CN109561311A - A kind of 3 d video encoding bit rate control method and storage equipment based on the domain ρ - Google Patents
A kind of 3 d video encoding bit rate control method and storage equipment based on the domain ρ Download PDFInfo
- Publication number
- CN109561311A CN109561311A CN201811491526.9A CN201811491526A CN109561311A CN 109561311 A CN109561311 A CN 109561311A CN 201811491526 A CN201811491526 A CN 201811491526A CN 109561311 A CN109561311 A CN 109561311A
- Authority
- CN
- China
- Prior art keywords
- view
- code rate
- image group
- frame
- current
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000013139 quantization Methods 0.000 claims description 20
- 238000005516 engineering process Methods 0.000 claims description 19
- 239000011159 matrix material Substances 0.000 claims description 12
- 239000013598 vector Substances 0.000 claims description 7
- 238000011497 Univariate linear regression Methods 0.000 claims description 6
- 238000007619 statistical method Methods 0.000 claims description 6
- 238000010219 correlation analysis Methods 0.000 abstract description 3
- 230000000694 effects Effects 0.000 abstract description 3
- 230000003139 buffering effect Effects 0.000 abstract 2
- 238000004422 calculation algorithm Methods 0.000 description 18
- 238000011160 research Methods 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/146—Data rate or code amount at the encoder output
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/124—Quantisation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/176—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/184—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/50—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
- H04N19/597—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding specially adapted for multi-view video sequence encoding
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention belongs to 3 d video encoding technical field, in particular to a kind of 3 d video encoding bit rate control method and storage equipment based on the domain ρ.A kind of 3 d video encoding bit rate control method based on the domain ρ, comprising steps of establishing the domain ρ code rate model;Bit distributes between carrying out viewpoint according to the image similarity between viewpoint;Frame-layer, basic unit layer bit-allocation and rate-control are carried out according to the activity time domain complexity of frame per second, destination buffer capacity, actual buffering area size and the frame.By establishing the domain ρ code rate model, and on the basis of being encoded based on high efficiency viewpoint video, bit distributes and (that is: gives each viewpoint to carry out reasonable bit distribution in turn using correlation analysis between viewpoint) between carrying out viewpoint according to the image similarity between viewpoint, and frame-layer, basic unit layer bit-allocation and rate-control are carried out according to the activity time domain complexity of frame per second, actual buffering area size and the frame, overcome the limitation of target bit allocation in the prior art.
Description
Technical Field
The invention belongs to the technical field of three-dimensional video coding, and particularly relates to a rho domain-based three-dimensional video coding rate control method and storage equipment.
Background
Rate control has been one of the most important techniques in video coding standards, and all international video compression standards are restricted in their application away from rate control. The standard rate control model is given by international common video compression standards such as MPEG-4, h.264, hevc (high efficiency video coding), MVC (Multi-view video coding), etc., and the rate control technology thereof is very mature. However, an effective rate control algorithm has not been given yet in Multi-view high definition video coding (MV-HEVC), which is one of the latest published 3D video coding standards in the world.
At present, the research on MV-HEVC code rate control at home and abroad is less than that of the prior code rate control technology, and most researches are related to the code rate control of MVC, HEVC, H264 and the like. U.S. Lim et al propose a binomial model-based MVC rate control algorithm that divides all video frames into several different coding type frames using the geometric relationship of disparity prediction and motion prediction, however, disparity prediction characteristics between views have large differences. Although the influence of adopting layered B frames in a time coding layer in multi-view video coding is considered by Seanae Park et al in Korea, experimental results show that the algorithm can also keep high-efficiency coding efficiency, the average code rate control error of the experimental results given by a test sequence is more than 1%, and the requirement of practical application is difficult to meet. Lei J et al establish a multi-view video coding rate control algorithm of an R-lambda model, and the experimental result is very ideal, but the correlation between views is not considered. In the rate control algorithm of stereoscopic video coding, Vizzotto B et al only considers the two-view situation, and since the algorithm is used in multi-view video coding, as the number of coding types of coded images increases, the accuracy of the target bit number allocation based on TM5 becomes worse, which results in a very high rate control error and difficult bit allocation control.
Many scholars at home and abroad are also engaged in the research of HEVC-based multi-view video coding rate control. Shao F et al propose to allocate code rates between texture depths using a fixed allocation ratio, but this method does not allow different sequences to achieve optimal coding efficiency. Xiao et al propose a scalable rate allocation algorithm applied to different bandwidths. Fang et al propose an analysis model for estimating virtual visual distortion in a 3D video, and estimate virtual visual distortion caused by depth map coding distortion using a joint frequency domain time domain analysis method. The estimation model, although accurate, is highly complex.
Pan G et al propose a depth-based 3D-HEVC rate control algorithm, which adopts a fixed color-to-depth rate ratio of 4:1, but cannot obtain optimal virtual viewpoint rendering quality. Xiao J M et al propose a depth and texture hierarchical rate control algorithm. Wang X et al propose a 3D-HEVC rate control algorithm based on a binomial R-D model, and because the rate control model in H.264 is directly adopted, the rate control accuracy is low.
Disclosure of Invention
Therefore, a multi-level code rate control method for MV-HEVC multi-view video coding is needed to solve the problems of the traditional code rate control model and inaccurate bit allocation of high-efficiency multi-view video coding.
The specific technical scheme is as follows:
a three-dimensional video coding rate control method based on rho domain comprises the following steps:
establishing a rho domain code rate model;
carrying out bit allocation among the viewpoints according to the image similarity among the viewpoints;
and performing bit allocation and code rate control on a frame layer and a basic unit layer according to the frame rate, the capacity of the target buffer area, the size of the actual buffer area and the complexity of the active time domain of the frame.
Further, the establishing of the rho domain code rate model further comprises the following steps:
and calculating to obtain model parameters by adopting a multiple regression technology.
Further, the "obtaining model parameters by calculation using multiple regression technique" further includes the steps of:
let ρ have the following quadratic relation with the texture part coding bit rate R (ρ):
R(ρ)=θ1·(1-ρ)2+θ2·(1-ρ)+θ3
where p represents the percentage of the number of zero coefficients in all coefficients after quantization of the transform coefficients, theta1,θ2,θ3Is a univariate regression coefficient;
the following R-p model was calculated:
R(ρ)=θ1·(1-ρ)2+θ2·(1-ρ)
wherein, theta1,θ2This is given by the following statistical analysis method: let x1(ρ)=(1-ρ)2,x2(p) 1-p, and (k)11,k21,r1),(k12,k22,r2),…,(k1n,k2n,rn) Is n sample values already present, let
By utilizing a multiple regression technology, calculating to obtain a model parameter N as follows:
wherein, KTIs the transposed matrix of K, (K)TK)-1Is KTThe inverse matrix of K.
Further, the inter-view bit allocation is performed according to the image similarity between the views; according to the frame rate, the capacity of the target buffer area, the size of the actual buffer area and the complexity of the active time domain of the frame, the bit allocation and the code rate control of the frame layer and the basic unit layer are carried out, and the method also comprises the following steps:
step 1: carrying out bit allocation and code rate control on the multi-view image groups, and calculating to obtain the number of bits to be allocated to the current multi-view image group;
step 2: carrying out bit allocation and code rate control on the single viewpoint image groups, and calculating to obtain a weight factor W of each single viewpoint image groupkRoot of Chinese characterAccording to the weight factor WkCalculating to obtain the number of bits to be allocated to the current single-viewpoint image group;
and step 3: carrying out bit allocation and code rate control on a frame layer in a single viewpoint image group, and calculating to obtain a target bit number of a current coding frame;
and 4, step 4: bit allocation and code rate control are carried out on a macro block layer, a rho value is calculated according to a rho domain code rate model, and then a quantization parameter Q of the current macro block is further calculated according to the rho valuemb;
And 5: according to a quantization parameter QmbEncoding a current macroblock;
step 6: judging whether all macro blocks in the current frame are coded, if so, turning to step 7; if the codes are not uniformly coded, repeating the step 4 to the step 5 until the codes are uniformly coded, and then turning to the step 7;
and 7: judging whether all frames in the current single viewpoint image group are all coded, and if so, turning to the step 8; if not, repeating the steps 3 to 6 until all frames of the current image group are coded;
and 8: judging whether all the single viewpoint image groups in the current single viewpoint image group are all coded, and turning to the step 9 if all the coding is finished; if not, repeating the steps 2 to 7 until all the single view image groups in the current multi-view image group are completely encoded;
and step 9: and judging whether the current multi-view image group is the last multi-view image group of the whole multi-view sequence, if so, finishing the coding, otherwise, repeating the steps 1 to 8 until the coding of the whole multi-view sequence is finished.
Further, the step 2 further includes the steps of:
current multi-view image group is allocated to total bits of TMVGOPWeight factor W for each single view image groupkThen of KThe view allocation target bits are:
TSVGOP,k=TMVGOP·Wk
wherein Wk(k=1,2,…,Nview) The initial value is obtained by calculating the similarity between the viewpoints:
wherein S is1Representing a base view or a main view as a reference view for other views, NviewIndicating the number of coded views, S (V)j,Vk) Representing a viewpoint Vj,VkThe degree of similarity is such that,
wherein,respectively, feature vectors of the two images.
In order to solve the technical problem, the storage device is also provided, and the specific technical scheme is as follows;
a storage device having stored therein a set of instructions for performing:
establishing a rho domain code rate model;
carrying out bit allocation among the viewpoints according to the image similarity among the viewpoints;
and performing bit allocation and code rate control on a frame layer and a basic unit layer according to the frame rate, the capacity of the target buffer area, the size of the actual buffer area and the complexity of the active time domain of the frame.
Further, the set of instructions is further for performing:
the method for establishing the rho domain code rate model further comprises the following steps:
and calculating to obtain model parameters by adopting a multiple regression technology.
Further, the set of instructions is further for performing:
the method comprises the following steps of calculating model parameters by adopting a multiple regression technology, and further comprises the following steps:
let ρ have the following quadratic relation with the texture part coding bit rate R (ρ):
R(ρ)=θ1·(1-ρ)2+θ2·(1-ρ)+θ3
where p represents the percentage of the number of zero coefficients in all coefficients after quantization of the transform coefficients, theta1,θ2,θ3Is a univariate regression coefficient;
the following R-p model was calculated:
R(ρ)=θ1·(1-ρ)2+θ2·(1-ρ)
wherein, theta1,θ2This is given by the following statistical analysis method: let x1(ρ)=(1-ρ)2,x2(p) 1-p, and (k)11,k21,r1),(k12,k22,r2),…,(k1n,k2n,rn) Is n sample values already present, let
By utilizing a multiple regression technology, calculating to obtain a model parameter N as follows:
wherein, KTIs the transposed matrix of K, (K)TK)-1Is KTThe inverse matrix of K.
Further, the set of instructions is further for performing:
the 'inter-view bit allocation is carried out according to the image similarity among the views'; according to the frame rate, the capacity of the target buffer area, the size of the actual buffer area and the complexity of the active time domain of the frame, the bit allocation and the code rate control of the frame layer and the basic unit layer are carried out, and the method also comprises the following steps:
step 1: carrying out bit allocation and code rate control on the multi-view image groups, and calculating to obtain the number of bits to be allocated to the current multi-view image group;
step 2: carrying out bit allocation and code rate control on the single viewpoint image groups, and calculating to obtain a weight factor W of each single viewpoint image groupkAccording to the weight factor WkCalculating to obtain the number of bits to be allocated to the current single-viewpoint image group;
and step 3: carrying out bit allocation and code rate control on a frame layer in a single viewpoint image group, and calculating to obtain a target bit number of a current coding frame;
and 4, step 4: bit allocation and code rate control are carried out on a macro block layer, a rho value is calculated according to a rho domain code rate model, and then a quantization parameter Q of the current macro block is further calculated according to the rho valuemb;
And 5: according to a quantization parameter QmbEncoding a current macroblock;
step 6: judging whether all macro blocks in the current frame are coded, if so, turning to step 7; if the codes are not uniformly coded, repeating the step 4 to the step 5 until the codes are uniformly coded, and then turning to the step 7;
and 7: judging whether all frames in the current single viewpoint image group are all coded, and if so, turning to the step 8; if not, repeating the steps 3 to 6 until all frames of the current image group are coded;
and 8: judging whether all the single viewpoint image groups in the current single viewpoint image group are all coded, and turning to the step 9 if all the coding is finished; if not, repeating the steps 2 to 7 until all the single view image groups in the current multi-view image group are completely encoded;
and step 9: and judging whether the current multi-view image group is the last multi-view image group of the whole multi-view sequence, if so, finishing the coding, otherwise, repeating the steps 1 to 8 until the coding of the whole multi-view sequence is finished.
Further, the set of instructions is further for performing:
the step 2 further comprises the following steps:
current multi-view image group is allocated to total bits of TMVGOPWeight factor W for each single view image groupkThen, the K-th view allocation target bit is:
TSVGOP,k=TMVGOP·Wk
wherein Wk(k=1,2,…,Nview) The initial value is obtained by calculating the similarity between the viewpoints:
wherein S is1Representing a base view or a main view as a reference view for other views, NviewIndicating the number of coded views, S (V)j,Vk) Representing a viewpoint Vj,VkThe degree of similarity is such that,
wherein,respectively, feature vectors of the two images.
The invention has the beneficial effects that: by establishing a rho domain code rate model and carrying out inter-viewpoint bit allocation according to the image similarity between viewpoints on the basis of high-efficiency viewpoint video coding (namely, reasonably allocating bits for each viewpoint by adopting inter-viewpoint correlation analysis), the negative influence of the liquid level fluctuation of a buffer area on frame target bit allocation is effectively avoided, and frame layer and basic unit layer bit allocation and code rate control are carried out according to the frame rate, the capacity of a target buffer area, the size of an actual buffer area and the complexity of an active time domain of the frame, so that the limitation of target bit allocation in the prior art is overcome, the code rate of multi-viewpoint video coding can be effectively controlled, the code rate control precision reaches more than 99%, and the peak signal-to-noise ratio is averagely improved by more than 0.38 dB.
Drawings
Fig. 1 is a flowchart of a three-dimensional video coding rate control method based on a ρ domain according to an embodiment;
FIG. 2 is a flowchart illustrating inter-view bit allocation based on inter-view image similarity according to an embodiment;
fig. 3 is a block diagram of a storage device according to an embodiment.
Description of reference numerals:
300. a storage device.
Detailed Description
To explain technical contents, structural features, and objects and effects of the technical solutions in detail, the following detailed description is given with reference to the accompanying drawings in conjunction with the embodiments.
Referring to fig. 1 to fig. 2, in the present embodiment, a method for controlling a rate of a three-dimensional video coding based on a ρ domain can be applied to a storage device, which includes but is not limited to: personal computers, servers, general purpose computers, special purpose computers, network devices, embedded devices, programmable devices, intelligent mobile terminals, etc. The specific implementation mode is as follows:
some terms in the present embodiment are explained first as follows:
code rate control: an optimization algorithm of coding is used for realizing the control of the size of a video code stream;
video code rate: the number of data bits transmitted per unit time during data transmission;
frame rate: the frequency with which bitmap images, called frames, appear continuously on the display;
MV-hevc (multiview high efficiency video coding): high efficiency multi-view video coding;
hevc (high efficiency video coding): high-efficiency viewpoint video coding;
MV-GOP (Multi-View of Group of picture) Multi-View image Group;
SV-GOP (Signal-View of Group of picture) single View point image Group.
The specific implementation mode is as follows:
step S101: and establishing a rho domain code rate model. In this embodiment, the "establishing a ρ -domain code rate model" further includes: and calculating to obtain model parameters by adopting a multiple regression technology. The method comprises the following steps of calculating model parameters by adopting a multiple regression technology, and further comprises the following steps:
let ρ have the following quadratic relation with the texture part coding bit rate R (ρ):
R(ρ)=θ1·(1-ρ)2+θ2·(1-ρ)+θ3(1)
where p represents the percentage of the number of zero coefficients in all coefficients after quantization of the transform coefficients, theta1,θ2,θ3Is a univariate regression coefficient;
the following R-p model was calculated:
R(ρ)=θ1·(1-ρ)2+θ2·(1-ρ) (3)
wherein, theta1,θ2This is given by the following statistical analysis method: let x1(ρ)=(1-ρ)2,x2(p) 1-p, and (k)11,k21,r1),(k12,k22,r2),…,(k1n,k2n,rn) Is n sample values already present, let
By utilizing a multiple regression technology, calculating to obtain a model parameter N as follows:
wherein, KTIs the transposed matrix of K, (K)TK)-1Is KTThe inverse matrix of K.
Step S102: and carrying out inter-view bit allocation according to the image similarity among the views.
Step S103: and performing bit allocation and code rate control on a frame layer and a basic unit layer according to the frame rate, the capacity of the target buffer area, the size of the actual buffer area and the complexity of the active time domain of the frame.
It should be noted that the difficulty in controlling the code rate of high-efficiency multi-view video coding is that bits between views are difficult to allocate, and reasonable bit allocation cannot be accurately performed for each view. The reasonable bit allocation to different views is performed in step S102 and step S103 according to the image similarity between views and the coding information. In this embodiment, the weighting factor W is usedkRepresents the proportion of all viewpoints of the viewpoint k, WkThe larger the view is, the more bits are needed to be allocated, which indicates that the view is the main view and is used as the reference view of other views, and the benefit of view coding is directly related to the result of the whole video coding.
Referring to fig. 2, step S102 and step S103 further include:
step S201: carrying out bit allocation and code rate control on the multi-view image groups, and calculating to obtain the number of bits to be allocated to the current multi-view image group;
step S202: carrying out bit allocation and code rate control on the single viewpoint image groups, and calculating to obtain a weight factor W of each single viewpoint image groupkAccording to the weight factor WkCalculating to obtain the number of bits to be allocated to the current single-viewpoint image group;
step S203: carrying out bit allocation and code rate control on a frame layer in a single viewpoint image group, and calculating to obtain a target bit number of a current coding frame;
step S204: bit allocation and code rate control are carried out on a macro block layer, a rho value is calculated according to a rho domain code rate model, and then a quantization parameter Q of the current macro block is further calculated according to the rho valuemb;
Step S205: according to the quantization parameter QmbEncoding a current macroblock;
step S206: judging whether all macro blocks in the current frame are coded, if so, turning to step S207; if not, repeating the step S204 to the step S205 until the codes are all coded, and then turning to the step S207;
step S207: judging whether all frames in the current single viewpoint image group are all coded, if so, turning to the step S208; if not, repeating the steps S203 to S206 until all frames of the current image group are coded;
step S208: judging whether all the single viewpoint image groups in the current single viewpoint image group are all coded, and if all the coding is finished, turning to the step S209; if not, repeating the above steps S202 to S207 until all the single-view image groups in the current multi-view image group are completely encoded;
step S209: and judging whether the current multi-view image group is the last multi-view image group of the whole multi-view sequence, if so, finishing the coding, otherwise, repeating the steps S201 to S208 until the coding of the whole multi-view sequence is finished.
Wherein the step S202 further includes the steps of:
current multi-view image group is allocated to total bits of TMVGOPWeight factor W for each single view image groupkThen, the K-th view allocation target bit is:
TSVGOP,k=TMVGOP·Wk(6)
wherein Wk(k=1,2,…,Nview) The initial value is obtained by calculating the similarity between the viewpoints:
wherein S is1Representing a base view or a main view as a reference view for other views, NviewIndicating the number of coded views, S (V)j,Vk) Representing a viewpoint Vj,VkThe degree of similarity is such that,
wherein,respectively, feature vectors of the two images.
Wherein,respectively, feature vectors of the two images. After each MV-GOP group of pictures is coded, the related parameters need to be updated, and the related parameters can be continuously updated according to the coded information. A. theGGOP(sni,0) The actual number of bits required to encode the ith MV-GOP group of pictures, AGOP(nk-1,0) W represents the actual number of bits required to encode the k-1 GOP of the ith MV-GOP group of picturesk-1Is given by formula (9)
In order to better predict the bits required by the current coding view, the coded information of the multi-view video coding, namely the previous group of MV-GOPs, is fully utilized, and the inter-view correlation is utilized to predict the coding weight of the current view. w is akThe linear prediction model is:
whereinAndrespectively representing unary regression coefficients, setting initial values as 1 and 0, and refreshing at a post-coding stage after each MV-GOP image group is coded.
In the present embodiment, all video frames are divided into a plurality of different encoding type frames by using the geometric relationship between disparity prediction and motion prediction, however, disparity prediction characteristics between views have a large difference, so that encoded images with the same inter-view prediction relationship or inter-time prediction relationship may have different encoding characteristics, and at this time, the target bit number calculated by using the same model parameter cannot be used, otherwise, a certain deviation occurs. Therefore, the following frame layer target bit allocation algorithm is proposed:
in the above formula, T is the sum of the number of bits consumed for encoding M frames; MADaRepresents the average of all frames MAD; MADjMAD representing the j-th frame; cjAnd CmAnd respectively occupying bits for the header information of the jth frame and the mth frame. From the above formula (11), MADjAnd CjThe larger the image frame, the more target bits are allocated.
According to the HEV frame layer target bit allocation method, in MV-HEVC, the j-1 th frame target bit allocation is as follows:
in the above formula, CaRepresenting the average of the bits consumed to encode the header information of the encoded frames in the current GOP.
In general, in multi-view video coding, the more violent the motion of a video sequence is, the larger the active time domain of each frame, i.e., the image content scene change is, and the more bits are needed in coding; conversely, the smaller the active time domain of each frame, i.e. the image content scene change, the fewer bits are needed for encoding. In order to make MV-HEVC rate control more accurate, the rate control method of equation (12) above is further improved, and the target bit of the current frame is calculated by equation (13):
in the above formula, TjBits consumed for the header information of the jth frame. n represents the temporal layer in which the current frame is located, W (l) represents the weight of the complexity of each frame, WB(l) Representing the weight of the B frame.
In this embodiment, the base unit layer bit allocation may be as follows:
for compatibility with HEVC, the multiview video coding base unit layer bit allocation algorithm is similar to HEVC. Known from the HEVC base unit layer rate control algorithm, the base unit layer rate allocation of the algorithm is simple, that is, bits allocated to each frame are evenly divided into each base unit layer of the frame, and then all different macro blocks in the same base unit layer are encoded by using the same quantization parameter QP. In practice, however, even macroblocks in the same basic unit have great differences in complexity of image content, texture, active time domain, and the like. Therefore, for more accurate MV-HEVC rate control, different quantization values are adopted according to the complexity of the image content, texture, active time domain, etc., and are calculated by equation (14):
in the above formula, Ttotal,kAnd N represents the total number of remaining bits and the number of remaining basic units, T, respectivelyhead,kRepresents bits consumed by the k-th macroblock header information, and fd (k) represents the active temporal level of the k-th encoded macroblock.
Wherein X and Y are pixel points of the macro block in horizontal and vertical directions, respectively, X and Y are coordinates in horizontal and vertical directions, respectively, and Ij(x,y)、Ij-1(x, y) are the luminance values at the current and previous macroblock locations (x, y), respectively.
In the present embodiment, test sequences with two different formats are adopted, wherein Vassar, Flamenco2, Exit, and ballrom are VGA format, and PoznanHall2 and GT Fly are HD format. The resolution of the sequence includes 640 × 480 pixels and 1920 × 1088 pixels. The test platform adopts an MV-HEVC system platform provided by JCT-3V.
The above table gives the experimental results of the rate control of the multi-view video coding. As can be seen from the above table, the code rate control errors of the first three code rate control algorithms are respectively 2.95%, 2.51% and 1.98%, and the code rate control errors are relatively large, and compared with the other three algorithms, the code rate control algorithm provided by the embodiment has the advantages of more accurate code rate, smaller code rate deviation, and less average code rate error than 1%, and can meet the requirements of practical application.
Referring to fig. 3, in the present embodiment, a memory device 300 is implemented as follows:
a storage device 300 having stored therein a set of instructions for performing:
establishing a rho domain code rate model;
carrying out bit allocation among the viewpoints according to the image similarity among the viewpoints;
and performing bit allocation and code rate control on a frame layer and a basic unit layer according to the frame rate, the capacity of the target buffer area, the size of the actual buffer area and the complexity of the active time domain of the frame.
Further, the set of instructions is further for performing:
the method for establishing the rho domain code rate model further comprises the following steps:
and calculating to obtain model parameters by adopting a multiple regression technology.
Further, the set of instructions is further for performing:
the method comprises the following steps of calculating model parameters by adopting a multiple regression technology, and further comprises the following steps:
let ρ have the following quadratic relation with the texture part coding bit rate R (ρ):
R(ρ)=θ1·(1-ρ)2+θ2·(1-ρ)+θ3
where p represents the percentage of the number of zero coefficients in all coefficients after quantization of the transform coefficients, theta1,θ2,θ3Is a univariate regression coefficient;
the following R-p model was calculated:
R(ρ)=θ1·(1-ρ)2+θ2·(1-ρ)
wherein, theta1,θ2This is given by the following statistical analysis method: let x1(ρ)=(1-ρ)2,x2(p) 1-p, and (k)11,k21,r1),(k12,k22,r2),…,(k1n,k2n,rn) Is n sample values already present, let
By utilizing a multiple regression technology, calculating to obtain a model parameter N as follows:
wherein, KTIs the transposed matrix of K, (K)TK)-1Is KTThe inverse matrix of K.
Further, the set of instructions is further for performing:
the 'inter-view bit allocation is carried out according to the image similarity among the views'; according to the frame rate, the capacity of the target buffer area, the size of the actual buffer area and the complexity of the active time domain of the frame, the bit allocation and the code rate control of the frame layer and the basic unit layer are carried out, and the method also comprises the following steps:
step 1: carrying out bit allocation and code rate control on the multi-view image groups, and calculating to obtain the number of bits to be allocated to the current multi-view image group;
step 2: carrying out bit allocation and code rate control on the single viewpoint image groups, and calculating to obtain a weight factor W of each single viewpoint image groupkAccording to the weight factor WkCalculating to obtain the number of bits to be allocated to the current single-viewpoint image group;
and step 3: carrying out bit allocation and code rate control on a frame layer in a single viewpoint image group, and calculating to obtain a target bit number of a current coding frame;
and 4, step 4: bit allocation and code rate control are carried out on a macro block layer, a rho value is calculated according to a rho domain code rate model, and then a quantization parameter Q of the current macro block is further calculated according to the rho valuemb;
And 5: according to a quantization parameter QmbEncoding a current macroblock;
step 6: judging whether all macro blocks in the current frame are coded, if so, turning to step 7; if the codes are not uniformly coded, repeating the step 4 to the step 5 until the codes are uniformly coded, and then turning to the step 7;
and 7: judging whether all frames in the current single viewpoint image group are all coded, and if so, turning to the step 8; if not, repeating the steps 3 to 6 until all frames of the current image group are coded;
and 8: judging whether all the single viewpoint image groups in the current single viewpoint image group are all coded, and turning to the step 9 if all the coding is finished; if not, repeating the steps 2 to 7 until all the single view image groups in the current multi-view image group are completely encoded;
and step 9: and judging whether the current multi-view image group is the last multi-view image group of the whole multi-view sequence, if so, finishing the coding, otherwise, repeating the steps 1 to 8 until the coding of the whole multi-view sequence is finished.
Further, the set of instructions is further for performing:
the step 2 further comprises the following steps:
current multi-view image group is allocated to total bits of TMVGOPWeight factor W for each single view image groupkThen, the K-th view allocation target bit is:
TSVGOP,k=TMVGOP·Wk
wherein Wk(k=1,2,…,Nview) The initial value is obtained by calculating the similarity between the viewpoints:
wherein S is1Representing a base view or a main view as a reference view for other views, NviewIndicating the number of coded views, S (V)j,Vk) Representing a viewpoint Vj,VkThe degree of similarity is such that,
wherein,respectively, feature vectors of the two images.
The following steps are performed by an instruction set on the storage device 300: by establishing a rho domain code rate model and carrying out inter-viewpoint bit allocation according to the image similarity between viewpoints on the basis of high-efficiency viewpoint video coding (namely, reasonably allocating bits for each viewpoint by adopting inter-viewpoint correlation analysis), the negative influence of the liquid level fluctuation of a buffer area on frame target bit allocation is effectively avoided, and frame layer and basic unit layer bit allocation and code rate control are carried out according to the frame rate, the capacity of a target buffer area, the size of an actual buffer area and the complexity of an active time domain of the frame, so that the limitation of target bit allocation in the prior art is overcome, the code rate of multi-viewpoint video coding can be effectively controlled, the code rate control precision reaches more than 99%, and the peak signal-to-noise ratio is averagely improved by more than 0.38 dB.
It should be noted that, although the above embodiments have been described herein, the invention is not limited thereto. Therefore, based on the innovative concepts of the present invention, the technical solutions of the present invention can be directly or indirectly applied to other related technical fields by making changes and modifications to the embodiments described herein, or by using equivalent structures or equivalent processes performed in the content of the present specification and the attached drawings, which are included in the scope of the present invention.
Claims (10)
1. A three-dimensional video coding rate control method based on rho domain is characterized by comprising the following steps:
establishing a rho domain code rate model;
carrying out bit allocation among the viewpoints according to the image similarity among the viewpoints;
and performing bit allocation and code rate control on a frame layer and a basic unit layer according to the frame rate, the capacity of the target buffer area, the size of the actual buffer area and the complexity of the active time domain of the frame.
2. The method of claim 1, wherein the code rate control for p-domain-based three-dimensional video coding,
the method for establishing the rho domain code rate model further comprises the following steps:
and calculating to obtain model parameters by adopting a multiple regression technology.
3. The method of claim 2, wherein the code rate control for p-domain-based three-dimensional video coding,
the method comprises the following steps of calculating model parameters by adopting a multiple regression technology, and further comprises the following steps:
let ρ have the following quadratic relation with the texture part coding bit rate R (ρ):
R(ρ)=θ1·(1-ρ)2+θ2·(1-ρ)+θ3
where p represents the percentage of the number of zero coefficients in all coefficients after quantization of the transform coefficients, theta1,θ2,θ3Is a univariate regression coefficient;
the following R-p model was calculated:
R(ρ)=θ1·(1-ρ)2+θ2·(1-ρ)
wherein, theta1,θ2This is given by the following statistical analysis method: let x1(ρ)=(1-ρ)2,x2(p) 1-p, and (k)11,k21,r1),(k12,k22,r2),…,(k1n,k2n,rn) Is n sample values already present, let
By utilizing a multiple regression technology, calculating to obtain a model parameter N as follows:
wherein, KTIs the transposed matrix of K, (K)TK)-1Is KTThe inverse matrix of K.
4. The method of claim 1, wherein the code rate control for p-domain-based three-dimensional video coding,
the 'inter-view bit allocation is carried out according to the image similarity among the views'; according to the frame rate, the capacity of the target buffer area, the size of the actual buffer area and the complexity of the active time domain of the frame, the bit allocation and the code rate control of the frame layer and the basic unit layer are carried out, and the method also comprises the following steps:
step 1: carrying out bit allocation and code rate control on the multi-view image groups, and calculating to obtain the number of bits to be allocated to the current multi-view image group;
step 2: carrying out bit allocation and code rate control on the single viewpoint image groups, and calculating to obtain a weight factor W of each single viewpoint image groupkAccording to the weight factor WkCalculating to obtain the number of bits to be allocated to the current single-viewpoint image group;
and step 3: carrying out bit allocation and code rate control on a frame layer in a single viewpoint image group, and calculating to obtain a target bit number of a current coding frame;
and 4, step 4: bit allocation and code rate control are carried out on a macro block layer, a rho value is calculated according to a rho domain code rate model, and then a quantization parameter Q of the current macro block is further calculated according to the rho valuemb;
And 5: according to the quantization parameter QmbEncoding a current macroblock;
step 6: judging whether all macro blocks in the current frame are coded, if so, turning to step 7; if the codes are not uniformly coded, repeating the step 4 to the step 5 until the codes are uniformly coded, and then turning to the step 7;
and 7: judging whether all frames in the current single viewpoint image group are all coded, and if so, turning to the step 8; if not, repeating the steps 3 to 6 until all frames of the current image group are coded;
and 8: judging whether all the single viewpoint image groups in the current single viewpoint image group are all coded, and turning to the step 9 if all the coding is finished; if not, repeating the steps 2 to 7 until all the single view image groups in the current multi-view image group are completely encoded;
and step 9: and judging whether the current multi-view image group is the last multi-view image group of the whole multi-view sequence, if so, finishing the coding, otherwise, repeating the steps 1 to 8 until the coding of the whole multi-view sequence is finished.
5. The method of claim 4, wherein the code rate control for p-domain-based three-dimensional video coding,
the step 2 further comprises the following steps:
current multi-view image group is allocated to total bits of TMVGOPWeight factor W for each single view image groupkThen, the K-th view allocation target bit is:
TSVGOP,k=TMVGOP·Wk
wherein Wk(k=1,2,…,Nview) The initial value is obtained by calculating the similarity between the viewpoints:
wherein S is1Representing a base view or a main view as a reference view for other views, NviewIndicating the number of coded views, S (V)j,Vk) Representing a viewpoint Vj,VkThe degree of similarity is such that,
wherein,respectively, feature vectors of the two images.
6. A storage device having a set of instructions stored therein, the set of instructions being operable to perform:
establishing a rho domain code rate model;
carrying out bit allocation among the viewpoints according to the image similarity among the viewpoints;
and performing bit allocation and code rate control on a frame layer and a basic unit layer according to the frame rate, the capacity of the target buffer area, the size of the actual buffer area and the complexity of the active time domain of the frame.
7. The storage device of claim 6, wherein the set of instructions is further configured to perform:
the method for establishing the rho domain code rate model further comprises the following steps:
and calculating to obtain model parameters by adopting a multiple regression technology.
8. The storage device of claim 7, wherein the set of instructions is further configured to perform:
the method comprises the following steps of calculating model parameters by adopting a multiple regression technology, and further comprises the following steps:
let ρ have the following quadratic relation with the texture part coding bit rate R (ρ):
R(ρ)=θ1·(1-ρ)2+θ2·(1-ρ)+θ3
where p represents the percentage of the number of zero coefficients in all coefficients after quantization of the transform coefficients, theta1,θ2,θ3Is a univariate regression coefficient;
the following R-p model was calculated:
R(ρ)=θ1·(1-ρ)2+θ2·(1-ρ)
wherein, theta1,θ2This is given by the following statistical analysis method: let x1(ρ)=(1-ρ)2,x2(p) 1-p, and (k)11,k21,r1),(k12,k22,r2),…,(k1n,k2n,rn) Is n sample values already present, let
By utilizing a multiple regression technology, calculating to obtain a model parameter N as follows:
wherein, KTIs the transposed matrix of K, (K)TK)-1Is KTThe inverse matrix of K.
9. The storage device of claim 6, wherein the set of instructions is further configured to perform:
the 'inter-view bit allocation is carried out according to the image similarity among the views'; according to the frame rate, the capacity of the target buffer area, the size of the actual buffer area and the complexity of the active time domain of the frame, the bit allocation and the code rate control of the frame layer and the basic unit layer are carried out, and the method also comprises the following steps:
step 1: carrying out bit allocation and code rate control on the multi-view image groups, and calculating to obtain the number of bits to be allocated to the current multi-view image group;
step 2: carrying out bit allocation and code rate control on the single viewpoint image groups, and calculating to obtain a weight factor W of each single viewpoint image groupkAccording to the weight factor WkCalculating to obtain the number of bits to be allocated to the current single-viewpoint image group;
and step 3: carrying out bit allocation and code rate control on a frame layer in a single viewpoint image group, and calculating to obtain a target bit number of a current coding frame;
and 4, step 4: bit allocation for macroblock layersAnd controlling the code rate, calculating a rho value according to a rho domain code rate model, and further calculating a quantization parameter Q of the current macro block according to the rho valuemb;
And 5: according to a quantization parameter QmbEncoding a current macroblock;
step 6: judging whether all macro blocks in the current frame are coded, if so, turning to step 7; if the codes are not uniformly coded, repeating the step 4 to the step 5 until the codes are uniformly coded, and then turning to the step 7;
and 7: judging whether all frames in the current single viewpoint image group are all coded, and if so, turning to the step 8; if not, repeating the steps 3 to 6 until all frames of the current image group are coded;
and 8: judging whether all the single viewpoint image groups in the current single viewpoint image group are all coded, and turning to the step 9 if all the coding is finished; if not, repeating the steps 2 to 7 until all the single view image groups in the current multi-view image group are completely encoded;
and step 9: and judging whether the current multi-view image group is the last multi-view image group of the whole multi-view sequence, if so, finishing the coding, otherwise, repeating the steps 1 to 8 until the coding of the whole multi-view sequence is finished.
10. The storage device of claim 9, wherein the set of instructions is further configured to perform:
the step 2 further comprises the following steps:
current multi-view image group is allocated to total bits of TMVGOPWeight factor W for each single view image groupkThen, the K-th view allocation target bit is:
TSVGOP,k=TMVGOP·Wk
wherein Wk(k=1,2,…,Nview) The initial value is obtained by calculating the similarity between the viewpoints:
wherein S is1Representing a base view or a main view as a reference view for other views, NviewIndicating the number of coded views, S (V)j,Vk) Representing a viewpoint Vj,VkThe degree of similarity is such that,
wherein,respectively, feature vectors of the two images.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811491526.9A CN109561311A (en) | 2018-12-07 | 2018-12-07 | A kind of 3 d video encoding bit rate control method and storage equipment based on the domain ρ |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811491526.9A CN109561311A (en) | 2018-12-07 | 2018-12-07 | A kind of 3 d video encoding bit rate control method and storage equipment based on the domain ρ |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109561311A true CN109561311A (en) | 2019-04-02 |
Family
ID=65869158
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811491526.9A Pending CN109561311A (en) | 2018-12-07 | 2018-12-07 | A kind of 3 d video encoding bit rate control method and storage equipment based on the domain ρ |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109561311A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110139089A (en) * | 2019-05-09 | 2019-08-16 | 莆田学院 | A kind of the 3 d video encoding bit rate control method and storage equipment of combination scene detection |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101242532A (en) * | 2007-12-12 | 2008-08-13 | 浙江万里学院 | A code rate control method oriented to multi-view point video |
CN101287123A (en) * | 2008-05-23 | 2008-10-15 | 清华大学 | Code rate controlling method for video coding based on Rho domain |
WO2010005691A1 (en) * | 2008-06-16 | 2010-01-14 | Dolby Laboratories Licensing Corporation | Rate control model adaptation based on slice dependencies for video coding |
CN107220881A (en) * | 2017-05-27 | 2017-09-29 | 莆田学院 | A kind of method and apparatus of the electric business temperature ranking based on time and space |
-
2018
- 2018-12-07 CN CN201811491526.9A patent/CN109561311A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101242532A (en) * | 2007-12-12 | 2008-08-13 | 浙江万里学院 | A code rate control method oriented to multi-view point video |
CN101287123A (en) * | 2008-05-23 | 2008-10-15 | 清华大学 | Code rate controlling method for video coding based on Rho domain |
WO2010005691A1 (en) * | 2008-06-16 | 2010-01-14 | Dolby Laboratories Licensing Corporation | Rate control model adaptation based on slice dependencies for video coding |
CN107220881A (en) * | 2017-05-27 | 2017-09-29 | 莆田学院 | A kind of method and apparatus of the electric business temperature ranking based on time and space |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110139089A (en) * | 2019-05-09 | 2019-08-16 | 莆田学院 | A kind of the 3 d video encoding bit rate control method and storage equipment of combination scene detection |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5234586B2 (en) | Video encoding method and decoding method, apparatus thereof, program thereof, and storage medium storing program | |
CN108989815B9 (en) | Method of generating merge candidate list for multi-view video signal and decoding apparatus | |
US8228994B2 (en) | Multi-view video coding based on temporal and view decomposition | |
KR101354387B1 (en) | Depth map generation techniques for conversion of 2d video data to 3d video data | |
TWI448145B (en) | Image processing apparatus and method, and program | |
JP6446488B2 (en) | Video data decoding method and video data decoding apparatus | |
Shao et al. | Joint bit allocation and rate control for coding multi-view video plus depth based 3D video | |
KR20140089486A (en) | Motion compensation method and motion compensation apparatus for encoding and decoding of scalable video | |
Yuan et al. | Rate distortion optimized inter-view frame level bit allocation method for MV-HEVC | |
CN108200431B (en) | Bit allocation method for video coding code rate control frame layer | |
CN101674472A (en) | Multistage code rate control method of video code with a plurality of visual points | |
CN102685532A (en) | Coding method for free view point four-dimensional space video coding system | |
CN103428499A (en) | Coding unit partition method and multi-view video coding method using coding unit partition method | |
Li et al. | A bit allocation method based on inter-view dependency and spatio-temporal correlation for multi-view texture video coding | |
CN107864380A (en) | 3D HEVC fast intra-mode prediction decision-making techniques based on DCT | |
Xiao et al. | Scalable bit allocation between texture and depth views for 3-D video streaming over heterogeneous networks | |
CN104038769B (en) | Rate control method for intra-frame coding | |
Lei et al. | Region adaptive R-$\lambda $ model-based rate control for depth maps coding | |
CN104159095A (en) | Code rate control method for multi-view texture video and depth map coding | |
CN102387368A (en) | Fast selection method of inter-view prediction for multi-view video coding (MVC) | |
CN109561311A (en) | A kind of 3 d video encoding bit rate control method and storage equipment based on the domain ρ | |
Shao et al. | A novel rate control technique for asymmetric-quality stereoscopic video | |
Yea et al. | View synthesis prediction for rate-overhead reduction in ftv | |
Afonso et al. | Hardware-friendly unidirectional disparity-search algorithm for 3D-HEVC | |
Stefanoski et al. | Image quality vs rate optimized coding of warps for view synthesis in 3D video applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20190402 |
|
WD01 | Invention patent application deemed withdrawn after publication |