CN103124347A - Method for guiding multi-view video coding quantization process by visual perception characteristics - Google Patents
Method for guiding multi-view video coding quantization process by visual perception characteristics Download PDFInfo
- Publication number
- CN103124347A CN103124347A CN2012104020039A CN201210402003A CN103124347A CN 103124347 A CN103124347 A CN 103124347A CN 2012104020039 A CN2012104020039 A CN 2012104020039A CN 201210402003 A CN201210402003 A CN 201210402003A CN 103124347 A CN103124347 A CN 103124347A
- Authority
- CN
- China
- Prior art keywords
- frame
- distinguish
- viewpoint
- encoding
- coding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Abstract
The invention relates to a method for guiding coding quantization process by visual perception characteristics. The method includes the following operating steps that (1) a brightness value of each frame of an input video sequence is read, and a just noticeable distortion threshold value model of a frequency domain is established; (2) prediction on each frame passing through the viewpoints and between the viewpoints in the input video sequence is performed; (3) residual error data are subjected to discrete cosine transform; (4) a quantifying step size of each macro block in the current frame is dynamically adjusted; (5) lagrangian parameters in a rate-distortion optimization process are dynamically adjusted; and (6) quantized data are subjected to entropy coding to form a code stream, and the code stream is transmitted through network. By means of the method, the video compression efficiency is improved under the condition that subjective qualities are basically unchanged, and the video is suitable for being transmitted in the network.
Description
Technical field
The present invention relates to multi-view point video encoding and decoding technique field, particularly utilize vision perception characteristic to instruct the method for multiple view video coding quantizing process, be applicable to the encoding and decoding of high definition 3D vision signal.
Background technology
Along with era development, people are more and more higher to the requirement of audiovisual impression, are not content with existing haplopia two-dimensional video.People are more and more higher for the third dimension experience requirements, can experience third dimension from the third dimension of fixed angle to any angle, thereby expedite the emergence of out the development of multi-vision-point encoding technology.Yet the data that many viewpoints require improve greatly, how effectively to improve video compression efficiency and become study hotspot.At present, video compression technology mainly concentrates on and removes spatial redundancy, time redundancy and statistical redundancy three aspects.Although video experts is released technology of video compressing encoding of new generation (HEVC), expect that H.264 video compression efficiency is doubling on the basis again.Yet the characteristic due to human visual system (HVS) self exists the perception redundancy and still is not removed.Along with to human-eye visual characteristic research gradually deeply, what have that the video worker proposed to remove the human eye redundancy just can distinguish distortion model (Just Noticeable Distortion, JND).Namely according to the size of the JND threshold metric perception redundancy that obtains, when changing value lower than this threshold value just not by Human Perception.
Research for JND at present mainly is divided into two large class: pixel domain JND and frequency domain JND models.Wherein, the JND model that proposes in document [1] is classical pixel domain model, has studied respectively that characteristic is covered in brightness, texture covers characteristic and time domain is covered characteristic.The frequency domain JND model that proposes in document [2] has also been studied the sensitiveness of human eye to the different frequency section outside having studied first three specific character, make like this frequency domain JND model more meet the visual characteristic of human eye.
For the JND model that proposes in document [2], it is at present more complete DCT territory JND model.It covers characteristic except the brightness that comprises pixel and texture is covered characteristic, has also increased spatial sensitivity function effect.The spatial sensitivity function has reflected the bandpass characteristics of human eye, reaches removal Human Perception frequency redundancy purpose by removing the imperceptible frequency content of human eye.In the time domain shielding effect, comprise level and smooth eyeball and moved effect, not only comprised the size of motion amplitude, also comprised the directional information of motion.After having the researcher that it is combined with multi-view point video to act on residual error dct transform (discrete cosine transform), greatly improved compression efficiency.But, do not use it for other cataloged procedure such as quantizing process, therefore that it removes visual redundancy is thorough not.
The JND model of setting up in document [3] is although proposed to utilize the JND model to instruct quantizing process.Yet the JND model of its foundation is pixel domain, has lacked the process of removing the human eye frequency redundancy, causes instructing quantizing process accurate not.Secondly, guaranteed subjective quality for the JND model, only need to regulate quantized value to the insensitive place of human eye, and other area quantization value has remained unchanged.Adjusting quantization parameter simultaneously, corresponding adjustment LaGrange parameter at last.
Patent application of the present invention proposes DCT territory JND model is applied to quantizing process in multiple view video coding first, in the situation that guarantee that subjective quality is constant, further improves video compression efficiency.
Document [1]: X. Yang, W. Lin, and Z. Lu, " Motion-compensated residue preprocessing in video coding based on just-noticeable-distortion profile; " IEEE Trans. Circuits Syst. Video Technol., vol. 15, and no. 6, pp. 742 – 752,2005.
Document [2]: Zhenyu Wei and King N. Ngan., " Spatio-Temporal Just Noticeable Distortion Profile for Grey Scale Image/Video in DCT Domain. " IEEE transactions on circuits and systems for video technology.VOL. 19, NO. 3, March 2009.
Document [3]: Z. Chen and C. Guillemot, " Perceptually friendly H.26/AVC video coding based on foveated just noticeable distortion model; " IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 6, pp. 806 – 819, Jun.2010.
Summary of the invention
The objective of the invention is the defective for the prior art existence, a kind of method of utilizing vision perception characteristic to instruct the multiple view video coding quantizing process is provided, the method is in the situation that guarantee that Subjective video quality is constant, use frequency domain JND model to instruct many viewpoints quantizing process, to the insensitive zone raising of human eye quantization step, improved video compression efficiency.When adjusting step-length, dynamically the LaGrange parameter of regulation aberration optimizing function, make code efficiency further improve.
For achieving the above object, the present invention adopts following technical scheme:
A kind of method of utilizing vision perception characteristic to instruct the multiple view video coding quantizing process is characterized in that operating procedure is as follows:
(1) read the brightness value size of each frame of input video sequence, that sets up frequency domain just can distinguish the distortion threshold model,
(2) each frame of input video sequence passes through in viewpoint and the prediction between viewpoint,
(3) residual error data is carried out discrete cosine transform (dct transform),
(4) quantization step of each macro block in the dynamic adjustments present frame,
(5) LaGrange parameter in dynamic adjustments rate-distortion optimization process,
(6) data that quantize are carried out the entropy coding, generated code stream passes through Internet Transmission.
Of the present inventionly utilize method that vision perception characteristic instructs the multiple view video coding quantizing process compared with the prior art, have following apparent outstanding substantive distinguishing features and significantly technological progress:
1), this multi-view point video encoding method when guaranteeing the reconstruction video mass conservation, make cataloged procedure just can reduce encoder bit rate by quantizing this subprogram, in test, maximal rate can drop to 12.35%;
2), this multi-view point video encoding method is when guaranteeing the reconstruction video mass conservation, adopt average subjective scores difference, when the subjective scores difference near 0 the time, the subjective quality that two kinds of methods are described is more approaching, the average subjective scores difference of this method is 0.03, therefore says that the subjective quality of subjective quality of the present invention and multi-view point video encoding and decoding JMVC code is suitable;
3), this multi-view point video encoding method do not have to increase complicated especially cataloged procedure, improves the Video coding compression efficiency with less complexity.
Description of drawings
Fig. 1 is that the vision perception characteristic that utilizes in the present invention instructs the theory diagram of the method for multiple view video coding quantizing process.
Fig. 2 be frequency domain just can distinguish the block diagram of distortion model.
Fig. 3 is in viewpoint/block diagram of a prediction.
Fig. 4 is the dct transform block diagram.
Fig. 5 is the block diagram of dynamic adjustments quantization step.
Fig. 6 is the block diagram of the LaGrange parameter in dynamic adjustments rate distortion costs function.
Fig. 7 is the block diagram of entropy coding output.
Fig. 8 a is the reconstructed image that the 0th viewpoint of video sequence ballroom the 15th two field picture uses JMVC original coding method.
Fig. 8 b is the reconstructed image that the 0th viewpoint of video sequence ballroom the 15th two field picture uses the inventive method.
Fig. 9 is that video sequence ballroom uses JMVC original coding method and the inventive method in different Q P and different points of view situation, the comparing result of code check, PSNR value, reconstruction video subjective quality assessment mark poor (DM0S)
Figure 10 a is the reconstructed image that the 1st viewpoint of video sequence race1 the 35th two field picture uses JMVC original coding method.
Figure 10 b is the reconstructed image that the 1st viewpoint of video sequence race1 the 35th two field picture uses the inventive method
Figure 11 is that video sequence race1 uses JMVC original coding method and the inventive method in different Q P and different points of view situation, the comparing result of code check, PSNR value, reconstruction video subjective quality assessment mark poor (DM0S)
Figure 12 a is the reconstructed image that the 2nd viewpoint of video sequence Crowd the 45th two field picture uses JMVC original coding method.
Figure 12 b is the reconstructed image that the 2nd viewpoint of video sequence Crowd the 45th two field picture uses the inventive method.
Figure 13 is that video sequence Crowd uses JMVC original coding method and the inventive method in different Q P and different points of view situation, the comparing result of code check, PSNR value, the average subjective scoring difference of reconstruction video (DM0S).
Embodiment
Below in conjunction with accompanying drawing, the preferred embodiments of the present invention are described in further detail:
Embodiment one:
The present embodiment utilizes vision perception characteristic to instruct the method for multiple view video coding quantizing process, referring to Fig. 1, comprises the following steps:
(1) read the brightness value size of each frame of input video sequence, that sets up frequency domain just can distinguish the distortion threshold model,
(2) each frame of input video sequence passes through in viewpoint and the prediction between viewpoint,
(3) residual error data is carried out discrete cosine transform,
(4) quantization step of each macro block in the dynamic adjustments present frame,
(5) LaGrange parameter in dynamic adjustments rate-distortion optimization process,
(6) data that quantize are carried out the entropy coding, generated code stream passes through Internet Transmission.
Embodiment two: the present embodiment and embodiment one are basic identical, and special feature is as follows:
Set up frequency domain JND model in above-mentioned steps (1) and comprise four models, referring to Fig. 2:
(1-1) spatial contrast sensitivity function model is the bandpass characteristics curve according to human eye, for the particular space frequency
Its basic JND threshold value can be expressed as:
Wherein,
With
The coordinate position of expression discrete cosine transform block,
Be the dimension of discrete cosine transform block,
With
The visual angle of expression horizontal and vertical it is generally acknowledged that the horizontal view angle equals the vertical angle of view, and it is expressed as:
Because the human eye vision susceptibility has directivity, more responsive to the horizontal and vertical direction ratio, relatively less to the susceptibility of other directions.The modulation factor that adds thus direction can get:
Be the angle of the frequency of DCT coefficient vector representative,
For DCT coefficient normalization factor expression formula is:
Add at last the control parameter
The modulation factor that forms final spatial sensitivity function is:
In the multi-vision-point encoding process, due to the dct transform that has 8 * 8 and 4 * 4 sizes, therefore parameter is distinguished to some extent.In experiment, for the DCT coded format of 8 * 8 sizes,
Be 0.6,
Be 1.33,
Be 0.11,
Be 0.18; For the DCT coded format of 4 * 4 sizes,
Be 0.6,
Be 0.8,
Be 0.035,
Be 0.008.
(1-2) brightness shielding effect model is according to experiment, and the human eye vision sensitivity of awareness is more responsive in black and brighter background area at the regional ratio of intermediate grey values, simulates at last brightness shielding effect curve, and its expression formula is:
(1-3) texture shielding effect model is the difference according to image texture, image can be divided into Three regions: frontier district, smoothly district and texture area.Human eye reduces its susceptibility successively.Usually utilize the canny operator to tell the regional of image.
The edge pixel density of utilizing the canny operator to obtain is as follows:
Utilize edge pixel density
Image block is divided into the flat region, texture area and marginal zone, it is as follows according to formula that image block is classified:
For texture region, eyes are insensitive to the low frequency part distortion, but HFS suitably keeps.Therefore obtain contrasting the estimation factor of covering be:
Due to the eclipsing effects of spatial contrast sensitivity function effect and luminance effect, obtain the final shielding effect factor and be:
Wherein,
The of expression input video sequence
Frame,
Be the DCT coefficient,
Be the threshold value of spatial contrast degree sensitivity function,
Be brightness shielding effect characteristic modulation factor.
(1-4) time contrast sensitivity function model is to be according to the modulation factor that experiment records the time domain shielding effect:
Wherein,
The expression temporal frequency,
The representation space frequency.Temporal frequency
Its general computing formula is as follows:
Be respectively the horizontal and vertical component of spatial frequency,
Speed for object of which movement on retina.
Wherein,
With
Expression pixel level and vertical visual angle,
Be the dct transform dimension,
With
The coordinate position of expression discrete cosine transform block.
Wherein,
Be that the smooth pursuit eyeball moves the effect gain, get 0.98 in experiment.
Represent object in the speed of the plane of delineation,
The eyeball translational speed of the minimum that expression causes due to drift motion, its empirical value is 0.15.deg/s.
Be the maximal rate of the eyeball corresponding with the eyes jumping, usually get 80deg/s,
It is the frame per second of video sequence.
The motion vector of each piece,
It is the visual angle of pixel.
What (1-5) weighted product of four kinds of factors namely consisted of current encoded frame just can distinguish distortion threshold, and its expression formula is:
Wherein,
Be the threshold value of spatial contrast degree sensitivity function,
Be brightness shielding effect modulation factor,
Be the shielding effect modulation factor,
For time domain is covered modulation factor.
Above-mentioned steps (2) is that input video sequence is carried out between viewpoint/interior prediction, and referring to Fig. 3, its concrete steps are as follows:
(2-1) interframe in viewpoint/interior prediction is to remove the time redundancy of present frame by the inter prediction in viewpoint, removes the spatial redundancy of present frame by the infra-frame prediction in viewpoint.The sort of prediction mode of selection rate aberration optimizing function minimum in infra-frame prediction and inter prediction.Wherein the rate-distortion optimization function expression is:
Wherein
Be distorted signal,
Be the bit number of encoding under the different coding pattern,
It is the LaGrange parameter after adjusting.
The prediction of (2-2) carrying out between viewpoint is because this method is a plurality of viewpoints of coding, predicts present frame by the corresponding frame between viewpoint, can remove the redundant information between viewpoint.
(2-3) relatively between viewpoint and the coding cost in viewpoint, select best prediction mode in prediction again in viewpoint and the prediction mode between viewpoint relatively, the prediction mode of selection rate aberration optimizing cost function minimum is the optimum prediction mode.Take into full account between viewpoint and viewpoint in redundancy properties, select suitable prediction mode further to improve video compression efficiency.
Above-mentioned steps (3) is carried out discrete cosine transform to residual error data, and referring to Fig. 4, its concrete steps are as follows:
(3-1) judgement of coded block size, coded block size has in the multi-vision-point encoding method
Seven kinds of situations, front four kinds are summed up as
Transform block, rear three kinds are
Transform block.
(3-2) corresponding dct transform, for
Transform block adopts
Dct transform, for
Transform block adopts
Dct transform.
The quantization step of each macro block in above-mentioned steps (4) dynamic adjustments present frame, referring to Fig. 5, its concrete steps are as follows:
(4-1) the JND model by having set up is obtained the average JND value of present frame, and average JND threshold value is:
Wherein,
With
Height and the width of difference presentation graphs picture frame,
What represent present frame just can distinguish distortion threshold,
The coordinate of expression pixel.
(4-2) the JND average of current macro, the average JND threshold value of M macro block is expressed as:
(4-3) quantization step of dynamic adjustments current macro can distinguish that just distortion threshold has reflected the difference of human eye to the susceptibility of piece image various piece, therefore can come according to the difference that just can distinguish distortion threshold the quantization step of each macro block of dynamic adjustments.For the insensitive place of human eye, with suitable the tuning up of quantization step, otherwise quantized value is constant.The quantization parameter that proposes is adjusted to:
Wherein,
The original step-length of coding framework,
Be regulatory factor, its expression formula is provided by following formula:
LaGrange parameter in above-mentioned steps (5) dynamic adjustments rate-distortion optimization process, referring to Fig. 6, its concrete operation step is as follows:
(5-1) calculate and compare the JND average of present frame and the JND average of current coding macro block, for next step weighting to LaGrange parameter provides foundation.
(5-2) adjust Suzanne Lenglen day parameter, quantization parameter has been regulated in the front, and distortion value and code check in Lagrangian rate-distortion optimization change, and uses original LaGrange parameter this moment again
Value just can not guarantee it is optimal solution.Corresponding weighting LaGrange parameter, can make cost function again reach optimum simultaneously, after adjustment
For:
Wherein,
The quantization parameter that generates in expression multi-vision-point encoding method,
Expression the
Quantization parameter value after individual macro block is adjusted.
(5-3) LaGrange parameter after adjusting is updated in the rate-distortion optimization cost function, and its expression formula is as follows:
Wherein
Be distorted signal,
Be the bit number of encoding under the different coding pattern,
It is the LaGrange parameter after adjusting.Make like this when quantization parameter changes, corresponding change LaGrange parameter makes the rate-distortion optimization function still obtain optimal solution.
Above-mentioned steps (6) is carried out the entropy coding to the data that quantize, and generated code stream is by Internet Transmission, and referring to Fig. 7, its concrete steps are as follows:
(6-1) data that quantize are carried out the entropy coding, make like this data of quantification the most effectively to be represented by binary code stream, removed the statistical redundancy of quantized data.
The code stream that (6-2) the entropy coding is formed is realized the transmission of video by Internet Transmission.Because its occupied bandwidth is little, can better adapt to Internet Transmission in the coding method of processing through vision perception characteristic.
The below carries out the performance that a large amount of emulation experiments are assessed the multi-view point video encoding method that utilizes visual characteristic that this paper proposes.Be configured to Intel Pentium 4 CPU 3.00GHz, 512M Internal Memory, Intel 8254G Express Chipset Family, front 48 frames of encoding and decoding multi-view point video sequence ballroom, race1, crowd on the PC of Windows XP Operation System, wherein, BASIC QP is made as 20,24,28,32, experiment porch is selected multi-view point video encoding and decoding reference software JMVC, and the encoding and decoding predict is selected HHI-IBBBP, and the interview prediction mode adopts bi-directional predicted mode.
The experimental result of video sequence ballroom such as Fig. 8 a~8b, shown in Figure 9.Fig. 8 a is video sequence ballroom in the situation that quantization parameter QP=24, and the 0th viewpoint the 15th two field picture uses the reconstructed image of JMVC original coding method, the PSNR=40.31dB of reconstruction video image.Fig. 8 b is video sequence ballroom in the situation that quantization parameter QP=24, and the 0th viewpoint the 15th two field picture uses the reconstruction video image of the inventive method, the PSNR=40.10dB of reconstruction video image.Fig. 9 is that video sequence ballroom uses JMVC original coding and two kinds of methods of the present invention, in the situation that different Q P and different points of view, code check, PSNR value, code check are saved the statistics of percentage, reconstruction video subjective quality assessment mark poor (DM0S), average bit rate saving percentage.Can find out, video sequence ballroom is under different Q P, use the encoder bit rate of the inventive method to save 7.47%~9.16% than the encoder bit rate that uses JMVC original coding method, the Subjective video quality evaluation score of JMVC original coding method and the inventive method is poor is 0.03~0.07, can think that subjective quality remains unchanged.
The experimental result of video sequence race1 such as Figure 10 a~10b, shown in Figure 11.Figure 10 a is video sequence race1 in the situation that quantization parameter QP=24, and the 1st viewpoint the 25th two field picture uses the reconstruction video image of JMVC original coding method, the PSNR=41.15dB of reconstruction video image.Figure 10 b is video sequence race1 in the situation that quantization parameter QP=24, and the 1st viewpoint the 36th two field picture uses the reconstruction video image of JMVC original coding method, the PSNR=40.51dB of reconstruction video image.Figure 11 is that video sequence race1 uses JMVC original coding and two kinds of methods of the present invention, in the situation that different Q P and different points of view, code check, PSNR value, code check are saved the statistics of percentage, reconstruction video subjective quality assessment mark poor (DM0S), average bit rate saving percentage.Can find out, video sequence race1 is under different Q P, use the encoder bit rate of the inventive method to save 10.77%~12.35% than the encoder bit rate that uses JMVC original coding method, the Subjective video quality evaluation score of JMVC original coding method and the inventive method is poor is 0.06~0.09, can think that subjective quality remains unchanged.
The experimental result of video sequence crowd such as Figure 12 a~12b, shown in Figure 13.Figure 12 a is video sequence crowd in the situation that quantization parameter QP=35, and the 2nd viewpoint the 45th two field picture uses the reconstruction video image of JMVC original coding method, the PSNR=33.77dB of reconstruction video image.Figure 12 b is video sequence crowd in the situation that quantization parameter QP=35, and the 2nd viewpoint the 45th two field picture uses the reconstruction video image of JMVC original coding method, the PSNR=33.12dB of reconstruction video image.Figure 13 is that video sequence crowd uses JMVC original coding and two kinds of methods of the present invention, in the situation that different Q P and different points of view, code check, PSNR value, code check are saved the statistics of percentage, reconstruction video subjective quality assessment mark poor (DM0S), average bit rate saving percentage.Can find out, video sequence crowd is under different Q P, use the encoder bit rate of the inventive method to save 8.95%~9.83% than the encoder bit rate that uses JMVC original coding method, the Subjective video quality evaluation score of JMVC original coding method and the inventive method is poor is 0.03~0.08, can think that subjective quality remains unchanged.
Can find out in conjunction with above each chart, the present invention is by setting up the JND model in DCT territory, and it is applied to multiple view video coding framework quantizing process and rate-distortion optimization process, in the situation that guarantee that subjective quality is constant, decrease multiple view video coding code check has improved the compression efficiency of multiple view video coding.
Claims (7)
1. method of utilizing vision perception characteristic to instruct the multiple view video coding quantizing process is characterized in that operating procedure is as follows:
(1) read the brightness value size of each frame of input video sequence, that sets up frequency domain just can distinguish the distortion threshold model,
(2) each frame of input video sequence passes through in viewpoint and the prediction between viewpoint,
(3) residual error data is carried out discrete cosine transform,
(4) quantization step of each macro block in the dynamic adjustments present frame,
(5) LaGrange parameter in dynamic adjustments rate-distortion optimization process,
(6) data that quantize are carried out the entropy coding, generated code stream passes through Internet Transmission.
2. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1, it is characterized in that described step (1) reads the brightness value size of each frame of input video sequence, the operating procedure that just can distinguish the distortion threshold model of setting up frequency domain is as follows:
1. obtain respectively the spatial sensitivity factor of 4x4 and 8x8DCT conversion according to the dimension of dct transform
, its formula is:
Wherein s is the control parameter,
Be the angle of the frequency of DCT coefficient vector representative,
Be DCT coefficient normalization factor,
Be spatial frequency, parameter r, a, b and c are according to varying in size of dct transform and difference: for the DCT coded format of 8 * 8 sizes,
Be 0.6,
Be 1.33,
Be 0.11,
Be 0.18; For the DCT coded format of 4 * 4 sizes,
Be 0.6,
Be 0.8,
Be 0.035,
Be 0.008;
2. record human eye under the Different background illumination condition according to experiment, the brightness shielding effect
Curve is expressed as follows:
Wherein,
Average pixel value for the present encoding piece;
3. utilize edge detector to detect the texture features of present encoding piece, obtain texture and cover the factor
, its expression formula is as follows:
Wherein,
The transverse and longitudinal coordinate coefficient of expression transform block,
The estimation factor is covered in the expression contrast,
Be the spatial sensitivity factor,
Dct transform coefficient for n encoding block of present frame;
4. the speed of object of which movement in frame every according to video sequence, test and record the time domain shielding effect factor
Expression formula is:
5. described step 1. ~ weighted product of four kinds of factors of 4. trying to achieve namely consist of current encoded frame just can distinguish distortion threshold.
3. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1, it is characterized in that each frame of described step (2) input video sequence through in viewpoint and the operating procedure of the prediction between viewpoint as follows:
1. carry out interframe and infra-frame prediction in viewpoint, predicted value and the current frame that will encode are compared, choose the less a kind of coded system of coding cost;
2. carry out the prediction between viewpoint, the current encoded frame of current view point predicts according to the corresponding frame of reference view, and the corresponding frame of predicted value and reference view is compared, and tries to achieve the coding cost of interview prediction;
3. compare between viewpoint and the coding cost in viewpoint, select the sort of predictive mode than the lower Item cost.
4. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1 is characterized in that the operating procedure that described step (3) carries out discrete cosine transform to residual error data is as follows:
1. the judgement of coded block size less than 8, classifies as the 4x4 transform block when arbitrary length of side of encoding block, otherwise, be the 8x8 transform block;
2. when being the 4x4 transform block, select the 4x4 dct transform, when being the 8x8 transform block, select the 8x8DCT conversion.
5. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1 is characterized in that the operating procedure of the quantization step of each macro block in described step (4) dynamic adjustments present frame is as follows:
That 1. calculates present frame just can distinguish the mean value of distortion threshold;
That 2. calculates current coding macro block just can distinguish distortion threshold mean value;
3. more every frame just can distinguish distortion threshold average and current macro just can distinguish the average of distortion threshold, the quantization step of dynamic adjustments current macro, its expression formula of the quantization step after adjusting is as follows:
6. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1 is characterized in that the operating procedure of the LaGrange parameter in described step (5) dynamic adjustments rate-distortion optimization process is as follows:
1. more every frame just can distinguish distortion threshold average and current macro just can distinguish the average of distortion threshold;
2. adjust LaGrange parameter, its expression formula of the LaGrange parameter after adjustment is:
Wherein
Be regulatory factor,
Be the quantization step after adjusting,
The original quantization step of presentation code framework,
What represent current macro just can distinguish the average of distortion threshold,
What represent present frame just can distinguish the average of distortion threshold;
3. the encode optimization of cost function, the dynamic adjustments LaGrange parameter makes the rate-distortion optimization function in the situation that quantization step changes, and regains optimal solution; Its expression formula is:
7. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1 is characterized in that described step (6) carries out the entropy coding to the data that quantize, and generated code stream is as follows by the operating procedure of Internet Transmission:
1. the data after quantizing are carried out the entropy coding, make the data formation binary code stream after quantification;
2. encoding code stream passes through Internet Transmission.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210402003.9A CN103124347B (en) | 2012-10-22 | 2012-10-22 | Vision perception characteristic is utilized to instruct the method for multiple view video coding quantizing process |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210402003.9A CN103124347B (en) | 2012-10-22 | 2012-10-22 | Vision perception characteristic is utilized to instruct the method for multiple view video coding quantizing process |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103124347A true CN103124347A (en) | 2013-05-29 |
CN103124347B CN103124347B (en) | 2016-04-27 |
Family
ID=48455183
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210402003.9A Expired - Fee Related CN103124347B (en) | 2012-10-22 | 2012-10-22 | Vision perception characteristic is utilized to instruct the method for multiple view video coding quantizing process |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103124347B (en) |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103475875A (en) * | 2013-06-27 | 2013-12-25 | 上海大学 | Image adaptive measuring method based on compressed sensing |
CN103716623A (en) * | 2013-12-17 | 2014-04-09 | 北京大学深圳研究生院 | Video compression encoding-and-decoding method and encoder-decoder on the basis of weighting quantification |
CN104219526A (en) * | 2014-09-01 | 2014-12-17 | 国家广播电影电视总局广播科学研究院 | HEVC rate distortion optimization algorithm based on just-noticeable perception quality judging criterion |
CN104349167A (en) * | 2014-11-17 | 2015-02-11 | 电子科技大学 | Adjustment method of video code rate distortion optimization |
CN104469386A (en) * | 2014-12-15 | 2015-03-25 | 西安电子科技大学 | Stereoscopic video perception and coding method for just-noticeable error model based on DOF |
CN104488266A (en) * | 2013-06-27 | 2015-04-01 | 北京大学深圳研究生院 | AVS video compressing and coding method, and coder |
CN105245890A (en) * | 2015-10-16 | 2016-01-13 | 北京工业大学 | Efficient video encoding method based on vision attention priority |
CN105704497A (en) * | 2016-01-30 | 2016-06-22 | 上海大学 | Fast select algorithm for coding unit size facing 3D-HEVC |
CN105850123A (en) * | 2013-12-19 | 2016-08-10 | 汤姆逊许可公司 | Method and device for encoding a high-dynamic range image |
CN106454386A (en) * | 2016-10-26 | 2017-02-22 | 广东电网有限责任公司电力科学研究院 | JND (Just-noticeable difference) based video encoding method and device |
CN107027031A (en) * | 2016-01-31 | 2017-08-08 | 西安电子科技大学 | A kind of coding method and device for video image |
CN107094251A (en) * | 2017-03-31 | 2017-08-25 | 浙江大学 | A kind of video, image coding/decoding method and device adjusted based on locus adaptive quality |
CN107197266A (en) * | 2017-06-26 | 2017-09-22 | 杭州当虹科技有限公司 | A kind of HDR method for video coding |
CN108111852A (en) * | 2018-01-12 | 2018-06-01 | 东华大学 | Towards the double measurement parameter rate distortion control methods for quantifying splits' positions perceptual coding |
CN108574841A (en) * | 2017-03-07 | 2018-09-25 | 北京金山云网络技术有限公司 | A kind of coding method and device based on adaptive quantizing parameter |
CN110024382A (en) * | 2017-07-19 | 2019-07-16 | 联发科技股份有限公司 | The method and apparatus for reducing the pseudomorphism at the noncoherent boundary in encoded virtual reality image |
CN110113606A (en) * | 2019-03-12 | 2019-08-09 | 佛山市顺德区中山大学研究院 | A kind of method, apparatus and equipment of removal human eye perception redundant video coding |
CN113489983A (en) * | 2021-06-11 | 2021-10-08 | 浙江智慧视频安防创新中心有限公司 | Method and device for determining block coding parameters based on correlation comparison |
CN114747214A (en) * | 2019-10-21 | 2022-07-12 | 弗劳恩霍夫应用研究促进协会 | Weighted PSNR quality metric for video encoded data |
CN115967806A (en) * | 2023-03-13 | 2023-04-14 | 阿里巴巴(中国)有限公司 | Data frame coding control method and system and electronic equipment |
WO2023155445A1 (en) * | 2022-02-21 | 2023-08-24 | 翱捷科技股份有限公司 | Rate distortion optimization method and apparatus based on motion detection |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1968419A (en) * | 2005-11-16 | 2007-05-23 | 三星电子株式会社 | Image encoding method and apparatus and image decoding method and apparatus using characteristics of the human visual system |
CN101710995A (en) * | 2009-12-10 | 2010-05-19 | 武汉大学 | Video coding system based on vision characteristic |
CN101854555A (en) * | 2010-06-18 | 2010-10-06 | 上海交通大学 | Video coding system based on prediction residual self-adaptation regulation |
CN102420988A (en) * | 2011-12-02 | 2012-04-18 | 上海大学 | Multi-view video coding system utilizing visual characteristics |
CN102724525A (en) * | 2012-06-01 | 2012-10-10 | 宁波大学 | Depth video coding method on basis of foveal JND (just noticeable distortion) model |
-
2012
- 2012-10-22 CN CN201210402003.9A patent/CN103124347B/en not_active Expired - Fee Related
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1968419A (en) * | 2005-11-16 | 2007-05-23 | 三星电子株式会社 | Image encoding method and apparatus and image decoding method and apparatus using characteristics of the human visual system |
CN101710995A (en) * | 2009-12-10 | 2010-05-19 | 武汉大学 | Video coding system based on vision characteristic |
CN101854555A (en) * | 2010-06-18 | 2010-10-06 | 上海交通大学 | Video coding system based on prediction residual self-adaptation regulation |
CN102420988A (en) * | 2011-12-02 | 2012-04-18 | 上海大学 | Multi-view video coding system utilizing visual characteristics |
CN102724525A (en) * | 2012-06-01 | 2012-10-10 | 宁波大学 | Depth video coding method on basis of foveal JND (just noticeable distortion) model |
Cited By (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103475875A (en) * | 2013-06-27 | 2013-12-25 | 上海大学 | Image adaptive measuring method based on compressed sensing |
CN104488266A (en) * | 2013-06-27 | 2015-04-01 | 北京大学深圳研究生院 | AVS video compressing and coding method, and coder |
CN103475875B (en) * | 2013-06-27 | 2017-02-08 | 上海大学 | Image adaptive measuring method based on compressed sensing |
CN104488266B (en) * | 2013-06-27 | 2018-07-06 | 北京大学深圳研究生院 | AVS video compressing and encoding methods and encoder |
CN103716623A (en) * | 2013-12-17 | 2014-04-09 | 北京大学深圳研究生院 | Video compression encoding-and-decoding method and encoder-decoder on the basis of weighting quantification |
CN103716623B (en) * | 2013-12-17 | 2017-02-15 | 北京大学深圳研究生院 | Video compression encoding-and-decoding method and encoder-decoder on the basis of weighting quantification |
CN105850123A (en) * | 2013-12-19 | 2016-08-10 | 汤姆逊许可公司 | Method and device for encoding a high-dynamic range image |
CN105850123B (en) * | 2013-12-19 | 2019-06-18 | 汤姆逊许可公司 | The method and apparatus that high dynamic range images are encoded |
US10574987B2 (en) | 2013-12-19 | 2020-02-25 | Interdigital Vc Holdings, Inc. | Method and device for encoding a high-dynamic range image |
CN104219526A (en) * | 2014-09-01 | 2014-12-17 | 国家广播电影电视总局广播科学研究院 | HEVC rate distortion optimization algorithm based on just-noticeable perception quality judging criterion |
CN104219526B (en) * | 2014-09-01 | 2017-05-24 | 国家广播电影电视总局广播科学研究院 | HEVC rate distortion optimization algorithm based on just-noticeable perception quality judging criterion |
CN104349167A (en) * | 2014-11-17 | 2015-02-11 | 电子科技大学 | Adjustment method of video code rate distortion optimization |
CN104349167B (en) * | 2014-11-17 | 2018-01-19 | 电子科技大学 | A kind of method of adjustment of Video coding rate-distortion optimization |
CN104469386B (en) * | 2014-12-15 | 2017-07-04 | 西安电子科技大学 | A kind of perception method for encoding stereo video of the proper appreciable error model based on DOF |
CN104469386A (en) * | 2014-12-15 | 2015-03-25 | 西安电子科技大学 | Stereoscopic video perception and coding method for just-noticeable error model based on DOF |
CN105245890B (en) * | 2015-10-16 | 2018-01-19 | 北京工业大学 | A kind of efficient video coding method of view-based access control model attention rate priority |
CN105245890A (en) * | 2015-10-16 | 2016-01-13 | 北京工业大学 | Efficient video encoding method based on vision attention priority |
CN105704497A (en) * | 2016-01-30 | 2016-06-22 | 上海大学 | Fast select algorithm for coding unit size facing 3D-HEVC |
CN105704497B (en) * | 2016-01-30 | 2018-08-17 | 上海大学 | Coding unit size fast selection algorithm towards 3D-HEVC |
CN107027031A (en) * | 2016-01-31 | 2017-08-08 | 西安电子科技大学 | A kind of coding method and device for video image |
CN106454386B (en) * | 2016-10-26 | 2019-07-05 | 广东电网有限责任公司电力科学研究院 | A kind of method and apparatus of the Video coding based on JND |
CN106454386A (en) * | 2016-10-26 | 2017-02-22 | 广东电网有限责任公司电力科学研究院 | JND (Just-noticeable difference) based video encoding method and device |
CN108574841A (en) * | 2017-03-07 | 2018-09-25 | 北京金山云网络技术有限公司 | A kind of coding method and device based on adaptive quantizing parameter |
CN107094251A (en) * | 2017-03-31 | 2017-08-25 | 浙江大学 | A kind of video, image coding/decoding method and device adjusted based on locus adaptive quality |
CN107197266A (en) * | 2017-06-26 | 2017-09-22 | 杭州当虹科技有限公司 | A kind of HDR method for video coding |
CN110024382A (en) * | 2017-07-19 | 2019-07-16 | 联发科技股份有限公司 | The method and apparatus for reducing the pseudomorphism at the noncoherent boundary in encoded virtual reality image |
US11049314B2 (en) | 2017-07-19 | 2021-06-29 | Mediatek Inc | Method and apparatus for reduction of artifacts at discontinuous boundaries in coded virtual-reality images |
CN110024382B (en) * | 2017-07-19 | 2022-04-12 | 联发科技股份有限公司 | Method and device for processing 360-degree virtual reality image |
CN108111852A (en) * | 2018-01-12 | 2018-06-01 | 东华大学 | Towards the double measurement parameter rate distortion control methods for quantifying splits' positions perceptual coding |
CN108111852B (en) * | 2018-01-12 | 2020-05-29 | 东华大学 | Double-measurement-parameter rate-distortion control method for quantization block compressed sensing coding |
CN110113606A (en) * | 2019-03-12 | 2019-08-09 | 佛山市顺德区中山大学研究院 | A kind of method, apparatus and equipment of removal human eye perception redundant video coding |
CN114747214A (en) * | 2019-10-21 | 2022-07-12 | 弗劳恩霍夫应用研究促进协会 | Weighted PSNR quality metric for video encoded data |
CN113489983A (en) * | 2021-06-11 | 2021-10-08 | 浙江智慧视频安防创新中心有限公司 | Method and device for determining block coding parameters based on correlation comparison |
WO2023155445A1 (en) * | 2022-02-21 | 2023-08-24 | 翱捷科技股份有限公司 | Rate distortion optimization method and apparatus based on motion detection |
CN115967806A (en) * | 2023-03-13 | 2023-04-14 | 阿里巴巴(中国)有限公司 | Data frame coding control method and system and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN103124347B (en) | 2016-04-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103124347B (en) | Vision perception characteristic is utilized to instruct the method for multiple view video coding quantizing process | |
CN102420988B (en) | Multi-view video coding system utilizing visual characteristics | |
CN101416511B (en) | Quantization adjustments based on grain | |
CN101931815B (en) | Quantization adjustment based on texture level | |
KR20190117651A (en) | Image processing and video compression methods | |
CN106534862B (en) | Video coding method | |
CN103079063B (en) | A kind of method for video coding of vision attention region under low bit rate | |
CN100464585C (en) | Video-frequency compression method | |
CN103051901B (en) | Video data coding device and method for coding video data | |
CN101257630B (en) | Video frequency coding method and device combining with three-dimensional filtering | |
CN101325711A (en) | Method for controlling self-adaption code rate based on space-time shielding effect | |
CN111083477B (en) | HEVC (high efficiency video coding) optimization algorithm based on visual saliency | |
CN101710993A (en) | Block-based self-adaptive super-resolution video processing method and system | |
CN102186070A (en) | Method for realizing rapid video coding by adopting hierarchical structure anticipation | |
CN103179394A (en) | I frame rate control method based on stable area video quality | |
CN107211145A (en) | The almost video recompression of virtually lossless | |
CN101188755A (en) | A method for VBR code rate control in AVX decoding of real time video signals | |
CN112825557B (en) | Self-adaptive sensing time-space domain quantization method aiming at video coding | |
CN108924554A (en) | A kind of panorama video code Rate-distortion optimization method of spherical shape weighting structures similarity | |
CN102984541B (en) | Video quality assessment method based on pixel domain distortion factor estimation | |
CN105657433A (en) | Image complexity based signal source real-time coding method and system | |
CN102984540A (en) | Video quality assessment method estimated on basis of macroblock domain distortion degree | |
CN107580217A (en) | Coding method and its device | |
KR970073152A (en) | An improved image encoding system having an adaptive quantization control function and a quantization control method thereof (IMPROVED IMAGE CODING SYSTEM USING ADAPTIVE QUANTIZATION TECHNIQUE AND ADAPTIVE QUANTIZATION CONTROL METHOD THEREOF) | |
CN104702959A (en) | Intra-frame prediction method and system of video coding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160427 Termination date: 20211022 |