CN103124347A - Method for guiding multi-view video coding quantization process by visual perception characteristics - Google Patents

Method for guiding multi-view video coding quantization process by visual perception characteristics Download PDF

Info

Publication number
CN103124347A
CN103124347A CN2012104020039A CN201210402003A CN103124347A CN 103124347 A CN103124347 A CN 103124347A CN 2012104020039 A CN2012104020039 A CN 2012104020039A CN 201210402003 A CN201210402003 A CN 201210402003A CN 103124347 A CN103124347 A CN 103124347A
Authority
CN
China
Prior art keywords
frame
distinguish
viewpoint
encoding
coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2012104020039A
Other languages
Chinese (zh)
Other versions
CN103124347B (en
Inventor
王永芳
商习武
刘静
宋允东
张兆杨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Shanghai for Science and Technology
Original Assignee
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shanghai for Science and Technology filed Critical University of Shanghai for Science and Technology
Priority to CN201210402003.9A priority Critical patent/CN103124347B/en
Publication of CN103124347A publication Critical patent/CN103124347A/en
Application granted granted Critical
Publication of CN103124347B publication Critical patent/CN103124347B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention relates to a method for guiding coding quantization process by visual perception characteristics. The method includes the following operating steps that (1) a brightness value of each frame of an input video sequence is read, and a just noticeable distortion threshold value model of a frequency domain is established; (2) prediction on each frame passing through the viewpoints and between the viewpoints in the input video sequence is performed; (3) residual error data are subjected to discrete cosine transform; (4) a quantifying step size of each macro block in the current frame is dynamically adjusted; (5) lagrangian parameters in a rate-distortion optimization process are dynamically adjusted; and (6) quantized data are subjected to entropy coding to form a code stream, and the code stream is transmitted through network. By means of the method, the video compression efficiency is improved under the condition that subjective qualities are basically unchanged, and the video is suitable for being transmitted in the network.

Description

Utilize vision perception characteristic to instruct the method for multiple view video coding quantizing process
Technical field
The present invention relates to multi-view point video encoding and decoding technique field, particularly utilize vision perception characteristic to instruct the method for multiple view video coding quantizing process, be applicable to the encoding and decoding of high definition 3D vision signal.
Background technology
Along with era development, people are more and more higher to the requirement of audiovisual impression, are not content with existing haplopia two-dimensional video.People are more and more higher for the third dimension experience requirements, can experience third dimension from the third dimension of fixed angle to any angle, thereby expedite the emergence of out the development of multi-vision-point encoding technology.Yet the data that many viewpoints require improve greatly, how effectively to improve video compression efficiency and become study hotspot.At present, video compression technology mainly concentrates on and removes spatial redundancy, time redundancy and statistical redundancy three aspects.Although video experts is released technology of video compressing encoding of new generation (HEVC), expect that H.264 video compression efficiency is doubling on the basis again.Yet the characteristic due to human visual system (HVS) self exists the perception redundancy and still is not removed.Along with to human-eye visual characteristic research gradually deeply, what have that the video worker proposed to remove the human eye redundancy just can distinguish distortion model (Just Noticeable Distortion, JND).Namely according to the size of the JND threshold metric perception redundancy that obtains, when changing value lower than this threshold value just not by Human Perception.
Research for JND at present mainly is divided into two large class: pixel domain JND and frequency domain JND models.Wherein, the JND model that proposes in document [1] is classical pixel domain model, has studied respectively that characteristic is covered in brightness, texture covers characteristic and time domain is covered characteristic.The frequency domain JND model that proposes in document [2] has also been studied the sensitiveness of human eye to the different frequency section outside having studied first three specific character, make like this frequency domain JND model more meet the visual characteristic of human eye.
For the JND model that proposes in document [2], it is at present more complete DCT territory JND model.It covers characteristic except the brightness that comprises pixel and texture is covered characteristic, has also increased spatial sensitivity function effect.The spatial sensitivity function has reflected the bandpass characteristics of human eye, reaches removal Human Perception frequency redundancy purpose by removing the imperceptible frequency content of human eye.In the time domain shielding effect, comprise level and smooth eyeball and moved effect, not only comprised the size of motion amplitude, also comprised the directional information of motion.After having the researcher that it is combined with multi-view point video to act on residual error dct transform (discrete cosine transform), greatly improved compression efficiency.But, do not use it for other cataloged procedure such as quantizing process, therefore that it removes visual redundancy is thorough not.
The JND model of setting up in document [3] is although proposed to utilize the JND model to instruct quantizing process.Yet the JND model of its foundation is pixel domain, has lacked the process of removing the human eye frequency redundancy, causes instructing quantizing process accurate not.Secondly, guaranteed subjective quality for the JND model, only need to regulate quantized value to the insensitive place of human eye, and other area quantization value has remained unchanged.Adjusting quantization parameter simultaneously, corresponding adjustment LaGrange parameter at last.
Patent application of the present invention proposes DCT territory JND model is applied to quantizing process in multiple view video coding first, in the situation that guarantee that subjective quality is constant, further improves video compression efficiency.
Document [1]: X. Yang, W. Lin, and Z. Lu, " Motion-compensated residue preprocessing in video coding based on just-noticeable-distortion profile; " IEEE Trans. Circuits Syst. Video Technol., vol. 15, and no. 6, pp. 742 – 752,2005.
Document [2]: Zhenyu Wei and King N. Ngan., " Spatio-Temporal Just Noticeable Distortion Profile for Grey Scale Image/Video in DCT Domain. " IEEE transactions on circuits and systems for video technology.VOL. 19, NO. 3, March 2009.
Document [3]: Z. Chen and C. Guillemot, " Perceptually friendly H.26/AVC video coding based on foveated just noticeable distortion model; " IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 6, pp. 806 – 819, Jun.2010.
Summary of the invention
The objective of the invention is the defective for the prior art existence, a kind of method of utilizing vision perception characteristic to instruct the multiple view video coding quantizing process is provided, the method is in the situation that guarantee that Subjective video quality is constant, use frequency domain JND model to instruct many viewpoints quantizing process, to the insensitive zone raising of human eye quantization step, improved video compression efficiency.When adjusting step-length, dynamically the LaGrange parameter of regulation aberration optimizing function, make code efficiency further improve.
For achieving the above object, the present invention adopts following technical scheme:
A kind of method of utilizing vision perception characteristic to instruct the multiple view video coding quantizing process is characterized in that operating procedure is as follows:
(1) read the brightness value size of each frame of input video sequence, that sets up frequency domain just can distinguish the distortion threshold model,
(2) each frame of input video sequence passes through in viewpoint and the prediction between viewpoint,
(3) residual error data is carried out discrete cosine transform (dct transform),
(4) quantization step of each macro block in the dynamic adjustments present frame,
(5) LaGrange parameter in dynamic adjustments rate-distortion optimization process,
(6) data that quantize are carried out the entropy coding, generated code stream passes through Internet Transmission.
Of the present inventionly utilize method that vision perception characteristic instructs the multiple view video coding quantizing process compared with the prior art, have following apparent outstanding substantive distinguishing features and significantly technological progress:
1), this multi-view point video encoding method when guaranteeing the reconstruction video mass conservation, make cataloged procedure just can reduce encoder bit rate by quantizing this subprogram, in test, maximal rate can drop to 12.35%;
2), this multi-view point video encoding method is when guaranteeing the reconstruction video mass conservation, adopt average subjective scores difference, when the subjective scores difference near 0 the time, the subjective quality that two kinds of methods are described is more approaching, the average subjective scores difference of this method is 0.03, therefore says that the subjective quality of subjective quality of the present invention and multi-view point video encoding and decoding JMVC code is suitable;
3), this multi-view point video encoding method do not have to increase complicated especially cataloged procedure, improves the Video coding compression efficiency with less complexity.
Description of drawings
Fig. 1 is that the vision perception characteristic that utilizes in the present invention instructs the theory diagram of the method for multiple view video coding quantizing process.
Fig. 2 be frequency domain just can distinguish the block diagram of distortion model.
Fig. 3 is in viewpoint/block diagram of a prediction.
Fig. 4 is the dct transform block diagram.
Fig. 5 is the block diagram of dynamic adjustments quantization step.
Fig. 6 is the block diagram of the LaGrange parameter in dynamic adjustments rate distortion costs function.
Fig. 7 is the block diagram of entropy coding output.
Fig. 8 a is the reconstructed image that the 0th viewpoint of video sequence ballroom the 15th two field picture uses JMVC original coding method.
Fig. 8 b is the reconstructed image that the 0th viewpoint of video sequence ballroom the 15th two field picture uses the inventive method.
Fig. 9 is that video sequence ballroom uses JMVC original coding method and the inventive method in different Q P and different points of view situation, the comparing result of code check, PSNR value, reconstruction video subjective quality assessment mark poor (DM0S)
Figure 10 a is the reconstructed image that the 1st viewpoint of video sequence race1 the 35th two field picture uses JMVC original coding method.
Figure 10 b is the reconstructed image that the 1st viewpoint of video sequence race1 the 35th two field picture uses the inventive method
Figure 11 is that video sequence race1 uses JMVC original coding method and the inventive method in different Q P and different points of view situation, the comparing result of code check, PSNR value, reconstruction video subjective quality assessment mark poor (DM0S)
Figure 12 a is the reconstructed image that the 2nd viewpoint of video sequence Crowd the 45th two field picture uses JMVC original coding method.
Figure 12 b is the reconstructed image that the 2nd viewpoint of video sequence Crowd the 45th two field picture uses the inventive method.
Figure 13 is that video sequence Crowd uses JMVC original coding method and the inventive method in different Q P and different points of view situation, the comparing result of code check, PSNR value, the average subjective scoring difference of reconstruction video (DM0S).
Embodiment
Below in conjunction with accompanying drawing, the preferred embodiments of the present invention are described in further detail:
Embodiment one:
The present embodiment utilizes vision perception characteristic to instruct the method for multiple view video coding quantizing process, referring to Fig. 1, comprises the following steps:
(1) read the brightness value size of each frame of input video sequence, that sets up frequency domain just can distinguish the distortion threshold model,
(2) each frame of input video sequence passes through in viewpoint and the prediction between viewpoint,
(3) residual error data is carried out discrete cosine transform,
(4) quantization step of each macro block in the dynamic adjustments present frame,
(5) LaGrange parameter in dynamic adjustments rate-distortion optimization process,
(6) data that quantize are carried out the entropy coding, generated code stream passes through Internet Transmission.
Embodiment two: the present embodiment and embodiment one are basic identical, and special feature is as follows:
Set up frequency domain JND model in above-mentioned steps (1) and comprise four models, referring to Fig. 2:
(1-1) spatial contrast sensitivity function model is the bandpass characteristics curve according to human eye, for the particular space frequency
Figure 722809DEST_PATH_IMAGE001
Its basic JND threshold value can be expressed as:
Figure 228877DEST_PATH_IMAGE002
Spatial frequency
Figure 445095DEST_PATH_IMAGE003
Computing formula be:
Figure 389917DEST_PATH_IMAGE004
Wherein,
Figure DEST_PATH_IMAGE005
With The coordinate position of expression discrete cosine transform block,
Figure 684949DEST_PATH_IMAGE007
Be the dimension of discrete cosine transform block,
Figure 185201DEST_PATH_IMAGE008
With
Figure 605818DEST_PATH_IMAGE009
The visual angle of expression horizontal and vertical it is generally acknowledged that the horizontal view angle equals the vertical angle of view, and it is expressed as:
Figure 78387DEST_PATH_IMAGE010
Because the human eye vision susceptibility has directivity, more responsive to the horizontal and vertical direction ratio, relatively less to the susceptibility of other directions.The modulation factor that adds thus direction can get:
Figure 988574DEST_PATH_IMAGE011
Figure 913805DEST_PATH_IMAGE012
Be the angle of the frequency of DCT coefficient vector representative,
Figure 138113DEST_PATH_IMAGE013
Figure 26041DEST_PATH_IMAGE014
For DCT coefficient normalization factor expression formula is:
Figure 310392DEST_PATH_IMAGE015
Add at last the control parameter
Figure 722919DEST_PATH_IMAGE016
The modulation factor that forms final spatial sensitivity function is:
Figure 547655DEST_PATH_IMAGE017
In the multi-vision-point encoding process, due to the dct transform that has 8 * 8 and 4 * 4 sizes, therefore parameter is distinguished to some extent.In experiment, for the DCT coded format of 8 * 8 sizes,
Figure 729238DEST_PATH_IMAGE018
Be 0.6,
Figure 184490DEST_PATH_IMAGE019
Be 1.33,
Figure 146630DEST_PATH_IMAGE020
Be 0.11, Be 0.18; For the DCT coded format of 4 * 4 sizes,
Figure 748830DEST_PATH_IMAGE018
Be 0.6,
Figure 437300DEST_PATH_IMAGE019
Be 0.8,
Figure 824419DEST_PATH_IMAGE020
Be 0.035,
Figure 928641DEST_PATH_IMAGE021
Be 0.008.
(1-2) brightness shielding effect model is according to experiment, and the human eye vision sensitivity of awareness is more responsive in black and brighter background area at the regional ratio of intermediate grey values, simulates at last brightness shielding effect curve, and its expression formula is:
Figure 648598DEST_PATH_IMAGE022
Wherein
Figure 445652DEST_PATH_IMAGE023
It is the average brightness value of present encoding piece.
(1-3) texture shielding effect model is the difference according to image texture, image can be divided into Three regions: frontier district, smoothly district and texture area.Human eye reduces its susceptibility successively.Usually utilize the canny operator to tell the regional of image.
The edge pixel density of utilizing the canny operator to obtain is as follows:
Figure 54488DEST_PATH_IMAGE024
Wherein,
Figure 290298DEST_PATH_IMAGE025
Be the edge pixel sum of piece, obtained by the Canny edge detector.
Utilize edge pixel density Image block is divided into the flat region, texture area and marginal zone, it is as follows according to formula that image block is classified:
Figure 331252DEST_PATH_IMAGE027
For texture region, eyes are insensitive to the low frequency part distortion, but HFS suitably keeps.Therefore obtain contrasting the estimation factor of covering be:
Figure 427384DEST_PATH_IMAGE028
Wherein (
Figure 873409DEST_PATH_IMAGE029
) be DCT coefficient label.
Due to the eclipsing effects of spatial contrast sensitivity function effect and luminance effect, obtain the final shielding effect factor and be:
Wherein,
Figure 205350DEST_PATH_IMAGE031
The of expression input video sequence
Figure 788778DEST_PATH_IMAGE031
Frame,
Figure 38494DEST_PATH_IMAGE032
Be the DCT coefficient,
Figure 584619DEST_PATH_IMAGE033
Be the threshold value of spatial contrast degree sensitivity function,
Figure 894378DEST_PATH_IMAGE034
Be brightness shielding effect characteristic modulation factor.
(1-4) time contrast sensitivity function model is to be according to the modulation factor that experiment records the time domain shielding effect:
Figure 965102DEST_PATH_IMAGE036
Wherein, The expression temporal frequency,
Figure 920606DEST_PATH_IMAGE038
The representation space frequency.Temporal frequency
Figure 401266DEST_PATH_IMAGE037
Its general computing formula is as follows:
Figure 756023DEST_PATH_IMAGE039
Figure 613121DEST_PATH_IMAGE040
Be respectively the horizontal and vertical component of spatial frequency,
Figure 307408DEST_PATH_IMAGE041
Speed for object of which movement on retina.
Figure 21286DEST_PATH_IMAGE040
Calculating formula be:
Figure 66602DEST_PATH_IMAGE042
Wherein,
Figure 461811DEST_PATH_IMAGE008
With
Figure 574386DEST_PATH_IMAGE009
Expression pixel level and vertical visual angle,
Figure 662428DEST_PATH_IMAGE007
Be the dct transform dimension,
Figure 929461DEST_PATH_IMAGE005
With
Figure 862782DEST_PATH_IMAGE006
The coordinate position of expression discrete cosine transform block.
The speed of retina epigraph
Figure 593978DEST_PATH_IMAGE041
Computational methods are as follows:
Wherein,
Figure 341671DEST_PATH_IMAGE044
Be that the smooth pursuit eyeball moves the effect gain, get 0.98 in experiment.
Figure 406579DEST_PATH_IMAGE045
Represent object in the speed of the plane of delineation,
Figure 929964DEST_PATH_IMAGE046
The eyeball translational speed of the minimum that expression causes due to drift motion, its empirical value is 0.15.deg/s.
Figure 94229DEST_PATH_IMAGE047
Be the maximal rate of the eyeball corresponding with the eyes jumping, usually get 80deg/s,
Figure 398171DEST_PATH_IMAGE048
It is the frame per second of video sequence.
Figure 938874DEST_PATH_IMAGE049
The motion vector of each piece, It is the visual angle of pixel.
What (1-5) weighted product of four kinds of factors namely consisted of current encoded frame just can distinguish distortion threshold, and its expression formula is:
Figure 314301DEST_PATH_IMAGE051
Wherein,
Figure 777644DEST_PATH_IMAGE033
Be the threshold value of spatial contrast degree sensitivity function,
Figure 122037DEST_PATH_IMAGE034
Be brightness shielding effect modulation factor,
Figure 682332DEST_PATH_IMAGE052
Be the shielding effect modulation factor,
Figure 188399DEST_PATH_IMAGE053
For time domain is covered modulation factor.
Above-mentioned steps (2) is that input video sequence is carried out between viewpoint/interior prediction, and referring to Fig. 3, its concrete steps are as follows:
(2-1) interframe in viewpoint/interior prediction is to remove the time redundancy of present frame by the inter prediction in viewpoint, removes the spatial redundancy of present frame by the infra-frame prediction in viewpoint.The sort of prediction mode of selection rate aberration optimizing function minimum in infra-frame prediction and inter prediction.Wherein the rate-distortion optimization function expression is:
Figure 404617DEST_PATH_IMAGE054
Wherein
Figure 287122DEST_PATH_IMAGE055
Be distorted signal,
Figure 701923DEST_PATH_IMAGE056
Be the bit number of encoding under the different coding pattern,
Figure 378892DEST_PATH_IMAGE057
It is the LaGrange parameter after adjusting.
The prediction of (2-2) carrying out between viewpoint is because this method is a plurality of viewpoints of coding, predicts present frame by the corresponding frame between viewpoint, can remove the redundant information between viewpoint.
(2-3) relatively between viewpoint and the coding cost in viewpoint, select best prediction mode in prediction again in viewpoint and the prediction mode between viewpoint relatively, the prediction mode of selection rate aberration optimizing cost function minimum is the optimum prediction mode.Take into full account between viewpoint and viewpoint in redundancy properties, select suitable prediction mode further to improve video compression efficiency.
Above-mentioned steps (3) is carried out discrete cosine transform to residual error data, and referring to Fig. 4, its concrete steps are as follows:
(3-1) judgement of coded block size, coded block size has in the multi-vision-point encoding method
Figure 816827DEST_PATH_IMAGE058
Seven kinds of situations, front four kinds are summed up as
Figure 565340DEST_PATH_IMAGE059
Transform block, rear three kinds are Transform block.
(3-2) corresponding dct transform, for Transform block adopts
Figure 374792DEST_PATH_IMAGE059
Dct transform, for
Figure 599100DEST_PATH_IMAGE060
Transform block adopts
Figure 926176DEST_PATH_IMAGE060
Dct transform.
The quantization step of each macro block in above-mentioned steps (4) dynamic adjustments present frame, referring to Fig. 5, its concrete steps are as follows:
(4-1) the JND model by having set up is obtained the average JND value of present frame, and average JND threshold value is:
Figure 272844DEST_PATH_IMAGE061
Wherein,
Figure 419792DEST_PATH_IMAGE062
With
Figure 447790DEST_PATH_IMAGE063
Height and the width of difference presentation graphs picture frame,
Figure 691690DEST_PATH_IMAGE064
What represent present frame just can distinguish distortion threshold,
Figure 146942DEST_PATH_IMAGE065
The coordinate of expression pixel.
(4-2) the JND average of current macro, the average JND threshold value of M macro block is expressed as:
Figure 46765DEST_PATH_IMAGE066
(4-3) quantization step of dynamic adjustments current macro can distinguish that just distortion threshold has reflected the difference of human eye to the susceptibility of piece image various piece, therefore can come according to the difference that just can distinguish distortion threshold the quantization step of each macro block of dynamic adjustments.For the insensitive place of human eye, with suitable the tuning up of quantization step, otherwise quantized value is constant.The quantization parameter that proposes is adjusted to:
Figure 675192DEST_PATH_IMAGE067
Wherein,
Figure 711281DEST_PATH_IMAGE068
The original step-length of coding framework,
Figure 337435DEST_PATH_IMAGE069
Be regulatory factor, its expression formula is provided by following formula:
Figure 19827DEST_PATH_IMAGE070
Wherein,
Figure 389628DEST_PATH_IMAGE071
LaGrange parameter in above-mentioned steps (5) dynamic adjustments rate-distortion optimization process, referring to Fig. 6, its concrete operation step is as follows:
(5-1) calculate and compare the JND average of present frame and the JND average of current coding macro block, for next step weighting to LaGrange parameter provides foundation.
(5-2) adjust Suzanne Lenglen day parameter, quantization parameter has been regulated in the front, and distortion value and code check in Lagrangian rate-distortion optimization change, and uses original LaGrange parameter this moment again Value just can not guarantee it is optimal solution.Corresponding weighting LaGrange parameter, can make cost function again reach optimum simultaneously, after adjustment
Figure 405175DEST_PATH_IMAGE072
For:
Wherein,
Figure 921924DEST_PATH_IMAGE074
The quantization parameter that generates in expression multi-vision-point encoding method,
Figure 932605DEST_PATH_IMAGE075
Expression the Quantization parameter value after individual macro block is adjusted.
(5-3) LaGrange parameter after adjusting is updated in the rate-distortion optimization cost function, and its expression formula is as follows:
Figure 59010DEST_PATH_IMAGE077
Wherein
Figure 770614DEST_PATH_IMAGE055
Be distorted signal,
Figure 963698DEST_PATH_IMAGE056
Be the bit number of encoding under the different coding pattern,
Figure 102555DEST_PATH_IMAGE072
It is the LaGrange parameter after adjusting.Make like this when quantization parameter changes, corresponding change LaGrange parameter makes the rate-distortion optimization function still obtain optimal solution.
Above-mentioned steps (6) is carried out the entropy coding to the data that quantize, and generated code stream is by Internet Transmission, and referring to Fig. 7, its concrete steps are as follows:
(6-1) data that quantize are carried out the entropy coding, make like this data of quantification the most effectively to be represented by binary code stream, removed the statistical redundancy of quantized data.
The code stream that (6-2) the entropy coding is formed is realized the transmission of video by Internet Transmission.Because its occupied bandwidth is little, can better adapt to Internet Transmission in the coding method of processing through vision perception characteristic.
The below carries out the performance that a large amount of emulation experiments are assessed the multi-view point video encoding method that utilizes visual characteristic that this paper proposes.Be configured to Intel Pentium 4 CPU 3.00GHz, 512M Internal Memory, Intel 8254G Express Chipset Family, front 48 frames of encoding and decoding multi-view point video sequence ballroom, race1, crowd on the PC of Windows XP Operation System, wherein, BASIC QP is made as 20,24,28,32, experiment porch is selected multi-view point video encoding and decoding reference software JMVC, and the encoding and decoding predict is selected HHI-IBBBP, and the interview prediction mode adopts bi-directional predicted mode.
The experimental result of video sequence ballroom such as Fig. 8 a~8b, shown in Figure 9.Fig. 8 a is video sequence ballroom in the situation that quantization parameter QP=24, and the 0th viewpoint the 15th two field picture uses the reconstructed image of JMVC original coding method, the PSNR=40.31dB of reconstruction video image.Fig. 8 b is video sequence ballroom in the situation that quantization parameter QP=24, and the 0th viewpoint the 15th two field picture uses the reconstruction video image of the inventive method, the PSNR=40.10dB of reconstruction video image.Fig. 9 is that video sequence ballroom uses JMVC original coding and two kinds of methods of the present invention, in the situation that different Q P and different points of view, code check, PSNR value, code check are saved the statistics of percentage, reconstruction video subjective quality assessment mark poor (DM0S), average bit rate saving percentage.Can find out, video sequence ballroom is under different Q P, use the encoder bit rate of the inventive method to save 7.47%~9.16% than the encoder bit rate that uses JMVC original coding method, the Subjective video quality evaluation score of JMVC original coding method and the inventive method is poor is 0.03~0.07, can think that subjective quality remains unchanged.
The experimental result of video sequence race1 such as Figure 10 a~10b, shown in Figure 11.Figure 10 a is video sequence race1 in the situation that quantization parameter QP=24, and the 1st viewpoint the 25th two field picture uses the reconstruction video image of JMVC original coding method, the PSNR=41.15dB of reconstruction video image.Figure 10 b is video sequence race1 in the situation that quantization parameter QP=24, and the 1st viewpoint the 36th two field picture uses the reconstruction video image of JMVC original coding method, the PSNR=40.51dB of reconstruction video image.Figure 11 is that video sequence race1 uses JMVC original coding and two kinds of methods of the present invention, in the situation that different Q P and different points of view, code check, PSNR value, code check are saved the statistics of percentage, reconstruction video subjective quality assessment mark poor (DM0S), average bit rate saving percentage.Can find out, video sequence race1 is under different Q P, use the encoder bit rate of the inventive method to save 10.77%~12.35% than the encoder bit rate that uses JMVC original coding method, the Subjective video quality evaluation score of JMVC original coding method and the inventive method is poor is 0.06~0.09, can think that subjective quality remains unchanged.
The experimental result of video sequence crowd such as Figure 12 a~12b, shown in Figure 13.Figure 12 a is video sequence crowd in the situation that quantization parameter QP=35, and the 2nd viewpoint the 45th two field picture uses the reconstruction video image of JMVC original coding method, the PSNR=33.77dB of reconstruction video image.Figure 12 b is video sequence crowd in the situation that quantization parameter QP=35, and the 2nd viewpoint the 45th two field picture uses the reconstruction video image of JMVC original coding method, the PSNR=33.12dB of reconstruction video image.Figure 13 is that video sequence crowd uses JMVC original coding and two kinds of methods of the present invention, in the situation that different Q P and different points of view, code check, PSNR value, code check are saved the statistics of percentage, reconstruction video subjective quality assessment mark poor (DM0S), average bit rate saving percentage.Can find out, video sequence crowd is under different Q P, use the encoder bit rate of the inventive method to save 8.95%~9.83% than the encoder bit rate that uses JMVC original coding method, the Subjective video quality evaluation score of JMVC original coding method and the inventive method is poor is 0.03~0.08, can think that subjective quality remains unchanged.
Can find out in conjunction with above each chart, the present invention is by setting up the JND model in DCT territory, and it is applied to multiple view video coding framework quantizing process and rate-distortion optimization process, in the situation that guarantee that subjective quality is constant, decrease multiple view video coding code check has improved the compression efficiency of multiple view video coding.

Claims (7)

1. method of utilizing vision perception characteristic to instruct the multiple view video coding quantizing process is characterized in that operating procedure is as follows:
(1) read the brightness value size of each frame of input video sequence, that sets up frequency domain just can distinguish the distortion threshold model,
(2) each frame of input video sequence passes through in viewpoint and the prediction between viewpoint,
(3) residual error data is carried out discrete cosine transform,
(4) quantization step of each macro block in the dynamic adjustments present frame,
(5) LaGrange parameter in dynamic adjustments rate-distortion optimization process,
(6) data that quantize are carried out the entropy coding, generated code stream passes through Internet Transmission.
2. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1, it is characterized in that described step (1) reads the brightness value size of each frame of input video sequence, the operating procedure that just can distinguish the distortion threshold model of setting up frequency domain is as follows:
1. obtain respectively the spatial sensitivity factor of 4x4 and 8x8DCT conversion according to the dimension of dct transform
Figure DEST_PATH_IMAGE002
, its formula is:
Figure DEST_PATH_IMAGE004
Wherein s is the control parameter,
Figure DEST_PATH_IMAGE006
Be the angle of the frequency of DCT coefficient vector representative,
Figure DEST_PATH_IMAGE008
Figure DEST_PATH_IMAGE010
Be DCT coefficient normalization factor,
Figure DEST_PATH_IMAGE012
Be spatial frequency, parameter r, a, b and c are according to varying in size of dct transform and difference: for the DCT coded format of 8 * 8 sizes, Be 0.6,
Figure DEST_PATH_IMAGE016
Be 1.33,
Figure DEST_PATH_IMAGE018
Be 0.11, Be 0.18; For the DCT coded format of 4 * 4 sizes,
Figure 615976DEST_PATH_IMAGE014
Be 0.6,
Figure 131271DEST_PATH_IMAGE016
Be 0.8,
Figure 178861DEST_PATH_IMAGE018
Be 0.035, Be 0.008;
2. record human eye under the Different background illumination condition according to experiment, the brightness shielding effect Curve is expressed as follows:
Figure DEST_PATH_IMAGE024
Wherein, Average pixel value for the present encoding piece;
3. utilize edge detector to detect the texture features of present encoding piece, obtain texture and cover the factor
Figure DEST_PATH_IMAGE028
, its expression formula is as follows:
Figure DEST_PATH_IMAGE030
Wherein,
Figure DEST_PATH_IMAGE032
The transverse and longitudinal coordinate coefficient of expression transform block,
Figure DEST_PATH_IMAGE034
The estimation factor is covered in the expression contrast, Be the spatial sensitivity factor,
Figure DEST_PATH_IMAGE036
Dct transform coefficient for n encoding block of present frame;
4. the speed of object of which movement in frame every according to video sequence, test and record the time domain shielding effect factor
Figure DEST_PATH_IMAGE038
Expression formula is:
Wherein,
Figure DEST_PATH_IMAGE042
Be spatial frequency,
Figure DEST_PATH_IMAGE044
Be temporal frequency;
5. described step 1. ~ weighted product of four kinds of factors of 4. trying to achieve namely consist of current encoded frame just can distinguish distortion threshold.
3. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1, it is characterized in that each frame of described step (2) input video sequence through in viewpoint and the operating procedure of the prediction between viewpoint as follows:
1. carry out interframe and infra-frame prediction in viewpoint, predicted value and the current frame that will encode are compared, choose the less a kind of coded system of coding cost;
2. carry out the prediction between viewpoint, the current encoded frame of current view point predicts according to the corresponding frame of reference view, and the corresponding frame of predicted value and reference view is compared, and tries to achieve the coding cost of interview prediction;
3. compare between viewpoint and the coding cost in viewpoint, select the sort of predictive mode than the lower Item cost.
4. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1 is characterized in that the operating procedure that described step (3) carries out discrete cosine transform to residual error data is as follows:
1. the judgement of coded block size less than 8, classifies as the 4x4 transform block when arbitrary length of side of encoding block, otherwise, be the 8x8 transform block;
2. when being the 4x4 transform block, select the 4x4 dct transform, when being the 8x8 transform block, select the 8x8DCT conversion.
5. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1 is characterized in that the operating procedure of the quantization step of each macro block in described step (4) dynamic adjustments present frame is as follows:
That 1. calculates present frame just can distinguish the mean value of distortion threshold;
That 2. calculates current coding macro block just can distinguish distortion threshold mean value;
3. more every frame just can distinguish distortion threshold average and current macro just can distinguish the average of distortion threshold, the quantization step of dynamic adjustments current macro, its expression formula of the quantization step after adjusting is as follows:
Figure DEST_PATH_IMAGE046
Wherein, The original quantization step of presentation code framework,
Figure DEST_PATH_IMAGE050
What represent current macro just can distinguish the average of distortion threshold,
Figure DEST_PATH_IMAGE052
What represent present frame just can distinguish the average of distortion threshold, Be regulatory factor.
6. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1 is characterized in that the operating procedure of the LaGrange parameter in described step (5) dynamic adjustments rate-distortion optimization process is as follows:
1. more every frame just can distinguish distortion threshold average and current macro just can distinguish the average of distortion threshold;
2. adjust LaGrange parameter, its expression formula of the LaGrange parameter after adjustment is:
Figure DEST_PATH_IMAGE056
Wherein Be regulatory factor,
Figure DEST_PATH_IMAGE058
Be the quantization step after adjusting,
Figure DEST_PATH_IMAGE060
The original quantization step of presentation code framework,
Figure 708924DEST_PATH_IMAGE050
What represent current macro just can distinguish the average of distortion threshold, What represent present frame just can distinguish the average of distortion threshold;
3. the encode optimization of cost function, the dynamic adjustments LaGrange parameter makes the rate-distortion optimization function in the situation that quantization step changes, and regains optimal solution; Its expression formula is:
Figure DEST_PATH_IMAGE062
Wherein
Figure DEST_PATH_IMAGE064
Be distorted signal,
Figure DEST_PATH_IMAGE066
Be the bit number of encoding under the different coding pattern,
Figure DEST_PATH_IMAGE068
It is the LaGrange parameter after adjusting.
7. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1 is characterized in that described step (6) carries out the entropy coding to the data that quantize, and generated code stream is as follows by the operating procedure of Internet Transmission:
1. the data after quantizing are carried out the entropy coding, make the data formation binary code stream after quantification;
2. encoding code stream passes through Internet Transmission.
CN201210402003.9A 2012-10-22 2012-10-22 Vision perception characteristic is utilized to instruct the method for multiple view video coding quantizing process Expired - Fee Related CN103124347B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210402003.9A CN103124347B (en) 2012-10-22 2012-10-22 Vision perception characteristic is utilized to instruct the method for multiple view video coding quantizing process

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210402003.9A CN103124347B (en) 2012-10-22 2012-10-22 Vision perception characteristic is utilized to instruct the method for multiple view video coding quantizing process

Publications (2)

Publication Number Publication Date
CN103124347A true CN103124347A (en) 2013-05-29
CN103124347B CN103124347B (en) 2016-04-27

Family

ID=48455183

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210402003.9A Expired - Fee Related CN103124347B (en) 2012-10-22 2012-10-22 Vision perception characteristic is utilized to instruct the method for multiple view video coding quantizing process

Country Status (1)

Country Link
CN (1) CN103124347B (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103475875A (en) * 2013-06-27 2013-12-25 上海大学 Image adaptive measuring method based on compressed sensing
CN103716623A (en) * 2013-12-17 2014-04-09 北京大学深圳研究生院 Video compression encoding-and-decoding method and encoder-decoder on the basis of weighting quantification
CN104219526A (en) * 2014-09-01 2014-12-17 国家广播电影电视总局广播科学研究院 HEVC rate distortion optimization algorithm based on just-noticeable perception quality judging criterion
CN104349167A (en) * 2014-11-17 2015-02-11 电子科技大学 Adjustment method of video code rate distortion optimization
CN104469386A (en) * 2014-12-15 2015-03-25 西安电子科技大学 Stereoscopic video perception and coding method for just-noticeable error model based on DOF
CN104488266A (en) * 2013-06-27 2015-04-01 北京大学深圳研究生院 AVS video compressing and coding method, and coder
CN105245890A (en) * 2015-10-16 2016-01-13 北京工业大学 Efficient video encoding method based on vision attention priority
CN105704497A (en) * 2016-01-30 2016-06-22 上海大学 Fast select algorithm for coding unit size facing 3D-HEVC
CN105850123A (en) * 2013-12-19 2016-08-10 汤姆逊许可公司 Method and device for encoding a high-dynamic range image
CN106454386A (en) * 2016-10-26 2017-02-22 广东电网有限责任公司电力科学研究院 JND (Just-noticeable difference) based video encoding method and device
CN107027031A (en) * 2016-01-31 2017-08-08 西安电子科技大学 A kind of coding method and device for video image
CN107094251A (en) * 2017-03-31 2017-08-25 浙江大学 A kind of video, image coding/decoding method and device adjusted based on locus adaptive quality
CN107197266A (en) * 2017-06-26 2017-09-22 杭州当虹科技有限公司 A kind of HDR method for video coding
CN108111852A (en) * 2018-01-12 2018-06-01 东华大学 Towards the double measurement parameter rate distortion control methods for quantifying splits' positions perceptual coding
CN108574841A (en) * 2017-03-07 2018-09-25 北京金山云网络技术有限公司 A kind of coding method and device based on adaptive quantizing parameter
CN110024382A (en) * 2017-07-19 2019-07-16 联发科技股份有限公司 The method and apparatus for reducing the pseudomorphism at the noncoherent boundary in encoded virtual reality image
CN110113606A (en) * 2019-03-12 2019-08-09 佛山市顺德区中山大学研究院 A kind of method, apparatus and equipment of removal human eye perception redundant video coding
CN113489983A (en) * 2021-06-11 2021-10-08 浙江智慧视频安防创新中心有限公司 Method and device for determining block coding parameters based on correlation comparison
CN114747214A (en) * 2019-10-21 2022-07-12 弗劳恩霍夫应用研究促进协会 Weighted PSNR quality metric for video encoded data
CN115967806A (en) * 2023-03-13 2023-04-14 阿里巴巴(中国)有限公司 Data frame coding control method and system and electronic equipment
WO2023155445A1 (en) * 2022-02-21 2023-08-24 翱捷科技股份有限公司 Rate distortion optimization method and apparatus based on motion detection

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1968419A (en) * 2005-11-16 2007-05-23 三星电子株式会社 Image encoding method and apparatus and image decoding method and apparatus using characteristics of the human visual system
CN101710995A (en) * 2009-12-10 2010-05-19 武汉大学 Video coding system based on vision characteristic
CN101854555A (en) * 2010-06-18 2010-10-06 上海交通大学 Video coding system based on prediction residual self-adaptation regulation
CN102420988A (en) * 2011-12-02 2012-04-18 上海大学 Multi-view video coding system utilizing visual characteristics
CN102724525A (en) * 2012-06-01 2012-10-10 宁波大学 Depth video coding method on basis of foveal JND (just noticeable distortion) model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1968419A (en) * 2005-11-16 2007-05-23 三星电子株式会社 Image encoding method and apparatus and image decoding method and apparatus using characteristics of the human visual system
CN101710995A (en) * 2009-12-10 2010-05-19 武汉大学 Video coding system based on vision characteristic
CN101854555A (en) * 2010-06-18 2010-10-06 上海交通大学 Video coding system based on prediction residual self-adaptation regulation
CN102420988A (en) * 2011-12-02 2012-04-18 上海大学 Multi-view video coding system utilizing visual characteristics
CN102724525A (en) * 2012-06-01 2012-10-10 宁波大学 Depth video coding method on basis of foveal JND (just noticeable distortion) model

Cited By (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103475875A (en) * 2013-06-27 2013-12-25 上海大学 Image adaptive measuring method based on compressed sensing
CN104488266A (en) * 2013-06-27 2015-04-01 北京大学深圳研究生院 AVS video compressing and coding method, and coder
CN103475875B (en) * 2013-06-27 2017-02-08 上海大学 Image adaptive measuring method based on compressed sensing
CN104488266B (en) * 2013-06-27 2018-07-06 北京大学深圳研究生院 AVS video compressing and encoding methods and encoder
CN103716623A (en) * 2013-12-17 2014-04-09 北京大学深圳研究生院 Video compression encoding-and-decoding method and encoder-decoder on the basis of weighting quantification
CN103716623B (en) * 2013-12-17 2017-02-15 北京大学深圳研究生院 Video compression encoding-and-decoding method and encoder-decoder on the basis of weighting quantification
CN105850123A (en) * 2013-12-19 2016-08-10 汤姆逊许可公司 Method and device for encoding a high-dynamic range image
CN105850123B (en) * 2013-12-19 2019-06-18 汤姆逊许可公司 The method and apparatus that high dynamic range images are encoded
US10574987B2 (en) 2013-12-19 2020-02-25 Interdigital Vc Holdings, Inc. Method and device for encoding a high-dynamic range image
CN104219526A (en) * 2014-09-01 2014-12-17 国家广播电影电视总局广播科学研究院 HEVC rate distortion optimization algorithm based on just-noticeable perception quality judging criterion
CN104219526B (en) * 2014-09-01 2017-05-24 国家广播电影电视总局广播科学研究院 HEVC rate distortion optimization algorithm based on just-noticeable perception quality judging criterion
CN104349167A (en) * 2014-11-17 2015-02-11 电子科技大学 Adjustment method of video code rate distortion optimization
CN104349167B (en) * 2014-11-17 2018-01-19 电子科技大学 A kind of method of adjustment of Video coding rate-distortion optimization
CN104469386B (en) * 2014-12-15 2017-07-04 西安电子科技大学 A kind of perception method for encoding stereo video of the proper appreciable error model based on DOF
CN104469386A (en) * 2014-12-15 2015-03-25 西安电子科技大学 Stereoscopic video perception and coding method for just-noticeable error model based on DOF
CN105245890B (en) * 2015-10-16 2018-01-19 北京工业大学 A kind of efficient video coding method of view-based access control model attention rate priority
CN105245890A (en) * 2015-10-16 2016-01-13 北京工业大学 Efficient video encoding method based on vision attention priority
CN105704497A (en) * 2016-01-30 2016-06-22 上海大学 Fast select algorithm for coding unit size facing 3D-HEVC
CN105704497B (en) * 2016-01-30 2018-08-17 上海大学 Coding unit size fast selection algorithm towards 3D-HEVC
CN107027031A (en) * 2016-01-31 2017-08-08 西安电子科技大学 A kind of coding method and device for video image
CN106454386B (en) * 2016-10-26 2019-07-05 广东电网有限责任公司电力科学研究院 A kind of method and apparatus of the Video coding based on JND
CN106454386A (en) * 2016-10-26 2017-02-22 广东电网有限责任公司电力科学研究院 JND (Just-noticeable difference) based video encoding method and device
CN108574841A (en) * 2017-03-07 2018-09-25 北京金山云网络技术有限公司 A kind of coding method and device based on adaptive quantizing parameter
CN107094251A (en) * 2017-03-31 2017-08-25 浙江大学 A kind of video, image coding/decoding method and device adjusted based on locus adaptive quality
CN107197266A (en) * 2017-06-26 2017-09-22 杭州当虹科技有限公司 A kind of HDR method for video coding
CN110024382A (en) * 2017-07-19 2019-07-16 联发科技股份有限公司 The method and apparatus for reducing the pseudomorphism at the noncoherent boundary in encoded virtual reality image
US11049314B2 (en) 2017-07-19 2021-06-29 Mediatek Inc Method and apparatus for reduction of artifacts at discontinuous boundaries in coded virtual-reality images
CN110024382B (en) * 2017-07-19 2022-04-12 联发科技股份有限公司 Method and device for processing 360-degree virtual reality image
CN108111852A (en) * 2018-01-12 2018-06-01 东华大学 Towards the double measurement parameter rate distortion control methods for quantifying splits' positions perceptual coding
CN108111852B (en) * 2018-01-12 2020-05-29 东华大学 Double-measurement-parameter rate-distortion control method for quantization block compressed sensing coding
CN110113606A (en) * 2019-03-12 2019-08-09 佛山市顺德区中山大学研究院 A kind of method, apparatus and equipment of removal human eye perception redundant video coding
CN114747214A (en) * 2019-10-21 2022-07-12 弗劳恩霍夫应用研究促进协会 Weighted PSNR quality metric for video encoded data
CN113489983A (en) * 2021-06-11 2021-10-08 浙江智慧视频安防创新中心有限公司 Method and device for determining block coding parameters based on correlation comparison
WO2023155445A1 (en) * 2022-02-21 2023-08-24 翱捷科技股份有限公司 Rate distortion optimization method and apparatus based on motion detection
CN115967806A (en) * 2023-03-13 2023-04-14 阿里巴巴(中国)有限公司 Data frame coding control method and system and electronic equipment

Also Published As

Publication number Publication date
CN103124347B (en) 2016-04-27

Similar Documents

Publication Publication Date Title
CN103124347B (en) Vision perception characteristic is utilized to instruct the method for multiple view video coding quantizing process
CN102420988B (en) Multi-view video coding system utilizing visual characteristics
CN101416511B (en) Quantization adjustments based on grain
CN101931815B (en) Quantization adjustment based on texture level
KR20190117651A (en) Image processing and video compression methods
CN106534862B (en) Video coding method
CN103079063B (en) A kind of method for video coding of vision attention region under low bit rate
CN100464585C (en) Video-frequency compression method
CN103051901B (en) Video data coding device and method for coding video data
CN101257630B (en) Video frequency coding method and device combining with three-dimensional filtering
CN101325711A (en) Method for controlling self-adaption code rate based on space-time shielding effect
CN111083477B (en) HEVC (high efficiency video coding) optimization algorithm based on visual saliency
CN101710993A (en) Block-based self-adaptive super-resolution video processing method and system
CN102186070A (en) Method for realizing rapid video coding by adopting hierarchical structure anticipation
CN103179394A (en) I frame rate control method based on stable area video quality
CN107211145A (en) The almost video recompression of virtually lossless
CN101188755A (en) A method for VBR code rate control in AVX decoding of real time video signals
CN112825557B (en) Self-adaptive sensing time-space domain quantization method aiming at video coding
CN108924554A (en) A kind of panorama video code Rate-distortion optimization method of spherical shape weighting structures similarity
CN102984541B (en) Video quality assessment method based on pixel domain distortion factor estimation
CN105657433A (en) Image complexity based signal source real-time coding method and system
CN102984540A (en) Video quality assessment method estimated on basis of macroblock domain distortion degree
CN107580217A (en) Coding method and its device
KR970073152A (en) An improved image encoding system having an adaptive quantization control function and a quantization control method thereof (IMPROVED IMAGE CODING SYSTEM USING ADAPTIVE QUANTIZATION TECHNIQUE AND ADAPTIVE QUANTIZATION CONTROL METHOD THEREOF)
CN104702959A (en) Intra-frame prediction method and system of video coding

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160427

Termination date: 20211022