CN103124347A

CN103124347A - Method for guiding multi-view video coding quantization process by visual perception characteristics

Info

Publication number: CN103124347A
Application number: CN2012104020039A
Authority: CN
Inventors: 王永芳; 商习武; 刘静; 宋允东; 张兆杨
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2012-10-22
Filing date: 2012-10-22
Publication date: 2013-05-29
Anticipated expiration: 2032-10-22
Also published as: CN103124347B

Abstract

The invention relates to a method for guiding coding quantization process by visual perception characteristics. The method includes the following operating steps that (1) a brightness value of each frame of an input video sequence is read, and a just noticeable distortion threshold value model of a frequency domain is established; (2) prediction on each frame passing through the viewpoints and between the viewpoints in the input video sequence is performed; (3) residual error data are subjected to discrete cosine transform; (4) a quantifying step size of each macro block in the current frame is dynamically adjusted; (5) lagrangian parameters in a rate-distortion optimization process are dynamically adjusted; and (6) quantized data are subjected to entropy coding to form a code stream, and the code stream is transmitted through network. By means of the method, the video compression efficiency is improved under the condition that subjective qualities are basically unchanged, and the video is suitable for being transmitted in the network.

Description

Utilize vision perception characteristic to instruct the method for multiple view video coding quantizing process

Technical field

The present invention relates to multi-view point video encoding and decoding technique field, particularly utilize vision perception characteristic to instruct the method for multiple view video coding quantizing process, be applicable to the encoding and decoding of high definition 3D vision signal.

Background technology

Along with era development, people are more and more higher to the requirement of audiovisual impression, are not content with existing haplopia two-dimensional video.People are more and more higher for the third dimension experience requirements, can experience third dimension from the third dimension of fixed angle to any angle, thereby expedite the emergence of out the development of multi-vision-point encoding technology.Yet the data that many viewpoints require improve greatly, how effectively to improve video compression efficiency and become study hotspot.At present, video compression technology mainly concentrates on and removes spatial redundancy, time redundancy and statistical redundancy three aspects.Although video experts is released technology of video compressing encoding of new generation (HEVC), expect that H.264 video compression efficiency is doubling on the basis again.Yet the characteristic due to human visual system (HVS) self exists the perception redundancy and still is not removed.Along with to human-eye visual characteristic research gradually deeply, what have that the video worker proposed to remove the human eye redundancy just can distinguish distortion model (Just Noticeable Distortion, JND).Namely according to the size of the JND threshold metric perception redundancy that obtains, when changing value lower than this threshold value just not by Human Perception.

Research for JND at present mainly is divided into two large class: pixel domain JND and frequency domain JND models.Wherein, the JND model that proposes in document [1] is classical pixel domain model, has studied respectively that characteristic is covered in brightness, texture covers characteristic and time domain is covered characteristic.The frequency domain JND model that proposes in document [2] has also been studied the sensitiveness of human eye to the different frequency section outside having studied first three specific character, make like this frequency domain JND model more meet the visual characteristic of human eye.

For the JND model that proposes in document [2], it is at present more complete DCT territory JND model.It covers characteristic except the brightness that comprises pixel and texture is covered characteristic, has also increased spatial sensitivity function effect.The spatial sensitivity function has reflected the bandpass characteristics of human eye, reaches removal Human Perception frequency redundancy purpose by removing the imperceptible frequency content of human eye.In the time domain shielding effect, comprise level and smooth eyeball and moved effect, not only comprised the size of motion amplitude, also comprised the directional information of motion.After having the researcher that it is combined with multi-view point video to act on residual error dct transform (discrete cosine transform), greatly improved compression efficiency.But, do not use it for other cataloged procedure such as quantizing process, therefore that it removes visual redundancy is thorough not.

The JND model of setting up in document [3] is although proposed to utilize the JND model to instruct quantizing process.Yet the JND model of its foundation is pixel domain, has lacked the process of removing the human eye frequency redundancy, causes instructing quantizing process accurate not.Secondly, guaranteed subjective quality for the JND model, only need to regulate quantized value to the insensitive place of human eye, and other area quantization value has remained unchanged.Adjusting quantization parameter simultaneously, corresponding adjustment LaGrange parameter at last.

Patent application of the present invention proposes DCT territory JND model is applied to quantizing process in multiple view video coding first, in the situation that guarantee that subjective quality is constant, further improves video compression efficiency.

Document [1]: X. Yang, W. Lin, and Z. Lu, " Motion-compensated residue preprocessing in video coding based on just-noticeable-distortion profile; " IEEE Trans. Circuits Syst. Video Technol., vol. 15, and no. 6, pp. 742 – 752,2005.

Document [2]: Zhenyu Wei and King N. Ngan., " Spatio-Temporal Just Noticeable Distortion Profile for Grey Scale Image/Video in DCT Domain. " IEEE transactions on circuits and systems for video technology.VOL. 19, NO. 3, March 2009.

Document [3]: Z. Chen and C. Guillemot, " Perceptually friendly H.26/AVC video coding based on foveated just noticeable distortion model; " IEEE Trans. Circuits Syst. Video Technol., vol. 20, no. 6, pp. 806 – 819, Jun.2010.

Summary of the invention

The objective of the invention is the defective for the prior art existence, a kind of method of utilizing vision perception characteristic to instruct the multiple view video coding quantizing process is provided, the method is in the situation that guarantee that Subjective video quality is constant, use frequency domain JND model to instruct many viewpoints quantizing process, to the insensitive zone raising of human eye quantization step, improved video compression efficiency.When adjusting step-length, dynamically the LaGrange parameter of regulation aberration optimizing function, make code efficiency further improve.

For achieving the above object, the present invention adopts following technical scheme:

A kind of method of utilizing vision perception characteristic to instruct the multiple view video coding quantizing process is characterized in that operating procedure is as follows:

(1) read the brightness value size of each frame of input video sequence, that sets up frequency domain just can distinguish the distortion threshold model,

(2) each frame of input video sequence passes through in viewpoint and the prediction between viewpoint,

(3) residual error data is carried out discrete cosine transform (dct transform),

(4) quantization step of each macro block in the dynamic adjustments present frame,

(5) LaGrange parameter in dynamic adjustments rate-distortion optimization process,

(6) data that quantize are carried out the entropy coding, generated code stream passes through Internet Transmission.

Of the present inventionly utilize method that vision perception characteristic instructs the multiple view video coding quantizing process compared with the prior art, have following apparent outstanding substantive distinguishing features and significantly technological progress:

1), this multi-view point video encoding method when guaranteeing the reconstruction video mass conservation, make cataloged procedure just can reduce encoder bit rate by quantizing this subprogram, in test, maximal rate can drop to 12.35%;

2), this multi-view point video encoding method is when guaranteeing the reconstruction video mass conservation, adopt average subjective scores difference, when the subjective scores difference near 0 the time, the subjective quality that two kinds of methods are described is more approaching, the average subjective scores difference of this method is 0.03, therefore says that the subjective quality of subjective quality of the present invention and multi-view point video encoding and decoding JMVC code is suitable;

3), this multi-view point video encoding method do not have to increase complicated especially cataloged procedure, improves the Video coding compression efficiency with less complexity.

Description of drawings

Fig. 1 is that the vision perception characteristic that utilizes in the present invention instructs the theory diagram of the method for multiple view video coding quantizing process.

Fig. 2 be frequency domain just can distinguish the block diagram of distortion model.

Fig. 3 is in viewpoint/block diagram of a prediction.

Fig. 4 is the dct transform block diagram.

Fig. 5 is the block diagram of dynamic adjustments quantization step.

Fig. 6 is the block diagram of the LaGrange parameter in dynamic adjustments rate distortion costs function.

Fig. 7 is the block diagram of entropy coding output.

Fig. 8 a is the reconstructed image that the 0th viewpoint of video sequence ballroom the 15th two field picture uses JMVC original coding method.

Fig. 8 b is the reconstructed image that the 0th viewpoint of video sequence ballroom the 15th two field picture uses the inventive method.

Fig. 9 is that video sequence ballroom uses JMVC original coding method and the inventive method in different Q P and different points of view situation, the comparing result of code check, PSNR value, reconstruction video subjective quality assessment mark poor (DM0S)

Figure 10 a is the reconstructed image that the 1st viewpoint of video sequence race1 the 35th two field picture uses JMVC original coding method.

Figure 10 b is the reconstructed image that the 1st viewpoint of video sequence race1 the 35th two field picture uses the inventive method

Figure 11 is that video sequence race1 uses JMVC original coding method and the inventive method in different Q P and different points of view situation, the comparing result of code check, PSNR value, reconstruction video subjective quality assessment mark poor (DM0S)

Figure 12 a is the reconstructed image that the 2nd viewpoint of video sequence Crowd the 45th two field picture uses JMVC original coding method.

Figure 12 b is the reconstructed image that the 2nd viewpoint of video sequence Crowd the 45th two field picture uses the inventive method.

Figure 13 is that video sequence Crowd uses JMVC original coding method and the inventive method in different Q P and different points of view situation, the comparing result of code check, PSNR value, the average subjective scoring difference of reconstruction video (DM0S).

Embodiment

Below in conjunction with accompanying drawing, the preferred embodiments of the present invention are described in further detail:

Embodiment one:

The present embodiment utilizes vision perception characteristic to instruct the method for multiple view video coding quantizing process, referring to Fig. 1, comprises the following steps:

(3) residual error data is carried out discrete cosine transform,

Embodiment two: the present embodiment and embodiment one are basic identical, and special feature is as follows:

Set up frequency domain JND model in above-mentioned steps (1) and comprise four models, referring to Fig. 2:

(1-1) spatial contrast sensitivity function model is the bandpass characteristics curve according to human eye, for the particular space frequency

Its basic JND threshold value can be expressed as:

Spatial frequency

Computing formula be:

Wherein,

With The coordinate position of expression discrete cosine transform block,

Be the dimension of discrete cosine transform block,

With

The visual angle of expression horizontal and vertical it is generally acknowledged that the horizontal view angle equals the vertical angle of view, and it is expressed as:

Because the human eye vision susceptibility has directivity, more responsive to the horizontal and vertical direction ratio, relatively less to the susceptibility of other directions.The modulation factor that adds thus direction can get:

Be the angle of the frequency of DCT coefficient vector representative,

For DCT coefficient normalization factor expression formula is:

Add at last the control parameter

The modulation factor that forms final spatial sensitivity function is:

In the multi-vision-point encoding process, due to the dct transform that has 8 * 8 and 4 * 4 sizes, therefore parameter is distinguished to some extent.In experiment, for the DCT coded format of 8 * 8 sizes,

Be 0.6,

Be 1.33,

Be 0.11, Be 0.18; For the DCT coded format of 4 * 4 sizes,

Be 0.6,

Be 0.8,

Be 0.035,

Be 0.008.

(1-2) brightness shielding effect model is according to experiment, and the human eye vision sensitivity of awareness is more responsive in black and brighter background area at the regional ratio of intermediate grey values, simulates at last brightness shielding effect curve, and its expression formula is:

Wherein

It is the average brightness value of present encoding piece.

(1-3) texture shielding effect model is the difference according to image texture, image can be divided into Three regions: frontier district, smoothly district and texture area.Human eye reduces its susceptibility successively.Usually utilize the canny operator to tell the regional of image.

The edge pixel density of utilizing the canny operator to obtain is as follows:

Wherein,

Be the edge pixel sum of piece, obtained by the Canny edge detector.

Utilize edge pixel density Image block is divided into the flat region, texture area and marginal zone, it is as follows according to formula that image block is classified:

For texture region, eyes are insensitive to the low frequency part distortion, but HFS suitably keeps.Therefore obtain contrasting the estimation factor of covering be:

Wherein (

) be DCT coefficient label.

Due to the eclipsing effects of spatial contrast sensitivity function effect and luminance effect, obtain the final shielding effect factor and be:

Wherein,

The of expression input video sequence

Frame,

Be the DCT coefficient,

Be the threshold value of spatial contrast degree sensitivity function,

Be brightness shielding effect characteristic modulation factor.

(1-4) time contrast sensitivity function model is to be according to the modulation factor that experiment records the time domain shielding effect:

Wherein, The expression temporal frequency,

The representation space frequency.Temporal frequency

Its general computing formula is as follows:

Be respectively the horizontal and vertical component of spatial frequency,

Speed for object of which movement on retina.

Calculating formula be:

Wherein,

With

Expression pixel level and vertical visual angle,

Be the dct transform dimension,

With

The coordinate position of expression discrete cosine transform block.

The speed of retina epigraph

Computational methods are as follows:

Wherein,

Be that the smooth pursuit eyeball moves the effect gain, get 0.98 in experiment.

Represent object in the speed of the plane of delineation,

The eyeball translational speed of the minimum that expression causes due to drift motion, its empirical value is 0.15.deg/s.

Be the maximal rate of the eyeball corresponding with the eyes jumping, usually get 80deg/s,

It is the frame per second of video sequence.

The motion vector of each piece, It is the visual angle of pixel.

What (1-5) weighted product of four kinds of factors namely consisted of current encoded frame just can distinguish distortion threshold, and its expression formula is:

Wherein,

Be the threshold value of spatial contrast degree sensitivity function,

Be brightness shielding effect modulation factor,

Be the shielding effect modulation factor,

For time domain is covered modulation factor.

Above-mentioned steps (2) is that input video sequence is carried out between viewpoint/interior prediction, and referring to Fig. 3, its concrete steps are as follows:

(2-1) interframe in viewpoint/interior prediction is to remove the time redundancy of present frame by the inter prediction in viewpoint, removes the spatial redundancy of present frame by the infra-frame prediction in viewpoint.The sort of prediction mode of selection rate aberration optimizing function minimum in infra-frame prediction and inter prediction.Wherein the rate-distortion optimization function expression is:

Wherein

Be distorted signal,

Be the bit number of encoding under the different coding pattern,

It is the LaGrange parameter after adjusting.

The prediction of (2-2) carrying out between viewpoint is because this method is a plurality of viewpoints of coding, predicts present frame by the corresponding frame between viewpoint, can remove the redundant information between viewpoint.

(2-3) relatively between viewpoint and the coding cost in viewpoint, select best prediction mode in prediction again in viewpoint and the prediction mode between viewpoint relatively, the prediction mode of selection rate aberration optimizing cost function minimum is the optimum prediction mode.Take into full account between viewpoint and viewpoint in redundancy properties, select suitable prediction mode further to improve video compression efficiency.

Above-mentioned steps (3) is carried out discrete cosine transform to residual error data, and referring to Fig. 4, its concrete steps are as follows:

(3-1) judgement of coded block size, coded block size has in the multi-vision-point encoding method

Seven kinds of situations, front four kinds are summed up as

Transform block, rear three kinds are Transform block.

(3-2) corresponding dct transform, for Transform block adopts

Dct transform, for

Transform block adopts

Dct transform.

The quantization step of each macro block in above-mentioned steps (4) dynamic adjustments present frame, referring to Fig. 5, its concrete steps are as follows:

(4-1) the JND model by having set up is obtained the average JND value of present frame, and average JND threshold value is:

Wherein,

With

Height and the width of difference presentation graphs picture frame,

What represent present frame just can distinguish distortion threshold,

The coordinate of expression pixel.

(4-2) the JND average of current macro, the average JND threshold value of M macro block is expressed as:

(4-3) quantization step of dynamic adjustments current macro can distinguish that just distortion threshold has reflected the difference of human eye to the susceptibility of piece image various piece, therefore can come according to the difference that just can distinguish distortion threshold the quantization step of each macro block of dynamic adjustments.For the insensitive place of human eye, with suitable the tuning up of quantization step, otherwise quantized value is constant.The quantization parameter that proposes is adjusted to:

Wherein,

The original step-length of coding framework,

Be regulatory factor, its expression formula is provided by following formula:

Wherein,

LaGrange parameter in above-mentioned steps (5) dynamic adjustments rate-distortion optimization process, referring to Fig. 6, its concrete operation step is as follows:

(5-1) calculate and compare the JND average of present frame and the JND average of current coding macro block, for next step weighting to LaGrange parameter provides foundation.

(5-2) adjust Suzanne Lenglen day parameter, quantization parameter has been regulated in the front, and distortion value and code check in Lagrangian rate-distortion optimization change, and uses original LaGrange parameter this moment again Value just can not guarantee it is optimal solution.Corresponding weighting LaGrange parameter, can make cost function again reach optimum simultaneously, after adjustment

For:

Wherein,

The quantization parameter that generates in expression multi-vision-point encoding method,

Expression the Quantization parameter value after individual macro block is adjusted.

(5-3) LaGrange parameter after adjusting is updated in the rate-distortion optimization cost function, and its expression formula is as follows:

Wherein

Be distorted signal,

Be the bit number of encoding under the different coding pattern,

It is the LaGrange parameter after adjusting.Make like this when quantization parameter changes, corresponding change LaGrange parameter makes the rate-distortion optimization function still obtain optimal solution.

Above-mentioned steps (6) is carried out the entropy coding to the data that quantize, and generated code stream is by Internet Transmission, and referring to Fig. 7, its concrete steps are as follows:

(6-1) data that quantize are carried out the entropy coding, make like this data of quantification the most effectively to be represented by binary code stream, removed the statistical redundancy of quantized data.

The code stream that (6-2) the entropy coding is formed is realized the transmission of video by Internet Transmission.Because its occupied bandwidth is little, can better adapt to Internet Transmission in the coding method of processing through vision perception characteristic.

The below carries out the performance that a large amount of emulation experiments are assessed the multi-view point video encoding method that utilizes visual characteristic that this paper proposes.Be configured to Intel Pentium 4 CPU 3.00GHz, 512M Internal Memory, Intel 8254G Express Chipset Family, front 48 frames of encoding and decoding multi-view point video sequence ballroom, race1, crowd on the PC of Windows XP Operation System, wherein, BASIC QP is made as 20,24,28,32, experiment porch is selected multi-view point video encoding and decoding reference software JMVC, and the encoding and decoding predict is selected HHI-IBBBP, and the interview prediction mode adopts bi-directional predicted mode.

The experimental result of video sequence ballroom such as Fig. 8 a～8b, shown in Figure 9.Fig. 8 a is video sequence ballroom in the situation that quantization parameter QP=24, and the 0th viewpoint the 15th two field picture uses the reconstructed image of JMVC original coding method, the PSNR=40.31dB of reconstruction video image.Fig. 8 b is video sequence ballroom in the situation that quantization parameter QP=24, and the 0th viewpoint the 15th two field picture uses the reconstruction video image of the inventive method, the PSNR=40.10dB of reconstruction video image.Fig. 9 is that video sequence ballroom uses JMVC original coding and two kinds of methods of the present invention, in the situation that different Q P and different points of view, code check, PSNR value, code check are saved the statistics of percentage, reconstruction video subjective quality assessment mark poor (DM0S), average bit rate saving percentage.Can find out, video sequence ballroom is under different Q P, use the encoder bit rate of the inventive method to save 7.47%～9.16% than the encoder bit rate that uses JMVC original coding method, the Subjective video quality evaluation score of JMVC original coding method and the inventive method is poor is 0.03～0.07, can think that subjective quality remains unchanged.

The experimental result of video sequence race1 such as Figure 10 a～10b, shown in Figure 11.Figure 10 a is video sequence race1 in the situation that quantization parameter QP=24, and the 1st viewpoint the 25th two field picture uses the reconstruction video image of JMVC original coding method, the PSNR=41.15dB of reconstruction video image.Figure 10 b is video sequence race1 in the situation that quantization parameter QP=24, and the 1st viewpoint the 36th two field picture uses the reconstruction video image of JMVC original coding method, the PSNR=40.51dB of reconstruction video image.Figure 11 is that video sequence race1 uses JMVC original coding and two kinds of methods of the present invention, in the situation that different Q P and different points of view, code check, PSNR value, code check are saved the statistics of percentage, reconstruction video subjective quality assessment mark poor (DM0S), average bit rate saving percentage.Can find out, video sequence race1 is under different Q P, use the encoder bit rate of the inventive method to save 10.77%～12.35% than the encoder bit rate that uses JMVC original coding method, the Subjective video quality evaluation score of JMVC original coding method and the inventive method is poor is 0.06～0.09, can think that subjective quality remains unchanged.

The experimental result of video sequence crowd such as Figure 12 a～12b, shown in Figure 13.Figure 12 a is video sequence crowd in the situation that quantization parameter QP=35, and the 2nd viewpoint the 45th two field picture uses the reconstruction video image of JMVC original coding method, the PSNR=33.77dB of reconstruction video image.Figure 12 b is video sequence crowd in the situation that quantization parameter QP=35, and the 2nd viewpoint the 45th two field picture uses the reconstruction video image of JMVC original coding method, the PSNR=33.12dB of reconstruction video image.Figure 13 is that video sequence crowd uses JMVC original coding and two kinds of methods of the present invention, in the situation that different Q P and different points of view, code check, PSNR value, code check are saved the statistics of percentage, reconstruction video subjective quality assessment mark poor (DM0S), average bit rate saving percentage.Can find out, video sequence crowd is under different Q P, use the encoder bit rate of the inventive method to save 8.95%～9.83% than the encoder bit rate that uses JMVC original coding method, the Subjective video quality evaluation score of JMVC original coding method and the inventive method is poor is 0.03～0.08, can think that subjective quality remains unchanged.

Can find out in conjunction with above each chart, the present invention is by setting up the JND model in DCT territory, and it is applied to multiple view video coding framework quantizing process and rate-distortion optimization process, in the situation that guarantee that subjective quality is constant, decrease multiple view video coding code check has improved the compression efficiency of multiple view video coding.

Claims

1. method of utilizing vision perception characteristic to instruct the multiple view video coding quantizing process is characterized in that operating procedure is as follows:

(3) residual error data is carried out discrete cosine transform,

2. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1, it is characterized in that described step (1) reads the brightness value size of each frame of input video sequence, the operating procedure that just can distinguish the distortion threshold model of setting up frequency domain is as follows:

1. obtain respectively the spatial sensitivity factor of 4x4 and 8x8DCT conversion according to the dimension of dct transform

, its formula is:

Wherein s is the control parameter,

Be the angle of the frequency of DCT coefficient vector representative,

Be DCT coefficient normalization factor,

Be spatial frequency, parameter r, a, b and c are according to varying in size of dct transform and difference: for the DCT coded format of 8 * 8 sizes, Be 0.6,

Be 1.33,

Be 0.11, Be 0.18; For the DCT coded format of 4 * 4 sizes,

Be 0.6,

Be 0.8,

Be 0.035, Be 0.008;

2. record human eye under the Different background illumination condition according to experiment, the brightness shielding effect Curve is expressed as follows:

Wherein, Average pixel value for the present encoding piece;

3. utilize edge detector to detect the texture features of present encoding piece, obtain texture and cover the factor

, its expression formula is as follows:

Wherein,

The transverse and longitudinal coordinate coefficient of expression transform block,

The estimation factor is covered in the expression contrast, Be the spatial sensitivity factor,

Dct transform coefficient for n encoding block of present frame;

4. the speed of object of which movement in frame every according to video sequence, test and record the time domain shielding effect factor

Expression formula is:

Wherein,

Be spatial frequency,

Be temporal frequency;

5. described step 1. ~ weighted product of four kinds of factors of 4. trying to achieve namely consist of current encoded frame just can distinguish distortion threshold.

3. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1, it is characterized in that each frame of described step (2) input video sequence through in viewpoint and the operating procedure of the prediction between viewpoint as follows:

1. carry out interframe and infra-frame prediction in viewpoint, predicted value and the current frame that will encode are compared, choose the less a kind of coded system of coding cost;

2. carry out the prediction between viewpoint, the current encoded frame of current view point predicts according to the corresponding frame of reference view, and the corresponding frame of predicted value and reference view is compared, and tries to achieve the coding cost of interview prediction;

3. compare between viewpoint and the coding cost in viewpoint, select the sort of predictive mode than the lower Item cost.

4. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1 is characterized in that the operating procedure that described step (3) carries out discrete cosine transform to residual error data is as follows:

1. the judgement of coded block size less than 8, classifies as the 4x4 transform block when arbitrary length of side of encoding block, otherwise, be the 8x8 transform block;

2. when being the 4x4 transform block, select the 4x4 dct transform, when being the 8x8 transform block, select the 8x8DCT conversion.

5. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1 is characterized in that the operating procedure of the quantization step of each macro block in described step (4) dynamic adjustments present frame is as follows:

That 1. calculates present frame just can distinguish the mean value of distortion threshold;

That 2. calculates current coding macro block just can distinguish distortion threshold mean value;

3. more every frame just can distinguish distortion threshold average and current macro just can distinguish the average of distortion threshold, the quantization step of dynamic adjustments current macro, its expression formula of the quantization step after adjusting is as follows:

Wherein, The original quantization step of presentation code framework,

What represent current macro just can distinguish the average of distortion threshold,

What represent present frame just can distinguish the average of distortion threshold, Be regulatory factor.

6. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1 is characterized in that the operating procedure of the LaGrange parameter in described step (5) dynamic adjustments rate-distortion optimization process is as follows:

1. more every frame just can distinguish distortion threshold average and current macro just can distinguish the average of distortion threshold;

2. adjust LaGrange parameter, its expression formula of the LaGrange parameter after adjustment is:

Wherein Be regulatory factor,

Be the quantization step after adjusting,

The original quantization step of presentation code framework,

What represent current macro just can distinguish the average of distortion threshold, What represent present frame just can distinguish the average of distortion threshold;

3. the encode optimization of cost function, the dynamic adjustments LaGrange parameter makes the rate-distortion optimization function in the situation that quantization step changes, and regains optimal solution; Its expression formula is:

Wherein

Be distorted signal,

Be the bit number of encoding under the different coding pattern,

It is the LaGrange parameter after adjusting.

7. the method for utilizing vision perception characteristic to instruct the multi-vision-point encoding quantizing process according to claim 1 is characterized in that described step (6) carries out the entropy coding to the data that quantize, and generated code stream is as follows by the operating procedure of Internet Transmission:

1. the data after quantizing are carried out the entropy coding, make the data formation binary code stream after quantification;

2. encoding code stream passes through Internet Transmission.