CN1713729A

CN1713729A - Video frequency compression

Info

Publication number: CN1713729A
Application number: CN 200410060102
Authority: CN
Inventors: 熊联欢
Original assignee: Huawei Technologies Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2004-06-24
Filing date: 2004-06-24
Publication date: 2005-12-28
Anticipated expiration: 2024-06-24
Also published as: CN100366091C

Abstract

After making motion estimation, motion compensation, DCT transformation and quantization, and entropy coding for input video data at transmission end, the video data is sent to reception end and is mad a reversing process at reception end to recover the video data. The method also includes following steps: the image is divided into foreground image and background image that are identified as foreground macro block and background macro block; the quantify parameter (QP) of current foreground macro block and background macro block is made quantization step regulation.

Description

A kind of video-frequency compression method

Technical field

The present invention relates to field of video communication, relate in particular to the video-frequency compression method in a kind of TV conference system and the video-phone system.

Technical background

Existing video image compression standard MPEG-1, MPEG-2, MPEG-4, H.261, H.263, H.264 all be based on dct transform, what wherein H.264 adopt with MPEG-4AVC is the integer transform method that performance approaches DCT, basic processing unit all is the macro block of 16 * 16 sizes, and its general processing block frame as shown in Figure 1.

As seen, in the prior art, may further comprise the steps among the figure:

The transmitting terminal inputted video image, and it is divided into 16 * 16 macro block;

Transmitting terminal carries out estimation and motion compensation to the inputted video image data;

Above-mentioned data are carried out dct transform to be handled;

Data after the conversion are carried out quantification treatment;

Data after the quantification treatment are carried out entropy coding, and the data after will encoding are sent to receiving terminal.

Receiving terminal carries out the processing of inverse process to the data that receive, and gets final product the decoding video images data, and the vedio data after the reduction can be used.

In said process, except MPEG4 comprises object-based coding method, other all be that image is regarded as an integral body, and the particular content of differentiate between images the inside not.Like this, obviously, differentiate between images content accurately.

Summary of the invention

In video conferencing and video telephone etc. are used, to be higher than other content in the image to the degree of concern of the colour of skin (particularly people's face) generally speaking, therefore, when using video compression standard to carry out video image compression, particularly when low bandwidth scenario, if can manage to make the loss of colour of skin view data to try one's best little and have fidelity preferably, then can receive more gratifying video signal communicative effect.The present invention just is based on and H.263 waits video compression standard, proposes a kind of video image data compression and Methods for Coding based on Face Detection and motion detection, to improve picture quality.

For this reason, the present invention adopts following technical scheme:

A kind of video-frequency compression method, after transmitting terminal carries out the processing of estimation, motion compensation, dct transform, quantification and entropy coding step with inputting video data, described video data is sent to receiving terminal, and carry out the treatment step opposite at receiving terminal, the reduction vedio data with transmitting terminal;

It is characterized in that described method also is included in transmitting terminal and carries out:

Detect step: image is divided into foreground image and background image, and is designated foreground macro block and background macro block;

QP value set-up procedure: aforementioned foreground macro block and background macro block quantized level parameter QP value are carried out the quantized level adjustment.

Described detection step comprises Face Detection step and motion detection step.

A nearly step of described Face Detection step comprises:

Broca scale is as the initial survey step: judge the Face Detection content on the macro block level;

Background image is filled treatment step: the background image behind the initial survey is filled processing;

Foreground image is filled treatment step: the foreground image behind the initial survey is filled processing.

Described background image is filled treatment step and is further comprised:

Judge isolated module;

Statistics drops on the target macroblock number in the background image template, judges background image according to setting threshold, and background image is filled processing.

Described foreground image is filled treatment step and is further comprised:

Statistics drops on the target macroblock number in the foreground image template;

Judge foreground image according to setting threshold, and foreground image is filled processing.

Described method also comprises the step that fill on the border.

Described motion detection step comprises:

According to the difference of correspondence position macro block in current macro that calculates in the estimation and the former frame reconstructed image and the kinetic characteristic that motion estimation vectors is judged current macro, and add up the big motion macro block sum of present frame, middle motion macro block sum and little motion macro block sum respectively according to setting threshold.

Described macro block quantized level set-up procedure further comprises:

To put in order the two field picture macro block and be designated complexion macro block sign, motion macro block sign, stationary part sign respectively, and the motion macro block will be designated the motion macro block (mb) type of different stage respectively according to the motion vector threshold value of setting.

The motion macro block (mb) type of described different stage is respectively big motion macro block, middle motion macro block and little motion macro block, and described processing mode further comprises:

The big case of motion of prospect: prospect code check proportion reduces slightly than result of calculation, the corresponding background code check proportion that increased;

Case of motion in the prospect: prospect code check proportion is according to the result of calculation value;

The little case of motion of prospect: prospect code check proportion increases slightly than result of calculation, the corresponding background code check proportion that reduced.

Described method further comprises one of following determining step or combination:

Surpass first threshold if colour of skin MB number is counted sum with motion MB, then do not carry out the QP value of colour of skin MB and adjust;

If colour of skin MB number surpasses second threshold value, then the QP value of the MB before first colour of skin MB will add 1;

If motion MB number surpasses the 3rd threshold value, the QP value of the MB that then do not move is adjusted;

The MB number surpasses the 4th threshold value if move greatly, and then the QP value of the static MB before first big motion MB will subtract 1;

If little motion MB number surpasses the 5th threshold value, then the QP value of the static MB before first little motion MB will add 1;

If static MB number surpasses the 6th threshold value, then the QP value of all MB before first motion MB will subtract 1.

The present invention be H.261, H.263, during standards such as MPEG1, MPEG2 and MPEG4 realize, do not rely on object shapes information, by increasing pre-treatment processes such as Face Detection and motion detection, foreground image and background image are considered respectively and adjusted the QP value, for video communication applications, under the situation that keeps total code efficiency and overall image quality, can obtain the good image subjective effect of content-based coding, the computation complexity of increase is also less.

Description of drawings

Fig. 1 is based on the general processing frame diagram of the video compression of DCT;

Fig. 2 is a flow chart of the present invention;

Fig. 3 is the schematic diagram that fill on border of the present invention.

Embodiment

Below in conjunction with Figure of description the specific embodiment of the present invention is described.

For video communication applications such as video conferencing and video telephone, video image can be divided into prospect and background, help to improve subjective visual quality do like this, meet the attentiveness centralized mechanism of human eye, and make the subjective quality that under low bandwidth, still can guarantee image.Preceding scenic spot mainly is meant the object of interested target (comprising the colour of skin etc.), motion and the object of close camera.Background area mainly is meant non-interested target, static object and away from the object of camera.In video conferencing and video telephone etc. are used, to be higher than other content in the image to the degree of concern of foreground image (particularly facial image) generally speaking, to particularly spokesman's image and also relatively concern of some actions of participant, therefore, when using video compression standard to carry out video image compression, particularly when low bandwidth scenario, if can manage to make the loss of colour of skin view data to try one's best little and have fidelity preferably, and improve the definition of foreground image as far as possible, then can receive more gratifying video signal communicative effect.

The present invention is by Face Detection and motion detection, with image segmentation is foreground image and background image, identify foreground macro block and background macro block respectively, then Rate Control is calculated macro block quantized level parameter QP (the Quantization Prediction of gained, being used for quantizing factor that image pixel value or prediction difference are dwindled) value carries out prospect and the adjustment of background quantized level, guaranteeing that to reach overall image quality improves the effect of foreground image quality under the situation preferably, thereby obtaining subjective picture quality preferably.Its handling process schematic diagram as shown in Figure 2.

The present invention is after transmitting terminal carries out the processing of steps such as estimation, motion compensation, dct transform, quantification and entropy coding with inputting video data, described video data is sent to receiving terminal, and carry out the treatment step opposite at receiving terminal, the reduction vedio data with transmitting terminal;

In addition, also comprise:

Following mask body is set forth the detailed technical scheme of the present invention, and it mainly comprises following three processes:

One, Face Detection is handled, and comprising:

1, broca scale is as Rough Inspection:

Utilize skin color segmentation people face owing to be not subjected to the space geometry informational influence thereby have stronger robustness, and the simple efficient of algorithm is higher.The colour of skin of most of people's faces always is distributed within a certain scope.Through statistical analysis, can obtain the colour of skin distribution of people's face: Y (TY1～TY2), Cb (TCB1～TCB2), Cr (TCR1～TCR2).Judge that facial image is to be undertaken by following algorithm on the macro block level:

(i, j) all (m, n) locational picture elements, and accumulative total belongs to the picture element number (skin_num) of the colour of skin down in the locational macro block traversal macro block to the.

if?TY1＜Y(m，n)＜TY2&&TCB1＜Cb(m，n)＜TCB2&&TCR1＜Cr(m，n)＜TCR2

then?skin_num＝skin_num+1；

If during skin_num＞T, think that then this macro block is a complexion macro block, and it marked, statistics complexion macro block sum.

For example, as a kind of implementation, can get TY1=50, TY2=200, TCB1=80, TCB2=140, TCR1=130, TCR2=170, T=10.

2, prospect and background image are filled and are handled:

Through broca scale as the Rough Inspection step after, many " cavities " may appear in prospect (colour of skin) image or background image inside, this moment will fill processing, filtering these " cavities ".With 1 sign foreground macro block, 0 sign background macro block.The scanning sequence of filling is from left to right, lines by line scan from top to bottom.Comprise following three kinds of background areas filling:

Background area is filled:

The first step: differentiate isolated macro block, for example:

??0	??0	??0
??0	??0	??0	??0	??1	??0
??0	??0	??0	??0	??1	??0

If illustrated case judges that then this macro block for isolated macro block, is changed to 0 with the foreground macro block mark.

Second the step: be right after the first step, the position of current scanning macro block be designated as (j i), can make 2 * 2 templates to the bottom right,

??(j，i)	??(j，i+1)
??(j，i)	??(j，i+1)	??(j+1，i)	??(j+1，i+1)

Statistics drops on the number of target macroblock in the template, is designated as SUM.

If 1. SUM=0 then continues the next macro block of scanning.

If 2. SUM ≠ 0 then expands to 4 * 4 with template, and is as shown in the table:


					(j，i)
					(j，i)

Statistics drops on the number of target macroblock in 4 * 4 templates, is designated as Count.

If SUM＞=3 and Count=SUM then put 0 with all the macro block marks in the template;

If 0＜SUM＜3 and Count-SUM≤1 then puts 0 with all the macro block marks in the template.

Fill at preceding scenic spot.

After the background area filling is finished, can carry out preceding scenic spot and fill.

The first step: the position of current scanning macro block be designated as (j i), can make 5 * 5 templates,



		??(j，i)
		??(j，i)

The macro block number that statistics drops in the template is designated as Count, and the foreground macro block number is SUM.

If SUM is greater than a certain threshold value, for example SUM 〉=Count * 0.4 then puts 1 with all macro block marks of template;

Otherwise, entered for second step.

Second step: still be that 3 * 3 templates are made at the center with the current macro,


				??(j，i)
				??(j，i)

The macro block number that statistics drops in 3 * 3 templates is designated as Count, and the target macroblock number is SUM.

If SUM is greater than a certain threshold value, for example SUM 〉=Count * 0.4 then puts 1 with all macro block marks of 3 * 3 templates;

Fill on the border

The border filling algorithm is fairly simple, adopts fill method as shown in Figure 3:

The foreground macro block of target for detecting among the figure, the position of filling is for being the macro block of background by flase drop.

2, motion detection:

In order to reduce computation complexity, the direct kinetic characteristic of judging current macro according to difference SAD0 and motion estimation vectors MVX, the MVY of correspondence position macro block in current macro that calculates in the estimation and the former frame reconstructed image, and add up the motion macro block sum of present frame, middle motion macro block sum and little motion macro block sum respectively.

3, the macro block quantized level is adjusted:

After the processing of front several steps, can identify respectively putting in order the two field picture macro block: complexion macro block sign, motion macro block sign, stationary part sign, and the big young pathbreaker's motion macro block according to motion vector is divided into " big motion ", " middle motion " and " little motion " three types, and identifies respectively.

(1) emphasis analysis

Several processing methods in particular cases:

If counting sum with motion MB, colour of skin MB number surpasses a certain threshold value, for example 50%, then do not carry out the QP value adjustment of colour of skin MB; (MB; Macroblock, macro block, the image pixel piece of 16 * 16 sizes)

If colour of skin MB number surpasses a certain threshold value, for example 20%, then the QP value of the MB before first colour of skin MB will add 1;

If motion MB number surpasses a certain threshold value, for example 50%, the QP value adjustment of the MB that then do not move;

The MB number surpasses a certain threshold value if move greatly, and for example 20%, then the QP value of the static MB before first big motion MB will subtract 1;

If little motion MB number surpasses a certain threshold value, for example 20%, then the QP value of the static MB before first little motion MB will add 1;

If static MB number surpasses a certain threshold value, for example 80%, then the QP value of all MB before first motion MB will subtract 1.

(2) the main frame description of method, { the QP value of this MB reduces 1 or 2 than calculated value can to adopt following pseudo-program language: if (this MB belongs to colour of skin part).Else{ if (this MB belongs to motion parts)

If (this MB belongs to big case of motion)

{

The QP value of this MB increases 1 or 2 or the DCT part high fdrequency component of suitably pruning than calculated value.

}

Else if (this MB belongs to middle case of motion)

{

The QP value of this MB is taken as calculated value.

}

Else if (this MB belongs to little case of motion)

{

The QP value of this MB reduces 1 or 2 than calculated value.

Else (this MB belongs to stationary part)

QP value to this MB limits:

(a) the QP value with the MB of same position place of former frame stationary part differs restriction:

-1～1 or-2～2.

(b) the QP value of each MB of stationary part at same frame differs restriction:

-1～1 or-2～2.}}

(3) quantization parameter of smooth background macro block:

The main purpose of this processing is that the quantization parameter of all background macro block is reached unanimity, thereby reduces the shake of background parts in the image.Concrete processing procedure is described below:

To any one background macro block, establishing by its quantization parameter that calculates is QP_new, and the quantization parameter mean value of all background macro block of having encoded is partMean in this frame, and the quantization parameter of the adjacent previous macro block of current macro is QP_prev, then:

If 1. partMean＞QP_prev and QP_new＜QP_prev

QP_new QP_prev partMean---------------|---------------------|------------------------------|-----------------＞the QP increase

Then make QP_new=QP_prev

If 2. partMean＜QP_prev and QP_new＞QP_prev

partMean??????????????QP_prev????????????????????????QP_new

----------|--------------------|------------------------------|-----------------＞the QP increase

Then make QP_new=QP_prev

If 3. partMean-QP_prev＞=2 and QP_new＞QP_prev and

QP_new-QP_prev＜＝2

QP_prev????????????QP_new???????????????QP_prev+2???????????????partMean

------------------|----------------|--------------------|--------------------|--------＞QP increases

Then make QP_new=QP_prev+2

If 4. partMean-QP_prev＜=-2 and QP_new＜QP_prev and

QP_new-QP_prev＞＝-2

PartMean QP_prev-2 QP_new QP_prev-------------------|-----------------|---------------------|--------------------|--------＞QP increases

Then make QP_new=QP_prev-2.

The outstanding advantages of the method for the invention is:

(1) video-frequency compression method of skin color based detection of the present invention and motion detection can be better with prospect and background Image Segmentation Using, in the situation that keeps total code efficiency and overall image quality, quality and the definition of foreground image can be improved, thereby preferably subjective picture quality effect can be obtained.

(2) amount of calculation that skin color based of the present invention detects and the video-frequency compression method of motion detection increases is very little, and computation complexity is lower, is convenient to real-time application.

The present invention is directed to some cycle tests and video camera real-time image acquisition, the given skin color based of the present invention is detected and the video-frequency compression method of motion detection has carried out contrast test, the result shows, can be better with prospect and background Image Segmentation Using, in the situation that keeps total code efficiency and overall image quality, the quality definition of foreground image can be improved, thereby preferably subjective picture quality effect can be obtained.

Claims

1, a kind of video-frequency compression method, after transmitting terminal carries out the processing of estimation, motion compensation, dct transform, quantification and entropy coding step with inputting video data, described video data is sent to receiving terminal, and carry out the treatment step opposite at receiving terminal, the reduction vedio data with transmitting terminal;

2, the method for claim 1 is characterized in that described detection step, comprises Face Detection step and motion detection step.

3, method as claimed in claim 2 is characterized in that described Face Detection step further comprises:

4, method as claimed in claim 3 is characterized in that described background image filling treatment step further comprises:

Judge isolated module;

5, method as claimed in claim 3 is characterized in that described foreground image filling treatment step further comprises:

6, method as claimed in claim 3 is characterized in that the step that comprises that also the border is filled.

7, method as claimed in claim 2 is characterized in that described motion detection step, comprising:

8, method as claimed in claim 7 is characterized in that described macro block quantized level set-up procedure further comprises:

9, method as claimed in claim 8 is characterized in that the motion macro block (mb) type of described different stage is respectively big motion macro block, middle motion macro block and little motion macro block, and described processing mode further comprises:

10, method as claimed in claim 9 is characterized in that further comprising one of following determining step or combination: