KR20010069019A

KR20010069019A - Method of video coding and decoding for video codec

Info

Publication number: KR20010069019A
Application number: KR1020000001222A
Authority: KR
Inventors: 김철우
Original assignee: 구자홍; 엘지전자 주식회사
Priority date: 2000-01-11
Filing date: 2000-01-11
Publication date: 2001-07-23
Also published as: KR100393987B1

Abstract

PURPOSE: A video coding and decoding method is provided to decrease the number of bit needed in a video compression coding by estimating motions of blocks included by a main region(or a background region) after dividing a full screen of animation into the main region and the background region. CONSTITUTION: In a coding process, an estimation vector(MV(Bk)) of a block to be coded is found out in an input image(Frame)(301). The input image(Frame) is divided into a main region(Fm) and a background region(Fb), and an overall motion vector(MV(Fm))(or MV(Fb)) of the main region(Fm)(or the background region(Fb)) is found out(302). Next, by coding the difference value(delta(Bk)) of the overall motion vector(MV(Fm) or MV(Fb)) and a motion vector(MV(Bk)) of a pertinent block(Bk), the difference value(delta(Bk)) of the overall motion vector(MV(Fm) or MV(Fb)) is coded instead of the motion vector(MV(Bk)) of a pertinent block(Bk)(303).

Description

Image encoding and decoding method {METHOD OF VIDEO CODING AND DECODING FOR VIDEO CODEC}

본 발명은 블럭(Block) 단위로 영상을 처리하여 압축 부호화 및 복화하 하는 시스템에서 동영상의 압축 부호화 방법과 복호화 방법에 관한 것으로서, 전송할 동영상의 화면 전체 중에서 의미전달에 중요한 영역과 그렇지 않은 영역을 분리하여, 의미전달에 중요한 영역을 주영역으로, 그렇지 않은 영역을 배경영역으로 하여 각각 전체 움직임 추정을 행한 후에, 이를 기반으로 하여 주영역(및/또는 배경영역)에 속하는 블럭들의 움직임 추정을 행함으로써 영상 압축 부호화에 필요한 비트 수를 줄일 수 있도록 한 영상 부호화 및 복호화 방법에 관한 것이다.The present invention relates to a compression encoding method and a decoding method of a moving picture in a system that processes and encodes and decodes an image by a block unit. Then, the total motion estimation is performed using the area important for semantic transmission as the main area and the area not as the background area, and then the motion estimation of the blocks belonging to the main area (and / or the background area) is performed based on this. The present invention relates to a video encoding and decoding method capable of reducing the number of bits required for video compression encoding.

더욱 상세하게는 본 발명은 H.26x 나 MPEG-1, MPEG-2, MPEG-4 등의 표준을포함한 블럭별 움직임 예측 및 보상을 통해서 동영상을 압축하는 기법에서, 화면을 배경영역과 주화상 영역(주영역)으로 분리하여 각 영역에 전체 움직임 벡터를 할당한 후, 전체 움직임 벡터를 기반으로 각 블럭의 움직임 벡터를 예측함으로써, 영상 압축 부호화의 성능을 높일 수 있도록 한 영상 부호화 및 복호화 방법에 관한 것으로서, 특히 PC용 화상회의 시스템(PVS: Video Conferencing System)과 같이 배경영역과 주영역(사용자의 얼굴)이 비교적 명확하게 구분되는 시스템의 성능을 높일 수 있도록 한 영상 부호화 및 복호화 방법에 관한 것이다.More specifically, the present invention provides a background image and a main image area in a technique of compressing a video by block motion prediction and compensation including standards such as H.26x, MPEG-1, MPEG-2, MPEG-4, etc. A method of encoding and decoding an image which can enhance the performance of image compression encoding by allocating the entire motion vector to each region by dividing it into the (main region) and then predicting the motion vector of each block based on the total motion vector. In particular, the present invention relates to a video encoding and decoding method for improving the performance of a system in which a background area and a main area (a face of a user) are relatively clearly separated, such as a video conferencing system (PVS) for a PC.

현재, 국제 영상 부호화 표준으로 자리잡고 있는 H.261, H.263, MPEG-2, MPEG-4 등에서는 화면을 16×16 혹은 8×8 등의 작은 블럭(Block)으로 나누어 각 블럭에 대해서 움직임 예측(Motion Estimation) 및 보상(Motion Compensation)을 실시한 후에, 이산 코사인 변환(DCT) 등을 통하여 잔류신호를 부호화하는 과정으로 구성되어 있다.Currently, in H.261, H.263, MPEG-2, MPEG-4, etc., which are set as international video coding standards, the screen is divided into small blocks such as 16 × 16 or 8 × 8 to move each block. After performing the prediction (Motion Estimation) and the compensation (Motion Compensation), the residual signal is encoded through a discrete cosine transform (DCT) or the like.

이때 블럭별 움직임 예측 및 보상은 동영상에서 시간적 상관성(Temporal Correlation)을 이용하기 위해서 사용되는 대표적인 방법이라고 할 수 있다.In this case, motion prediction and compensation for each block may be a representative method used to use temporal correlation in a video.

따라서, 표준 및 기타의 영상 부호화 기법들에서 움직임 예측 및 보상을 하는 이유는 동영상의 시간적 상관성을 충분히 이용하여, 이전 프레임(Frame)으로부터 이용할 수 있는 정보를 현재 프레임을 부호화하는데 최대한 이용하고자 하는 것이 목적이다.Therefore, the reason for motion prediction and compensation in standard and other image encoding techniques is to fully utilize the temporal correlation of a video, and to utilize the information available from the previous frame to the maximum in encoding the current frame. to be.

H.263, MPEG-2, MPEG-4 등의 영상 부호화 표준에서 각 블럭간의 움직임 벡터는 인접 블럭들의 움직임 벡터들과 비슷할 확률이 높기 때문에, 인접 블럭들의 움직임 벡터를 이용해서 움직임 벡터를 추정한 후에, 추정값과의 차이값들을 부호화하고 있는데, 예를 들면 움직임 벡터의 추정 후에 그 추정값과의 차이값을 가변길이 부호화 기법(VLC) 등으로 부호화 하고 있다.In the video coding standards such as H.263, MPEG-2, and MPEG-4, since motion vectors between blocks are more likely to be similar to motion vectors of adjacent blocks, the motion vectors of adjacent blocks are estimated using the motion vectors. The difference values with the estimated values are encoded. For example, after the motion vector is estimated, the difference values with the estimated values are encoded using a variable length coding technique (VLC) or the like.

따라서, 가변길이 부호화 기법 등에서는 부호화를 위한 부호화 테이블이 추정값과의 차이값의 크기가 작을수록 즉, 추정된 벡터값과의 차이가 영벡터(Zero Vector)에 가까울수록 부호어(Code Word)의 길이가 작아지도록 배열되어 있고, 결국 블럭의 움직임 벡터를 표시하기 위한 주변 벡터값과의 차이값이 영벡터에 가까울수록 움직임 벡터를 표기하기 위한 부호어의 길이가 짧아지게 된다.Therefore, in a variable length coding technique, the smaller the magnitude of the difference value between the estimated value and the estimated vector value is, the closer the zero vector is to the zero vector, The length of the codeword for indicating the motion vector becomes shorter as the length is arranged to be smaller, and the closer the difference with the neighboring vector value for displaying the motion vector of the block is to the zero vector.

이와같이 종래에는 원 영상과 움직인 영상간의 각각의 블럭별 차이값을 이용해서 움직임 벡터를 부호화하기 때문에 데이터량이 증가하게 되며, 이러한 데이터량의 증가를 억제하고, 데이터량을 줄여서 같은 시간내에 보다 많은 데이터를 충실하게 전송할 수 있도록 하기 위한 연구가 진행되어 왔다.As described above, since the motion vector is encoded by using the difference value of each block between the original image and the moving image, the data amount is increased, and the increase of the data amount is suppressed, and the data amount is reduced, thereby increasing the amount of data in the same time. Research has been carried out to enable the transmission of the faithful.

특히, PC 기반의 화상회의 시스템의 경우, 데이터량을 줄인다면 같은 시간내에 전송할 수 있는 프레임 수를 높일 수 있기 때문에, 보다 완전한 동영상에 가까운 화상회의 시스템 구축이 가능해질 것이다.In particular, in the case of a PC-based video conferencing system, if the amount of data is reduced, the number of frames that can be transmitted within the same time can be increased, and thus a video conferencing system closer to a more complete video will be possible.

본 발명은 블럭(Block) 단위로 영상을 처리하여 압축 부호화 및 복화하 하는 시스템에서 동영상의 압축 부호화 방법과 복호화 방법에 관한 것으로서, 전송할 동영상의 화면 전체 중에서 의미전달에 중요한 영역과 그렇지 않은 영역을 분리하여, 의미전달에 중요한 영역을 주영역으로, 그렇지 않은 영역을 배경영역으로 하여 각각 전체 움직임 추정을 행한 후에, 이를 기반으로 하여 주영역(및/또는 배경영역)에 속하는 블럭들의 움직임 추정을 행함으로써 영상 압축 부호화에 필요한 비트 수를 줄일 수 있도록 한 영상 부호화 및 복호화 방법을 제공한다.The present invention relates to a compression encoding method and a decoding method of a moving picture in a system that processes and encodes and decodes an image by a block unit. Then, the total motion estimation is performed using the area important for semantic transmission as the main area and the area not as the background area, and then the motion estimation of the blocks belonging to the main area (and / or the background area) is performed based on this. An image encoding and decoding method for reducing the number of bits required for image compression encoding is provided.

특히 본 발명은 화상회의 시스템의 특성상, 화상 중에서 사용자의 얼굴이 배경에 비하여 의미있는 주요 정보라는 점을 감안하여, 화상을 주영역과 배경영역으로 분할하고, 주영역 및/또는 배경영역에 대해서 전체 움직임 추정을 행하고, 이 전체 움직임 벡터를 기반으로 하여 주영역 및/또는 배경영역에 속한 블럭들의 움직임 벡터를 예측하여 부호화 및 복호화하는 방법을 제공한다.Particularly, the present invention divides an image into a main area and a background area in consideration of the fact that the face of the user is meaningful information compared to the background in view of the nature of the video conferencing system. A method of performing motion estimation and predicting, encoding and decoding motion vectors of blocks belonging to a main region and / or a background region based on the total motion vectors is provided.

도1은 본 발명을 설명하기 위한 움직임 벡터 추정의 예를 나타낸 도면1 is a diagram showing an example of motion vector estimation for explaining the present invention;

도2는 본 발명을 설명하기 위한 주영역과 배경영역의 분할 예를 나타낸 도면2 is a diagram showing an example of division of a main area and a background area for explaining the present invention;

도3a는 본 발명을 설명하기 위한 영상 부호화 알고리즘의 부호화기 블럭도3A is an encoder block diagram of an image encoding algorithm for explaining the present invention.

도3b는 본 발명을 설명하기 위한 영상 복호화 알고리즘의 복호화기 블럭도3B is a decoder block diagram of an image decoding algorithm for explaining the present invention.

본 발명은 블럭단위로 영상을 처리하여 압축 부호화하고 또 복호화하는 영상 부호화 및 복호화 방법에 있어서,The present invention provides a video encoding and decoding method which processes, encodes and decodes an image by block unit.

부호화할 영상을 의미전달에 중요한 주영역과 그렇지 않은 배경영역으로 분리하는 과정과, 상기 분리된 영역에 대하여 전체 움직임 추정을 행하여 전체 움직임 벡터를 구하는 과정과, 상기 전체 움직임 추정에 의해서 구한 전체 움직임 벡터를 기반으로 해당 영역에 속하는 블럭의 움직임 벡터를 부호화하는 과정으로 이루어지는 것을 특징으로 하는 영상 부호화 방법이다.Separating the image to be encoded into a main region and a background region which are important for semantic transmission, obtaining a total motion vector by performing total motion estimation on the separated region, and a total motion vector obtained by the total motion estimation. The video encoding method is performed by encoding a motion vector of a block belonging to a corresponding region.

또한 본 발명은 블럭단위로 영상을 처리하여 압축 부호화하고 또 복호화하는 영상 부호화 및 복호화 방법에 있어서,In addition, the present invention provides a video encoding and decoding method for processing, encoding and decoding a video by block unit,

부호화할 영상을 의미전달에 중요한 주영역과 그렇지 않은 배경영역으로 분리하여 각 영역에 대한 전체 움직임 벡터정보를 입력하는 과정과, 상기 입력된 전체 움직임 벡터정보를 기반으로 하여 해당 영역에 속하는 블럭의 움직임 벡터를 복호화하는 과정으로 이루어지는 것을 특징으로 하는 영상 복호화 방법이다.Inputting the entire motion vector information for each region by separating the image to be encoded into a main region and a background region that are important for semantic transmission, and moving the block belonging to the region based on the input total motion vector information. An image decoding method comprising the process of decoding a vector.

또한 본 발명에서는 상기 주영역에 대하여 전체 움직임 벡터를 추정하고, 상기 주영역에 속하는 블럭에 대해서 상기 전체 움직임 벡터와 해당 블럭의 움직임 벡터와의 차값을 그 블럭의 움직임 벡터정보로서 부호화하는 것을 특징으로 한다.The present invention is also characterized by estimating the total motion vector for the main region, and encoding a difference value between the full motion vector and the motion vector of the block for the blocks belonging to the main region as motion vector information of the block. do.

또한 본 발명에서는 상기 주영역에 대하여 전체 움직임 벡터를 추정하고, 상기 주영역에 속하는 블럭에 대해서 상기 전체 움직임 벡터와 해당 블럭의 움직임 벡터와의 합값을 그 블럭의 움직임 벡터정보로서 복호화하는 것을 특징으로 한다.In addition, the present invention is characterized by estimating the total motion vector for the main region, and decoding the sum of the total motion vector and the motion vector of the block for the block belonging to the main region as the motion vector information of the block. do.

또한 본 발명에서는 상기 배경영역에 대하여 전체 움직임 벡터를 추정하고, 상기 배경영역에 속하는 블럭에 대해서 상기 전체 움직임 벡터와 해당 블럭의 움직임 벡터와의 차값을 그 블럭의 움직임 벡터정보로서 부호화하는 것을 특징으로 한다.In addition, the present invention is characterized by estimating the total motion vector for the background region, and encoding the difference value between the total motion vector and the motion vector of the block for the blocks belonging to the background region as the motion vector information of the block. do.

또한 본 발명에서는 상기 배경영역에 대하여 전체 움직임 벡터를 추정하고, 상기 배경영역에 속하는 블럭에 대해서 상기 전체 움직임 벡터와 해당 블럭의 움직임 벡터와의 합값을 그 블럭의 움직임 벡터정보로서 복호화하는 것을 특징으로 한다.In addition, the present invention is characterized by estimating the total motion vector for the background region, and decoding the sum of the total motion vector and the motion vector of the block for the blocks belonging to the background region as the motion vector information of the block. do.

또한 본 발명은 PC 기반 화상회의 시스템이 가지는 특성을 이용할 때, 상기 주영역이 사용자의 얼굴영역이고 나머지 영역을 배경영역으로 하여 영상을 부호화 및 복호화하는 것을 특징으로 한다.In addition, the present invention is characterized in that the video is encoded and decoded with the main area as the face area of the user and the remaining area as the background area when using the characteristics of the PC-based video conferencing system.

즉, 본 발명에서는 PC 기반의 화상회의 시스템에서 주영역과 배경영역으로영역을 구분하여 전체 움직임 추정을 행한 후에, 각 영역에 속한 블럭들은 전체 움직임 벡터들에 기반하여 움직임 추정을 함으로써, 움직임 벡터를 부호화하기 위한 비트 수를 줄일 수 있도록 한 PC 기반 화상회의 시스템의 영상 부호화 및 복호화 방법을 특징으로 한다.That is, in the present invention, after performing the entire motion estimation by dividing the area into the main area and the background area in the PC-based video conferencing system, the blocks belonging to each area are estimated by the motion vectors based on the motion vectors. A video encoding and decoding method of a PC-based videoconferencing system is provided to reduce the number of bits for encoding.

앞에서 설명한 바와같이 움직임 예측 및 보상시에 블럭들 간의 움직임 벡터는 같은 물체(Object)에 속해있는 블럭들 일수록 상관성이 많이 존재할 확률이 크다.As described above, the motion vectors between blocks in motion prediction and compensation are more likely to have a higher correlation with blocks belonging to the same object.

이 것은 상기 기술한 영상압축부호화 국제 표준들에서 블럭들의 움직임 벡터를 추정하는 과정에서 블럭내의 모든 화소들이 동일한 움직임 벡터를 가진다고 하는 가정에 기인한다(도1 참조).This is due to the assumption that all the pixels in the block have the same motion vector in the process of estimating the motion vectors of the blocks in the above-described international standards for image compression encoding (see Fig. 1).

즉, 현재 표준에서 사용하고 있는 움직임 예측 및 보상은 영상내의 물체가 병진적 이행(Translational)을 한다는 가정을 따른 것이다.That is, the motion prediction and compensation used in the current standard are based on the assumption that the objects in the image are translated.

도1에서 살펴보면, 영상(101)내의 움직이는 물체(102)가 하나의 덩어리처럼 전체적으로 병진적 이행을 한다는 가정을 따를 때, 그 물체(102)에 속하는 화소들의 움직임 벡터(MVx, MVy,...)들이 동일한 움직임 벡터를 가진다는 가정에 기인한다.Referring to FIG. 1, when following the assumption that the moving object 102 in the image 101 performs translational transition as a whole as a mass, the motion vectors MVx, MVy, ... of pixels belonging to the object 102 are taken. Is assumed to have the same motion vector.

위와같은 이유로 인해서 같은 물체에 속한 것으로 판명된 블럭들은 움직임 벡터에서도 서로 비슷한 특성을 가질 확률이 매우 크다고 볼 수 있다.Blocks found to belong to the same object due to the above reasons are very likely to have similar characteristics in motion vectors.

이 것은 물체가 병진적 이행을 한다는 가정의 경우 뿐만 아니라, 회전, 확대 혹은 축소의 경우에도 비슷한 움직임 특성을 나타내게 된다.This gives similar movement characteristics not only in the case of the assumption that the object is translated, but also in the case of rotation, enlargement or reduction.

이러한 특성을 충분하게 이용하기 위해서, 전체 움직임 추정(Global Motion Estimation)을 이용한 기법들이 많이 연구되어 있다.In order to fully utilize these characteristics, many techniques using global motion estimation have been studied.

즉, 전체 움직임 추정을 행한 후에, 전체 움직임 추정을 통해서 얻어진 움직임 벡터를 기반으로 하여 각 블럭들의 움직임 벡터를 추정하여, 전체적으로 움직임 벡터의 크기를 줄이고자 하는 시도들이 많이 개발되어 있다.That is, many attempts have been made to reduce the size of the motion vector as a whole by estimating the motion vector of each block based on the motion vector obtained through the overall motion estimation after performing the overall motion estimation.

도1에서 보는 바와같이, 하나의 물체(102)에 속한 블럭들은 서로 같거나 매우 비슷한 특성을 띠는 움직임 벡터(MVx,MVy,...)를 가지게 되며, 이들 같은 물체의 움직임 벡터를 가지는 블럭들은 이전 프레임(화면)에서 그에 해당하는 부분들의 화소들을 가져와서 움직임 보상을 실시하게 된다.As shown in FIG. 1, blocks belonging to one object 102 have motion vectors MVx, MVy, ... which have the same or very similar characteristics to each other, and blocks having the motion vectors of the same object. The motion compensation is performed by importing pixels of the corresponding parts in the previous frame (screen).

따라서, 같은 물체에 속한 블럭들 일수록 같은 움직임 벡터를 가지게 될 확률이 매우 높다.Therefore, the more blocks belonging to the same object have a higher probability of having the same motion vector.

따라서, 이들 같은 물체 혹은 같은 영역에 속하는 블럭들은 전체적으로 움직임 추정을 행하고, 전체적으로 추정된 움직임 벡터를 기반으로 블럭간의 움직임 추정을 행하면 움직임 벡터들의 크기가 줄어들 확률이 매우 높게 된다.Therefore, if the same object or blocks belonging to the same area are estimated by the motion as a whole, and the motion estimation between the blocks based on the estimated motion vector as a whole, the probability of the size of the motion vectors is very high.

그런데, 상기 기술한 바와같이 PC 기반의 화상회의 시스템은 대부분의 경우에 사용자가 1명이므로, 전송되는 영상에서 사용자의 얼굴영역과 배경영역의 구분이 용이하며, 또한 배경영역의 움직임은 거의 없는데 반해서, 사용자의 얼굴이 대부분 같은 방향으로 움직인다.However, as described above, in the PC-based video conferencing system, since only one user is used in most cases, it is easy to distinguish the user's face area from the background area in the transmitted image, and there is almost no movement of the background area. Most of the user's faces move in the same direction.

또한, 화상회의 시스템에서 사용자의 얼굴에 거의 대부분의 중요도가 집중되어 있고, 배경화면은 그리 중요하지 않다고 볼 수 있다.In addition, in the videoconferencing system, almost all the importance is concentrated on the user's face, and the background image is not so important.

따라서, 사용자의 얼굴(주영역)에 전체 움직임 벡터를 할당하고 이를 기반으로 주영역에 속하는 블럭들의 움직임 추정을 행함으로써, 움직임 벡터를 부호화하는데 드는 비트 수를 줄일 뿐만 아니라, 최적의 움직임 벡터를 탐색하는데 걸리는 시간도 줄일 수 있다.Therefore, by allocating the entire motion vector to the user's face (main area) and performing motion estimation of blocks belonging to the main area, the number of bits required to encode the motion vector is reduced, and the optimal motion vector is searched. It can also reduce the time it takes.

그리고, 필요할 경우에는 배경영역에도 전체 움직임 벡터를 할당한 후에 주영역에서 행한 것처럼 움직임 추정을 할 수도 있다.If necessary, the motion region may be estimated as if it is performed in the main region after allocating the entire motion vector to the background region.

본 발명에서는 주영역에만 전체 움직임 벡터를 할당하는 경우를 예로 들어 설명하기로 한다.In the present invention, a case in which the entire motion vector is allocated to only the main region will be described as an example.

배경영역으로의 확장은 주영역의 경우와 동일하다.The extension to the background area is the same as for the main area.

도2에서 보는 바와같이 부호화할 하나의 화면을 F라고 한다.As shown in Fig. 2, one screen to be encoded is called F.

이 때 화면(F)를 주영역과 배경영역으로 분할하고 이들 영역을 각각 Fm, Fb라고 한다.At this time, the screen F is divided into a main area and a background area, and these areas are called Fm and Fb, respectively.

이와같이 분리된 주영역(Fm) 과 배경영역(Fb) 에 대해서 전체 움직임 추정을 행하고, 이 결과로 얻어진 각 영역의 움직임 벡터를 각각 MV(Fm) 와 MV(Fb) 라고 한다.The overall motion estimation is performed on the main region Fm and the background region Fb thus separated, and the motion vectors of the respective regions obtained as the result are called MV (Fm) and MV (Fb), respectively.

또한 현재 부호화할 블럭을 Bk라고 하고, 부호화할 블럭(Bk) 가 가지는 움직임 벡터를 MV(Bk) 라고 한다.The block to be encoded is referred to as Bk, and the motion vector of the block to be encoded is referred to as MV (Bk).

그러면 현재 부호화되는 블럭(Bk) 가 주영역(Fm)에 속하는지의 여부를 판단하여, 블럭(Bk)이 주영역(Fm)에 속할 경우(Bk ∈ Fm), 블럭(Bk)를 위한 움직임 벡터는 그 블럭에 대한 움직임 벡터(MV(Bk)) 를 직접 부호화 하는 대신에, Δ(Bk) =MV(Bk) - MV(Fm) 값으로, 주영역 전체 움직임 벡터(MV(Fm))와 해당 블럭(Bk)의 움직임 벡터(MV(Bk))의 차값을 부호화 한다.Then, it is determined whether the block Bk currently encoded belongs to the main region Fm, and if the block Bk belongs to the main region Fm (Bk ∈ Fm), the motion vector for the block Bk is determined. Instead of directly encoding the motion vector MV (Bk) for the block, Δ (Bk) = MV (Bk)-MV (Fm), where the main region full motion vector (MV (Fm)) The difference value of the motion vector MV (Bk) of the block Bk is encoded.

이때 전체적으로 Δ(Bk) 는 MV(Bk) 들이 가지는 값들 보다 크기가 작은 분포를 나타내게 될 확률이 매우 높다.At this time, Δ (Bk) has a high probability of showing a distribution smaller than the values of MV (Bk).

이렇게 되는 이유는 앞에서 설명한 바 있다.The reason for this has been explained earlier.

또한, 복호기에서는 블럭(Bk) 를 위한 움직임 벡터 MV(Bk)를 다음과 같은 과정으로 구하게 된다.In addition, the decoder obtains a motion vector MV (Bk) for the block Bk by the following process.

즉, MV(Bk) = Δ(Bk) + MV(Fm) 로서, 주영역 전체 움직임 벡터(MV(Fm))와 상기 차값(Δ(Bk))의 합으로써 해당 블럭(Bk)의 움직임 벡터(MV(Bk))를 구할 수 있게 되는 것이다.That is, as MV (Bk) = Δ (Bk) + MV (Fm), the motion vector (Bk) of the corresponding block Bk is obtained by summing the total main motion vector MV (Fm) and the difference value Δ (Bk). MV (Bk)) can be obtained.

이와같이 하여 주영역과 배경영역은 이미 움직임 추정 전단계에서 분리를 수행하였으며, 이에 필요한 지도(Map) 정보(주영역과 배경영역의 구분정보)는 분할 단계에서 부호화하게 되므로, 움직임 추정시에는 전체 움직임 벡터를 사용할 것인가 그렇지 않은가를 표시해야할 부가정보는 필요하지 않게된다.In this way, the main region and the background region have already been separated in the previous stage of motion estimation, and the map information (divided information of the main region and the background region) necessary for this is encoded in the partitioning step. No additional information is needed to indicate whether to use or not.

도3a,3b에 전체적으로 행해아할 과정들을 블럭도로서 나타내었다.3A and 3B are shown as block diagrams of the processes to be performed as a whole.

먼저, 도3a 를 참조하여 부호화 과정을 설명한다.First, the encoding process will be described with reference to FIG. 3A.

입력 화상(Frame)에서 부호화할 블럭(Bk)의 움직임 벡터(MV(Bk))을 구한다(301).The motion vector MV (Bk) of the block Bk to be encoded in the input image Frame is obtained (301).

또한, 입력 화상(Frame)에서 주영역(Fm)과 배경영역(Fb)을 분리하고, 주영역(Fm)(및/또는 배경영역 Fb)의 전체 움직임 벡터(MV(Fm))(및/또는 MV(Fb))를구한다(302).In addition, the main region Fm and the background region Fb are separated from the input image Frame, and the entire motion vector MV (Fm) of the main region Fm (and / or the background region Fb) (and / or MV (Fb)) is obtained (302).

그리고, 해당 블럭(Bk)에 대해서 주영역(Fm)(및/또는 배경영역(Fb)) 전체 움직임 벡터(MV(Fm))(및/또는 MV(Fb))와 상기 해당 블럭(Bk)의 움직임 벡터(MV(Bk))와의 차값(Δ(Bk))을 부호화함으로써, 주영역(및/또는 배경영역)에 속하는 해당 블럭(Bk)의 움직임 벡터(MV(Bk)) 대신 상기 Δ(Bk)을 부호화하는 것이다.(303)Then, with respect to the block Bk, the main region Fm (and / or the background region Fb) of the entire motion vector MV (Fm) (and / or MV (Fb)) and the corresponding block Bk. By encoding the difference value Δ (Bk) with the motion vector MV (Bk), instead of the motion vector MV (Bk) of the corresponding block Bk belonging to the main region (and / or background region), the Δ (Bk) (303)

복호화의 과정은 부호화의 역과정에 해당하며, 도3b에 나타내었다.The decoding process corresponds to an inverse process of encoding, which is shown in FIG. 3B.

주영역과 배경영역으로 분리되어 각 영역에 대한 전체 움직임 벡터(MV(Fm) 및/또는MV(Fb))와 해당 영역에 속하는 블럭(Bk)의 움직임 벡터정보 Δ(Bk)를 입력으로 하여 블럭(Bk)의 움직임 벡터(MV(Bk))를 복호화 한다.(304)The block is divided into a main region and a background region by inputting the motion vector information Δ (Bk) of the entire motion vector MV (Fm) and / or MV (Fb) and the block Bk belonging to the region. Decodes the motion vector MV (Bk) of (Bk) (304).

즉, 주영역(Fm)에 대해서는 Δ(Bk) + MV(Fm) 으로 주영역에 속하는 블럭(Bk)의 움직임 벡터(MV(Bk))를 복호화하고, 배경영역(Fb)에 대해서도 앞의 도3a 와 같은 부호화 과정을 거쳤다면 Δ(Bk) + MV(Fb) 으로 배경영역에 속하는 해당 블럭의 움직임 벡터를 복호화하는 것이다.That is, for the main region Fm, the motion vector MV (Bk) of the block Bk belonging to the main region is decoded by Δ (Bk) + MV (Fm), and the previous figure is also shown for the background region Fb. If the encoding process is performed as in 3a, the motion vector of the corresponding block belonging to the background region is decoded by Δ (Bk) + MV (Fb).

본 발명은 움직임 예측 및 보상을 행하는 모든 종류의 부호화 표준 및 알고리즘에서 사용이 가능하다.The present invention can be used in all kinds of coding standards and algorithms for motion prediction and compensation.

본 발명은 하나의 물체에 속한 블럭들은 서로 비슷한 움직임 특성을 가진다는 점에 착안하여 이들 움직임 벡터를 보다 효율적으로 부호화할 수 있는 알고리즘을 제안하였다.In view of the fact that the blocks belonging to one object have similar motion characteristics, the present invention has proposed an algorithm capable of encoding these motion vectors more efficiently.

특히 본 발명은 PC 기반의 화상회의 시스템에서 대부분의 경우 사용자가 1인이며, 따라서 주요 영역이 사용자의 얼굴로서 쉽게 추정이 된다는 점을 이용해서, 주영역 및 필요할 경우에는 배경영역에도 전체 움직임 벡터를 할당하여 움직임 벡터를 부호화 하는데 사용되는 비트 수를 줄일 수 있게 되었다.In particular, the present invention utilizes the fact that the user is one person in most cases in the PC-based videoconferencing system, and therefore the main area is easily estimated as the user's face, so that the entire motion vector is also applied to the main area and the background area if necessary. By allocating, the number of bits used to encode a motion vector can be reduced.

따라서, 본 발명은 현재 사용되고 있는 화상회의 시스템에서 이들 시스템이 가지는 특성을 이용하여 압축 부호화 성능을 향상시킬 수 있다.Therefore, the present invention can improve the compression coding performance by using the characteristics of these systems in the videoconferencing system currently used.

Claims

A video encoding and decoding method for processing, encoding, and decoding a video by block unit,

Dividing an image to be encoded into a main region that is important for semantic transmission and a background region that is not important to the encoding; obtaining a total motion vector of a corresponding region by performing a total motion estimation on a required region among the separated regions; And encoding a motion vector of a block belonging to the corresponding region based on the entire motion vector obtained by the method.

2. The method of claim 1, wherein a total motion vector of the corresponding region is estimated for the main region, the background region, or the main region and the background region, and the total motion vector and the motion vector of the block belong to the block belonging to the region. And encoding a difference value of? As motion vector information of the block.

The method according to claim 1 or 2, wherein when the image to be encoded is an image processed by a PC-based video conferencing system, the face region of the user is extracted and the user face region is the main region, and the rest is the background region. The video encoding method, characterized in that for performing encoding.

Separating the image to be encoded into a main region that is important for semantic transmission and a background region that is not important, and inputting the entire motion vector information for the required region among the separated regions, and based on the inputted whole motion vector information, And decoding a motion vector of the belonging block.

5. The method of claim 4, wherein a total motion vector of the corresponding region is estimated for the main region, the background region, or the main region and the background region, and the total motion vector and the motion vector of the block belong to the block belonging to the corresponding region. And decoding the sum of the values as motion vector information of the block.