KR20040032332A

KR20040032332A - Image communication system and operating method thereof

Info

Publication number: KR20040032332A
Application number: KR1020020061426A
Authority: KR
Inventors: 이진수; 김현준
Original assignee: 엘지전자 주식회사
Priority date: 2002-10-09
Filing date: 2002-10-09
Publication date: 2004-04-17
Also published as: KR100493702B1

Abstract

PURPOSE: A video communication system and a method for operating the same are provided to improve the quality and reduce the quantity of data by making a background into a still image through digital image stabilization and motion compensation in executing a video communication or transmitting and receiving video mails. CONSTITUTION: If an inputted image is not a first frame, an ROI(Region Of Interest) is abstracted from the inputted image(201,202). The image is compensated by according a background region except the ROI with a background region of a previous frame(203). The frame including the ROI is repositioned on the previous frame, and the ROI is synthesized with the background region(204). A boundary between a macro block formed of the background region only, and another macro block including the ROI is smoothly processed to supplement an error(205). The accorded background region is set as a 'non-coded block' for coding the compensated image(206).

Description

Image communication system and operating method

본 발명은 화상통신 시스템에 관한 것으로서, 특히 화상 전송이 가능한 이동통신 단말기를 이용하여 화상통신을 수행하거나 영상우편(Video Mail)을 송수신할 수 있는 화상통신 시스템 및 그 운용방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a video communication system, and more particularly, to a video communication system capable of performing video communication or transmitting and receiving video mail using a mobile communication terminal capable of transmitting an image and a method of operating the same.

최근 IMT2000, CDMA2000 서비스 등 차세대 이동 통신 서비스가 자리를 잡으면서, 이동 통신은 더 이상 음성만이 아닌 멀티미디어 송수신을 위한 수단으로 사용되고 있다. 특히 화상통신 및 동화상을 이용한 비디오 메일 서비스는 이러한 멀티미디어 기능 중 핵심으로 자리잡고 있다. 화상통신 및 비디오 메일에서 가장 큰 문제점은 낮은 네트워크 환경에서 질 높은 데이터를 전송하기 위한 것이다. 화상통신의 경우 네트워크 환경이 조금이라도 낮아지면 화질이 낮아지기 쉽고, 비디오 메일의 경우 화질을 높여 코딩할 경우 데이터 양이 급격하게 증가하여 사용자의 요금 부담이 매우 커지게 된다.Recently, as next-generation mobile communication services such as IMT2000 and CDMA2000 services are in place, mobile communication is no longer used as a means for multimedia transmission and reception as well as voice. In particular, video mail services using video communication and moving images are at the core of these multimedia functions. The biggest problem in video communication and video mail is to transmit high quality data in a low network environment. In the case of video communication, even if the network environment is a little lower, the image quality tends to be lowered. In the case of video mail, the quality of data is rapidly increased when coding with higher image quality, and the user's burden is greatly increased.

이러한 이유로 최근에는 보다 적은 데이터 양으로 높은 화질을 선보이기 위한 연구가 활발히 이루어지고 있다. 그 중 얼굴 영역을 추출하여 얼굴 영역을 보다 높은 화질로 코딩하고 다른 배경 영역은 낮은 화질로 코딩함으로써, 적은 데이터 양으로 고화질 영상을 송수신할 수 있는 기술은 최근이 연구되고 있다. 얼굴 영역은 화상통신에서 주요관심영역(ROI:Region Of Interest)이므로 얼굴 영역만 고화질이라면 사용자가 화질상 불편함을 덜 느끼게 되는 것이다. 이 때 중요한 것은 얼굴 영역을 정확하게 추출하기 위한 얼굴 영역 추출 엔진인데, 얼굴 영역 추출 방법은 오래 전부터 많은 연구가 이루어져 왔다.For this reason, recent studies have been actively conducted to show high image quality with a smaller amount of data. Among them, a technique of extracting a face region and coding a face region with a higher image quality and a different background region with a lower image quality to transmit and receive a high quality image with a small amount of data has been recently studied. Since the face area is a region of interest (ROI) in video communication, if only the face area is high quality, the user may feel less inconvenience in image quality. What is important at this time is a face region extraction engine for accurately extracting a face region, and the face region extraction method has been studied for a long time.

이와 같이 얼굴 영역을 제외한 배경영역의 화질 저하 방법은 부분적으로 적은 데이터 양으로 높은 화질을 제공하는데 크게 효과적이나, 여전히 넓은 배경 영역에 대해 코딩이 이루어지므로 비디오 메일 서비스 등을 이용할 때 사용자가 원하지 않더라도 높은 요금을 부담해야 할 수밖에 없다.As such, the method of lowering the image quality of the background region except the face region is greatly effective in providing high image quality with a small amount of data. However, since the coding is performed on the wide background region, even if the user does not want to use the video mail service, etc. You have to pay the fee.

한편, 일반적으로 카메라의 이동이 많으면 전체적으로 코딩된 데이터 양이 많아지게 된다. 만일 움직임이 전혀 없다면 이전 프레임의 해당 블록을 사용하므로 코딩이 되지 않아 데이터 양이 줄게 된다. 따라서 사용자가 비디오 메일 취득이나 화상통신 시에, 많이 움직이지 않고 영상을 취득하게 되면 같은 시간을 촬영하더라도 상대적으로 적은 데이터로 코딩될 수 있다. 하지만 사용자가 아무리 카메라를 움직이지 않는다고 하여도, 고정되지 않는 이상 약간의 흔들림으로 인해 배경 영역도 움직임이 발생하여 코딩 양이 상당히 발생하게 된다. 이러한 불필요한 양의 데이터는 실시간 화상통신 시에 화질을 저하시키는 요인이 되며, 비디오 메일의 경우 매우 비싼 요금을 지불해야 하는 원인이 된다.On the other hand, in general, when the camera moves a lot, the total amount of coded data increases. If there is no movement at all, the corresponding block of the previous frame is used and thus the amount of data is reduced because it is not coded. Therefore, when a user acquires an image without much movement during video mail acquisition or image communication, even if the same time is taken, it can be coded with relatively little data. However, even if the user does not move the camera, unless it is fixed, a slight shake causes movement in the background area, resulting in a significant amount of coding. This unnecessary amount of data causes a deterioration in image quality during real-time video communication, and in the case of video mail, a very expensive fee is required.

본 발명은, 화상통신을 지원하는 이동 통신 단말기를 이용하여 화상통신을 수행하거나 비디오 메일을 송수신할 때, 떨림이나 움직임 보정을 통해 배경을 정지영상화 함으로써, 화질을 개선함과 동시에 데이터 양을 감소시킬 수 있는 화상통신 시스템 및 그 운용방법을 제공함에 목적이 있다.According to the present invention, when performing a video communication or transmitting / receiving a video mail using a mobile communication terminal that supports video communication, the background image is stabilized by image stabilization through motion or motion correction, thereby improving image quality and reducing the amount of data. An object of the present invention is to provide a video communication system and a method of operating the same.

도 1은 본 발명에 따른 화상통신 시스템의 운용방법에 의하여, 입력된 영상에 대하여 관심 영역과 배경 영역을 분리하고, 입력된 영상의 배경 영역을 이전 프레임의 배경 영역과 일치시켜 보정하는 것을 개념적으로 나타낸 도면.FIG. 1 conceptually separates a region of interest and a background region from an input image and corrects the background region of the input image by matching the background region of a previous frame by an operating method of a video communication system according to the present invention. Shown.

도 2는 본 발명에 따른 화상통신 시스템의 운용방법에 의하여, 입력된 영상에 대한 인코딩이 수행되는 과정을 나타낸 순서도.2 is a flowchart illustrating a process of encoding an input video by a method of operating a video communication system according to the present invention.

도 3은 본 발명에 따른 화상통신 시스템의 운용방법에 의하여, 입력된 영상에 대한 코딩을 수행하기 위하여 관심 영역과 배경 영역을 분리하는 것을 개념적으로 나타낸 도면.3 is a view conceptually separating a region of interest from a background region in order to perform coding on an input image by a method of operating a video communication system according to the present invention;

도 4는 본 발명에 따른 화상통신 시스템의 운용방법이 적용된 예로서, 화상 회의와 같은 실시간 코딩 시스템에 적용되는 처리 과정을 나타낸 순서도.4 is a flowchart illustrating a process applied to a real-time coding system such as a video conference, as an example to which an operating method of a video communication system according to the present invention is applied.

도 5는 본 발명에 따른 화상통신 시스템의 운용방법이 적용된 예로서, 비디오 메일과 같은 비실시간 코딩 시스템에 적용되는 처리 과정을 나타낸 순서도.5 is a flowchart illustrating a process applied to a non-real-time coding system such as a video mail as an example to which an operating method of a video communication system according to the present invention is applied.

도 6은 본 발명에 따른 화상통신 시스템의 운용방법에 의하여, 관심 영역으로 얼굴 영역을 추출하는 과정의 한 예를 나타낸 도면.6 is a diagram illustrating an example of a process of extracting a face region as a region of interest by an operating method of a video communication system according to the present invention;

상기의 목적을 달성하기 위하여 본 발명에 따른 화상통신 시스템은,In order to achieve the above object, the video communication system according to the present invention,

입력된 영상으로부터 관심 영역(ROI:Region Of Interest)을 추출하는 수단과;Means for extracting a region of interest (ROI) from the input image;

상기 입력된 영상에 대하여, 관심 영역을 제외한 배경 영역을 이전 프레임에서의 배경 영역과 일치시켜 영상을 보정하는 수단; 및Means for correcting an image by matching a background region excluding a region of interest with a background region in a previous frame with respect to the input image; And

상기 일치된 배경 영역은 'Not-coded 블록'으로 설정하여 상기 보정된 영상을 코딩하는 수단; 을 포함하는 점에 그 특징이 있다.Means for coding the corrected image by setting the matched background area to a 'Not-coded block'; Its features are to include.

여기서 본 발명에 의하면, 상기 관심 영역 추출 수단에서 추출되는 관심 영역은 얼굴 영역인 점에 그 특징이 있다.According to the present invention, the region of interest extracted by the region of interest extracting means is a face region.

또한 본 발명에 의하면, 상기 코딩 수단은 'Not-coded 블록'으로 설정된 배경 영역에 대하여 코딩을 수행함에 있어, 움직임 추정(motion estimation) 과정을 거치지 않고 바로 'not-coded 블록'으로 강제 코딩하는 점에 그 특징이 있다.In addition, according to the present invention, the coding means for coding the background area set as the 'Not-coded block', the forced coding to the 'not-coded block' directly without going through a motion estimation process (motion estimation) Has its features.

또한 본 발명에 의하면, 상기 일치된 배경 영역과 관심 영역의 경계에서 발생될 수 있는, 영상 보정 과정에서의 오차를 보완하기 위하여, 배경 영역으로만 이루어진 매크로 블록과 관심 영역이 포함된 매크로 블록 간의 경계를 매끄럽게 처리하는(smoothing) 후처리 수단을 더 포함하는 점에 그 특징이 있다.In addition, according to the present invention, in order to compensate for an error in the image correction process, which may occur at the boundary between the matched background region and the region of interest, a boundary between the macro block including only the background region and the macro block including the region of interest. It is characterized in that it further comprises post-processing means for smoothing.

또한, 상기의 목적을 달성하기 위하여 본 발명에 따른 화상통신 시스템의 운용방법은,In addition, the operation method of the video communication system according to the present invention in order to achieve the above object,

입력된 영상이 첫 프레임이 아닌 경우, 상기 입력된 영상으로부터 관심 영역 (ROI:Region Of Interest)을 추출하는 단계와;If the input image is not the first frame, extracting a region of interest (ROI) from the input image;

상기 입력된 영상에 대하여, 추출된 관심 영역을 제외한 배경 영역을 이전 프레임에서의 배경 영역과 일치시켜 영상을 보정하는 단계; 및Correcting the image by matching the background region excluding the extracted ROI with the background region of the previous frame with respect to the input image; And

상기 일치된 배경 영역은 'not-coded 블록'으로 설정하여 상기 보정된 영상을 코딩하는 단계; 를 포함하는 점에 그 특징이 있다.Coding the corrected image by setting the matched background area to a 'not-coded block'; Its features are to include.

여기서, 본 발명에 의하면 상기 코딩 단계에 있어, 'Not-coded 블록'으로 설정된 배경 영역에 대해서는 움직임 추정(motion estimation) 과정을 거치지 않고 바로 'not-coded 블록'으로 강제 코딩하는 점에 그 특징이 있다.According to the present invention, in the coding step, the background region set as the 'Not-coded block' is forcibly coded as a 'not-coded block' without undergoing a motion estimation process. have.

또한, 본 발명에 의하면 상기 코딩 단계 이전에, 상기 일치된 배경 영역과 관심 영역의 경계에서 발생될 수 있는, 영상 보정 과정에서의 오차를 보완하기 위하여, 배경 영역으로만 이루어진 매크로 블록과 관심 영역이 포함된 매크로 블록 간의 경계를 매끄럽게 처리하는(smoothing) 후처리 단계를 더 포함하는 점에 그 특징이 있다.In addition, according to the present invention, in order to compensate for errors in the image correction process, which may occur at the boundary between the matched background area and the ROI, before the coding step, the macro block and the ROI that are composed only of the background area may be added. It is characterized in that it further comprises a post-processing step of smoothing the boundary between the included macroblocks.

또한, 본 발명에 의하면 상기 영상을 보정하는 단계에 있어, 상기 추출된 관심 영역의 주변 배경 영역을 조금씩 이동시켜 반복적으로 매핑함으로써, 가장 유사하게 매핑되는 재위치 정보를 구하여 상기 입력된 영상의 배경 영역을 이전 프레임의 배경 영역과 일치시켜 영상을 보정하는 점에 그 특징이 있다.Further, according to the present invention, in the step of correcting the image, by moving the surrounding background region of the extracted region of interest little by little repeatedly to obtain the most similarly mapped relocation information to obtain the background region of the input image Is characterized by correcting the image by matching the background area of the previous frame.

또한 본 발명에 의하면, 상기 코딩 단계에서 보정된 영상에 대한 코딩을 수행함에 있어, 상기 입력된 영상에 대한 영상 보정이 수행된 후에 매 프레임마다 실시간으로 코딩이 수행되는 점에 그 특징이 있다.According to the present invention, in the coding of the image corrected in the coding step, the coding is performed in real time every frame after the image correction is performed on the input image.

또한 본 발명에 의하면, 상기 코딩 단계에서 보정된 영상에 대한 코딩을 수행함에 있어, 상기 입력되는 영상에 대한 영상 보정이 모두 수행된 이후에, 보정된 모든 영상 프레임에 대하여 일괄적으로 비실시간 코딩이 수행되는 점에 그 특징이 있다.In addition, according to the present invention, in performing coding on an image corrected in the coding step, after all image corrections are performed on the input image, non-real-time coding is collectively performed on all corrected image frames. It is characterized by what is being done.

또한, 상기의 목적을 달성하기 위하여 본 발명에 따른 화상통신 시스템에서의 떨림/움직임 보정 방법은,In addition, in order to achieve the above object, the vibration / motion correction method in the video communication system according to the present invention,

입력된 영상에서 관심 영역(ROI:Region Of Interest)을 추출하는 단계; 및Extracting a region of interest (ROI) from the input image; And

상기 추출된 관심 영역을 제외한 배경 영역을 이전 프레임에서의 배경 영역과 일치시켜 영상을 보정하는 단계; 를 포함하는 점에 그 특징이 있다.Correcting an image by matching a background area except the extracted ROI with a background area of a previous frame; Its features are to include.

여기서, 본 발명에 의하면 상기 영상 보정 단계 이후에, 상기 일치된 배경 영역과 관심 영역의 경계에서 발생될 수 있는, 영상 보정 과정에서의 오차를 보완하기 위하여, 배경 영역으로만 이루어진 매크로 블록과 관심 영역이 포함된 매크로 블록 간의 경계를 매끄럽게 처리하는(smoothing) 후처리 단계를 더 포함하는 점에 그 특징이 있다.Here, according to the present invention, in order to compensate for errors in the image correction process, which may occur at the boundary between the matched background region and the ROI, after the image correction step, the macroblock and the ROI consist of only the background region. It is characterized in that it further comprises a post-processing step of smoothing the boundary between the included macroblocks.

또한, 본 발명에 의하면 상기 영상을 보정시키는 단계에 있어, 상기 추출된 관심 영역의 주변 배경 영역을 조금씩 이동시켜 반복적으로 매핑함으로써, 가장 유사하게 매핑되는 재위치 정보를 구하여 상기 입력된 영상의 배경 영역을 이전 프레임의 배경 영역과 일치시켜 영상을 보정시키는 점에 그 특징이 있다.Further, according to the present invention, in the step of correcting the image, by moving the surrounding background region of the extracted region of interest little by little repeatedly to obtain the most similarly mapped relocation information to obtain the background region of the input image Is characterized by correcting the image by matching the background area of the previous frame.

이와 같은 본 발명에 의하면, 화상통신을 지원하는 이동 통신 단말기를 이용하여 화상통신을 수행하거나 비디오 메일을 송수신할 때, 떨림이나 움직임 보정을 통해 배경을 정지영상화 함으로써, 화질을 개선함과 동시에 데이터 양을 감소시킬수 있는 장점이 있다.According to the present invention, when performing a video communication or a video mail transmission and reception using a mobile communication terminal that supports video communication, the background image is still imaged through image stabilization and motion correction, thereby improving image quality and data amount. There is an advantage that can reduce.

이하, 첨부된 도면을 참조하여 본 발명에 따른 실시 예를 상세히 설명한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

일반적으로 화상통신에서 사용되는 동영상 표준으로는 H.263, MPEG1, MPEG2, MPEG4 등이 사용된다. 이들 모두 유사한 방법으로 영상을 압축하게 되는데, 기본적인 압축 개념은 바로 이전 프레임을 참조하여 코딩 양을 최소화하는 것이다.In general, H.263, MPEG1, MPEG2, MPEG4 and the like are used as moving picture standards used in video communication. Both of them compress the image in a similar way. The basic concept of compression is to minimize the amount of coding by referring to the previous frame.

모든 프레임은 16*16 픽셀 크기의 매크로 블록단위로 코딩이 이루어진다. 코딩된 프레임은 크게 인트라(intra) 프레임과 인터(inter) 프레임으로 나뉜다. 인트라 프레임은 다른 프레임의 참조 없이 독자적으로 압축 코딩된 프레임이고, 인터 프레임은 이전 프레임을 참조하여 압축 코딩된 프레임이다. 모든 압축 표준에서 첫 프레임은, 참조할 프레임이 없으므로 인트라 프레임이 되고, 이후부터는 인터 프레임이 된다. 하지만 화질의 개선을 위해 주기적으로 인트라 프레임으로 코딩하기도 한다 (MPEG1,2,4).All frames are coded in macroblocks of 16x16 pixels. The coded frame is largely divided into an intra frame and an inter frame. Intra frames are compression-coded frames independently without reference to other frames, and inter frames are compression-coded frames with reference to previous frames. In all compression standards, the first frame is an intra frame since there are no frames to reference, and subsequently an inter frame. However, in order to improve image quality, it is also periodically coded as an intra frame (MPEG1, 2, 4).

인터 프레임의 경우, 매크로 블록 단위로 이전 프레임에서 가장 유사한 매크로 블록을 찾게 된다. 이를 움직임 추정(Motion Estimation)이라고 하는데, 유사한 이전 프레임에서의 매크로 블록과 현재 매크로 블록과의 차이 값을 구한 후 이를 코딩하게 된다. 두 매크로 블록이 유사할수록 그 차이 값은 0에 가깝게 될 것이고, 0에 가까운 값이 많을수록 압축 알고리즘 상 데이터 양이 작아지게 된다. 가장 데이터 양이 적을 때에는 이전 프레임에서의 같은 위치의 매크로 블록과 현재 매크로 블록이 같은 경우이다. 이때는 코딩을 따로 하지 않고, 디코더에서 바로 이전 프레임의 같은 위치의 매크로 블록을 가져오도록 하고 있다(이를 'Not-coded block'이라 함). 즉, 'Not-coded block'에 해당되는 매크로 블록에 대해서는 코딩 양이 발생되지 않게 된다. 때문에 움직임이 전혀 없는 고정된 카메라로 취득한 영상은 데이터 양이 매우 적게된다.In the case of an inter frame, the macroblock unit finds the most similar macroblock in the previous frame. This is called motion estimation. The difference value between the macroblock and the current macroblock in a similar previous frame is obtained and then coded. The more similar the two macroblocks, the closer the difference will be to zero, and the closer the value is to zero, the smaller the amount of data in the compression algorithm. The smallest amount of data is when the macro block at the same position in the previous frame and the current macro block are the same. In this case, without coding, the decoder is to bring the macro block at the same position of the previous frame immediately (this is called 'Not-coded block'). That is, the coding amount is not generated for the macro block corresponding to the 'Not-coded block'. As a result, images acquired by fixed cameras with no movements have very little data.

이러한 이유로 사용자가 낮은 네트워크 환경에서 좋은 화질을 원한다면 가급적 카메라를 움직이지 않고 통신을 수행하여야 하며, 비디오 메일의 경우에도 적은 양으로 영상을 취득하고 싶다면 카메라를 움직이지 않고 영상을 취득하여야 한다. 하지만 PC 카메라와 같이 카메라가 고정되어 있지 않는 이상, 사람이 손으로 움직이지 않고 카메라를 들고 있다고 하더라도 실제로는 미세한 움직임이 발생하여 배경 영역도 코딩이 되어 불필요한 코딩 양이 발생한다. 또한 경우에 따라서는 카메라를 고정시킨 경우조차 보이지 않는 조명의 변화(또는 떨림)로 인하여 코딩이 되는 경우도 있다.For this reason, if a user wants to have good image quality in a low network environment, the user should perform communication without moving the camera as much as possible. In the case of video mail, if the user wants to acquire a small amount of video, the user should acquire the image without moving the camera. However, even if a camera is not fixed like a PC camera, even if a person is holding the camera without moving by hand, the motion is actually fine and the background area is also coded, which causes unnecessary coding amount. In some cases, even when the camera is fixed, coding may be performed due to invisible light changes (or shaking).

본 발명에서는 이와 같이 떨림이나 약간의 이동으로 인하여 사용자가 의도하지 않은 움직임이 발생했을 때, 이를 감지하고 자동으로 움직인 배경을 제자리로 고정시킴으로써, 도 1에 나타낸 바와 같이, 정지 배경과 같은 효과를 주도록 한다. 도 1은 본 발명에 따른 화상통신 시스템의 운용방법에 의하여, 입력된 영상에 대하여 관심 영역과 배경 영역을 분리하고, 입력된 영상의 배경 영역을 이전 프레임의 배경 영역과 일치시켜 보정하는 것을 개념적으로 나타낸 도면이다.In the present invention, when the user unintentional movement occurs due to the shaking or slight movement, it detects this and automatically locks the background that is automatically moved into place. As shown in FIG. Give it. FIG. 1 conceptually separates a region of interest and a background region from an input image and corrects the background region of the input image by matching the background region of a previous frame by an operating method of a video communication system according to the present invention. The figure shown.

이때, 배경의 움직임을 감지하기 위해서 기존에 사용된 대부분의 방법들은 영상 전체에 대해 움직임을 보정하는 방법을 사용하고 있다. 하지만 본 발명에서는 얼굴 영역을 제외한 배경 영역에 대해서만 이러한 움직임 보정이 일어나야 하므로,움직임 보정 수단 뿐 아니라 얼굴 영역 추출 수단이 요구된다.In this case, most of the existing methods for detecting the motion of the background use a method of compensating for the entire image. However, in the present invention, since such motion correction should occur only for the background region except for the face region, not only the motion correction means but also the face region extraction means are required.

즉, 얼굴 영역을 먼저 추출한 후, 얼굴 영역을 제외한 나머지 영역에 대해 움직임 정보를 추출하여 전체 프레임에 대한 움직임 보정을 수행하여야 한다. 이러한 일련의 과정이 도 2에 기술되어 있다. 도 2는 본 발명에 따른 화상통신 시스템의 운용방법에 의하여, 입력된 영상에 대한 인코딩이 수행되는 과정을 나타낸 순서도이다.That is, after extracting the face region first, the motion information should be extracted for the remaining regions except for the face region to perform motion correction for the entire frame. This series of procedures is described in FIG. 2 is a flowchart illustrating a process of encoding an input video by a method of operating a video communication system according to the present invention.

먼저, 영상 프레임이 입력되면 입력된 영상으로부터 얼굴 영역을 추출한다(단계 201~202). 다음 얼굴 영역을 제외한 배경 영역에 대해, 배경을 이전 프레임과 일치시키기 위한 재위치 정보를 구하고, 구해진 재위치 정보에 따라 얼굴 영역을 제외한 배경 영역을 이전 프레임과 일치시킨다(단계 203). 이후, 얼굴 영역을 포함된 프레임을 배경이 일치된 프레임 위에 재위치 시키고, 이전 프레임을 이용하여 연속된 두 프레임(이전 프레임과 현재 프레임)이 자연스럽도록 얼굴 영역과 배경 영역을 합성시킨다(단계 204).First, when an image frame is input, a face region is extracted from the input image (steps 201 to 202). For the background area except the next face area, relocation information for matching the background with the previous frame is obtained, and the background area except the face area is matched with the previous frame according to the obtained relocation information (step 203). Then, the frame including the face area is repositioned on the frame where the background is matched, and the face area and the background area are synthesized using the previous frame so that two consecutive frames (previous frame and the current frame) are natural (step 204). ).

여기서, 프레임을 재위치 시킨다는 것은 배경 영역은 이전 프레임의 배경 영역과 일치한다는 가정이므로 완전히 배경 영역만 있는 매크로 블록의 경우 이전 프레임의 매크로 블록을 그대로 사용할 수 있다. 도 3에서 회색(gray)으로 표시된 블록이 이에 해당된다. 대신 얼굴 영역이 포함된 영상은 이전 배경과 정확하게 일치하도록 재위치시킨 상태에서 현재 프레임의 해당 블록을 사용하게 된다. 도 3에서 회색(Gray)으로 표시되지 않은 나머지 블록이 이에 해당된다. 도 3은 본 발명에 따른 화상통신 시스템의 운용방법에 의하여, 입력된 영상에 대한 코딩을 수행하기 위하여 관심 영역과 배경 영역을 분리하는 것을 개념적으로 나타낸 도면이다.Here, since repositioning the frame assumes that the background area corresponds to the background area of the previous frame, the macro block of the previous frame may be used as it is in the case of the macro block having only the background area. This corresponds to a block indicated in gray in FIG. 3. Instead, the image including the face region is used to re-locate the image exactly to match the previous background and use the corresponding block of the current frame. The remaining blocks not shown in gray in FIG. 3 correspond to this. 3 is a diagram conceptually illustrating separating a region of interest from a background region in order to perform coding on an input image by a method of operating a video communication system according to the present invention.

그리고, 상기 배경을 일치시키는 방법으로는 얼굴 영역 주변 매크로 블록들을 조금씩 이동시켜 매핑하는 과정을 통하여 가장 유사하게 매핑되는 점을 찾아 평균 이동 정보를 구하여 재위치 정보를 구하는 방법이 사용될 수 있다.As the method of matching the background, a method of finding relocation information by finding average movement information by finding the most similarly mapped points by moving the macro blocks around the face area little by little and mapping the same may be used.

일반적으로 카메라 움직임에 따라 배경을 일치시키더라도 정확히 일치되지 않는 경우가 있을 수 있다. 이 경우 상기와 같이 현재 프레임에서는 얼굴을 포함한 매크로 블록만을 일치시킨 후 사용하고 나머지 배경에 대해서는 이전 프레임의 매크로 블록을 사용하게 되면, 현재 프레임에서 가져온 블록과 이전 프레임에서 가져온 블록 간의 경계 부근에서 어긋남이 발생될 수도 있다. 이러한 약간의 어긋남을 보완하기 위해 후처리가 추가될 수도 있다(단계 205). 이와 같은 후처리의 예로서 블록 간의 경계 부근을 매끄럽게 만드는 방법(smoothing)을 이용할 수도 있다.In general, even if the background is matched according to the camera movement, there may be a case where the match is not exactly. In this case, if only the macro block including the face is matched and used in the current frame, and the macro block of the previous frame is used for the rest of the background, there is a deviation near the boundary between the block taken from the current frame and the block taken from the previous frame. May occur. Post-processing may be added to compensate for this slight deviation (step 205). As an example of such post-processing, a smoothing near the boundary between blocks may be used.

이와 같이 이전 프레임에 대하여 배경을 일치시킨 동영상을 인코더로 압축하게 되면 배경에 해당하는 블록은 자연스럽게 'Not-coded 블록'으로 코딩되게 된다(단계 206). 따라서 데이터 양은 자연스럽게 감소하게 되지만 움직임 추정(motion estimation) 과정은 수행하게 되므로 코딩 시간에는 크게 변함이 없다.As such, when the video having the background matched to the previous frame is compressed by the encoder, the block corresponding to the background is naturally coded as a 'Not-coded block' (step 206). Therefore, the amount of data naturally decreases, but since the motion estimation process is performed, the coding time does not change much.

그런데, 본 발명의 경우 얼굴 영역을 포함하는 매크로 블록, 즉 'coded 블록'과 'not-coded 블록'에 대한 정보를 사전에 알고 있으므로, 'not-coded 블록'으로 코딩될 블록, 즉 배경 영역에 해당되는 블록의 경우에는 움직임 추정(motion estimation) 과정 없이 바로 'not-coded 블록'으로 세팅할 수 있다. 이 경우 인코딩 시간도 매우 감소시킬 수 있으므로 데이터 양 뿐 아니라 코딩 시간도 감소시킬수 있는 장점이 있다.However, in the present invention, since the macroblock including the face region, that is, the information about the 'coded block' and the 'not-coded block' is known in advance, In the case of the corresponding block, it can be set directly to a 'not-coded block' without a motion estimation process. In this case, since the encoding time can be greatly reduced, the coding time can be reduced as well as the amount of data.

지금까지 기술한 떨림 보정, 또는 움직임 보정을 통한 화상회의/비디오 메일 시스템은 다음과 같이 두 가지 응용이 가능하다. 먼저, 화상 회의와 같이 실시간 코딩 시스템에서는 도 4에 나타낸 바와 같이 입력된 영상을 일련의 과정으로 배경 일치를 시킨 후, 바로 현재 프레임을 코딩하여 전송 버퍼로 보내는 경우이고, 다른 하나는 도 5와 같이 취득된 영상의 모든 프레임을 일련의 과정으로 배경 일치시킨 후, 나중에 일괄적으로 전 프레임을 코딩하여 압축된 영상을 만들어서 전송하는 경우로서, 이는 비디오 메일에 적용될 수 있다. 도 4는 본 발명에 따른 화상통신 시스템의 운용방법이 적용된 예로서, 화상 회의와 같은 실시간 코딩 시스템에 적용되는 처리 과정을 나타낸 순서도이고, 도 5는 본 발명에 따른 화상통신 시스템의 운용방법이 적용된 예로서, 비디오 메일과 같은 비실시간 코딩 시스템에 적용되는 처리 과정을 나타낸 순서도이다.The video conferencing / video mail system described above with image stabilization or motion correction has two possible applications. First, in a real-time coding system such as video conferencing, as shown in FIG. 4, the input image is matched with a series of processes, and then the current frame is coded and sent to the transmission buffer. Background matching of all the frames of the acquired image in a series of processes, and then the entire frame is coded later to create a compressed image and transmit it, which can be applied to the video mail. 4 is a flowchart illustrating a process applied to a real-time coding system such as video conferencing, in which an operating method of a video communication system according to the present invention is applied, and FIG. 5 is a method of operating a video communication system according to the present invention. For example, a flowchart illustrating a process applied to a non-real time coding system such as a video mail.

한편, 얼굴 추출 알고리즘은 다양한 방법이 기존에 소개되어 있으며, 여기서는 한 예로서 살색 기반 얼굴 영역 추출 알고리즘을 사용하여 얼굴 영역을 추출하는 경우에 대하여 도 6을 참조하여 간략하게 설명해 보기로 한다. 도 6은 본 발명에 따른 화상통신 시스템의 운용방법에 의하여, 관심 영역으로 얼굴 영역을 추출하는 과정의 한 예를 나타낸 도면이다.Meanwhile, various methods of the face extraction algorithm have been introduced in the related art, and a case of extracting the face region using the flesh color-based face region extraction algorithm as an example will be briefly described with reference to FIG. 6. 6 is a diagram illustrating an example of a process of extracting a face region as a region of interest by an operating method of a video communication system according to the present invention.

도 6을 참조하여 설명하면, 먼저 영상이 입력되면(단계 601), 입력된 영상에서 살색 영역 조건에 만족하는 픽셀을 구하고(단계 602), 살색 픽셀을 유사한 색끼리 그룹화하는 과정을 수행한다(단계 603).Referring to FIG. 6, when an image is input (step 601), pixels satisfying the skin color region condition are obtained from the input image (step 602), and the color pixels are grouped with similar colors (step 602). 603).

이후, 상기 단계 603에서 그룹화된 살색 그룹들 중 가장 중앙에 많이 분포되어 있는 그룹을 후보 색그룹으로 선정하여 이를 살색 영역으로 지정하고(단계 604), 살색 영역에서 연결 정보를 사용하여 가장 큰 살색 덩어리를 구하며(단계 605), 상기 단계 605에서 구해진 살색 덩어리 모양이 타원 모양인지를 검증하는 타원 검증 과정을 통해 얼굴 영역을 추출한다(단계 606).Subsequently, the group most distributed at the center among the flesh color groups grouped in step 603 is selected as a candidate color group and designated as a skin color region (step 604), and the largest flesh mass is obtained by using the connection information in the skin color region. (Step 605), the face region is extracted through an elliptic verification process for verifying whether the flesh-colored mass obtained in the step 605 is an ellipse shape (step 606).

이상의 설명에서와 같이 본 발명에 따른 화상통신 시스템 및 그 운용방법에 의하면, 화상통신을 지원하는 이동 통신 단말기를 이용하여 화상통신을 수행하거나 비디오 메일을 송수신할 때, 떨림이나 움직임 보정을 통해 배경을 정지영상화 함으로써, 화질을 개선함과 동시에 데이터 양을 감소시킬 수 있는 장점이 있다.According to the video communication system and its operation method according to the present invention as described above, when performing a video communication or transmitting and receiving a video mail using a mobile communication terminal that supports video communication, the background through the shaking or motion correction By still image, there is an advantage that can reduce the amount of data while improving the image quality.

Claims

Means for extracting a region of interest (ROI) from the input image;

Means for correcting an image by matching a background region excluding a region of interest with a background region in a previous frame with respect to the input image; And

Means for coding the corrected image by setting the matched background area to a 'Not-coded block'; Video communication system comprising a.

The method of claim 1,

And the region of interest extracted by the region of interest extraction means is a face region.

The method of claim 1,

The coding means performs coding on a background region set as a 'not-coded block', and forcibly codes a 'not-coded block' without performing a motion estimation process. .

The method of claim 1,

In order to compensate for an error in the image correction process, which may occur at the boundary between the matched background region and the region of interest, the boundary between the macroblock consisting of only the background region and the macroblock including the region of interest is smoothed. And post-processing means.

If the input image is not the first frame, extracting a region of interest (ROI) from the input image;

Correcting the image by matching the background region excluding the extracted ROI with the background region of the previous frame with respect to the input image; And

Coding the corrected image by setting the matched background area to a 'not-coded block'; Operating method of a video communication system comprising a.

The method of claim 5,

In the coding step, a method for operating a video communication system, characterized in that the background area set to 'Not-coded block' is forcibly coded as a 'not-coded block' without undergoing a motion estimation process.

The method of claim 5,

Before the coding step,

In order to compensate for an error in the image correction process, which may occur at the boundary between the matched background region and the ROI, the boundary between the macroblock including only the background region and the macroblock including the ROI is smoothed. And a post-processing step.

The method of claim 5,

In the step of correcting the image, by gradually moving the peripheral background region of the extracted region of interest little by little iteratively, to obtain the most similarly mapped repositioning information to replace the background region of the input image background region of the previous frame And correcting the image in accordance with the operation of the video communication system.

The method of claim 5,

In the coding of the image corrected in the coding step, after the image correction is performed on the input image, the operation of the video communication system, characterized in that the coding is performed in real time every frame.

The method of claim 5,

In the coding of the corrected image in the coding step, after all the image correction is performed on the input image, non-real-time coding is performed on all the corrected image frames collectively. Operation method of video communication system.

Extracting a region of interest (ROI) from the input image; And

Correcting an image by matching a background area except the extracted ROI with a background area of a previous frame; Image stabilization / motion correction method in a video communication system comprising a

The method of claim 11,

After the image correction step, in order to compensate for an error in the image correction process, which may occur at the boundary between the matched background region and the ROI, between the macroblock including the ROI region and the macroblock including the ROI Further comprising a post-processing step of smoothing the boundary.

The method of claim 11,

In the step of correcting the image, by gradually moving the peripheral background region of the extracted region of interest little by little iteratively, to obtain the most similarly mapped repositioning information to replace the background region of the input image background region of the previous frame And image stabilization in accordance with the image stabilization method.