KR20040035007A

KR20040035007A - Face Detection and Background Blurring for Video Conference Application

Info

Publication number: KR20040035007A
Application number: KR1020020063710A
Authority: KR
Inventors: 홍성훈
Original assignee: (주) 임펙링크제너레이션
Priority date: 2002-10-18
Filing date: 2002-10-18
Publication date: 2004-04-29

Abstract

PURPOSE: A method of preprocessing moving pictures in a video conference by extracting a facial shape and blurring major subjects is provided to highlight a boundary of a facial shape and to blur remaining parts such as background, thereby minimizing a moving picture compression ratio and reducing deterioration of image quality screen disconnection. CONSTITUTION: An image preprocessor first performs a preprocessing operation such as extraction of a facial shape and blurring of non-major subjects for moving picture signals, and then performs encoding for a video conference such as the conventional H.263 or MPEG4(Moving Picture Experts Group 4). When the facial shape is extracted, the boundary in the facial shape is classified into a major part and remaining parts including background are classified into a non-major part. And blurring is applied to the non-major part to minimize information in an encoding system and operate an encoder for the video conference at a minimum bit rate.

Description

Face Detection and Background Blurring for Video Conference Application using Face Shape Extraction and Non-Key Partial Blur

일반적으로 동영상은 순차적인 여러 장의 정지영상으로 구성되어 있는데, 이때 각각의 영상을 프레임 (frame)이라고 부른다. 즉, 동영상은 순차적인 프레임들로 구성되어 있는 것이다. 동영상의 인코딩 방법에 대해서는 많은 연구들이 진행되어왔고, 현재도 다양한 연구가 진행중이다. 이러한 연구는 국내외에서 개인 혹은 개별기업에 의해 연구가 진행되기도 하고, 다양한 형태로 범 국가적인 연구단체를 결성하여 이를 중심으로 연구 및 개발이 진행되기도 한다. 그 중 가장 대표적인 단체는 MPEG (Moving Picture Experts Group)이다. MPEG을 중심으로 동영상 분야의 개인 전문가 및 회사들이 모여서 전세계적인 동영상 파일의 표준을 정하였는데, 이 단체에서 발표한 표준으로는 MPEG, MPEG4, MPEG7 등이 있다. 현재는 주로 MPEG4 혹은 이를 일부 수정하여 제작된 동영상 파일들이 주로 인터넷에서는 이용되고 있고,방송용 혹은 일반 미디어용으로는 MPEG2가 사용되고 있다. 이외에 다양한 개발자 혹은 개별회사에 의해 다양한 형태의 인코딩 기법들이 발표되고 사용되어지고 있다. 형태는 다양하지만, 기술적으로 대부분의 동영상 인코딩 방식은 1) 중간 중간의 정지영상압축, 2) 정지영상사이의 보간을 통한 영상압축, 3) 배경과 물체 혹은 배경내의 물체를 분리하여 각 물체자체 및 물체의 움직임 정보를 이용 (모션 추정)하여 압축하는 방법들을 이용하고 있다. 즉, 대부분의 동영상 파일 인코딩 방법은 크게 보면 나열한 세가지 기술 중 일부 혹은 전부를 다양한 형태로 적용하고 있는 것이다. 인터넷의 보급이 대중화되어있는 현 상황에서는 네트워크를 통한 전송의 효율성을 위해서 부가적으로 파일이나 외부의 요소들을 추가하게 된다. 동영상 파일 자체에 전송속도 등과 같은 네트워크 정보를 이용하여 전송할 수 있게 하기위한 정보들이 제공되면 네트워크 환경에서 동영상 파일을 사용하는데 유리한데, 일부 인코딩 기법들은 이러한 정보를 실제로 제공한다.In general, a video is composed of several sequential still images, each of which is called a frame. That is, the video is composed of sequential frames. Many researches have been conducted on the video encoding method, and various researches are in progress. Such research may be conducted by individuals or individual companies at home and abroad, and research and development may be carried out by forming a national research group in various forms. The most representative of them is the Moving Picture Experts Group (MPEG). Individual experts and companies in the video field gathered around MPEG to set the standard for worldwide video files. The standards announced by the organization include MPEG, MPEG4 and MPEG7. Currently, MPEG4 or video files produced by modifying some of them are mainly used on the Internet, and MPEG2 is used for broadcasting or general media. In addition, various types of encoding techniques are announced and used by various developers or individual companies. Although there are various forms, technically, most video encoding methods are 1) intermediate still image compression, 2) image compression through interpolation between still images, and 3) separates the object from the background and the object in the background. Compression methods using motion information (motion estimation) of the object are used. In other words, most video file encoding methods use some or all of the three technologies listed in various forms. In the current popularization of the Internet, additional files or external elements are added for the efficiency of transmission over the network. If the video file itself is provided with information for transmission using network information such as transmission speed, it is advantageous to use the video file in a network environment. Some encoding techniques actually provide such information.

본 발명에서 이루고자 하는 과제는 화상회의에서 유효한 정보를 유지하면서 네트워크를 통해 전달되는 정보의 양은 줄이는 것이다. 아울러 가능한 한 기존의 MPEG 계열 압축방식의 구조 (인코더와 디코더)에 대한 수정을 최소화하고자 하는 것이다.An object of the present invention is to reduce the amount of information transmitted over a network while maintaining valid information in video conferencing. In addition, it is intended to minimize the modification of existing MPEG-based compression schemes (encoders and decoders) as much as possible.

(도 1) 고안된 영상전처리기를 포함한 화상통화동영상 인코딩 방법(Fig. 1) Video call video encoding method including the designed image preprocessor

제안하는 영상전처리기를 포함한 화상통화 동영상의 인코딩 방법을 (도 1)에 나타내었다. (도 1)에 나타난 바와 같이 제안하는 영상전처리기는 입력되는 동영상신호를 대상으로 얼굴형상추출과 비주요부분 블러링을 먼저 수행한 후, 기존의 H.263이나 MPEG4와 같은 화상통화를 위한 인코딩을 수행한다. 얼굴형상추출은 기존의 방식을 수정없이 사용할 수 있다. 얼굴형상이 추출되면 얼굴형상 중 경계부위를 대상으로 주요부분으로 구분하고 배경을 포함한 그 외의 부분을 비주요부분으로 구분한다. 비주요부분으로 구분된 부분은 블러링을 시킴으로써 기존의 인코딩에서 정보가 최소화될 수 있도록 한다. 제안하는 방식에서는 얼굴형상의 경계부위를 강조하고 배경 등 나머지 부분에 대해서는 블러링을 수행함으로써 정보가 최소화될 수 있도록 전처리를 수행함으로써 기존의 H.263이나 MPEG4와 같은 화상통화 인코더가 최소한의 비트율로 동작될 수 있도록 하였다. 이와 같은 전처리기를 이용하면 비주요부분의 블러링 효과로 인해 동영상 압축율이 최소화된다. 따라서 전송하고자 하는 화상통화의 압축율을 상당부분 개선할 수 있으므로 기존 화상통화 시 나타나는 화질의 저하와 화면 끊김 현상을 개선할 수 있다.An encoding method of a video call video including the proposed image preprocessor is shown in FIG. 1. As shown in Fig. 1, the proposed image preprocessor performs face shape extraction and non-major partial blurring on the input video signal, and then encodes the existing video encoding such as H.263 or MPEG4. To perform. Face extraction can be used without modification. When the face shape is extracted, the boundary part of the face shape is divided into main parts, and other parts including the background are divided into non-main parts. The parts separated by the non-major parts are blurred to minimize the information in the existing encoding. In the proposed method, pre-processing is performed to minimize the information by emphasizing the boundary of the face shape and blurring the remaining parts such as the background so that existing video call encoders such as H.263 and MPEG4 can be processed at the minimum bit rate. It can be operated. Using such a preprocessor minimizes the video compression rate due to the blurring effect of the non-major portion. Therefore, since the compression ratio of the video call to be transmitted can be substantially improved, the image quality deterioration and the screen dropping phenomenon that occur during the existing video call can be improved.

본 발명은 화상통화를 위한 동영상의 인코딩 방법에 관한 것이다. 본 발명의 방법으로 인코딩된 파일은 영상정보에서 많은 부분을 차지하는 프레임별 영상정보를 획기적으로 개선할 수 있게 한다. 제안하는 방식은 인코딩시 전처리기로서 얼굴형상추출과 영상의 전처리 방식을 이용하고 있다. 이러한 비주요부분의 블러링만으로도 기존 동영상 압축기의 압축율을 획기적으로 개선할 수 있을 뿐만 아니라 디코더단에서 아무런 수정 없이 동작시킬 수 있으므로 실제 응용도 용이한 장점을 가지고 있다.The present invention relates to a method of encoding a video for a video call. The file encoded by the method of the present invention can significantly improve the frame-by-frame image information that takes up a large portion of the image information. The proposed method uses face extraction and image preprocessing as a preprocessor for encoding. The blurring of the non-major parts can not only dramatically improve the compression ratio of the existing video compressor, but also can operate without any modification at the decoder stage.

본 발명에서 제안하는 동영상 인코딩 기법은 기존의 화상통화를 위한 동영상 압축방식들이 가지고 있는 압축율의 한계를 극복하기 위해서 제안한 새로운 전처리 방식으로서 화상통화의 동영상 종류에 따라 많은 압축효과를 얻을 수 있다. 특히, 제안한 방식은 디코더단에서 아무런 수정없이 적용이 가능하고 기존 인코더단에서도 수정을 최소화할 수 있는 구조로 구현이 가능하다는 장점을 가지고 있다. 따라서 기존의 화상통화를 위한 동영상 압축방식이 가지고 있었던 저전송율 환경에서의 화질저하와 화면 끊김 현상을 획기적으로 개선할 수 있다.The video encoding scheme proposed by the present invention is a new preprocessing scheme proposed to overcome the limitations of the compression ratios of the existing video compression schemes for video calls, and can achieve a lot of compression effects according to the video types of the video calls. In particular, the proposed method has the advantage that it can be applied without any modification in the decoder stage and can be implemented in a structure that can minimize the modification in the existing encoder stage. Therefore, it is possible to drastically improve image quality and screen drop in a low bit rate environment that a video compression method for a video call has.

Claims

Video Call Video Preprocessing Using Face Shape Extraction and Non-Key Partial Blur