KR20030062586A

KR20030062586A - Human area detection for mobile video telecommunication system

Info

Publication number: KR20030062586A
Application number: KR1020020002842A
Authority: KR
Inventors: 이진수
Original assignee: 엘지전자 주식회사
Priority date: 2002-01-17
Filing date: 2002-01-17
Publication date: 2003-07-28
Also published as: KR100439377B1

Abstract

PURPOSE: A method for extracting a man's area in mobile communications is provided to use areas properly separated according to the kind of application by precise extraction of the face, head and body areas of a man. CONSTITUTION: A method for extracting a man's area in mobile communications includes the steps of extracting a face area from an image(S14), extracting a head area on the basis of the extracted face area(S15), extracting a body area on the basis of the extracted face area(S16), and assigning a man's area by uniting the extracted face area, head area, and body area together. First, an original image is reduced before carrying out the previous steps(S12), and the reduce image is restored to the original image size after carrying out the previous steps(S18).

Description

HUMAN AREA DETECTION FOR MOBILE VIDEO TELECOMMUNICATION SYSTEM}

본 발명은 이미지에서 얼굴 영역을 검출하는 방법, 상기 검출된 얼굴 영역을 기반으로 하여 머리와 몸 영역을 검출함으로써 이미지에서 사람 영역을 검출하는 방법에 관한 것이다.The present invention relates to a method for detecting a face region in an image and a method for detecting a human region in an image by detecting a head and body region based on the detected face region.

화상 통신 및 멀티미디어 서비스가 가능한 이동 단말기 환경에서는 방대한 양의 데이터를 제한된 네트워크 환경을 통하여 실시간으로 전송해야 하는 어려움이 있다. 전송할 화상 데이터의 질과 데이터의 양은 서로 비례하므로 높은 화질을 보장하기 위해서는 높은 데이터 전송율이 동시에 보장되어야 한다. 화상 통신에서 사용자가 느끼는 화질의 차이는 통신 영상 전체 프레임 영역보다는 사용자 영역에서 더욱 심하게 나타난다. 이 것은 화상 통신의 특성상 통화자(사용자)에게 시선이 집중되고 또 통화자의 얼굴 등의 모습이 주된 관심 있는 영역이기 때문이다. 따라서,사용자 영역만을 보다 높은 화질로 전송하면 낮은 데이터 양을 가지고 높은 체감 화질을 구현할 수 있다.In a mobile terminal environment capable of video communication and multimedia services, it is difficult to transmit a large amount of data in real time through a limited network environment. Since the quality of the image data to be transmitted and the amount of data are proportional to each other, high data rates must be guaranteed at the same time to ensure high image quality. The difference in image quality felt by the user in the video communication is more severe in the user area than in the entire frame area of the communication image. This is because, due to the characteristics of the video communication, the attention is focused on the caller (user) and the appearance of the caller's face and the like is the main area of interest. Therefore, if only the user region is transmitted with higher image quality, higher immersion quality can be realized with a lower data amount.

이러한 기능을 위해서는 사용자와 배경을 자동으로 분리하는 기술이 요구된다. 사용자와 배경을 자동으로 분리하는 기술은 화상 통신뿐만 아니라, 비디오 메일과 같이 오프라인 편집이 가능할 경우에는 얼굴 이외의 배경을 재미있는 다른 템플리트로 전환하거나, 경우에 따라서는 복장이나 헤어 스타일 등을 편집하여 전송하는 기법도 가능하게 한다.This requires a technology that automatically separates the user from the background. The technology that automatically separates the user from the background is not only video communication, but when offline editing such as video mail is possible, the background other than the face is switched to another interesting template, and in some cases, the dress or hairstyle is edited and transmitted. To make it possible.

이와 같은 서비스를 위해서는 화상 통신 영상에서 사용자와 배경, 또는 사용자 얼굴과 배경을 분리하는 기술이 요구되며 특히, 제한된 네트워크 환경을 고려할 때 화질의 문제는 물론 실시간 처리를 가능하게 하는가의 여부도 중요한 문제가 된다.This service requires a technology that separates the user and the background or the user's face and the background from the video communication video, and in particular, considering the limited network environment, whether the image quality as well as the real-time processing is possible is an important issue. do.

영상 프레임에서 특정 객체를 분리하는 기술은 대부분 실시간 처리가 어렵거나 다양한 데이터에 대해 만족할 만한 결과를 주지 못하고 있다. 특히 이동 카메라 환경, 즉 움직이는 배경에서 사람과 같은 객체를 분리하는 것은 더욱 어렵다.Most of the techniques for separating specific objects from image frames are difficult to process in real time or do not provide satisfactory results for various data. In particular, it is more difficult to separate objects such as people from the mobile camera environment, ie moving background.

종래에 객체를 분리하는 기술로는 차영상과 에지 영상을 이용하여 객체를 분리하는 방법이 있다. 이 방법은 사용하는 특징정보가 단순하여 빠른 처리는 가능하지만 복잡한 화면에서는 분리가 어렵고, 배경은 정지되어 있고 움직이는 물체가 객체라는 전제 하에 분리를 행하므로 이동 카메라 환경에는 적용하기 어렵다. 종래에 객체를 분리하는 또 다른 기술로는 영상 프레임에서 칼라 정보를 기반으로 하여 특정 객체를 분리하는 기법이 있다. 이 기법은 칼라가 일정한 영역을 분리한 후 이를 이용하는데 상당한 처리 시간이 걸리기 때문에 실시간 처리를 요하는 이동 통신 환경에는 역시 적용하기 어려운 문제가 있다.Conventionally, there is a technique of separating an object using a difference image and an edge image. This method is simple to use, and fast processing is possible, but it is difficult to separate on a complicated screen, and the background is stationary and the moving object is separated on the premise that an object is difficult to apply to a mobile camera environment. Another technique for separating an object in the related art is a technique for separating a specific object based on color information in an image frame. This technique is also difficult to apply to mobile communication environments that require real-time processing because it takes considerable processing time to separate a certain area of color.

칼라 정보를 기반으로 하여 객체를 분리하는 기술 중에서 얼굴 영역을 추출하는 방법이 있다. 이 방법은 색공간 내에서 살색에 해당하는 범위를 사전에 정의하여 해당 범위에 속한 픽셀, 즉 살색을 갖는 픽셀들만을 사용하여 얼굴 영역을 지정하는 방법이다. 살색을 기반으로 얼굴 영역을 검출하는 방법으로는 RGB 색좌표를 YIQ 좌표로 변환한 후, 살색 칼라 범위를 YIQ 색좌표에서 지정하여 살색 영역을 추출하는 방법이나, 학습 데이터의 집합으로부터 L*a*b 색좌표에서 표현되는 살색 칼라 범위를 다시 PCA(Principle Component Analysis)을 이용하여 보다 적은 정보량으로 효과적인 살색 칼라 범위를 표현할 수 있도록 변환한 후, 이를 이용하여 얼굴 영역을 추출하는 방법, 또는 퍼지 집합(Fuzzy Set)을 이용하여 특정한 칼라가 살색 칼라일 확률을 계산하는 개념을 이용하는 방법이 있다.Among techniques for separating objects based on color information, there is a method of extracting a face region. In this method, a range corresponding to flesh color is defined in advance in the color space to designate a face area using only pixels within the range, that is, pixels having flesh color. As a method of detecting a face region based on flesh color, a method of extracting a flesh color region by converting an RGB color coordinate into a YIQ coordinate and then specifying the flesh color range in a YIQ color coordinate, or using an L * a * b color coordinate from a set of training data A method of extracting a facial region using a method of extracting a flesh color color range using a Principle Component Analysis (PCA) to express an effective flesh color color range with a smaller amount of information, and then using a fuzzy set. There is a method using the concept of calculating the probability that a particular color is a flesh color using

상기 기술한 칼라 기반의 얼굴영역 추출은 모두 사전에 스킨 칼라의 범위를 정의하여 이를 이용하는데, 이 방법은 빠른 프로세싱 시간을 요하는 장점을 가지고 있는 반면, 다음과 같은 이유로 실제 적용하기에 적합하지 않다.All of the above-described color-based face region extractions use a range of skin colors in advance, which has the advantage of requiring fast processing time, but is not suitable for practical application for the following reasons. .

같은 사람의 얼굴을 다양한 장소와 다양한 촬영장치를 사용하여 취득한 영상에서 살색 영역에 대한 색상을 분석하면 각 영상마다 색상이 매우 다양하게 나타난다. 즉, 살색 영역이 색상에 따라 색좌표 상에서 매우 다양하게 나타나므로, 사전에 모든 가능한 살색 범위를 지정할 경우 그 범위가 너무 광범위하게 되어 한 영상에서 살색 이외의 영역을 포함하는 경우가 많아진다. 이러한 원인은 크게 조명에 따라 색이 왜곡되는 조명에 의한 색상 왜곡과, 영상을 취득하는 장치에 의존적인 영상 취득장치에 의한 색상 왜곡, 그리고 영상의 재생장치에 의존적인 재생장치에 의한 색상 왜곡 등이 있다. 상기 기술한 두번째와 세번째의 경우는 영상 취득 및 재생 기술이 자연 영상의 칼라를 그대로 반영시킬 수 있는 수준에 다다르지 못하는 이유와 이를 보정하기 위하여 자체적인 색상 필터 등을 적용하기 때문에 야기된다.When analyzing the color of the flesh area in the images acquired by using the same place with various places and various photographing devices, the colors appear very different for each image. That is, since the skin color regions appear in various colors on the color coordinates according to the colors, if all possible skin color ranges are specified in advance, the range becomes too broad and often includes areas other than the skin color in one image. This is mainly caused by color distortion caused by lighting that is distorted by color, color distortion caused by an image acquisition device dependent on an image acquisition device, and color distortion caused by a playback device dependent on an image reproducing device. have. The second and third cases described above are caused by the reason why the image acquisition and reproduction technique does not reach the level that can reflect the color of the natural image as it is, and to apply its own color filter to correct it.

상기 기술한 문제점 때문에 칼라 이외의 정보를 사용한 얼굴 영역 추출방법으로 탬플리트 매칭을 이용한 방법도 제안되고 있다. 이 방법은 사람 얼굴의 탬플리트를 구성하고 구성된 탬플리트를 이미지 전 영역에 대해서 최소 사이즈에서 출발하여 최대 사이즈까지 크기를 조절해 가면서 스캔 및 매칭해 나가는 방법이다.Due to the above-described problems, a method using template matching has also been proposed as a face region extraction method using information other than color. In this method, a template of a human face is constructed and the configured template is scanned and matched from the minimum size to the maximum size for the entire image area.

그러나 이 방법은 탬플리트 매칭 회수가 매우 많기 때문에 탬플리트 크기가 작아도 프로세싱 시간이 매우 길다는 문제점이 따른다.However, this method has a large number of template matching, which results in a very long processing time even with a small template size.

또 다른 종래의 기술로는 상기 살색 영역을 이용하는 방법과 탬플리트를 이용하는 방법을 조합하여 얼굴을 검출하는 방법이 있다.Another conventional technique is a method of detecting a face by combining the method using the skin region and the method using a template.

지금까지 기술한 종래의 객체 분리 기술은 정지 영상만을 이용하기 때문에 동영상에 비해 적은 정보만을 사용하여 그 만큼 정확성이 떨어진다. 또한 처리 시간의 문제나 정확성의 문제는 이동 카메라 환경에서 적용하기 곤란한 제약으로 작용한다.The conventional object separation technique described so far uses only a still image, and thus uses less information than that of a moving image, thereby reducing its accuracy. In addition, the problem of processing time or accuracy is a problem that is difficult to apply in a mobile camera environment.

본 발명은 화상 통신을 위한 영상 프레임에서 사람 영역을 실시간 분리하여객체 기반의 비트율 조절이나, 비디오 편집 등에 응용할 수 있도록 한 사람 영역 검출방법을 제공하는 것을 그 목적으로 한다.SUMMARY OF THE INVENTION An object of the present invention is to provide a method for detecting a human region in which a human region is separated from an image frame for video communication in real time and applied to object-based bit rate adjustment or video editing.

또한 본 발명은 화상 통신을 위한 영상 프레임에서 얼굴 영역을 실시간으로 정확하게 분리하여 응용에 사용할 수 있도록 한 얼굴 검출방법을 제공함을 목적으로 한다.In addition, an object of the present invention is to provide a face detection method that can be used in the application by accurately separating the face region in real time in the image frame for video communication.

상기 목적을 달성하기 위하여 본 발명은 영상에서 얼굴 영역을 추출하는 단계, 상기 추출된 얼굴 영역을 기반으로 머리 영역을 추출하는 단계, 상기 추출된 얼굴 영역을 기반으로 몸 영역을 추출하는 단계, 상기 추출된 얼굴영역, 머리 영역, 몸 영역을 합하여 사람 영역으로 지정하는 단계; 를 포함하여 이루어지는 것을 특징으로 하는 사람 영역 추출방법을 제공한다.In order to achieve the above object, the present invention includes extracting a face region from an image, extracting a head region based on the extracted face region, extracting a body region based on the extracted face region, and extracting Designating a human area by adding the combined facial area, head area, and body area; It provides a method for extracting a human region comprising a.

또한 상기 목적을 달성하기 위하여 본 발명은 영상에서 얼굴 영역을 추출하는 단계, 상기 추출된 얼굴 영역을 기반으로 머리 참조 영역을 설정하는 단계, 상기 설정된 머리 참조 영역에서 머리 참조 색을 구하는 단계, 상기 구해진 머리 참조 색과 유사한 색을 갖는 영역을 구하는 단계, 상기 구해진 머리 참조 색과 유사한 색을 갖는 영역 중에서 에지가 발생한 영역을 제외시키는 단계, 상기 에지 발생 영역이 제외된 나머지 영역들 중에서 얼굴과 인접한 영역을 머리 영역으로 지정하는 단계; 를 포함하여 이루어지는 것을 특징으로 하는 머리 영역 추출방법을 제공한다.In order to achieve the above object, the present invention includes extracting a face region from an image, setting a head reference region based on the extracted face region, obtaining a head reference color from the set head reference region, Obtaining an area having a color similar to the head reference color, excluding an area where an edge is generated from the area having a color similar to the obtained head reference color, and selecting an area adjacent to the face among the remaining areas from which the edge generation area is excluded Designating as a head region; It provides a head region extraction method comprising a.

또한 상기 목적을 달성하기 위하여 본 발명은 영상에서 얼굴 영역을 추출하는 단계, 상기 추출된 얼굴 영역을 기반으로 몸 틀을 지정하는 단계, 상기 추출된얼굴 영역을 기반으로 몸 참조 영역을 설정하는 단계, 상기 몸 참조 영역에서 몸 참조 색을 구하는 단계, 상기 지정된 몸 틀 내부를 대상으로, 상기 몸 참조 색과 유사한 색을 갖는 영역을 구하는 단계, 상기 몸 틀 내부 중에서 살색을 갖는 영역을 구하는 단계; 를 포함하여 이루어지는 것을 특징으로 하는 몸 영역 추출방법을 제공한다.In order to achieve the above object, the present invention includes extracting a face region from an image, specifying a body frame based on the extracted face region, setting a body reference region based on the extracted face region, Obtaining a body reference color from the body reference region, obtaining an area having a color similar to the body reference color from the inside of the designated body frame, and obtaining a region having flesh color from the inside of the body frame; It provides a body region extraction method comprising a.

또한 상기 목적을 달성하기 위하여 본 발명은 영상에서 살색 영역에 해당하는 픽셀들 만을 대상으로 칼라 히스토그램(hist)을 구성하는 단계, 상기 칼라 히스토그램 구성시 영상의 중앙을 중심으로 일부 부분 영역을 대상으로만 살색 칼라 영역에 해당하는 픽섹들을 대상으로 중앙 칼라 히스토그램(center_hist)을 구성하는 단계, 상기 칼라 히스토그램(hist, center_hist)을 사용하여 살색 칼라 그룹을 구성하는 단계, 상기 살색 그룹별로 영상의 영역을 분리하여 얼굴 영역을 지정하는 단계; 를 포함하여 이루어지는 것을 특징으로 하는 얼굴 영역 추출방법을 제공한다.In addition, in order to achieve the above object, the present invention comprises the steps of constructing a color histogram (hist) for only pixels corresponding to the skin color region in the image, when the color histogram is configured only a partial region around the center of the image Comprising a center color histogram (center_hist) for the pixels corresponding to the flesh color region, forming a flesh color group using the color histogram (hist, center_hist), by separating the region of the image for each color group Designating a face area; It provides a face region extraction method comprising a.

또한 상기 목적을 달성하기 위하여 본 발명은 영상을 44*36으로 축소하는 단계, 상기 축소된 영상에서 살색 범위에 해당하는 픽셀의 색 공간상에서의 분포를 분석하여 살색 칼라 그룹화를 통한 얼굴 영역 추출을 수행하는 단계, 상기 추출된 얼굴 영역을 기반으로 머리 참조 영역을 지정하고 그 머리 참조 영역 안에서 머리 참조 색을 확인하여 머리 영역을 추출하는 단계, 상기 추출된 얼굴 영역을 기반으로 몸 참조 영역을 지정하고 그 몸 참조 영역 안에서 몸 참조 색을 확인하여 몸 영역을 추출하는 단계, 상기 추출된 얼굴영역, 머리 영역, 몸 영역을 합하여 사람 영역을 지정한 후 이를 원 영상 크기로 복원하는 단계; 를 포함하여 이루어지는 것을 특징으로 하는 사람 영역 추출방법을 제공한다.In addition, the present invention to reduce the image to 44 * 36, to achieve the above object, by analyzing the distribution in the color space of the pixel corresponding to the flesh color range in the reduced image to perform the facial region extraction through flesh color color grouping Specifying a head reference region based on the extracted face region and extracting a head region by checking a hair reference color within the head reference region; specifying a body reference region based on the extracted face region Extracting a body region by checking a body reference color in the body reference region, designating a human region by adding the extracted face region, the head region, and the body region and restoring it to the original image size; It provides a method for extracting a human region comprising a.

도1은 객체 기반 비트율 조절을 설명하기 위한 이미지의 예를 나타낸 도면1 is a diagram showing an example of an image for explaining object-based bit rate adjustment;

도2는 영상의 배경을 전환시킨 예를 나타낸 도면2 is a diagram illustrating an example of switching a background of an image;

도3은 본 발명의 사람 영역 검출방법의 전체적인 과정을 나타낸 플로우차트3 is a flowchart showing the overall process of the human area detection method of the present invention.

도4는 본 발명에서 얼굴 영역 검출과정을 나타낸 플로우차트4 is a flowchart illustrating a facial region detection process in the present invention.

도5는 얼굴 검출을 위한 얼굴 모델과 MBR을 설명하기 위한 도면5 is a diagram for explaining a face model and an MBR for face detection;

도6은 본 발명에서 칼라 그룹화를 설명하기 위한 도면Figure 6 is a view for explaining the color grouping in the present invention

도7은 본 발명에서 머리 참조 영역을 설명하기 위한 도면7 is a view for explaining a head reference region in the present invention.

도8은 본 발명에서 머리 영역 검출방법을 설명하기 위한 도면8 is a view for explaining a head region detection method in the present invention;

도9는 본 발명에서 머리 영역 추출을 위한 모양틀을 설명하기 위한 도면Figure 9 is a view for explaining the frame for extracting the head region in the present invention

도10은 본 발명에서 머리 영역 추출과정을 나타낸 플로우차트10 is a flowchart showing a head region extraction process in the present invention.

도11은 본 발명에서 몸 참조 영역을 설명하기 위한 도면11 is a view for explaining a body reference region in the present invention

도12는 본 발명에서 몸틀을 설명하기 위한 도면Figure 12 is a view for explaining the body frame in the present invention

도13은 본 발명에서 몸 영역 추출과정을 나타낸 플로우차트Figure 13 is a flow chart showing a body region extraction process in the present invention

도14는 본 발명에 의하여 추출된 사람 영역의 예를 나타낸 도면14 is a diagram showing an example of a human region extracted by the present invention;

도15는 본 발명의 얼굴 검출방법의 실시예를 설명하기 위한 플로우차트Fig. 15 is a flowchart for explaining an embodiment of the face detection method of the present invention.

도16은 도15의 얼굴 검출방법에 적용되는 중심영역을 설명하기 위한 도면FIG. 16 is a view for explaining a center region applied to the face detection method of FIG.

도17은 도15에서 칼라 그룹화 과정을 나타낸 플로우차트17 is a flowchart showing a color grouping process in FIG.

도18은 HMMD 색공간을 나타낸 도면18 illustrates an HMMD color space

도19는 HMMD 색공간에서 양자화를 설명하기 위한 도면19 illustrates quantization in the HMMD color space.

도20은 칼라 그룹화 방법에 의하여 여러 칼라로 구성된 얼굴 영역이 하나의 칼라 그룹으로 정의됨을 설명하기 위한 도면20 is a view for explaining that a face region composed of several colors is defined as one color group by a color grouping method.

도21은 얼굴 이외의 영역이 살색을 띄었을 때에도 효과적으로 얼굴 영역만을 분리하는 예를 설명하기 위한 도면21 is a view for explaining an example of effectively separating only a face region even when a region other than the face is flesh-colored.

화상 통신을 위한 사람 영역의 분리 기술의 응용은 앞서 언급한 바와 같이 낮은 네트워크 환경에서도 일정한 화질을 보장하기 위한 비트율 조절과, 동영상 편집 등을 들 수 있는데, 먼저 이에 대하여 살펴보기로 한다.As mentioned above, application of the separation technology of the human domain for video communication may include bit rate adjustment and video editing to ensure a constant image quality even in a low network environment.

[객체 기반 비트율 조절(Bit-rate control)][Object-based Bit-rate control]

이 기술은 화면의 중요한 영역에 따라 할당 비트 수를 다르게 하여 중요한 영역은 고화질을 유지하고 그렇지 않은 영역에 대해서는 화질을 떨어뜨리는 기술이다. 이를 위해서는 주요 영역(ROI: Region Of Interest) 자동 분리가 이루어져야 한다. 화상 통신의 경우는 사람 영역이 주요 영역이 될 것이고, 배경은 중요하지 않은 영역으로 될 것이다. 또한 사람 영역 중에서도 얼굴 영역이 상대적으로 더 중요한 영역이 되며, 얼굴 영역 내에서는 눈과 입을 더 중요한 영역으로 볼 수 있다.This technique is to reduce the number of allocated bits according to the important areas of the screen to keep the important areas of high quality and to reduce the quality of the other areas. This requires automatic separation of Region of Interest (ROI). In the case of video communication, the human area will be the main area and the background will be the non-important area. Also, the face area becomes a more important area among the human area, and the eyes and mouth can be regarded as more important areas in the face area.

이와 같이 다단계의 ROI를 정의함으로써, 네트워크 환경에 따라 최적의 화질을 유지할 수 있다. 또한, 비디오 메일의 경우 처음부터 중요하지 않은 영역을 낮은 비트로 할당함으로써 보다 적은 사이즈의 동영상으로 인코딩할 수 있는데, 이는 통신 선로 점유의 문제는 물론 통신 서비스 요금과 직접적으로 관계가 있어서 매우 중요한 부분이다. 즉, 영상의 크기가 커지면 사용자의 요금도 증가하므로 적은 크기의 영상으로 인코딩하는 것은 사용자 입장에서 매우 경제적이다.By defining a multi-level ROI in this way, it is possible to maintain an optimum picture quality according to the network environment. In addition, video mail can be encoded into a smaller size video by allocating a non-significant area from the beginning, which is very important because it is directly related to the communication service charge as well as the problem of occupying the communication line. In other words, as the size of the image increases, the user's fee also increases, so encoding to a small size of the image is very economical for the user.

도1의 (a)는 원래의 프레임 영상을 나타내고, 도2의 (b)는 (a)의 영상에서ROI(이 경우는 사람 영역)를 분리한 후, 사람 영역은 원래의 칼라 정보를 모두 표현하고 배경 영역은 칼라 정보를 제외한 밝기 정보만을 나타내어 데이터의 크기를 줄인 예이다.Fig. 1 (a) shows the original frame image, and Fig. 2 (b) shows the original color information after separating the ROI (in this case, the human area) from the image of (a). The background area is an example in which the size of data is reduced by representing only brightness information without color information.

도1의 (c)는 사람 영역을 배경 영역보다 너 높은 비트 수로 표현한 경우이고, (d)는 화면 전체를 낮은 비트 수로 표현한 예이다. (d) 경우는 화질의 저하를 쉽게 느낄 수 있지만, (b)와 (c)의 경우는 많은 양의 데이터가 줄었음에도 불구하고 사용자에게 화질의 큰 저하를 느낄만큼 불편을 주지 않는다. 이 것은 사용자에게 있어서 중요한 영역인 사람 영역이 화질을 느끼는데 큰 영향을 준다는 점을 고려하면 쉽게 이해가 될 것이다. 실제 통신환경에서는 이러한 화질 저하가 프레임의 끊김 현상이나 블록킹 현상으로 나타나기도 한다.FIG. 1C illustrates a case where the human area is expressed by the number of bits higher than the background area, and (d) is an example in which the entire screen is expressed by the low number of bits. In the case of (d), the deterioration of the image quality can be easily felt. However, in the case of (b) and (c), although the large amount of data has been reduced, the user does not have enough inconvenience to feel the deterioration of the image quality. This will be easily understood considering that the human area, which is an important area for the user, has a great influence on the image quality. In a real communication environment, such deterioration may be caused by frame dropping or blocking.

[비디오 메일 편집기][Video Mail Editor]

카메라를 장착한 이동 단말기의 출현은 비디오 메일 서비스도 가능하게 한다. 비디오 메일의 경우에도 상기한 중요 영역의 분리 기술을 사용해서 보다 적은 사이즈의 영상으로 압축 전송할 수 있다. 사용자는 통신 요금과 화질 중에서 어느 한쪽을 포기해야 하는 기존의 선택 대신, 어느 정도의 화질을 유지하면서도 저렴한 통신 요금을 선택할 수 있는 장점이 있다. 이외에도 얼굴 영역과 눈 영역의 분리 기술을 이용하여 다양한 비디오 메일 편집을 행할 수 있다. 도2는 원 영상에서 사람 영역과 배경 영역을 분리하고, 분리된 사람 영역에 다른 배경을 합성하여 배경을 전환한 예를 보여준다. 도2의 (a)는 원 영상이고, (b)는 배경 전환된 영상이다.The advent of mobile terminals equipped with cameras also enables video mail services. In the case of video mail, it is possible to compress and transmit a smaller size image by using the above-described separation of important areas. Instead of the conventional choice of giving up one of the communication fee and the image quality, the user can select a low communication rate while maintaining some image quality. In addition, various video mail editing can be performed by using a separation technique of a face region and an eye region. FIG. 2 shows an example in which a human area and a background area are separated from the original image, and a background is switched by synthesizing another background with the separated human area. 2 (a) is an original image, and (b) is a background switched image.

본 발명에 따른 사람 영역 추출방법은, 얼굴 영역을 추출하는 단계, 머리 영역을 추출하는 단계, 몸 영역을 추출하는 단계로 이루어진다.The human region extraction method according to the present invention includes extracting a face region, extracting a head region, and extracting a body region.

도3은 사람 영역을 추출하기 위한 본 발명의 사람 영역 검출방법의 전체 프로세스를 보여준다. 먼저, 영상이 CIF나 QCIF로 입력되면 입력된 영상을 44*36 픽셀 크기로 줄이는 리사이징(resizing)을 수행한다(S11,S12). 이러한 리사이징은 이 정도의 크기의 이미지를 처리하게 되면 전체적인 성능에는 영향을 주지 않으면서도 매우 빠른 처리가 가능하기 때문이다.3 shows the overall process of the human area detection method of the present invention for extracting the human area. First, when an image is input to CIF or QCIF, resizing is performed to reduce the input image to a size of 44 * 36 pixels (S11 and S12). This resizing is because processing images of this size allows for very fast processing without affecting the overall performance.

다음에는 줄여진 영상에서 먼저 얼굴 영역을 추출한다(S13,S14). 얼굴 영역의 추출은 살색을 기반으로 한 얼굴 후보 영역을 추출하는 과정(S13)과, 얼굴 영역을 확인하는 과정(S14)을 통해서 수행된다. 얼굴 영역의 추출이 이루어진 후에는 추출된 얼굴 영역을 기반으로 머리 영역을 추출하고(S15), 또한 상기 추출된 얼굴 영역을 기반으로 하여 몸 영역을 추출한다(S16). 여기서 머리 영역의 추출과 몸 영역의 추출 순서는 바뀔 수도 있다.Next, a face region is first extracted from the reduced image (S13 and S14). The extraction of the face region is performed through the process of extracting the face candidate region based on the flesh color (S13) and the process of identifying the face region (S14). After extraction of the face region is performed, the head region is extracted based on the extracted face region (S15), and the body region is extracted based on the extracted face region (S16). Here, the extraction order of the head region and the body region may be reversed.

다음에는 상기 추출된 얼굴과 머리, 몸 영역을 합쳐서 사람 영역으로 하고, 후처리 과정(S17)(이 과정은 후에 상세하게 설명하기로 한다)을 거치고, 다시 원래의 영상 크기로 확대(resizing)하는 과정(S18)을 통해 사람 영역의 추출이 끝나게 된다.Next, the extracted face, head, and body areas are combined to be a human area, and after the post-processing process S17 (this process will be described in detail later), the image is resized to the original image size. Through the process (S18), the extraction of the human domain is finished.

지금까지 설명한 사람 영역 추출 과정을 얼굴 영역 추출방법, 머리 영역 추출방법, 몸 영역 추출방법으로 각각 나누어 더욱 상세히 설명한다.The human region extraction process described so far is described in more detail by dividing into a face region extraction method, a head region extraction method, and a body region extraction method.

1. 얼굴 영역 추출방법1. Facial region extraction method

얼굴 영역의 추출은 살색 영역 추출을 기반으로 한다. 살색 영역이란 영상의각 픽셀이 가지는 칼라 값이 사람의 살색 범위에 해당하는지를 조사하여 이를 이용하는 방법이다. 이 방법은 처리 속도는 빠르지만 조명이나 영상 취득 장치에 따라 다양하게 변조되는 색상의 다양성으로 인하여 살색의 범위가 너무 광범위한 문제가 있다. 즉, 살색 범위에 속하는 픽셀을 취하더라도 사람의 피부 이외의 영역도 살색으로 오인되어 추출되는 문제가 일반적으로 나타난다. 이를 해결하기 위해서 본 발명에서는 살색 영역에 속한 픽셀들 중에서 서로 유사한 칼라 그룹이 모여 있을 경우 이를 하나의 칼라 그룹(color group)으로 보고, 칼라 그룹별로 얼굴 후보 영역을 추출하여 최종적인 얼굴 영역을 추출하는 방법을 취하였다.Extraction of facial areas is based on skin area extraction. The skin color area is a method of investigating whether a color value of each pixel of an image corresponds to a skin color range of a human. This method has a high processing speed, but there is a problem that the range of flesh color is too wide due to the variety of colors that are variously modulated by the lighting or the image capturing device. In other words, even when pixels belonging to the skin color range are taken, a problem is generally obtained in which regions other than the human skin are mistaken for skin color and extracted. In order to solve this problem, in the present invention, when similar color groups are gathered among the pixels belonging to the flesh color area, they are regarded as one color group, and the final face area is extracted by extracting the face candidate area for each color group. The method was taken.

이러한 방법을 통해 광범위한 살색 범위로 인하여 살색으로 오인되는 배경이 있더라도 실제 얼굴 영역과 구분할 수 있게 된다. 그러나 이 방법을 사용해도 그림자 등 다양한 색상 왜곡에 의해 모든 얼굴 영역 내부의 픽셀이 살색 픽셀로 추출되지 않는 경우가 발생한다. 이러한 이유로 인하여 살색 영역을 바탕으로 얼굴 영역을 추출하더라도, 최종적으로 추출된 살색 칼라 영역이 진짜 얼굴 영역인지 확인하는 과정이 필요하다.In this way, even if there is a background mistaken for skin color due to the wide range of skin color, it can be distinguished from the real face area. However, even with this method, pixels inside all face areas are not extracted as flesh pixels due to various color distortions such as shadows. For this reason, even if a facial region is extracted based on the skin color region, a process of confirming whether the extracted skin color region is a real face region is necessary.

얼굴 영역을 확인하는 방법의 대표적인 예는 얼굴 템플리트를 사전에 작성하여 이와 비교한 후, 비슷한 영역을 얼굴 영역으로 확인하는 방법이 있다. 일반적으로 얼굴 영역을 확인하는 작업은, 살색 영역이 정확하게 얼굴 영역을 가리키지 못할 경우 비교적 많은 시간을 요구하기 때문에 초기 영역을 정확하게 추출할 때 사용하고, 이후 연속된 프레임에서의 얼굴 영역 추출 시에는 확인 과정이 생략되거나, 또는 이전 영역 주변만을 검사하는 등 제약적인 방법을 사용하게 된다.A representative example of a method of identifying a face area is a method of creating a face template in advance and comparing the same, and then identifying a similar area as a face area. In general, checking the face area is used to accurately extract the initial area because the skin area requires a relatively long time when the flesh area does not accurately point to the face area, and then checks the face area in successive frames. The process may be omitted or a restrictive method may be used, such as checking only around the previous area.

이와 같은 얼굴 추출의 과정을 도4에 나타내었다.This process of face extraction is shown in FIG.

먼저, 입력 영상에서 살색을 갖는 픽셀을 추출한 후 이들을 대상으로 칼라 히스토그램(color histogram)을 구하여 칼라 그룹화를 한다(S21,S22,S23). 칼라 그룹화란 히스토그램 상에서 서로 모여 있는 주요 칼라들을 하나의 그룹으로 만드는 것을 의미한다. 도6에 칼라 그룹화의 예를 나타내었다. 도6에서 칼라 히스토그램이 나타내는 분포를 사용하여 모여 있는 유사 칼라를 하나의 그룹으로 지정하였다. 다음 단계(S24)에서는 상기 각 칼라 그룹에 속한 픽셀들로 구성된 영역들을 얼굴 후보 영역이라 지정하고, 상기 기술한 얼굴 영역 확인 과정(S25,S26)을 거쳐 얼굴 영역을 추출하게 된다.First, color pixels are extracted from an input image, and color histograms are obtained and color grouped (S21, S22, S23). Color grouping refers to the grouping of major colors grouped together on a histogram. 6 shows an example of color grouping. In FIG. 6, the similar colors gathered were designated as one group using the distribution indicated by the color histogram. In a next step S24, regions composed of pixels belonging to each color group are designated as face candidate regions, and face regions are extracted through the above-described face region checking processes S25 and S26.

얼굴 영역의 추출 결과를 사용해서 머리 영역과 몸 영역을 추출하므로 얼굴 영역의 추출은 매우 중요하다. 특히, 머리 영역은 얼굴 영역과 이웃한 머리 후보 영역들을 모아 머리 영역으로 인지하므로 얼굴 영역과 적절하게 이웃하려면 얼굴 영역의 모양이 중요하다. 사람에 따라 다양하게 나타나는 얼굴 모양을 단지 살색 영역에 의지할 경우 잡음 등에 의해서 왜곡되는 문제가 있다. 따라서, 사전에 정의한 얼굴 모양의 모델을 사용하게 되는데, 본 발명에서는 머리 영역 추출에 적합하도록 도5와 같은 얼굴 영역 모델(face model)을 적용하였다. 즉, 얼굴 영역이 도5의 (b)와 같이 사각형으로 잡히게 되면 머리와 몸 영역 추출에 사용되는 얼굴 영역은 도5의 (a)에 나타낸 모델에 의해서 도5의 (b)에서와 같이 추출되게 된다. 이와 같이 도4에서도 추출된 얼굴 영역은 먼저 얼굴을 포함하는 최소한의 사각형(MBR: Minimum Boundary Rectangle)으로 표현 된 후, 사전 정의된 얼굴 모델로변환된다(S25,S26).The extraction of the face area is very important because the head and body areas are extracted using the result of the face area extraction. In particular, since the head region collects the face region and neighboring head candidate regions as the head region, the shape of the face region is important to properly neighbor the face region. When face shapes appearing differently depending on a person merely rely on the flesh area, there is a problem of being distorted by noise. Therefore, a predefined face shape model is used. In the present invention, a face model as shown in FIG. 5 is applied to be suitable for extracting a head region. That is, if the face area is caught in a rectangle as shown in Fig. 5 (b), the face area used for head and body area extraction is extracted as shown in Fig. 5 (b) by the model shown in Fig. 5 (a). do. As described above, the extracted face region is first expressed as a minimum bounded rectangle (MBR) including a face and then converted into a predefined face model (S25 and S26).

도15는 본 발명의 얼굴 영역 검출방법의 다른 실시예를 보여준다. 도15의 얼굴 검출방법은 주어인 영상 프레임에서 사전에 정의된 살색 범위의 칼라를 갖는 픽셀들만을 대상으로 칼라 분포를 분석하여, 주된 살색 칼라 그룹을 구한 후, 중요도가 높은 칼라 그룹 순으로 얼굴 영역 후보로 지정함으로써, 영상에 의존적인 살색 범위를 적응적으로 적용하여 빠르고 효과적인 얼굴 영역 추출이 가능한 방법이다.15 shows another embodiment of the facial region detection method of the present invention. The face detection method of FIG. 15 analyzes a color distribution only on pixels having a color of a predefined skin color range in a main image frame, obtains a main skin color color group, and then selects a facial region in order of color group having the highest importance. By designating as a candidate, it is possible to apply fast and effective face region extraction by adaptively applying skin-dependent color gamut.

먼저, 영상이 입력되면 모든 픽셀에 대하여 각 픽셀의 칼라가 살색 영역에 속하는지를 비교 판단한다(S51). 살색 영역에 대한 판단 조건은 HMMD 색공간에서 Hue(h), Sum(s), Diff(d)의 세 축을 이용하여 다음과 같이 정의되었다.First, when an image is input, all pixels are compared and determined whether the color of each pixel belongs to the skin color region (S51). Judgment conditions for the skin region were defined as follows using three axes of Hue (h), Sum (s), and Diff (d) in the HMMD color space.

HMMD 색공간에서 살색 영역 조건:Skin Area Conditions in the HMMD Color Space:

if(((h>300&&h<360)∥(h<60&&h>0))&&s>100&&s<480&&d>5&&d<100)if (((h> 300 && h <360) ∥ (h <60 && h> 0)) && s> 100 & & s <480 && d> 5 && d <100)

살색;skin color;

elseelse

살색 아님;Not flesh;

이와 같이 정의한 조건을 만족하는 칼라는 살색이라고 정의한다. HMMD 색공간은 RGB, YCrCb 등과 같은 색공간 중 하나로서, 색공간 내에서의 거리가 눈으로 보이는 색의 차이를 잘 반영하는 성질을 갖는 색공간이다. 색공간의 모양은 도18의 (a)에 나타낸 바와 같이 더블 콘(double cone) 모양으로서, 세로축은 밝기를 나타내는 sum이고, 중심에서 바깥 가장자리로의 축은 색의 순도를 나타내는 Diff, 그리고 더블 콘의 단면인 원에서 표현되는 각도는 색상을 나타내는 hue를 의미한다.도18의 (b)는 이 색공간의 단면을 보여준다. RGB 색공간으로부터 HMMD 색공간으로의 변환은 다음과 같이 이루어진다.A color that satisfies the conditions defined above is defined as flesh color. The HMMD color space is one of the color spaces such as RGB, YCrCb, etc., and has a property that the distance in the color space well reflects the difference in color visible to the eye. The shape of the color space is a double cone shape as shown in Fig. 18A, where the vertical axis is sum representing brightness, the axis from the center to the outer edge is Diff representing color purity, and The angle represented by the circle, which is a cross section, means a hue representing a color. Fig. 18B shows a cross section of this color space. The conversion from the RGB color space to the HMMD color space is performed as follows.

max = MAX(R,G,B)max = MAX (R, G, B)

min = MIN(R,G,B)min = MIN (R, G, B)

if(max-min)>0)if (max-min)> 0)

｛｛

//HUE를 구함// requires HUE

if(R==max)H=((G-B)/(max-min));if (R == max) H = ((G-B) / (max-min));

elseelse

｛｛

if(G==max)H=(2.0+(B-R)/(max-min));if (G == max) H = (2.0+ (B-R) / (max-min));

elseelse

｛｛

if(B==max)H=(4.0+(R-G)/(max-min));if (B == max) H = (4.0+ (R-G) / (max-min));

｝｝

H*=60;H * = 60;

if(H<0.0)H+=360;if (H <0.0) H + = 360;

｝｝

elseelse

H = -1;H = -1;

hue = H;hue = H;

sum = max + min;sum = max + min;

diff = max-min;diff = max-min;

RGB 색공간은 색공간 내에서의 거리가 눈으로 보이는 색의 차이를 반영하지 못하는 특성이 있으므로 우선 색공간에서의 거리가 실제 색의 차이를 반영하는 HMMD 색공간에서 상기와 같이 살색 조건 정의를 한 후, 같은 조건이 되도록 다시 RGB 색공간에서 판단 조건을 정의하였다.The RGB color space has the characteristic that the distance in the color space does not reflect the visible color difference. Therefore, in the HMMD color space where the distance in the color space reflects the actual color difference, the skin color condition is defined as described above. After that, the judgment conditions were again defined in the RGB color space to become the same condition.

이와 같이 RGB 색공간에서 동일한 조건을 다시 만드는 이유는 입력 영상이 컴퓨터로 입력된 영상이므로 기본적으로 RGB 색공간으로 표현되어 있기 때문에, 다른 색공간으로 변환하지 않고 바로 판단할 수 있도록 하기 위함이다.The reason for re-creating the same condition in the RGB color space is because the input image is an image input by a computer and is basically expressed in the RGB color space, so that it can be directly determined without converting to another color space.

하지만, RGB 색공간에서 바로 살색 범위를 정의하려면 앞에서 기술한 RGB 색공간의 특성 때문에 정확한 범위를 설정하기 어려우므로 우선 HMMD 색공간에서 범위를 설정한 후, 같은 조건이 되도록 RGB 색공간에서의 조건으로 변환하는 것이다.However, in order to define the flesh color range directly in the RGB color space, it is difficult to set the exact range because of the characteristics of the RGB color space described above, so first set the range in the HMMD color space, and then use the same conditions in the RGB color space. To convert.

HMMD 색공간은 RGB 색공간과 1:1 양방향 매핑이 가능하므로 이와 같이 같은 조건을 같도록 변환 가능하며, 상기 HMMD 색공간에서의 살색 조건 정의를 변환된 RGB 색공간에서 고려하면 다음과 같이 표현될 수 있다.Since the HMMD color space can be 1: 1 bidirectionally mapped to the RGB color space, the same condition can be converted to the same condition. When the flesh color condition definition in the HMMD color space is considered in the converted RGB color space, it can be expressed as follows. Can be.

RGB 색공간에서 정의한 살색 영역 조건Skin Area Conditions Defined in the RGB Color Space

if(r>b&&r>g)if (r> b && r> g)

｛｛

if(b>g)if (b> g)

if(r+g>100&&r+g<480&&g-g>5&&r-g<100)if (r + g> 100 && r + g <480 && g-g> 5 && r-g <100)

살색;skin color;

elseelse

살색 아님;Not flesh;

elseelse

if(r+b>100&&r+b<480&&r-b>5&r-b<100)if (r + b> 100 && r + b <480 && r-b> 5 & r-b <100)

살색;skin color;

elseelse

살색 아님;Not flesh;

｝｝

elseelse

살색 아님;Not flesh;

이와 같은 조건을 적용하여 살색 영역으로 판단된 픽셀들에 대해서 칼라 히스토그램(hist)을 구성한다(S52). 칼라 히스토그램을 구성하기 위해서는 색공간을 양자화해야 하는데, HMMD 색공간을 다음과 같이 72가지 레벨로 양자화 하였다.By applying the above conditions, a color histogram (hist) is formed on the pixels determined as the skin color region (S52). To construct the color histogram, the color space must be quantized. HMMD color space is quantized to 72 levels as follows.

먼저, hue축으로 6등분한 후, 등분된 6개의 부분 영역을 다시 Diff축으로 4등분하고, 등분된 각 영역들을 다시 sum축으로 3등분하였다. 도19의 (a)는 Hue축으로 6등분함을 보이고, (b)는 Diff와 Sum 축으로 각각 4등분, 3등분 함을 보이고 있다. 이와 같이 양자화하는 근거는 실험에 의하여 hue, sum, diff 각각의 요소별로하나의 얼굴을 구성하는 값의 분포를 구한 후, 이를 기반으로 등분하는 정도를 정하였다. 예를 들어 sum은 조명에 의해 하나의 얼굴이라도 그 범위가 넓게 분포되므로 구분 능력이 상대적으로 낮아 3등분한 반면, hue는 한 얼굴 내에서는 좁은 분포를 가지므로 6등분하였다.First, after dividing into six parts by the hue axis, the six subdivided regions are further divided into four by the Diff axis, and each of the divided regions by three parts by the sum axis. (A) of FIG. 19 shows six divisions in the Hue axis, and (b) shows four divisions and three divisions in the Diff and Sum axes, respectively. The basis of quantization as described above was to calculate the distribution of values constituting one face for each element of hue, sum, and diff by experiment, and then determine the degree of equalization based on this. For example, the sum is divided into three parts because of its relatively low distinction ability, even though one face is widely distributed by illumination, while the hue is divided into six parts because it has a narrow distribution within one face.

앞에서 기술하였듯이 입력 영상에서 살색이라고 판단되는 모든 영역에 대해서 히스토그램을 추출하고 나면 영상의 중심 영역만을 대상으로 히스토그램(cneter_hist)을 추출한다. 중심 영역은 도16에 나타낸 바와 같이 중심으로부터 일정한 영역을 의미하며, 예를 들면 세로축에 대해서는 전체 이미지 세로 길이의 66%, 가로축에 대해서는 전체 이미지 가로 길이의 42%에 해당하는 길이를 갖는 부분 영역을 중심을 기준으로 지정하였다. 이와 같이 중심 영역에서 히스토그램을 따로 구성하는 이유는 이후에서 기술될 칼라 그룹화 단계 시, 칼라 분포가 가장 큰 칼라라고 하더라도 중심에 모여있는 칼라가 아니면 우선 순위를 중심에 모여 있는 칼라보다 뒤로 하기 위함이다. 이는 칼라 그룹시에 중심에 모여있는 칼라 그룹을 먼저 생성함으로써 얼굴 영역에 있는 유사한 칼라들을 하나로 모두 묶는데 효과적일 뿐 아니라, 추출된 칼라 그룹 중 가장 얼굴 영역일 확률이 높은 칼라 그룹을 지정할 수 있게 한다.As described above, after extracting the histogram for all regions of the input image determined to be flesh color, the histogram (cneter_hist) is extracted only for the center region of the image. The central area means a constant area from the center as shown in Fig. 16, for example, a partial area having a length corresponding to 66% of the total image height on the vertical axis and 42% of the total image width on the horizontal axis. Based on the center. The reason for constructing the histogram separately in the central area is that in the color grouping step to be described later, even if the color distribution is the largest color, the priority is higher than that of the centrally gathered color even if the color is the largest color. This not only helps to group all similar colors in the face area together by first creating a group of colors centered at the time of the color group, but also allows the designation of a color group that is most likely to be the face area among the extracted color groups. .

이러한 이유로 만일 이전 프레임에서 추출된 얼굴 영역 정보가 있을 경우 중심의 부분 영역 대신 이전에 추출된 얼굴 영역을 대상으로 중앙 히스토그램을 추출할 수도 있다.For this reason, if there is face region information extracted from a previous frame, a central histogram may be extracted from a face region previously extracted instead of a central partial region.

지금까지 설명한 바와 같이 히스토그램(hist)과 중앙히스토그램(center_hist)이 모두 구해지고 나면, 이 두 히스토그램을 가지고 칼라 그룹화를 시행한다(S54). 그룹화하는 구체적인 예는 다음에 자세히 설명된다.As described above, once both the histogram (hist) and the center histogram (center_hist) are obtained, color grouping is performed using the two histograms (S54). Specific examples of grouping are described in detail below.

상기 72개의 살색 칼라들은 칼라 그룹화에 의해 보통 2 내지 3개의 칼라 그룹으로 묶여진다. 이들 2 내지 3개의 칼라 그룹 중에서 중요도가 높은 하나의 칼라 그룹을 우선 얼굴 영역 후보로 지정하고(S55), 탬플리트 매칭 등의 얼굴 확인 작업을 통해 최종적으로 얼굴 영역을 추출할 수 있다. 만약 첫번째 중요도를 갖는 칼라 그룹에 속한 영역에서 매칭되는 얼굴이 없는 경우에는 다음 순위의 중요도를 갖는 칼라 그룹에 속한 영역에서 탬플리트 매칭을 시도한다. 그러나 대부분의 경우는 첫번째 중요도를 갖는 칼라 그룹에 속한 영역에서의 탬플리트 매칭에 의해서 얼굴 영역이 검출된다.The 72 flesh colors are usually grouped into 2-3 color groups by color grouping. Among the two to three color groups, one color group having a high degree of importance is first designated as a face region candidate (S55), and finally a face region can be extracted through face verification operations such as template matching. If there is no face matching in the area belonging to the color group having the first priority, the template matching is attempted in the area belonging to the color group having the next priority. In most cases, however, the facial region is detected by template matching in the region belonging to the color group having the first importance.

상기 두개의 히스토그램(hist, center_hist)을 가지고 수행되는 칼라 그룹화 과정(S54)을 도17에 상세하게 나타내었다.The color grouping process (S54) performed with the two histograms (hist, center_hist) is shown in detail in FIG.

먼저, 칼라 그룹화를 처음 시도하는가를 판단한다(S61). 처음 시도하는 경우에는 중앙 히스토그램(center_hist)에서 가장 큰 빈값을 갖는 bin_max를 구한다(S62).First, it is determined whether color grouping is attempted first (S61). In the first attempt, bin _max having the largest bin value in the center histogram center_hist is obtained (S62).

만일 bin_max의 값(V(bin_max))이 일정 임계치(Th1) 보다 크면 bin_max의 칼라(C(bin_max))와 유사한 칼라를 갖는 빈들을 구하여 집합A로 세팅하고(S63,S64,S65), 그렇지 않으면 종료한다.Ten thousand and one bin _max value (V (bin _max)) a certain threshold value (Th1), obtain the bins having a similar color and color (C (bin _max)) of greater than bin _max set to the set A, and (S63, S64, S65 ), Otherwise exit.

다음 단계(S66)에서는 히스토그램(hist)에서 집합A에 속한 빈(bin) 값들을모두 더하고, 값을 임계치(Th2)와 비교한다(S67). 히스토그램(hist)에서 집합A에 속한 빈 값을 모두 더한 값(Portion)이 임계치(Th2) 보다 크면 집합A에 속한 빈들을 모두 하나의 칼라 그룹에 해당하도록 설정하고 이를 중요도1의 칼라 그룹으로 정한다(S68). 그런 다음 히스토그램(hist)에서 집합A에 속한 빈들을 모두 제외한다(S69). 만일 히스토그램(hist)에서 집합A에 속한 빈 값을 모두 더한 값(Portion)이 임계치(Th2) 보다 작으면 히스토그램에서 bin_max만을 제외하고 처음 단계(S61)로 이동한다.In a next step (S66), all the bin values belonging to the set A are added in the histogram (hist), and the values are compared with the threshold Th2 (S67). In the histogram, when the sum of the bin values in set A is greater than the threshold Th2, the bins in set A are set to correspond to one color group and the color group of importance 1 is defined. S68). Then, the histogram (hist) excludes all the beans belonging to the set A (S69). If the sum of all the bin values belonging to the set A in the histogram is smaller than the threshold Th2, the process moves to the first step S61 except for bin _max in the histogram.

상기 과정이 한 번 실행되면 이후부터는 단계(S63)부터 실행하여 히스토그램(hist)에서 가장 큰 빈 값 bin_max을 찾아 그 이하의 과정(S64-S69)을 반복한다.Once the above process is executed once, the process starts from step S63 and the largest bin value bin _max is found in the histogram (hist), and the subsequent steps (S64-S69) are repeated.

도20은 제안된 칼라 그룹화 방법에 의하여 여러 칼라로 구성된 얼굴 영역이 하나의 칼라 그룹으로 정의됨을 보여주고 있다. 도20의 (a)는 원래의 영상이고, (b)는 기존과 같이 전체 살색 범위를 사용하여 얼굴 영역을 추출한 예이다. 본 예에서는 얼굴 이외의 영역에는 살색 영역이 존재하지 않아 비교적 정확히 얼굴 영역이 추출되었다. 그렇지만 살색 범위를 72가지 색으로 양자화했기 때문에 (c)와 같이 같은 얼굴 영역이 몇 가지의 살색으로 구성됨을 알 수 있다. 그러나 (d)에서는 칼라 그룹화 과정을 거쳐서 한 얼굴을 하나의 살색 그룹으로 표현하여 쉽게 얼굴 영역을 추출함을 보여준다.20 shows that the face region composed of several colors is defined as one color group by the proposed color grouping method. (A) of FIG. 20 is an original image, and (b) is an example of extracting a face region using the entire skin color range as before. In this example, no skin region exists in the region other than the face, so that the facial region was extracted relatively accurately. However, since the skin color range is quantized to 72 colors, it can be seen that the same face area is composed of several skin colors as shown in (c). However, in (d), one face is expressed as one flesh color group through the color grouping process, and the face region is easily extracted.

도21은 제안된 방법에 의하여 얼굴 이외의 영역이 살색을 띄었을 때에도 효과적으로 얼굴 영역만을 분리하는 예를 보여준다. (a)는 원래의 영상이고, (b)는 기존과 같이 전체 살색 범위를 사용하여 얼굴 영역을 추출한 예이다. 도면에서 알 수 있듯이 얼굴에 인접한 넓은 다른 영역이 살색으로 분류되어 얼굴 영역 추출이 어려움을 알 수 있다. (c)와 (d)는 본 발명에 의하여 얼굴 영역과 배경 영역을 같은 살색 범위에 속함에도 불구하고 정확히 추출함을 보여주고 있다.21 shows an example of effectively separating only the face area even when the area other than the face is flesh colored by the proposed method. (a) is the original image, and (b) is an example of extracting a face region using the entire skin color range as before. As can be seen from the figure, it is difficult to extract a face region because a wide area different from the face is classified as skin color. Figures (c) and (d) show that the present invention accurately extracts the face and background areas even though they belong to the same skin color range.

도20 및 도21에서 알 수 있듯이 눈으로 볼 때, 비교적 정확한 얼굴 영역이 살색 그룹 영역을 사용하여 추출되었음을 알 수 있다. 하지만 도면에서 얼굴 영역 주변에 잡힌 작은 잡음들은 사람이 볼 때에는 문제가 되지 않지만 컴퓨터로 처리할 때에는 이러한 잡음을 제거해야만 정확한 영역을 잡을 수 있다. 본 발명에서는 살색 그룹 영역을 나타낸 이미지를 N*M개의 그리드(grid)로 구성된 그리드 이미지로 변환한 후, 살색 그룹 영역을 연결요소(connected component)로 연결하여 가장 큰 요소만 남김으로써 자연스럽게 잡음 영역을 제거하는 방법을 사용하였다.As can be seen from Fig. 20 and Fig. 21, it can be seen that a relatively accurate face region is extracted using the skin color group region. However, small noises captured around the face area in the drawing are not a problem for human eyes, but when processed by a computer, the noise can be removed to capture the correct area. In the present invention, after converting the image representing the skin color group region into a grid image composed of N * M grids, the color region is naturally connected by connecting the skin color group region with a connected component to leave only the largest element. Removal method was used.

이와 같이 원 영상 이미지를 N*M개의 그리드로 구성하기 위해 원 영상의 크기가 CIF(352*288)일 경우에는 8*8 크기의 그리드로 이미지를 블록화 하였으며, 원 영상의 크기가 QCIF(176*144)일 경우에는 4*4 크기의 그리드로 이미지를 블록화 하였다. 각 그리드 안에 속한 살색 픽셀의 밀집도가 일정 임계치를 넘을 경우에는 해당 그리드를 살색 그리드로 정의함으로써, 실질적인 처리는 이미지 크기와 관계없이 44*36 크기의 이미지 처리와 같은 처리 속도를 나타내게 하였다.As such, when the original image is composed of N * M grids, if the original image is CIF (352 * 288), the image is blocked with an 8 * 8 grid, and the original image is QCIF (176 *). In the case of 144), the image is blocked by a 4 * 4 grid. When the density of flesh color pixels in each grid exceeds a certain threshold, the grid is defined as a flesh color grid, so that the actual processing shows the same processing speed as the 44 * 36 image processing regardless of the image size.

지금까지 기술한 방법은 원 영상에서 살색 그룹화를 거친 이후, 그 결과를 이용하여 얼굴 영역을 추출하기 위해 원 영상을 블록화하여 연결 요소를 구하는 방법을 소개하였다. 그렇지만 다른 실시예로서, 원 영상을 먼저 44*36 크기로 축소한 후, 축소된 이미지를 사용하여 살색 그룹화를 수행한 후, 그 결과에 대하여 바로 연결 요소를 구하는 방법을 사용할 수도 있다. 두 가지 방법의 결과는 약간 다를 수 있지만 영상 전체에 주로 큰 크기로 얼굴이 나타나는 화상 통신 환경에서는 두 방법 모두 적절한 성능을 나타낸다. 다만, 처리 속도의 측면에서는 후자의 방법이 유리하다고 할 수 있다.The method described so far has introduced a method of obtaining a connection element by blocking the original image to extract the face region after the skin color grouping on the original image. However, as another embodiment, the original image may be first reduced to 44 * 36 size, and then the flesh color grouping may be performed using the reduced image, and then the connection element may be directly obtained based on the result. The results of the two methods may be slightly different, but both methods perform well in a video communication environment where faces appear large in size throughout the image. However, the latter method is advantageous in terms of processing speed.

2. 머리 영역 추출방법2. Head region extraction method

상기 추출된 얼굴 영역을 이용해서 사람의 머리 영역을 추출한다. 머리 영역의 추출은 크게 칼라, 에지, 모양 등 3가지 정보를 사용하며, 각 정보를 사용하는 방법과 그 이유를 함께 살펴보면 다음과 같다.The human head region is extracted using the extracted face region. Extraction of the head region uses three types of information such as color, edge, and shape. The method of using each information and the reason thereof are as follows.

2.1. 칼라 정보2.1. Color information

사람의 머리색은 인종뿐만 아니라 개인의 특성과 기호에 따라서도 매우 다양하다. 따라서 특정한 색을 머리색으로 지정하여 머리 영역을 추출할 수 없다. 또한, 머리 모양도 다양하므로 얼굴 영역을 기준으로 특정한 영역을 강제로 분리하여 머리 영역이라고 칭할 수도 없다. 본 발명에서는 얼굴 영역을 기준으로 머리 영역일 확률이 높은 일부 영역(머리 참조 영역(hair reference region)이라고 함)에서 평균 칼라를 취하여 이를 머리 참조 칼라로 지정한 후, 이와 동일한 칼라를 지닌 영역들을 구하여 머리 영역의 후보로 지정하는 방법을 사용한다. 이렇게 함으로써 다양한 머리색을 포함한 문제들을 해결할 수 있다.Human hair color varies not only with race but also with individual characteristics and preferences. Therefore, it is not possible to extract a hair region by designating a specific color as a hair color. In addition, since the shape of the hair varies, it is not possible to forcibly separate a specific area based on the face area and call it a head area. In the present invention, the average color is taken from some areas (called a hair reference region) having a high probability of being a hair area based on the face area, designated as a hair reference color, and then the areas having the same color are obtained to obtain hair. Use the method of designating a region candidate. This can solve problems with various hair colors.

얼굴 영역이 주어졌을 때 머리 참조 영역은 도7에 나타낸 바와 같이 정의된다. 도7의 (a)에서 얼굴 영역의 상단 중 양쪽 일부분을 머리 참조영역으로 하여 평균 칼라를 취하였다. 이와 같이 양쪽을 모두 머리 참조영역으로 취함으로써, 잡음으로 인한 잘못된 머리 참조 칼라를 구하는 문제점을 줄일 수 있다. 만일 얼굴의 상단이 화면 끝에 붙어 있을 경우는 머리 참조 영역의 대부분이 화면 바깥쪽에 위치하여 도7의 (a)와 같이 머리 참조 칼라를 구할 수 없다. 이러한 경우는 도7의 (b)에 나타낸 바와 같이 얼굴 영역의 양쪽 끝에서 머리 참조 영역을 구하게 된다. 얼굴 상단이 화면의 끝에 붙어 있는지의 여부는 얼굴 영역의 좌표를 통해서 쉽게 판단할 수 있다.Given the face area, the head reference area is defined as shown in FIG. In Fig. 7 (a), an average collar was taken with both parts of the upper end of the face area as the head reference area. By taking both of the head reference regions in this manner, it is possible to reduce the problem of finding the wrong head reference collar due to noise. If the top of the face is attached to the end of the screen, most of the head reference area is located outside the screen, so that the head reference collar cannot be obtained as shown in FIG. In this case, as shown in Fig. 7B, the head reference area is obtained at both ends of the face area. Whether the top of the face is attached to the end of the screen can be easily determined through the coordinates of the face area.

상기한 취한 머리 참조 영역의 평균 칼라 정보를 사용할 때 주어진 색공간에 따라 성능이 달라진다. 본 발명에서는 앞서 기술한 바와 같이 Hue, Sum, Diff를 표현할 수 있는 HMMD 색공간을 사용하였다.When using the average color information of the taken head reference region, the performance depends on the given color space. In the present invention, as described above, the HMMD color space capable of expressing Hue, Sum, and Diff is used.

HMMD 칼라공간에서 주어진 영상의 픽셀값을 표현하였을 때, 사전에 정의된 광범위한 머리색은, 머리색 = (sum < 150.AND.diff<30).OR.(sum<230&&diff<15) 와 같이 정의하였다. HMMD 색공간에서 sum은 밝기를 나타내고, diff는 색의 순도를 나타낸다. 순도가 낮을 때, 즉 회색과 같은 경우 밝기가 상당히 밝더라도 머리일 확률이 높다. 이는 머리가 조명에 의해 밝게 빛날 수 있음을 의미한다. 따라서 순도를 나타내는 diff가 15이하일 때에는 밝기를 나타내는 sum이 230 이하이면 모두 머리색으로 지정하였다. 반면에 diff가 15보다는 높으나 30보다 낮을 경우 sum이 150이하일 때에만 머리색으로 지정하였다. 이는 순도가 어느 정도 높을 경우 밝기가 높으면 회색보다는 칼라가 강하게 느껴지므로 머리가 아닐 확률이 높기 때문이다.상기 머리색의 정의에 기술한 바와 같이 광범위한 머리색은 색상을 나타내는 hue에 의해 제한되지 않는다. 이는 산뜻한 칼라가 아닌 경우 칼라 값에 관계없이 머리색의 후보로 보기 위함이다. 또한 RGB로 표현되는 입력 영상을 HMMD로 변환시 제일 시간이 오래 걸리는 hue를 구할 필요가 없으므로 처리 시간 또한 빠르게 한다.When representing pixel values of a given image in the HMMD color space, a wide range of predefined hair colors is defined as hair color = (sum <150.AND.diff <30) .OR. (Sum <230 && diff <15) It was. In the HMMD color space, sum represents brightness and diff represents color purity. When purity is low, ie gray, it's more likely that you're a head, even though the brightness is quite bright. This means that the head can shine brightly by lighting. Therefore, when the diff representing purity is 15 or less, the hair color is designated as the sum representing brightness is 230 or less. On the other hand, if the diff is higher than 15 but lower than 30, the hair color is designated only when the sum is 150 or less. This is because, if the purity is somewhat high, the color is more strongly felt than the gray color, so it is not likely to be the hair. As described in the definition of hair color, the broad hair color is not limited by the hue representing the color. This is because if it is not a fresh color, it is regarded as a candidate for hair color regardless of the color value. In addition, since the hue that takes the longest time to convert the input image represented by RGB to HMMD is not required, processing time is also increased.

상기 기술한 방법은 대부분 좋은 결과를 나타낼 수 있지만, 얼굴 영역이 정확하지 않거나 대머리 등의 예외적인 경우가 발생할 경우 머리 참조 영역이 실제 머리가 아닌 영역을 포함할 수 있어 잘못된 후보 영역을 추출하는 경우도 있을 수 있다. 이러한 문제점을 해결하기 위해, 일반적인 경우 보다 높은 정확성을 갖도록 머리색 범위를 사용하였다. 즉, 광범위한 머리색 범위를 사전에 지정하여 머리 참조영역에 포함된 픽셀이 머리색 범위에 속했을 때에만 평균 칼라에 참여하도록 하는 것이다. 이러한 방법은 예외적인 머리색을 지닌 경우 해결하지 못하는 문제점이 있으나, 범위가 광범위하여 대부분의 문제에 적합하고, 예외적인 경우에는 머리색 범위를 사용하지 않을 수도 있기 때문에 선택적인 적용이 가능하다. 즉, 일반적인 경우 보다 높은 정확성을 갖고, 예외적인 경우 덜 정확하더라도 해결할 수 있도록 선택적 사용을 가능하게 하였다.Most of the methods described above can produce good results. However, if the face region is not accurate or an exceptional case such as baldness occurs, the head reference region may include an area other than the actual head. There may be. To solve this problem, hair color range was used to have higher accuracy than usual. In other words, a wide range of hair colors is specified in advance so that the pixels included in the head reference area participate in the average color only when they belong to the hair color range. This method has a problem that cannot be solved when it has an exceptional hair color, but is widely applicable to most problems due to its wide range, and may be selectively applied because the hair color range may not be used in exceptional cases. In other words, it has a higher accuracy than the general case, and the selective use is possible to solve even if the exception is less accurate.

2.2. 에지(dege) 정보2.2. Edge information

이동 통신 환경에서는 야외에서 영상을 취득하는 경우가 많다. 야외 배경의 경우 숲이나 나무 같은 배경이 머리 뒤에 위치할 경우 같은 머리색으로 구분되지 않을 경우가 많다. 이러한 경우 머리 영역과 배경 영역을 구분하기 위해 에지 정보를 사용한다. 상기 구해진 머리 후보 영역에서 에지가 검출된 픽셀은 제외시킨다.일반적으로 머리 영역에는 에지가 생성되지 않으며 머리 영역과 배경 영역과의 경계나 배경 중에는 에지가 검출된다. 따라서 에지가 검출된 픽셀을 제외시킬 경우 머리색으로 오인된 배경 중 일부가 제외됨과 동시에, 머리 영역과 배경 영역과의 경계 부분이 대부분 제외된다.In a mobile communication environment, images are often acquired outdoors. In the case of an outdoor background, if a background such as a forest or a tree is located behind the head, it is often not distinguished by the same hair color. In this case, edge information is used to distinguish the head region from the background region. Pixels whose edges are detected in the obtained head candidate region are excluded. In general, no edge is generated in the head region, and an edge is detected in the boundary or background between the head region and the background region. Therefore, when the edge detected pixels are excluded, part of the background mistaken for the hair color is excluded, and the boundary between the head area and the background area is mostly excluded.

다음에는, 남은 영역들을 연결된 픽셀들을 하나의 영역으로 묶는 연결요소(connected component)를 구한다. 이 경우 머리 영역과 배경 영역과의 경계 부분이 에지로 인하여 지워졌기 때문에 머리 영역과 배경 영역은 서로 다른 영역으로 레이블링된다. 따라서 추출된 얼굴과 붙어 있는 연결요소만을 머리 영역으로 판단하면 대부분의 배경 영역이 지워지게 된다. 이러한 과정의 예가 도8에 나타나 있다.Next, a connected component that connects the pixels connected to the remaining areas into one area is obtained. In this case, since the boundary between the head region and the background region is erased by the edge, the head region and the background region are labeled with different regions. Therefore, if only the connection element attached to the extracted face is determined as the head region, most of the background region is erased. An example of this process is shown in FIG.

도8에서 얼굴 주변에 회색(81)과 이 보다 짙은 색(82)의 연결요소들이 추출되어져 있다. 도8에서 회색 영역(81)은 배경이나 유사한 머리색으로 인하여 머리영역으로 오인된 영역이고 이 보다 짙은 색 영역(82)은 실제 머리 영역이다. 에지 픽셀의 제거로 인하여 도8에서 처럼 대부분의 머리 영역과 배경 영역의 경계 영역이 떨어져 있다. 머리 좌측 상단에 표기된 영역(83)과 같이 예외적으로 붙어 있는 픽셀이 소수 남아 있을 수도 있으나, 이러한 잡음은 오프닝(Opening)과 같은 모폴로지 기법으로 지워질 수 있다.In FIG. 8, connection elements of gray 81 and darker color 82 are extracted around the face. In FIG. 8, the gray area 81 is the area mistaken for the head area due to the background or similar hair color, and the darker color area 82 is the actual head area. Due to the removal of the edge pixels, the border area of most of the head area and the background area is separated as shown in FIG. There may be a small number of exceptionally stuck pixels, such as the area 83 shown in the upper left of the head, but this noise may be erased by morphology techniques such as opening.

모폴로지는 이미지 프로세싱이 많이 사용되는 일반적인 방법으로서 후에 간략하게 기술하겠다.Morphology is a common way that image processing is heavily used and will be briefly described later.

도8에서 얼굴 영역과 인접한 연결요소만을 남기고 지우게 되면 실제 머리 영역만이 추출되고, 잘못 추출된 배경은 제외된다.In FIG. 8, when only the connection element adjacent to the face region is deleted, only the actual head region is extracted, and the wrongly extracted background is excluded.

상기 에지를 구하는데 사용되는 필터는 여러 가지가 있으나 본 발명에서는 2*2 Roberts 필터를 사용하였다. Roberts 필터는 다음의 수학식1에 기술되어 있으며, 이 필터를 그레이(gray)로 바뀐 영상에 적용시킨 후 임계치인 25보다 큰 값일 경우만을 에지가 있다고 판단하였다.There are many filters used to find the edges, but the 2 * 2 Roberts filter is used in the present invention. The Roberts filter is described in Equation 1 below, and it is determined that there is an edge only when the filter is applied to the image changed to gray and the value is larger than the threshold value of 25.

2.3. 모양정보2.3. Shape information

상기 설명한 머리 영역 추출방법은 모두 얼굴 영역을 중심으로 사전에 정의된 모양틀 내부에서 이루어진다. 모양틀의 예를 도9에 나타내었다. 얼굴 영역이 추출되면 해당하는 크기와 가로 세로 비율로 모양틀은 재구성되며, 재구성된 모양틀 내부에서 상기한 머리 영역 추출 과정이 수행된다. 이는 처리의 오류로 인해 배경 영역 머리 영역으로 잘못 오인되는 경우라도 강제로 머리보다 떨어진 영역을 제한함으로써 최악의 경우를 방지하는 효과가 있다. 또한 제한된 영역에서만 처리가 이루어지므로 불필요한 처리 시간을 줄일 수 있는 효과도 있다. 모양틀은 일반적인 머리 영역에 비해 훨씬 크게 정의함으로써 실제 머리 영역이 모양틀로 인해 제외되는 경우를 방지하였다.The above-described method for extracting the head region is all made inside the previously defined frame around the face region. An example of a frame is shown in FIG. When the face region is extracted, the shape frame is reconstructed at a corresponding size and aspect ratio, and the head region extraction process is performed in the reconstructed shape frame. This has the effect of preventing the worst case by forcibly limiting the area away from the head even if it is mistakenly mistaken as the background area head area due to a processing error. In addition, since processing is performed only in a limited area, unnecessary processing time can be reduced. The frame is defined to be much larger than the general head area, preventing the actual head area from being excluded by the frame.

지금까지 칼라, 에지, 모양 정보를 사용하여 머리 영역을 추출하는 각각의 방법에 대하여 설명하였다.So far, each method of extracting the head region using color, edge, and shape information has been described.

도10은 지금까지 설명한 과정을 종합하여 머리 영역을 추출하는 전체 프로세스를 보여준다. 입력으로 사용되는 영상은 앞서 기술한 전체 프로세스에서 이미 리사이징을 거친 영상이므로 44*36(CFI/8 = 44*36) 영상이고, 얼굴 영역도 이미 추출된 상태이다.10 shows the overall process of extracting the head region by combining the processes described so far. The image used as the input is a 44 * 36 (CFI / 8 = 44 * 36) image because the image has already been resized in the entire process described above, and the face region has already been extracted.

먼저, 앞서 추출된 얼굴 영역을 사용하여 머리 후보 영역을 머리틀을 사용하여 정의한다(S31). 다음 단계(S32)부터 모든 과정은 머리틀에 의해 제한된 후보 영역을 대상으로 이루어진다. 다음 대상 픽셀들 중에서 사전에 정의된 머리 색 범위에 해당하는 픽셀을 구하여 이를 집합 H(set H)라고 한다(S32). 다음 Roberts 필터를 사용하여 에지가 발생한 픽셀을 구하여 이를 집합 E(set E)라고 한다(S33). 이 두 집합(set H, set E)을 사용하여 에지가 없으면서 머리색을 지니는 픽셀들의 집합 A(= H ∩E^c)를 구한다(S34). 다음, 앞에서 구해진 얼굴 영역을 이용하여 머리 참조 영역을 구하고(S35), 머리 참조 영역에서 머리색에 해당하는 픽셀 칼라들을 평균하여 머리 참조 칼라를 구한다(S36). 그리고, 상기 집합A(set A)에서 머리 참조 칼라와 유사한 칼라만을 남기고 이를 집합 Hair(set Hair)라고 한다(S37). 다음에는 집합 Hair에서 연결요소를 구한 후, 얼굴 영역과 인접한 연결요소만을 머리 영역이라고 판단함으로써 본 과정이 끝나게 된다(S38,S39). 상기 기술한 과정에서 각 단계의 순서는 편의에 따라 바뀔 수 있다.First, the head candidate region is defined using the frame using the face region extracted previously (S31). From the next step (S32) all the process is made of the candidate area limited by the frame. Among the following target pixels, a pixel corresponding to a predefined hair color range is obtained and called a set H (S32). Next, the edge-generated pixel is obtained using the Roberts filter, and this is called a set E (S33). Using these two sets (set H, set E), a set A (= H ∩ E ^c ) of pixels having a hair color without edges is obtained (S34). Next, a head reference area is obtained using the face area obtained above (S35), and a head reference color is obtained by averaging pixel colors corresponding to the hair color in the head reference area (S36). In addition, only a color similar to the head reference color is left in the set A, which is called a set hair (S37). Next, after obtaining the connection element from the set hair, the process is completed by determining only the connection element adjacent to the face area as the head area (S38, S39). In the above-described process, the order of each step may be changed for convenience.

3. 몸 영역 추출방법3. Body Area Extraction Method

지금까지 설명한 방법으로 얼굴과 머리 영역이 추출되고 나면 추출된 얼굴영역을 기준으로 몸 영역을 추출하게 된다. 몸 영역은 다른 영역과 달리 사람이 착용하는 옷의 종류와 모양이 다양하여 정확히 몸 영역을 추출하기 힘들다. 본 발명에서는 실제 응용을 위한 실용적인 기술제안이 목적이므로 주어진 문제에서 자동으로 몸 영역을 추출할 수 있을 경우에는 최대한 몸 영역을 추출하고, 그렇지 못할 경우에는 사전에 정의된 몸틀을 사용하여 강제로 추출하게 된다. 몸 영역의 추출은 상기 기술한 비트율 조절에 사용되므로 정확한 영역을 추출하지 못하더라도 큰 문제는 없고, 이동 통신 환경에서 몸이 화면에 잡히는 경우는 드물기 때문에 실제 적용 시 큰 문제가 되지는 않는다.After the facial and head regions are extracted by the method described so far, the body region is extracted based on the extracted face region. Unlike other areas, the body area has various types and shapes of clothes worn by a person, making it difficult to extract the body area accurately. In the present invention, since a practical technical proposal for an actual application is an object, if the body region can be automatically extracted from a given problem, the body region is extracted as much as possible, and if not, the extraction is forcibly performed using a predefined body frame. do. Since the extraction of the body region is used for the bit rate adjustment described above, there is no big problem even if the exact region cannot be extracted, and since the body is rarely caught on the screen in a mobile communication environment, it is not a big problem in actual application.

몸 영역의 추출은 크게 나누어 칼라 정보를 이용하는 방법과 모양 정보를 이용하는 방법이 있다.There are two methods of extracting body regions, using color information and shape information.

3.1. 칼라정보를 이용하는 몸 영역 추출방법3.1. Body Area Extraction Method Using Color Information

추출된 얼굴 영역을 기준으로 몸 참조 영역(Reference Body Region)을 지정한다. 몸 참조 영역은 사용자의 옷에 해당하는 영역일 확률이 높은 곳을 지정하는 것으로서 도11과 같이 얼굴 영역의 양쪽 끝을 화면의 맨 밑으로 연결했을 때 접하는 좌우 두개의 영역을 취하게 된다. 다음, 머리 영역 추출에서와 마찬가지로 몸 참조 영역에서 평균 칼라 값을 취하여 이를 몸 참조 칼라로 지정하고, 이후에는 몸 참조 칼라와 유사한 칼라를 갖는 픽셀들로 구성된 영역을 몸 영역으로 지정한다.A reference body region is designated based on the extracted face region. The body reference area designates a place where the user's clothes are likely to be the area of the user's clothes. As shown in FIG. 11, the body reference area takes two left and right areas which are in contact with each other when the ends of the face area are connected to the bottom of the screen. Next, as in the head region extraction, an average color value is taken from the body reference region and designated as the body reference color, and then, an area composed of pixels having a color similar to the body reference color is designated as the body region.

3.2. 모양정보를 이용하는 몸 영역 추출방법3.2. Body Region Extraction Method Using Shape Information

얼굴 영역이 정해지면 사람의 몸에 해당하는 대략의 영역을 예측할 수 있다. 이러한 대략의 영역보다 넓게 후보 영역을 지정한 것이 몸틀이다. 몸틀은 도12에나타낸 바와 같이 지정된다. 즉, 얼굴 영역이 주어지면 얼굴 크기에 맞게 틀의 크기도 맞추어지고 얼굴의 하단에 틀이 위치하게 된다. 이 틀로 제한된 몸 영역을 몸의 후보 영역으로 하여 상기 기술한 칼라 기반의 처리가 이루어지게 된다. 몸틀로 제한된 영역에 대한 칼라 기반 처리를 하기 때문에 사용자의 옷 색과 유사한 색을 가지는 배경을 포함하더라도 이러한 틀에 의해 그 영역은 제외될 수 있다.Once the face area is determined, an approximate area corresponding to the human body can be predicted. It is the frame that designates the candidate area wider than this approximate area. The body frame is designated as shown in FIG. In other words, given a face area, the size of the frame is adjusted according to the size of the face and the frame is positioned at the bottom of the face. In this frame, the above-described color-based processing is performed using the limited body area as the candidate area of the body. Because color-based processing is performed on the area limited to the body frame, even if the background includes a color similar to that of the user's clothes, the area may be excluded by this frame.

몸 영역에는 사용자의 옷 이외에도 목, 팔과 같은 사용자의 몸이지만 실제 피부가 드러난 곳을 포함하게 된다. 따라서, 상기 몸 참조 칼라만으로 몸 영역을 추출했을 경우 이러한 목이나 팔이 제외될 수 있다. 그러므로 몸 틀 내부에서 몸 참조 칼라와 유사한 칼라를 지닌 영역 뿐 아니라 살색을 갖는 영역도 몸 영역에 포함되어야 한다. 즉, 몸틀 내부에서 몸 참조 칼라와 유사한 칼라나 살색을 갖는 영역을 몸 영역으로 지정함으로써 몸 영역을 추출하는 것이다.In addition to the user's clothes, the body region includes the user's body such as the neck and arms but where the actual skin is exposed. Therefore, when the body region is extracted only by the body reference collar, the neck or the arm may be excluded. Therefore, not only areas with a color similar to the body reference collar within the body frame but also fleshed areas should be included in the body area. That is, the body region is extracted by designating a region having a color or flesh color similar to the body reference collar within the body frame as the body region.

몸 영역은 다양한 옷을 사용자가 취할 수 있으므로 몸 참조 칼라와 유사한 칼라만으로 몸 영역을 취했을 경우 실제 몸 영역을 추출하지 못할 수도 있다. 예를 들어 단색의 옷을 착용한 경우는 적절하게 몸 영역을 취할 수 있으나, 머리칼이 몸 영역까지 늘어지거나 다양한 색으로 구성된 옷을 착용하였을 경우에는 적절한 영역을 취할 수 없다. 그러므로 만일 몸 참조 영역에 일정한 색이 아닌 다양한 색이 포함되었을 경우에는 상기 기술한 방법으로 몸 영역을 취할 수 없다고 여기고 몸 틀에 의해 정의된 영역을 강제로 분리하여 몸 영역으로 지정한다.Since the body region can take various clothes, the user may not be able to extract the actual body region if the body region is taken only with a collar similar to the body reference collar. For example, if a user wears a solid color of clothes, the body area may be appropriately taken. However, if the hair is stretched to the body area or the clothes of various colors are worn, an appropriate area may not be taken. Therefore, if the body reference area includes various colors other than a constant color, it is considered that the body area cannot be taken by the above-described method, and the area defined by the body frame is forcibly separated and designated as the body area.

도13은 지금까지 설명한 몸 영역 추출과정을 보여준다. 먼저 도11에서 기술한 몸 참조 영역을 추출된 얼굴 영역을 이용하여 지정하고(S41), 지정된 영역에서평균 칼라값을 취하여 몸 참조 칼라를 구한다(S42). 다음, 도12에서 기술한 몸 틀을 추출된 얼굴 영역을 사용하여 지정하고(S43), 지정된 몸 틀 영역 내부에서 몸 참조 칼라와 유사한 픽셀을 취하여 이를 집합 Body(set Body)라고 한다(S44). 다음 몸 틀 영역 내부에서 살색 칼라를 갖는 픽셀 값을 취하여 이를 집합 Skin(set Skin)이라고 한다(S45). 다음, 두 집합 Body와 Skin을 합한 영역을 몸 영역으로 지정함으로써 몸 영역 추출과정이 끝나게 된다(S46).Figure 13 shows the body region extraction process described so far. First, the body reference region described in FIG. 11 is designated using the extracted face region (S41), and a body reference color is obtained by taking an average color value in the designated region (S42). Next, the body frame described in FIG. 12 is designated using the extracted face region (S43), and a pixel similar to the body reference color is taken inside the designated body frame region, and this is called a set body (S44). Next, a pixel value having a flesh color is taken inside the body frame region, and this is called a set skin (S45). Next, the body region extraction process is completed by designating a region of the sum of the two sets of bodies and skins as the body region (S46).

4. 후처리(Post processing)4. Post processing

상기 기술한 얼굴 영역 추출과 머리 영역 추출, 그리고 몸 영역 추출을 행하고 나면 후처리를 통하여 전체 몸 영역을 추출하게 된다. 후처리는 크게 잡음이 일어난 가장자리 영역을 다듬는 단계와, 추출된 몸 영역에 발생한 홀(hall)을 제거하는 홀 메움 과정으로 구성된다.After the face region extraction, the head region extraction, and the body region extraction described above, the entire body region is extracted through post-processing. Post-processing consists of trimming the noisy edge area and filling the hole to remove the hole in the extracted body area.

상기 추출한 사람영역의 경우는 픽셀 단위로 구성되므로 추출된 영역의 가장자리는 매끄럽지 못하고 작은 가지가 돌출되거나 들어가는 등의 잡음이 있을 수 있다. 이러한 잡음을 제거하기 위해서 간단한 모폴로지를 행한다. 먼저, 크기A의 엘리먼트를 이용한 다일레이션(dilation)을 행한다. 크기A의 엘리먼트를 이용한 다일레이션(dilation)을 행하면 A보다 작은 크기의 홈들이 메워지고 매끄럽게 된다. 다음, 크기B의 엘리먼트를 이용한 이로우젼(erosion)을 행한다. 그러면 B보다 작은 크기의 돌출된 잔가지들이 제거된다. 본 발명에서 크기A는 크기B 보다 약간 크게 설정하였다. 다일레이션(dilation)은 영역 확장을 이용한 잡음 제거이고, 이로우젼(erosion)은 영역 축소를 이용한 잡음 제거이므로, 두개의 모폴로지 과정을 행하면 원래 추출된 영역보다 크게 영역이 정의된다. 이는 비트율 조절이라는 응용에서 중요 영역이 오인되어 배경으로 인식되는 경우가 배경이 중요 영역으로 오인되는 경우보다 중요한 오류로 작용하기 때문에 원래 추출된 영역보다 크게 영역을 설정하는 것이다. 두개의 모폴로지 과정이 끝나면 추출된 영역에 발생한 홀을 제거하는 과정을 거친다.Since the extracted human region is composed of pixels, the edge of the extracted region may not be smooth, and there may be noise such as a small branch protruding or entering. Simple morphology is performed to remove this noise. First, dilation using an element of size A is performed. Dilation using an element of size A fills and smooths the grooves smaller than A size. Next, an erosion using an element of size B is performed. This will remove any overhanging twigs smaller than B. In the present invention, size A is set slightly larger than size B. Since dilation is noise cancellation using region expansion, and erosion is noise removal using region reduction, two morphology processes are used to define a region larger than the originally extracted region. This is to set the area larger than the original extracted area because the case where the important area is mistaken and recognized as the background in the application of bit rate adjustment is more important than the case where the background is mistaken as the important area. After the two morphology processes, the holes in the extracted area are removed.

[처리 속도의 개선][Improvement of Processing Speed]

지금까지 설명한 사람 영역 추출방법은 영상이 입력되면 44*36크기로 줄인 후, 처리를 행하므로 빠른 시간 처리가 가능하다. 일반적으로 화상 통신 데이터는 CIF(352*288)이나 QCIF(176*144) 크기의 화면을 갖는데 본 발명에서는 모두 44*36크기로 줄이므로 CIF의 경우 가로 세로 1/8로 축소하게 되고, QCIF의 경우 1/4로 축소하게 된다. 처리 시간과 알고리즘의 정확성 측면에서의 성능은 일반적으로 서로 상반 관계를 갖는다. 따라서 두 가지 측면을 모두 만족시키기가 어렵기 때문에 주어진 문제의 특성을 파악하여 어느 한쪽을 중요시하게 된다. 본 발명에서는 두 가지 측면을 모두 어느 정도 만족시키되, 영역을 축소함으로써 발생하는 각 과정별 성능의 저하 정도와 속도 개선 정도를 측정하여 전체적인 조화를 이루도록 하였다.The human region extraction method described so far is reduced to 44 * 36 size when an image is input, and then processed, thereby enabling fast time processing. In general, video communication data has a screen size of CIF (352 * 288) or QCIF (176 * 144). In the present invention, since all are reduced to 44 * 36 size, the CIF is reduced to 1/8 of the width and width of the QCIF. In this case, it is reduced to 1/4. Performance in terms of processing time and algorithm accuracy generally has a tradeoff. Therefore, it is difficult to satisfy both aspects, so it is important to understand the characteristics of a given problem and to consider which one is important. In the present invention, while satisfying both aspects to some extent, the degree of performance degradation and speed improvement by each process caused by the reduction of the area is measured to achieve an overall harmony.

본 발명에서 주요한 과정인 얼굴 영역 추출과정, 머리 영역 추출과정, 그리고 몸 영역 추출과정에서 사용되는 기본 단위 알고리즘에는 모폴로지 적용 부분과 살색 영역 추출과 같은 칼라 기반 알고리즘, 그리고 에지 기반 알고리즘 등이 주로 많은 처리 시간을 요구하고 있다. 특히 이 중에서 처리 시간을 많이 요구하는 정보를 비교해 보면 모폴로지 알고리즘 > 에지 기반 알고리즘 > 칼라 기반 알고리즘이다.In the present invention, the basic unit algorithms used in the face region extraction process, the head region extraction process, and the body region extraction process, which are the main processes of the present invention, are mainly processed by a color-based algorithm such as a morphology application part, a skin-color extraction area, and an edge-based algorithm Asking for time. In particular, if you compare the information that requires a lot of processing time, it is morphology algorithm> edge based algorithm> color based algorithm.

이들의 처리 시간을 줄이기 위해서 다음과 같은 세 단계의 처리 방법을 고려할 수 있다.In order to reduce their processing time, the following three steps can be considered.

첫 번째는 원 영상을 그대로 이용하는 방법이다. 이 방법은 원 이미지 크기에서 픽셀 단위로 처리하는 방법으로 처리 시간이 너무 걸리므로 바람직하지 않다.The first is to use the original image as it is. This method is not preferable because it takes too much processing time to process the pixel in the original image size.

두 번째는 그리드를 사용하는 방법이다. 즉, 주어진 영상을 n*m개의 그리드(grid)로 구분하여 각 그리드별로 에지나 살색 칼라 등을 구한 후, 각 그리드가 평균적으로 칼라와 에지가 어떠한지를 보는 방법이다. 이 경우 기본적인 칼라 픽셀이나 에지 픽셀의 추출은 원 영상에서 이루어지되, 구해진 칼라 픽셀과 에지 픽셀을 사용하여 적용하고 판단하는 과정은 그리드 단위로 이루어지므로 이미지를 축소하여 처리하는 것과 같은 처리시간을 갖는다. 즉, 가장 많은 시간을 요하는 포폴로지 알고리즘이 그리드를 이용하게 되므로 이 부분에서 많은 시간을 절약할 수 있다. 그렇지만 이 방법은 칼라나 에지 기반의 알고리즘에서는 원래의 처리 시간이 그대로 요구된다.The second way is to use a grid. In other words, after dividing a given image into n * m grids to obtain edges or flesh colors for each grid, each grid is an average view of color and edges. In this case, basic color pixels or edge pixels are extracted from the original image. However, the process of applying and determining the obtained color pixels and edge pixels is performed in grid units, and thus processing time is reduced and processed. That is, the most time-consuming population algorithms use the grid, which can save a lot of time. However, this method requires the original processing time in color or edge-based algorithms.

세 번째는 영상을 축소하여 처리한 후 다시 확대하는 방법이다. 즉, 주어진 영상을 n*m으로 축소한 후, 축소된 영상을 이용하여 처리하는 경우이다. 이 경우 모든 알고리즘이 축소된 영상을 사용하므로 빠른 처리 시간이 가능하다. 그렇지만 작은 크기인 만큼 한 픽셀의 오차가 이미지 전체에 미치는 영향이 커진다. 그렇기 때문에 잡음이 민감한 단점이 있다. 따라서, 어느 정도 크기까지 축소가 가능한지가 관건이고, 이미지 축소에 따른 영향 정도도 각 단위 알고리즘마다 다를 수 있다.The third is to zoom out and zoom in again. That is, a case in which a given image is reduced to n * m and then processed using the reduced image. In this case, all the algorithms use the reduced image, which enables fast processing time. However, the smaller the size, the greater the effect of an error of one pixel on the entire image. As a result, noise is sensitive. Therefore, it is a matter of how much the image can be reduced, and the degree of influence due to image reduction may also be different for each unit algorithm.

이들 세 알고리즘의 각각의 역할을 볼 때 상기 기술한 그리드 방법 대신 영상을 축소해서 처리했을 때의 성능이 저하되는 정도와 속도가 개선되는 정도를 살펴보면, 모폴로지 알고리즘의 경우 성능 저하 정도는 1%, 속도 개선 정도는 150%이고, 칼라 기반 알고리즘의 경우 성능 저하 정도는 1%, 속도 개선 정도는 50%이고, 에지 기반 알고리즘의 경우는 성능 저하 정도가 2%, 속도 개선 정도는 100%로 나타났다.In terms of the role of each of these three algorithms, the degree of performance degradation and the speed improvement when the image is reduced and processed instead of the grid method described above is considered. The improvement is 150%, and the color-based algorithm shows 1% performance degradation and 50% speed improvement, and the edge-based algorithm shows 2% performance degradation and 100% speed improvement.

여기서 알 수 있듯이 세 가지 알고리즘 각각의 속도 개선 정도는 매우 커지지만 성능 저하 정도는 매우 미미함을 알 수 있다. 상대적으로 에지 기반 알고리즘이 영상을 축소했을 때 잡음으로 인해 성능이 저하되는 정도가 컸지만 절대적인 저하치는 그리 크지 않음을 알 수 있다. 특히 비트율 조절이라는 응용 측면에서 1내지 2%의 오류는 처리시간에 비해 중요한 포인트가 아니므로 본 발명에서는 영상을 축소하여 처리하는 방법을 사용하였다.As you can see, the speed improvement of each of the three algorithms is very large, but the performance degradation is very slight. Relatively small edge-based algorithms reduce the performance due to noise, but the absolute degradation is not so great. In particular, in the aspect of application of bit rate adjustment, the error of 1 to 2% is not an important point compared to the processing time, so the present invention used a method of reducing and processing an image.

축소하는 정도는 실험에 의해 44*36 정도의 크기가 4배수이면서 화상 통신 데이터를 처리하는데 문제가 없는 최소 크기임을 발견하여 선택하였다. 크기가 4의 배수가 되도록 하는 것은 레지스터 등을 이용하는 실제 처리 과정에서 최적화를 위한 것이다.The size of the reduction was determined by experimenting that the size of 44 * 36 is four times the minimum size without any problem in processing video communication data. Making sure that the size is a multiple of 4 is for optimization in actual processing using registers and the like.

지금까지 기술한 사람 영역 추출방법을 사용하여 추출한 결과의 예를 도14의 (a) 내지 (e)에 나타내었다. 도14의 각 예에서 좌측 영상은 원 영상이며 원 영상 위에 표기된 사각형은 자동 추출한 얼굴 영역의 MBR을 나타낸다. 도14의 예에서 우측 영상은 추출된 사람 몸 영상이며, 추출된 영역이 블록단위로 표기된 것은 44*36으로 축소하여 처리한 것을 다시 확대하였기 때문이다.Examples of the results extracted using the human region extraction method described so far are shown in Figs. 14A to 14E. In each example of Fig. 14, the left image is the original image and the squares marked on the original image represent the MBR of the face region extracted automatically. In the example of FIG. 14, the right image is the extracted human body image, and the extracted region is expressed in units of blocks because the processing is reduced to 44 * 36 and magnified again.

도17의 (a)는 얼굴 색과 유사한 배경이 있는 예이며, (b)는 머리 색과 유사한 배경이 있는 예이며, (c)는 몸에 옷 이외에 팔이 포함된 예이며, (d)는 저화질 영상을 처리한 예이며, (e)는 몸 참조 영역에 머리가 포함되어 잘못 분리된 예를 보여주고 있다.Figure 17 (a) is an example with a background similar to the color of the face, (b) is an example with a background similar to the color of the hair, (c) is an example in which the arm is included in addition to the clothes, (d) This is an example of processing a low quality image, and (e) shows an example in which the head is included in the body reference region and is incorrectly separated.

본 발명은 이동 카메라 환경에서의 화상 통신에서 배경과 사용자를 자동 분리하는 방법을 제공한다. 본 발명은 실시간 처리가 가능할 정도의 빠른 처리 시간으로 사용자 영역을 분리하는 방법을 제공함으로써, 객체 기반의 화상 통신 데이터 응용을 가능하게 한다.The present invention provides a method for automatically separating a user from a background in video communications in a mobile camera environment. The present invention provides an object-based video communication data application by providing a method for separating a user area at a processing time fast enough to enable real-time processing.

특히 본 발명은 네트워크 부하에 적응적으로 적용되는 비트율 조절이나, 자동 비디오 메일 편집과 같이 이동 통신에 유용하게 사용될 수 있는 핵심 기술로 적용될 수 있으며, 시간과 성능 측면에서의 두 가지 문제점을 모두 해결하여 실제 적용이 가능하게 하였다.In particular, the present invention can be applied as a core technology that can be usefully used for mobile communication, such as bit rate adjustment or automatic video mail editing, which is adaptively applied to the network load, and solves both problems in terms of time and performance. The practical application was made possible.

본 발명은 정확한 얼굴 영역 추출과 비교적 정확한 머리 영역의 추출, 그리고 대략적인 몸 영역 추출방법을 사용하여 응용 종류에 따라 적절하게 분리된 영역을 사용할 수 있게 하였다. 특히 이러한 접근 방법은 얼굴 영역이 대부분인 화상 통신 데이터에 적절하게 이용될 수 있다.According to the present invention, an accurate face area extraction, a relatively accurate head area extraction, and an approximate body area extraction method can be used to properly separate areas according to the application type. In particular, this approach can be suitably used for video communication data with a large face area.

본 발명은 객체 분리의 단위가 CIF 영상의 경우 8*8 크기의 블록 단위로 이루어지고 QCIF의 경우 4*4 크기의 블록 단위로 이루어져 있으므로, 16*16 크기인 매크로 블록 단위의 비트율 조절이나 객체 기반 코딩에 적절하게 사용될 수 있다.In the present invention, since the unit of object separation is made of 8 * 8 block units in the case of CIF image and 4 * 4 block units in the case of QCIF, the bit rate control or object based on macroblock unit of 16 * 16 size is used. It can be used appropriately for coding.

또한 본 발명은 칼라 기반으로 얼굴 영역을 추출하되, 주어진 영상에서 살색 후보 영역을 정확히 실제 사람의 영역과 배경이면서 살색과 유사한 색을 지닌 영역을 분리함으로써, 영상마다 다양하게 나타나는 살색의 다양성 문제를 해결함과 동시에 하나의 이미지 내에 나타난 살색 영역을 오브젝트에 따라 분리해 준다. 또한 화상 통신에서 얼굴 영역을 정확히 추출하기 위하여 복수개의 살색 영역이 등장할 경우 얼굴이 등장할 확률이 높은 위치에 존재하는 살색을 기준으로 먼저 영역을 분리함으로써 배경 부분보다 작은 얼굴도 칼라 기반으로 정확히 추출할 수 있다.In addition, the present invention solves the problem of skin color diversity that is variously extracted for each image by extracting a face region based on color, but separating the skin candidate region from a given image exactly with the real human region and the background having a color similar to the flesh color. At the same time, the skin color area in an image is separated according to the object. In addition, in order to accurately extract the face area in video communication, if a plurality of skin areas appear, the areas are first separated based on the skin color that exists at the position where the face is likely to appear, so that the face smaller than the background part is accurately extracted based on the color. can do.

Claims

Extracting a face region from an image, extracting a head region based on the extracted face region, extracting a body region based on the extracted face region, the extracted face region, a head region, a body region Designating a sum as a human area; Human region extraction method comprising a.

The method of claim 1, further comprising the step of reducing the original image before all the steps, and further comprising the step of reconstructing the original image size after all the steps of using the reduced image. Way.

The method of claim 1, wherein the facial region extraction comprises: obtaining a pixel corresponding to a skin color range in the image, analyzing the distribution in the color space of the skin color pixels, grouping the skin color colors, and configured for each skin color color group Determining whether the image is a face area, expressing the identified face area with a predefined face model; Human area extraction method characterized in that made by.

2. The method of claim 1, further comprising a post-processing step comprising morphology processing and hall clearing processing.

Extracting a face region from the image, setting a head reference region based on the extracted face region, obtaining a head reference color from the set head reference region, and a region having a color similar to the obtained head reference color. Obtaining, excluding an area where an edge is generated from an area having a color similar to the obtained head reference color, and designating an area adjacent to the face as a head area among the remaining areas from which the edge generation area is excluded; Hair region extraction method comprising a.

The method of claim 5, wherein the head reference color is obtained by using an average value of a color for only pixels corresponding to a predefined hair color range among the pixels included in the head reference area. .

6. The method of claim 5, wherein the head reference region is designated as an upper left and right partial regions of the face region or an upper partial region from the left and right ends of the facial region based on the extracted face region.

The method of claim 5, wherein the designating of the head region comprises determining a connected component of the remaining pixels and designating only the elements adjacent to the face as the head region.

9. The method for extracting a head region according to claim 8, wherein prior to designating only the element adjacent to the face as the head region, morphology processing is first performed to solve a problem in which a region other than the head is connected to the head due to noise.

6. The method of claim 5, further comprising specifying a head frame based on the extracted face region, wherein all the processing is limited to pixels included in the head frame.

The method of claim 5, wherein the extraction of the face region comprises obtaining an MBR including a face and using a face model having a size and a ratio suitable for the MBR.

Extracting a face region from an image, specifying a body frame based on the extracted face region, setting a body reference region based on the extracted face region, and obtaining a body reference color from the body reference region Obtaining an area having a color similar to the body reference color from the inside of the designated body frame, and obtaining an area having flesh color from the inside of the body frame; Body region extraction method comprising a.

The apparatus of claim 12, wherein the body frame is a shape model of a body defining a body area larger than a general case when a face area is given, and when the face area is extracted, the body frame is extended or proportional to its size and position. Body area extraction method characterized in that the reduced position after positioning.

The apparatus of claim 12, wherein the setting of the body reference area comprises: a predetermined area that is proportional to the size of the face area from the left and right end coordinates of the face area based on the horizontal axis coordinates, and the face area based on the vertical axis coordinates; Body area extraction method characterized in that it is set to a predetermined area from the coordinates below a certain portion from the bottom coordinates to the bottom of the screen.

The method of claim 12, wherein the body region is forcibly separated using the body frame when the color distribution is not constant in the body reference region.

Comprising a color histogram (hist) for only the pixels corresponding to the skin color region in the image, the center color histogram (center_hist) targeting the pixels corresponding to the skin color region only for a partial region when the color histogram is configured Configuring a color scheme using the color histogram (hist, center_hist), and designating a face region by separating an image area for each flesh group; Facial region extraction method comprising a.

17. The method of claim 16, wherein the partial region is defined as a rectangle smaller than the original image size centering on the center of the image, and the width and height of the original image are defined for the partial region, respectively. And a horizontal / vertical length of each rectangle satisfies width * a and height * b (a <b <1), respectively.

The method of claim 16, wherein the partial region is defined as a face region in a previous frame.

17. The method of claim 16, wherein the flesh color grouping comprises: determining a bin bin _max having a maximum value among bin values of a color histogram, and similarity between the representative color value of the bin _{max and} the representative color value of each other bin; Measuring bins, grouping bins with the measured similarity below a predetermined threshold and bin _max into a group, and finally determining the bins as a group when the sum of all bin values included in the group is greater than or equal to a predetermined threshold. And after configuring the one color group, repeating the grouping with respect to the remaining bins except for the bins included in the color group; Facial region extraction method characterized in that consisting of.

17. The method of claim 16, wherein the flesh color grouping defines an empty bin _max having a maximum value among bins of a center color histogram (center_hist) when constituting a first color group, and a color when constituting a color group after a second. Determining the bin bin _max having the maximum value among the bin values in the histogram, and then measuring the similarity between the representative color value of bin _{max and} the representative color value of each bin in the color histogram. Grouping bins having a similarity below a predetermined threshold and bin _max into a group, and finally determining the bins as a group if the sum of all bin values included in the group is above a predetermined threshold; After the group is formed, the second and subsequent color groups are selected for the remaining bins except those included in the color group. Repeating the step of sex; Facial region extraction method characterized in that consisting of.

17. The method of claim 16, wherein if the facial region is included in the region of the grouped flesh color group, the region is composed of pixels corresponding to the first flesh color group. A face region extraction method comprising performing a face region extraction process for a region.

17. The method of claim 16, wherein the step of separating the regions of the image for each skin color group comprises: blocking an image in which each skin color group is specified into N * M-sized blocks (grids); Setting a block in which the frequency of the corresponding pixel is greater than or equal to a predetermined threshold value to a flesh color block, obtaining a connected component connecting neighboring flesh color blocks, and designating the largest connected element as a face region among them; Facial region extraction method characterized in that consisting of.

17. The method of claim 16, further comprising: reducing the original image before the face region extraction process, and restoring the original image size after performing the face region extraction using the reduced image; Facial region extraction method further comprises.

Reducing the image to 44 * 36, analyzing the distribution in the color space of the pixel corresponding to the flesh color range in the reduced image, and performing face region extraction through flesh color color grouping, based on the extracted face region Specifying a head reference region and extracting a head region by checking a head reference color within the head reference region, specifying a body reference region based on the extracted face region, and checking a body reference color within the body reference region Extracting a body region, designating a human region by adding the extracted face region, the head region, and the body region and restoring it to an original image size; Human region extraction method comprising a.

25. The method according to claim 3 or 24, wherein the flesh color range is selected from the following conditions in an RGB color space;

if (r> b && r> g)

｛

if (b> g)

if (r + g> 100 && r + g <480 && g-g> 5 && r-g <100)

skin color;

else

Not flesh;

else

if (r + b> 100 && r + b <480 && r-b> 5 & r-b <100)

skin color;

else

Not flesh;

｝

else

Not flesh;

; According to claim 1, wherein the human area extraction method for defining the flesh color range.

25. The method of claim 3 or 24, wherein the fleshing range is determined by the following conditions in the HMMD color space;

if (((h> 300 && h <360) ∥ (h <60 && h> 0)) && s> 100 & & s <480 && d> 5 && d <100)

skin color;

else

Not flesh;

25. The method according to claim 3 or 24, wherein the color space for extracting the skin color region is an HMMD color space, and the skin color range sections in the Hue, Diff, and Sum axes are respectively M / N / K (M> N> K). A method for extracting human domains, comprising quantizing by dividing into equal parts.

25. The method of claim 24, wherein the range of the hair color defined as the head reference color in the head reference region is a case where the Diff is less than 30 and the Sum is less than 150, or the Diff is less than 15 and the Sum is less than 230 in the HMMD color space. Human region extraction method characterized in that.

25. The method of claim 24, wherein a region having a color similar to the body reference color in the body reference region is obtained by calculating similarity in the HMMD color space.