KR20160130911A

KR20160130911A - Method and apparatus for masking face by using multi-level face features

Info

Publication number: KR20160130911A
Application number: KR1020150062678A
Authority: KR
Inventors: 노용만; 이승호; 문정익
Original assignee: 한국과학기술원
Priority date: 2015-05-04
Filing date: 2015-05-04
Publication date: 2016-11-15
Also published as: KR101688910B1

Abstract

According to the present invention, a method for masking a face by using multi-level face features may include: a step of extracting an interest region in a video frame; a step of extracting and analyzing a spatial feature of the face from the extracted interest region so as to detect a first face region; a step of detecting a second face region by using temporal information between frames from the extracted interest region; a step of generating an integrated facial region by integrating the detected first face region and the second face region; and a step of applying the masking to the generated integrated face region. Accordingly, the present invention can maximize the face features for masking the face in the video image photographed in an unrestricted environment.

Description

TECHNICAL FIELD [0001] The present invention relates to a facial masking method and a facial masking method using multi-level facial features,

본 발명은 얼굴 마스킹 기법에 관한 것으로, 더욱 상세하게는 비디오에 등장하는 인물들의 프라이버시 보호를 위해 입력된 비디오로부터 얼굴 영역을 검출하고, 이 검출된 얼굴 영역을 마스킹(masking)하는데 적합한 다중 레벨 얼굴 특징을 이용한 얼굴 마스킹 방법 및 그 장치에 관한 것이다.
The present invention relates to a face masking technique, and more particularly, to a face masking technique for detecting a face region from input video in order to protect the privacy of characters appearing in a video, and a multilevel face feature suitable for masking the detected face region The present invention relates to a facial masking method and apparatus using the same.

최근 들어, 실시간 감시(surveillance) 및 범죄 추적 등을 목적으로 전국 곳곳에 감시용 카메라(예 : CCTV)들이 설치되고 추세인데, 이러한 추세는 범죄 예방 및 수사에 효율성을 제공하는 장점을 갖는 반면에 카메라들에 의해 의도치 않게 노출된 개인의 프라이버시가 공공연하게 노출되는 문제를 유방하고 있다.In recent years, surveillance cameras (eg CCTV) have been installed in various places throughout the country for the purpose of surveillance and crime tracking. Such trends have the advantage of providing efficiency in crime prevention and investigation, The privacy of an individual who is unintentionally exposed by the public is exposed to the public.

특히, 얼굴 노출은 신원을 알 수 있게 해주는 단서가 될 수 있기 때문에 프라이버시의 보호를 위해 개인의 얼굴을 식별할 수 없도록 마스킹 해주는 기술이 필요하다.In particular, since face exposure can be a clue to know the identity, a technique for masking the face of the individual so as to protect privacy is required.

여기에서, 얼굴 마스킹을 위해서는 얼굴 영역의 위치를 정확하게 찾는 검출과정이 요구되는데, CCTV와 같이 제약되지 않은(unconstrained) 환경에서 취득된 비디오에서의 얼굴 영역 검출은 다음과 같은 도전요소들, 예컨대 얼굴의 다양한 각도 변화, 얼굴의 조명 변화, 블러된(blurred) 얼굴, 얼굴의 가림(occlusion) 등과 같은 도전 요소들이 존재할 수 있다.Here, in order to perform facial masking, a detection process of accurately locating a face region is required. Detection of a face region in a video acquired in an unconstrained environment such as CCTV is performed using the following conducive elements, There may be a variety of challenging factors such as varying angle changes, face lighting changes, blurred faces, occlusion of faces, and the like.

잘 알려진 바와 같이, 대규모 비디오에서 프라이버시 보호를 위한 얼굴 마스킹 시스템은 일일이 수동으로 사람들의 얼굴 영역을 지정하는 경우 시간과 비용 측면에서 매우 비효율적이다.As is well known, face masking systems for privacy protection in large-scale video are very inefficient in terms of time and cost when manually specifying face regions of people.

따라서, 자동으로 사람들의 얼굴 영역을 검출하는 기술을 필요로 하는데, 대표적인 것으로는, 예컨대 비올라 존스(Viola-Jones) 얼굴 검출 방법(종래기술 1)과 셀리언시(Saliency) 맵 기반 얼굴검출 방법(종래기술 2)이 있다.Therefore, a technique of automatically detecting the face region of people is required. Typical examples include a Viola-Jones face detection method (Prior Art 1) and a Saliency map-based face detection method 2).

종래기술 1Prior art 1

종래기술 1은 안정적인 검출 성능을 보이는 최초의 실시간 얼굴 영역 검출기로서 대표적인 영상처리 인식 라이브러리인 penCV(Open Computer Vision Library)에 통합되어 많은 얼굴 영역 검출기의 기반으로서 현재까지 널리 사용되고 있다.Conventional art 1 is the first real-time face area detector having stable detection performance, and it is widely used as a base of many face area detectors integrated into penCV (Open Computer Vision Library) which is a representative image processing recognition library.

이러한 종래기술 1은 부스팅(Boosting) 기법을 통해 약분류기(weak classifier)들의 조합으로 구성된 캐스케이드(cascade) 구조의 강분류기(strong classifier)에 얼굴의 구조적 정보에 기반한 헤어-라이크(Hair-like) 특징을 사용한다.This prior art 1 discloses a technique for generating a hair-like characteristic feature based on structural information of a face in a strong classifier of a cascade structure composed of a combination of weak classifiers through a boosting technique Lt; / RTI >

그러나, 종래기술 1은 훈련에 사용된 영상과 테스트 영상의 특성이 유사한 경우에는 매우 안정적인 검출 성능을 보이지만, 그렇지 않은 경우 검출율이 급격하게 저하되는 문제점을 갖는다.However, the conventional art 1 shows a very stable detection performance when the characteristics of the test image and the image used in the training are similar to each other, but otherwise the detection rate sharply drops.

종래기술 2Prior Art 2

종래기술 2는 제약되지 않은 환경에서 상술한 종래기술 2의 검출율 저하 문제를 해결하기 위해 제안된 방법으로서, 종래기술 1에 추가적으로 피부색상에 해당하는 맵과 얼굴이 존재할 수 있는 오브젝트 영역을 부각시키는 셀리언시 맵을 사용하였다.The conventional art 2 is a method proposed in order to solve the problem of the lowering of the detection rate of the conventional art 2 described above in the unconstrained environment. In addition to the conventional art 1, a map corresponding to the skin color and an object area, Celanese map was used.

그러나, 셀리언시 맵은 단조롭지 않은 복잡한 텍스처 배경을 포함한 입력 영상에 대해서는 효용성이 급격하게 저하되는 문제점을 갖는다.
However, the Celianity map has a problem that the efficiency is drastically reduced for an input image including a complex texture background which is not monotonous.

대한민국 공개특허 제2011-0070735호(공개일 : 2011. 06. 24.)Korean Patent Publication No. 2011-0070735 (Publication date: June 24, 2011)

[1] Viola, Paul, and Michael J. Jones. "Robust real-time face detection. "International journal of computer vision 57.2 (2004): 137-154.[1] Viola, Paul, and Michael J. Jones. &Quot; Robust real-time face detection. "International journal of computer vision 57.2 (2004): 137-154. [2] El-Barkouky, Ahmed, et al. "Face detection at a distance using saliency maps." Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on. IEEE, 2012.[2] El-Barkouky, Ahmed, et al. "Face detection at a distance using saliency maps." Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on. IEEE, 2012.

본 발명은, 공간적(spatial) 레벨 특징을 이용하여 얼굴 영역을 검출하고, 시간적(temporal) 레벨 특징을 이용하여 얼굴 영역을 검출하는 방식으로 얼굴 영역에 마스킹을 적용시킴으로써, 제약되지 않은 환경에서 촬영된 비디오 영상에서 얼굴 마스킹을 수행할 수 있는 다중 레벨 얼굴 특징을 이용한 얼굴 마스킹 기법을 제안하고자 한다.The present invention detects a face area using a spatial level feature and applies a masking to a face area in a manner of detecting a face area using a temporal level feature, We propose a face masking method using multi - level facial features that can perform facial masking in video images.

본 발명이 해결하고자 하는 과제는 상기에서 언급한 것으로 제한되지 않으며, 언급되지 않은 또 다른 해결하고자 하는 과제는 아래의 기재들로부터 본 발명이 속하는 통상의 지식을 가진 자에 의해 명확하게 이해될 수 있을 것이다.
The problems to be solved by the present invention are not limited to those mentioned above, and another problem to be solved by the present invention can be clearly understood by those skilled in the art from the following description will be.

본 발명은, 일 관점에 따라, 비디오 프레임에서 관심 영역을 추출하는 과정과, 추출된 상기 관심 영역으로부터 얼굴의 공간적 특징을 추출 및 분석하여 제 1 얼굴 영역을 검출하는 과정과, 추출된 상기 관심 영역으로부터 프레임 간의 시간적 정보를 이용하여 제 2 얼굴 영역을 검출하는 과정과, 검출된 상기 제 1 얼굴 영역과 제 2 얼굴 영역을 통합하여 통합 얼굴 영역을 생성하는 과정과, 생성된 상기 통합 얼굴 영역에 마스킹을 적용하는 과정을 포함하는 다중 레벨 얼굴 특징을 이용한 얼굴 마스킹 방법을 제공한다.According to one aspect of the present invention, there is provided a method for extracting a region of interest, the method including extracting a region of interest in a video frame, extracting and analyzing spatial features of the extracted region of interest, Detecting a second face region using temporal information between frames; generating an integrated face region by integrating the detected first face region and the detected second face region; And a face masking method using a multilevel face feature.

본 발명의 상기 관심 영역은, 직사각형 모양으로 추출될 수 있다.The area of interest of the present invention can be extracted in a rectangular shape.

본 발명의 상기 관심 영역은, 피부색 픽셀(skin color pixel) 검출 기법을 통해 추출될 수 있다.The region of interest of the present invention can be extracted through a skin color pixel detection technique.

본 발명의 상기 제 1 얼굴 영역을 검출하는 과정은, 상기 관심 영역으로부터 저 레벨 얼굴 특징, 중간 레벨 얼굴 특징 및 고 레벨 얼굴 특징에 대한 각 분석 스코어 값을 추출하는 과정과, 추출된 상기 각 분석 스코어 값을 융합하여 얼굴 영역인지 아닌지의 여부를 판별하는 과정과, 상기 얼굴 영역으로 판별될 때 해당 영역의 좌표 값을 상기 제 1 얼굴 영역으로서 검출하는 과정을 포함할 수 있다.The step of detecting the first face region of the present invention includes the steps of extracting each analysis score value for the low level face feature, the intermediate level face feature, and the high level face feature from the ROI, Determining whether the face region is fused to determine whether the face region is fused; and detecting a coordinate value of the region as the first face region when the face region is identified.

본 발명의 상기 저 레벨 얼굴 특징은, 피부색 정보를 이용할 수 있다.The low-level face feature of the present invention can use skin color information.

본 발명의 상기 중간 레벨 얼굴 특징은, 양쪽 눈과 입 영역의 국부적 얼굴 특징을 이용할 수 있다.The mid-level facial features of the present invention may utilize local facial features of both eyes and mouth regions.

본 발명의 상기 고 레벨 얼굴 특징은, 전체적 얼굴 특징을 이용할 수 있다.The high level face feature of the present invention may utilize the global face feature.

본 발명의 상기 제 2 얼굴 영역을 검출하는 과정은, 순방향 위치 추적에 기반하여 상기 좌표 값에 대응하는 이전 프레임 내 이전 좌표 값을 획득하는 과정과, 역방향 위치 추적에 기반하여 상기 좌표 값에 대응하는 이후 프레임 내 이후 좌표 값을 획득하는 과정과, 획득된 상기 이전 좌표 값과 이후 좌표 값을 상기 제 2 얼굴 영역으로 검출하는 과정을 포함할 수 있다.The step of detecting the second face region may include the steps of obtaining a previous coordinate value in a previous frame corresponding to the coordinate value based on the forward position tracking, Obtaining a subsequent coordinate value within the frame, and detecting the obtained previous coordinate value and the subsequent coordinate value as the second face region.

본 발명은, 다른 관점에 따라, 입력되는 비디오 프레임으로부터 관심 영역을 추출하는 관심 영역 추출부와, 추출된 상기 관심 영역으로부터 얼굴의 공간적 특징을 추출 및 분석하여 제 1 얼굴 영역을 검출하는 제 1 얼굴 영역 검출 블록과, 추출된 상기 관심 영역으로부터 프레임 간의 시간적 정보를 이용하여 제 2 얼굴 영역을 검출하는 제 2 얼굴 영역 검출 블록과, 검출된 상기 제 1 얼굴 영역과 제 2 얼굴 영역을 통합하여 통합 얼굴 영역을 생성하는 얼굴 영역 통합부와, 생성된 상기 통합 얼굴 영역에 마스킹을 적용하는 얼굴 영역 마스킹부를 포함하는 다중 레벨 얼굴 특징을 이용한 얼굴 마스킹 장치를 제공한다.According to another aspect of the present invention, there is provided an image processing apparatus for extracting and analyzing a spatial feature of a face from an extracted interest region, A second face region detection block for detecting a second face region using temporal information between frames extracted from the extracted region of interest and a second face region detection block for detecting the integrated face and the second face region by integrating the detected first face region and the second face region, The present invention provides a face masking apparatus using a multilevel face feature including a face region integration unit for generating a face region masking unit for applying a masking to the generated integrated face region.

본 발명의 상기 제 1 얼굴 영역 검출 블록은, 상기 관심 영역으로부터 저 레벨 얼굴 특징에 대한 제 1 분석 스코어 값을 추출하는 제 1 얼굴 특징 분석부와, 상기 관심 영역으로부터 중간 레벨 얼굴 특징에 대한 제 2 분석 스코어 값을 추출하는 제 2 얼굴 특징 분석부와, 상기 관심 영역으로부터 고 레벨 얼굴 특징에 대한 제 3 분석 스코어 값을 추출하는 제 3 얼굴 특징 분석부와, 추출된 상기 제 1 내지 제 3 각 분석 스코어 값을 융합하여 얼굴 영역인지 아닌지의 여부를 판별하는 얼굴 영역 판별부와, 상기 얼굴 영역으로 판별될 때 해당 영역의 좌표 값을 상기 제 1 얼굴 영역으로서 검출하는 공간적 얼굴 검출부를 포함할 수 있다.The first face region detection block of the present invention includes a first face feature analysis unit for extracting a first analysis score value for a low level face feature from the region of interest, A third face feature analyzing unit for extracting a third analysis score value for a high-level face feature from the region of interest, a second face feature analyzing unit for extracting an analysis score from the extracted first through third analysis results, And a spatial face detection unit for detecting a coordinate value of the area as the first face area when the face area is discriminated as the face area.

본 발명의 상기 제 1 얼굴 특징 분석부는, 피부색 정보를 저 레벨 얼굴 특징으로 이용할 수 있다.The first facial feature analysis unit of the present invention can use the skin color information as a low-level facial feature.

본 발명의 상기 제 2 얼굴 특징 분석부는, 양쪽 눈과 입 영역의 국부적 얼굴 특징을 상기 중간 레벨 얼굴 특징으로 이용할 수 있다.The second facial feature analysis unit of the present invention can use the local facial features of both eyes and mouth regions as the intermediate level facial feature.

본 발명의 상기 제 3 얼굴 특징 분석부는, 전체적 얼굴 특징을 상기 고 레벨 얼굴 특징으로 이용할 수 있다.The third face characteristic analyzing unit of the present invention can use the global face feature as the high level face feature.

본 발명의 상기 제 2 얼굴 영역 검출 블록은, 순방향 위치 추적에 기반하여 상기 좌표 값에 대응하는 이전 프레임 내 이전 좌표 값을 획득하는 순방향 좌표 획득부와, 역방향 위치 추적에 기반하여 상기 좌표 값에 대응하는 이후 프레임 내 이후 좌표 값을 획득하는 역방향 좌표 획득부와, 획득된 상기 이전 좌표 값과 이후 좌표 값을 상기 제 2 얼굴 영역으로 검출하는 시간적 얼굴 검출부를 포함할 수 있다.The second face area detection block may include a forward coordinate obtaining unit for obtaining a previous coordinate value in a previous frame corresponding to the coordinate value based on the forward position tracking, And a temporal face detecting unit for detecting the obtained previous coordinate value and subsequent coordinate values as the second face area.

본 발명의 상기 얼굴 영역 통합부는, 상기 좌표 값, 이전 좌표 값 및 이후 좌표 값을 통합하여 상기 통합 얼굴 영역을 생성할 수 있다.
The face area integrating unit of the present invention can generate the integrated face area by integrating the coordinate value, the previous coordinate value, and the subsequent coordinate values.

본 발명은 공간적 레벨 특징을 이용하여 얼굴 영역을 검출하고, 시간적 레벨 특징을 이용하여 얼굴 영역을 검출하는 방식으로 얼굴 영역에 마스킹을 적용시킴으로써, 제약되지 않은 환경에서 촬영된 비디오 영상에서 얼굴 마스킹을 수행하기 위한 얼굴 특성의 극대화를 실현할 수 있다.
In the present invention, face masking is performed on a video image photographed in an unconstrained environment by detecting the face region using the spatial level feature and applying the masking to the face region in a manner of detecting the face region using the temporal level feature It is possible to realize maximization of the face characteristic.

도 1은 본 발명에 따른 다중 레벨 얼굴 특징을 이용한 얼굴 마스킹 장치에 대한 블록 구성도이다.
도 2는 도 1에 도시된 제 1 얼굴 영역 검출 블록에 대한 세부 구성도이다.
도 3은 도 1에 도시된 제 2 얼굴 영역 검출 블록에 대한 세부 구성도이다.
도 4는 본 발명에 따라 다중 레벨 얼굴 특징을 이용하여 얼굴 마스킹을 수행하는 주요 과정을 도시한 순서도이다.
도 5는 본 발명에 따라 비디오 프레임으로부터 관심 영역을 추출하는 개념을 설명하기 위한 사진 예시도이다.
도 6은 본 발명에 따라 얼굴 영역 좌표를 결정하기 위해 각 관심 영역 내에서 윈도우 스캔(window scan)을 수행하는 일례를 보여주는 사진 예시도이다.
도 7a는 저 레벨 얼굴 특징에 기반한 얼굴 영역 검출을, 도 7b는 중간 레벨 얼굴 특징에 기반한 얼굴 영역 검출을, 도 7c는 고 레벨 얼굴 특징에 기반한 얼굴 영역 검출을 각각 설명하기 위한 사진 예시도이다.
도 8a는 공간적 레벨 특징을 이용하여 검출한 얼굴 영역의 좌표를, 도 8b는 순방향 위치 추적에 기반하여 검출한 얼굴 영역의 좌표들을 추가한 예를, 도 8c는 역방향 위치 추적에 기반하여 검출한 얼굴 영역의 좌표들을 추가한 예를 각각 보여주는 사진 예시도이다.
도 9는 본 발명에 따라 관심 영역으로부터 검출된 얼굴 영역들에 블러링을 통한 마스킹이 적용된 비디오 프레임의 결과를 보여주는 사진 예시도이다.1 is a block diagram of a face masking apparatus using a multilevel face feature according to the present invention.
2 is a detailed configuration diagram of the first face area detection block shown in FIG.
3 is a detailed configuration diagram of the second face area detection block shown in FIG.
4 is a flowchart illustrating a main process of performing face masking using a multi-level face feature according to the present invention.
FIG. 5 is a pictorial example of a concept for extracting a region of interest from a video frame according to the present invention.
FIG. 6 is a photographic example showing an example of performing a window scan within each ROI to determine face area coordinates according to the present invention.
FIG. 7A is a photograph illustrating a face region detection based on a low level face feature, FIG. 7B is a face region detection based on an intermediate level face feature, and FIG. 7C is a photograph example for explaining a face region detection based on a high level face feature, respectively.
FIG. 8A shows an example of adding the coordinates of the face region detected based on the forward position tracking, FIG. 8C shows an example of detecting the face detected based on the reverse position tracking, Fig. 8 is a photograph showing an example in which coordinates of a region are added.
FIG. 9 is a photograph illustrating a result of a video frame in which masking through blurring is applied to face regions detected from a region of interest according to the present invention. FIG.

먼저, 본 발명의 장점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되는 실시 예들을 참조하면 명확해질 것이다. 여기에서, 본 발명은 이하에서 개시되는 실시 예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 수 있으며, 단지 본 실시 예들은 본 발명의 개시가 완전하도록 하고, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 발명의 범주를 명확하게 이해할 수 있도록 하기 위해 예시적으로 제공되는 것이므로, 본 발명의 기술적 범위는 청구항들에 의해 정의되어야 할 것이다.First, the advantages and features of the present invention, and how to accomplish them, will be clarified with reference to the embodiments to be described in detail with reference to the accompanying drawings. While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, but, on the contrary, It will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

아울러, 아래의 본 발명을 설명함에 있어서 공지 기능 또는 구성 등에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 그리고, 후술되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들인 것으로, 이는 사용자, 운용자 등의 의도 또는 관례 등에 따라 달라질 수 있음은 물론이다. 그러므로, 그 정의는 본 명세서의 전반에 걸쳐 기술되는 기술사상을 토대로 이루어져야 할 것이다.In the following description of the present invention, detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear. It is to be understood that the following terms are defined in consideration of the functions of the present invention, and may be changed according to intentions or customs of a user, an operator, and the like. Therefore, the definition should be based on the technical idea described throughout this specification.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시 예에 대하여 상세하게 설명한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따른 다중 레벨 얼굴 특징을 이용한 얼굴 마스킹 장치에 대한 블록 구성도로서, 관심 영역 추출부(102), 제 1 얼굴 영역 검출 블록(104), 제 2 얼굴 영역 검출 블록(106), 프레임 메모리(108), 얼굴 영역 통합부(110), 얼굴 영역 마스킹부(112) 및 프레임 출력부(114) 등을 포함할 수 있다.FIG. 1 is a block diagram of a face masking apparatus using a multi-level face feature according to the present invention. The face masking apparatus includes a ROI extracting unit 102, a first face region detecting block 104, a second face region detecting block 106, A frame memory 108, a face area integration unit 110, a face area masking unit 112, a frame output unit 114, and the like.

도 1을 참조하면, 관심 영역 추출부(102)는 외부로부터 입력되는 비디오 프레임(비디오 영상)으로부터 관심 영역을 추출(판별)하는 등의 기능을 제공할 수 있다. 즉, 얼굴 마스킹의 계산시간을 줄이고 얼굴 영역 검출 시 오검출(false positive)를 감소시킬 수 있도록 입력되는 비디오 프레임 내에서 얼굴이 존재할 가능성이 높은 직사각형 영역들을 관심 영역으로 추출하고, 관심 영역의 좌표들을 제 1 얼굴 영역 검출 블록(104)으로 전달하는 등의 기능을 제공할 수 있다.Referring to FIG. 1, the ROI extracting unit 102 may provide a function of extracting (identifying) a ROI from a video frame (video image) input from the outside. That is, rectangular regions having a high probability of existence of a face in the input video frame are extracted as an interest region so as to reduce the calculation time of the face masking and reduce false positives in detecting the face region, To the first face area detection block 104, and the like.

여기에서, 관심 영역의 추출(판별)은, 예컨대 피부색 픽셀(skin color pixel) 검출 등의 기법을 포함하는 다양한 방법들이 적용될 수 있다.Here, various methods including extraction of a skin color pixel (skin color pixel) and the like can be applied to extraction (determination) of a region of interest.

도 5는 본 발명에 따라 비디오 프레임으로부터 관심 영역을 추출하는 개념을 설명하기 위한 사진 예시도로서, 왼쪽의 사진은 입력된 비디오 프레임을 나타내고, 오른쪽의 사진은 피부색 픽셀 검출 기법을 이용하여 관심 영역을 추출한 결과로서 흰색으로 표시된 영역이 피부색 픽셀에 해당하고, 녹색 박스는 피부색에 기반하여 설정된 직사각형 모양의 관심영역들을 나타낸다.FIG. 5 is a photograph illustrating a concept of extracting a region of interest from a video frame according to the present invention. In FIG. 5, a left picture shows an input video frame, and a right picture shows a region of interest As a result of extraction, a region indicated by white corresponds to a skin color pixel, and a green box indicates regions of interest having a rectangular shape set based on the skin color.

다음에, 제 1 얼굴 영역 검출 블록(104)은 관심 영역 추출부(102)로부터 전달되는 추출된 관심 영역으로부터 얼굴의 공간적 특징을 추출 및 분석하여 제 1 얼굴 영역을 검출(판별), 즉 관심 영역들의 좌표들을 입력으로 받아 입력 비디오 프레임의 관심 영역들 내부에서 다양한 레벨(예컨대, 저 레벨, 중간 레벨, 고 레벨 등)의 얼굴 특징들을 분석하고, 이 분석 결과들을 융합 및 분류하여 비디오 프레임에서의 얼굴 영역 좌표들을 출력하여 제 2 얼굴 영역 검출 블록(106)에 전달할 수 있다.Next, the first face region detection block 104 extracts and analyzes the spatial features of the face from the extracted region of interest transmitted from the region of interest extraction unit 102, and detects (discriminates) the first face region, (E.g., low level, middle level, high level, etc.) within the regions of interest of the input video frame, and analyzes and classifies the analysis results into a face Area coordinates may be outputted and transmitted to the second face area detection block 106.

즉, 제 1 얼굴 영역 검출 블록(104)은 얼굴 영역 좌표의 결정을 위해 각 관심 영역 내에서 윈도우 스캔(window scan)을 수행, 즉 일례로서 도 6에 도시된 바와 같이, 관심 영역 내의 각 픽셀 위치에 다양한 크기의 윈도우를 설정하고 윈도우가 얼굴 영역인지 아닌지 여부를 판단함으로써 제 1 얼굴 영역을 검출하는데, 이를 위해 제 1 얼굴 영역 검출 블록(104)은 도 2에 도시된 바와 같은 구성을 포함할 수 있다.That is, the first face region detection block 104 performs a window scan in each ROI for determination of the face region coordinates, that is, as shown in FIG. 6 as an example, The first face area detection block 104 may include a configuration as shown in FIG. 2, and may be configured to detect the first face area by determining whether the window is a face area or not. have.

도 2는 도 1에 도시된 제 1 얼굴 영역 검출 블록에 대한 세부 구성도로서, 제 1 얼굴 영역 검출 블록(104)은 제 1 얼굴 특징 분석부(202), 제 2 얼굴 특징 분석부(204), 제 3 얼굴 특징 분석부(206), 얼굴 영역 판별부(208) 및 공간적 얼굴 검출부(210) 등을 포함할 수 있다.FIG. 2 is a detailed configuration diagram of the first face region detection block shown in FIG. 1, where the first face region detection block 104 includes a first face feature analysis unit 202, a second face feature analysis unit 204, A third face feature analysis unit 206, a face area determination unit 208, and a spatial face detection unit 210. [

도 2를 참조하면, 제 1 얼굴 특징 분석부(202)는 관심 영역으로부터 저 레벨 얼굴 특징에 대한 제 1 분석 스코어 값을 추출하여 얼굴 영역 판별부(208)로 전달하는 등의 기능을 제공할 수 있는 것으로, 저 레벨 얼굴 특징으로서, 예컨대 피부색 정보를 이용할 수 있으며, 일례로서 도 7a에 도시된 바와 같이, 윈도우 내에 피부 픽셀이 차지하는 비율을 계산하는 방식을 통해 분석 스코어 값으로 환산할 수 있다.Referring to FIG. 2, the first facial characteristic analyzing unit 202 may provide a function of extracting a first analysis score value for a low-level facial feature from the region of interest and transmitting it to the facial region discriminating unit 208 For example, skin color information can be used as a low-level face feature, and the skin color information can be converted into an analysis score value by calculating a ratio of skin pixels in the window, for example, as shown in Fig. 7A.

이러한 저 레벨 얼굴 특징에 기반한 얼굴 영역의 검출은 얼굴의 회전(rotation)에 의한 각도 변화가 존재하는 영상이나 혹은 저해상도 영상, 모션 블러(motion blur)가 존재하는 영상 등에 있어서 상대적으로 높은 리콜(recall)을 실현할 수 있는 장점을 갖는다.The detection of the face region based on the low-level facial feature is relatively high recall in an image having an angle change due to the rotation of the face or a low resolution image or an image having a motion blur, Can be realized.

다음에, 제 2 얼굴 특징 분석부(204)는 관심 영역으로부터 중간 레벨 얼굴 특징에 대한 제 2 분석 스코어 값을 추출하여 얼굴 영역 판별부(208)로 전달하는 등의 기능을 제공할 수 있는 것으로, 중간 레벨 얼굴 특징으로서, 예컨대 양쪽 눈과 입 영역의 국부적(local) 얼굴 특징을 이용할 수 있으며, 일례로서 도 7b에 도시된 바와 같이, 눈과 입 영역 검출을 수행하고 이 과정에 의해 검출된 얼굴 구성요소의 개수를 분석 스코어 값으로 환산할 수 있다.Next, the second facial characteristic analyzing unit 204 may provide a function of extracting a second analysis score value of the intermediate level facial feature from the region of interest and transmitting it to the facial region discriminating unit 208, As an intermediate level facial feature, for example, the local facial features of both eyes and mouth regions can be used. As shown in FIG. 7B, for example, eye and mouth area detection are performed, The number of elements can be converted into an analysis score value.

이러한 중간 레벨 얼굴 특징에 기반한 얼굴 영역의 검출은 좌우 45도 이내의 적절한(moderate) 각도 변화에도 적용이 가능하고 얼굴의 일부 영역에 가림(occlusion)이 존재하는 경우에 가려지지 않은 영역의 구성요소 검출이 가능하기 때문에 상대적으로 높은 정확성을 실현할 수 있는 장점을 갖는다.Detection of the face area based on the intermediate level facial feature can be applied to a moderate angle change within 45 degrees left and right. When the occlusion is present in a part of the face, the component detection of the uncovered area It is possible to realize a relatively high accuracy.

그리고, 제 3 얼굴 특징 분석부(206)는 관심 영역으로부터 고 레벨 얼굴 특징에 대한 제 3 분석 스코어 값을 추출하여 얼굴 영역 판별부(208)로 전달하는 등의 기능을 제공할 수 있는 것으로, 고 레벨 얼굴 특징으로서, 예컨대 전체적(holistic) 얼굴 특징을 이용할 수 있으며, 일례로서 도 7c에 도시된 바와 같이, 윈도우에서 정면(frontal) 얼굴 영역 검출과 측면(profile) 얼굴 영역 검출을 각각 수행하고 고려하는 윈도우와 근접한 윈도우 중 얼굴 영역으로 분류된 윈도우의 개수를 분석 스코어 값으로 환산할 수 있다.The third facial feature analysis unit 206 may provide a function of extracting a third analysis score value of the high-level facial feature from the region of interest and delivering the third analysis score value to the facial region discriminator 208, As a level facial feature, for example, a holistic facial feature can be used and, as shown in FIG. 7C as an example, a frontal facial region detection and a profile facial region detection in a window are respectively performed and considered It is possible to convert the number of windows classified into the face region among the windows close to the window into an analysis score value.

이러한 고 레벨 얼굴 특징에 기반한 얼굴 영역의 검출은 검출용 분류기 학습에 사용된 얼굴 영역 영상과 테스트의 얼굴 영역의 각도가 유사한 경우에는 검출율이 상대적으로 높아지는 장점을 갖는다.The detection of the face area based on the high level face feature has an advantage that the detection rate is relatively increased when the face region image used for the classifier learning and the face region of the test are similar in angle.

즉, 본 발명은 관심영역 내에 설정된 각 윈도우에서 세 가지 레벨(저 레벨, 중간 레벨, 고 레벨)의 얼굴 특징들을 분석하는 방식으로 얼굴 영역을 판별할 수 있다.That is, the present invention can discriminate the face region by analyzing three levels (low level, middle level, high level) facial features in each window set in the region of interest.

다음에, 얼굴 영역 판별부(208)는 제 1 내지 제 3 얼굴 영역 분석부(202, 204, 206)를 통해 각각 추출된 제 1 내지 제 3 각 분석 스코어 값을 융합하여 얼굴 영역인지 아닌지의 여부를 판별(분류)하고, 그 판별 결과를 공간적 얼굴 검출부(210)로 전달하는 등의 기능을 제공할 수 있다.Next, the face region determination unit 208 fuses the first through third analysis score values extracted through the first through third face region analysis units 202, 204, and 206 to determine whether or not the face region is a face region And transmits the determination result to the spatial face detection unit 210. [0154] FIG.

즉, 세 가지 레벨(저 레벨, 중간 레벨, 고 레벨)의 얼굴 특징 분석에 기반한 세 가지 스코어 값은 서로 다른 범위를 가지므로 동일한 범위(scale)를 가지도록 정규화(예컨대, min-max 정규화 등)를 수행하며, 윈도우에서 계산된 정규화된 세 가지 스코어 값은 3차원 벡터로서 바이너리 분류기에 입력되어 해당 윈도우가 얼굴영역인지 아닌지를 판별할 수 있다.That is, the three score values based on three levels (low level, middle level, high level) facial feature analysis have different ranges, so normalization (e.g., min-max normalization, etc.) And the three normalized score values calculated in the window are input to the binary classifier as a three-dimensional vector to determine whether the corresponding window is a face region or not.

여기에서, 다양한 종류의 바이너리 분류기들(예컨대, support vector machine(SVM) 등)이 사용될 수 있으며 분류기 학습(training)을 위해 얼굴 영역에 해당하는 포지티브(positive) 윈도우들과 얼굴 영역을 포함하지 않는 네거티브(negative) 윈도우들에서 계산된 세 가지 스코어 값들이 바이너리 분류기 학습을 위해 사용된다.Here, various kinds of binary classifiers (e.g., a support vector machine (SVM), etc.) can be used, and positive windows corresponding to the face region for classifier training and a negative the three score values calculated in the negative windows are used for the binary classifier learning.

다음에, 공간적 얼굴 검출부(210)는 얼굴 영역 판별부(208)로부터 얼굴 영역으로 판별이 통지될 때 해당 영역의 좌표 값들을 제 1 얼굴 영역(공간적 특징 레벨을 이용하여 검출한 얼굴 영역)으로 검출하여 도 1의 제 2 얼굴 영역 검출 블록(106)으로 전달하는 등이 기능을 제공할 수 있다.Next, the spatial face detection unit 210 detects the coordinate values of the area as the first face area (the face area detected using the spatial feature level) when the face area determination unit 208 notifies the determination to the face area To the second face area detection block 106 of FIG. 1, and so forth.

다시 도 1을 참조하면, 제 2 얼굴 영역 검출 블록(106)은 관심 영역 추출부(102)를 통해 추출된 관심 영역으로부터 비디오 프레임 간의 시간적 정보를 이용(시간적으로 인접하는 프레임들 간에 수행된 순방향(forward)과 역방향(backward) 얼굴 추적을 이용)하여 제 2 얼굴 영역을 검출(판별)하는 등의 기능을 제공할 수 있는데, 이를 위해 도 3에 도시된 바와 같은 구성을 포함할 수 있다.Referring again to FIG. 1, the second face area detection block 106 detects the second face area using the temporal information between video frames from the ROI extracted through the ROI extracting unit 102 (in the forward direction forward) and reverse face tracking) to detect (discriminate) the second face region. For this, a configuration as shown in FIG. 3 may be included.

도 3은 도 1에 도시된 제 2 얼굴 영역 검출 블록에 대한 세부 구성도로서, 순방향 좌표 획득부(302), 역방향 좌표 획득부(304) 및 시간적 얼굴 검출부(306) 등을 포함할 수 있다.FIG. 3 is a detailed configuration diagram of the second face region detection block shown in FIG. 1, and may include a forward coordinate acquisition unit 302, a reverse coordinate acquisition unit 304, a temporal face detection unit 306, and the like.

도 3을 참조하면, 순방향 좌표 획득부(302)는 순방향 위치 추적에 기반하여 입력된 프레임(t 프레임)으로부터 검출(추적)된 얼굴 영역의 좌표 값에 대응하는 이전 프레임(t 프레임과 시간적으로 인접한 이전의 t-1 프레임) 내 이전 좌표 값을 프레임 메모리(108)로부터 획득하여 시간적 얼굴 검출부(306)로 전달하는 등의 기능을 제공할 수 있다.Referring to FIG. 3, the forward-direction coordinate acquiring unit 302 receives a previous frame (temporally adjacent to a t-frame) corresponding to a coordinate value of a face region detected (tracked) from an input frame (E.g., the previous t-1 frame) from the frame memory 108 and transfer it to the temporal-face detecting unit 306, for example.

또한, 역방향 좌표 획득부(304)는 역방향 위치 추적에 기반하여 입력된 프레임(t 프레임)으로부터 검출(추적)된 얼굴 영역의 좌표 값에 대응하는 이후 프레임(t 프레임과 시간적으로 인접한 이후의 t+1 프레임) 내 이후 좌표 값을 프레임 메모리(108)로부터 획득하여 시간적 얼굴 검출부(306)로 전달하는 등의 기능을 제공할 수 있다.In addition, the reverse-direction-coordinate obtaining unit 304 obtains a reverse frame of the frame corresponding to the coordinate value of the face region detected (tracked) from the input frame (t frame) based on the reverse- 1 frame) from the frame memory 108 and deliver it to the temporal-face detecting unit 306. The time-

그리고, 시간적 얼굴 검출부(306)는 순방향 좌표 획득부(302)와 역방향 좌표 획득부(304)를 통해 각각 획득된 이전 좌표 값과 이후 좌표 값을 제 2 얼굴 영역(시간적 특징 레벨을 이용하여 검출한 얼굴 영역)으로 검출하여 도 1의 얼굴 영역 통합부(110)로 전달하는 등의 기능을 제공할 수 있다.The temporal face detecting unit 306 detects a previous coordinate value and a subsequent coordinate value obtained through the forward coordinate obtaining unit 302 and the backward coordinate obtaining unit 304 using the second face region Face region) of the face region to the face region integration unit 110 of FIG. 1, and so on.

다시 도 1을 참조하면, 얼굴 영역 통합부(110)는 공간적 특징 레벨을 이용하여 검출한 얼굴 영역(제 1 얼굴 영역)과 시간적 특징 레벨을 이용하여 검출한 얼굴 영역(제 2 얼굴 영역)을 통합하여 통합 얼굴 영역을 생성하고, 이 생성된 통합 얼굴 영역을 얼굴 영역 마스킹부(112)로 전달하는 등의 기능을 제공할 수 있다.Referring again to FIG. 1, the face area integration unit 110 integrates the detected face region (first face region) using the spatial feature level and the detected face region (second face region) using the temporal feature level To generate the integrated face region, and to transmit the generated integrated face region to the face region masking unit 112. [

즉, 시간 t에서의 현재 프레임(current frame)에서는 공간적 레벨 특징을 이용한 얼굴 영역 검출(FD)의 결과로 얻은 얼굴 영역 좌표, 시간 t-1 프레임에서는 현재(t) 프레임으로 순방향으로 얼굴 추적(FFT)을 수행한 결과로 얻은 얼굴 영역 좌표, 그리고 시간 t+1 프레임에서는 현재(t) 프레임으로 역방향으로 얼굴 추적(BFT)을 수행한 결과로 얻은 얼굴 영역 좌표들을 모두 통합한다.That is, face area coordinates obtained as a result of face area detection (FD) using the spatial level feature in the current frame at time t, face tracking in the forward direction (FFT) in the current (t) ), And at the time t + 1 frame, the face area coordinates obtained as a result of performing face tracking (BFT) in the reverse direction to the current (t) frame are all incorporated.

이때, 현재 프레임 기준(시간 t)에서 역방향 얼굴 추적을 수행하는 프레임은 총 T개로 시간 (t-T+1)의 프레임까지이다.At this time, the frame for performing reverse face tracking at the current frame reference (time t) is up to the frame of time T (t-T + 1).

도 8a는 공간적 레벨 특징을 이용하여 검출한 얼굴 영역의 좌표를, 도 8b는 순방향 위치 추적에 기반하여 검출한 얼굴 영역의 좌표들을 추가한 예를, 도 8c는 역방향 위치 추적에 기반하여 검출한 얼굴 영역의 좌표들을 추가한 예를 각각 보여주는 사진 예시도이다.FIG. 8A shows an example of adding the coordinates of the face region detected based on the forward position tracking, FIG. 8C shows an example of detecting the face detected based on the reverse position tracking, Fig. 8 is a photograph showing an example in which coordinates of a region are added.

도 8a를 참조하면, 제 1 얼굴 영역 검출 블록(104)에서 출력되는 공간적 레벨 특징을 이용하여 검출한 얼굴 영역의 좌표가 표시(녹색 박스)됨을 알 수 있으며, 도 8b를 참조하면, 제 2 얼굴 영역 검출 블록(106) 내 순방향 좌표 획득부(302)에서 획득된 얼굴 영역의 좌표들이 추가로 표시(빨간색 박스)됨을 알 수 있다.Referring to FIG. 8A, it can be seen that the coordinates of the detected face region are displayed (green box) using the spatial level feature output from the first face region detection block 104. Referring to FIG. 8B, It can be seen that the coordinates of the face area acquired by the forward direction coordinate acquiring unit 302 in the area detecting block 106 are additionally displayed (red box).

이를 통해, 순방향 추적 기반의 얼굴 영역 검출은 초반부의 비디오 프레임에서는 검출이 되었지만 후반부의 비디오 프레임에서 검출이 되지 않은 얼굴 영역들을 검출하는데 유용하다는 것을 알 수 있다.Through this, it can be seen that face detection based on the forward tracking is useful for detecting face regions detected in the video frame in the early part but not in the video frame in the latter part.

도 8c를 참조하면, 제 2 얼굴 영역 검출 블록(106) 내 역방향 좌표 획득부(304)에서 획득된 얼굴 영역의 좌표들이 추가로 표시(파란색 박스)됨을 알 수 있는데, 이를 통해 역방향 추적 기반의 얼굴 영역 검출은 후반부의 비디오 프레임에서는 검출이 되었지만 초반부의 비디오 프레임에서 검출이 되지 않은 얼굴 영역들을 검출하는데 유용하다는 것을 알 수 있다.Referring to FIG. 8C, it can be seen that the coordinates of the face area obtained by the reverse direction coordinate obtaining unit 304 in the second face area detecting block 106 are additionally displayed (blue box) It can be seen that the region detection is useful for detecting face regions that were detected in the video frame of the latter half but not yet detected in the video frame of the latter half.

즉, 본 발명에서는 시간적 레벨 특징을 이용함으로써 공간적 레벨 특징들이 불충분한 경우(예컨대, 얼굴 회전각이 너무 커서 눈과 입이 보이지 않거나 혹은 가림(occlusion)이 매우 심한 경우 등)에도 얼굴 영역의 검출 어려움을 극복하여 리콜을 높일 수 있다.That is, in the present invention, it is difficult to detect the face area even when the spatial level features are insufficient (for example, when the face rotation angle is too large and the eyes and mouth are not visible or the occlusion is very severe) The recall can be increased.

다음에, 얼굴 영역 마스킹부(112)는 얼굴 영역 통합부(110)로부터 전달되는 통합 얼굴 영역(통합 얼굴 영역의 좌표 값들)에 마스킹을 적용(예컨대, 블러링(blurring)을 통한 마스킹 적용), 즉 시각적인 식별이 불가능하도록 마스킹을 적용하는 등의 기능을 제공할 수 있다.Next, the face area masking unit 112 applies masking (e.g., masking through blurring) to the integrated face area (coordinate values of the integrated face area) transmitted from the face area integrating unit 110, That is, it is possible to provide a function of applying masking so that visual identification is impossible.

그리고, 프레임 출력부(114)는 입력 비디오 프레임으로부터 검출된 얼굴 영역들에 마스킹이 적용된 비디오 프레임, 예컨대 도 9에 도시된 바와 같은 비디오 프레임을 도시 생략된 디스플레이 측으로 출력하는 등의 기능을 제공할 수 있다.Then, the frame output unit 114 can provide a function of masking the face regions detected from the input video frame, for example, outputting a video frame as shown in Fig. 9 to a display side (not shown) have.

다음에, 상술한 바와 같은 구성을 갖는 본 실시 예에 따른 얼굴 마스킹 장치를 통해 다중 레벨 얼굴 특징에 기반하여 얼굴 마스킹을 적용하는 일련의 과정들에 대하여 상세하게 설명한다.Next, a series of processes for applying the face masking based on the multi-level face feature through the face masking apparatus according to the present embodiment having the above-described configuration will be described in detail.

도 4는 본 발명에 따라 다중 레벨 얼굴 특징을 이용하여 얼굴 마스킹을 수행하는 주요 과정을 도시한 순서도이다.4 is a flowchart illustrating a main process of performing face masking using a multi-level face feature according to the present invention.

도 4를 참조하면, 외부로부터 비디오 프레임이 입력되면, 관심 영역 추출부(102)에서는 입력 비디오 프레임(비디오 영상)으로부터 관심 영역(예컨대, 직사각형 모양의 관심 영역)을 추출(판별)한다(단계 402). 여기에서, 관심 영역의 추출(판별)은, 예컨대 피부색 픽셀(skin color pixel) 검출 등과 기법이 적용될 수 있다.4, when a video frame is input from the outside, the ROI extracting unit 102 extracts (identifies) a ROI (for example, ROI) from an input video frame (video image) ). Here, extraction (discrimination) of a region of interest, for example, skin color pixel detection and the like can be applied.

이에 응답하여, 제 1 얼굴 영역 검출 블록(104) 내 제 1 얼굴 특징 분석부(202)에서는 관심 영역으로부터 저 레벨 얼굴 특징에 대한 제 1 분석 스코어 값을 추출하고, 제 2 얼굴 특징 분석부(204)에서는 관심 영역으로부터 중간 레벨 얼굴 특징에 대한 제 2 분석 스코어 값을 추출하며, 제 3 얼굴 특징 분석부(206)에서는 관심 영역으로부터 고 레벨 얼굴 특징에 대한 제 3 분석 스코어 값을 추출한다(단계 404).In response, the first facial feature analysis unit 202 in the first facial region detection block 104 extracts a first analysis score value for a low-level facial feature from the region of interest, and the second facial feature analysis unit 204 Extracts a second analysis score value for the intermediate level face feature from the ROI, and the third face feature analysis unit 206 extracts a third analysis score value for the high level face feature from the ROI (step 404 ).

여기에서, 저 레벨 얼굴 특징에서는, 예컨대 피부색 정보를 이용할 수 있으며, 윈도우 내에 피부 픽셀이 차지하는 비율을 계산하는 방식을 통해 분석 스코어 값으로 환산할 수 있다.Here, in the low-level face feature, skin color information can be used, for example, and converted into an analysis score value by a method of calculating the proportion of skin pixels in the window.

또한, 중간 레벨 얼굴 특징에서는, 예컨대 양쪽 눈과 입 영역의 국부적(local) 얼굴 특징을 이용할 수 있으며, 눈과 입 영역 검출을 수행하고 이 과정에 의해 검출된 얼굴 구성요소의 개수를 분석 스코어 값으로 환산할 수 있다.Further, in the intermediate level face feature, for example, local face features of both eyes and mouth regions can be used, eye and mouth region detection can be performed, and the number of face components detected by this process can be used as an analysis score value Can be converted.

그리고, 고 레벨 얼굴 특징에서는, 예컨대 전체적(holistic) 얼굴 특징을 이용할 수 있으며, 윈도우에서 정면(frontal) 얼굴 영역 검출과 측면(profile) 얼굴 영역 검출을 각각 수행하고 고려하는 윈도우와 근접한 윈도우 중 얼굴 영역으로 분류된 윈도우의 개수를 분석 스코어 값으로 환산할 수 있다.In the high-level facial feature, for example, a holistic facial feature can be used, and a frontal facial region detection and a facial facial region detection are respectively performed in a window, and a facial region Can be converted into an analysis score value.

다음에, 얼굴 영역 판별부(208)에서는 제 1 내지 제 3 얼굴 영역 분석부(202, 204, 206)를 통해 각각 추출된 제 1 내지 제 3 각 분석 스코어 값을 융합하여 얼굴 영역인지 아닌지의 여부를 판별(분류)한다(단계 406).Next, in the face area determination unit 208, the first through third analysis score values extracted through the first through third face area analysis units 202, 204, and 206 are fused to determine whether or not the face region is a face region (Step 406).

상기 단계(406)에서의 분류 결과, 얼굴 영역으로 판단되면, 공간적 얼굴 검출부(210)에서는 해당 얼굴 영역의 좌표 값들을 제 1 얼굴 영역, 즉 공간적 특징 레벨을 이용하여 검출한 얼굴 영역으로 검출한다(단계 408).If it is determined that the face region is a result of the classification in step 406, the spatial face detection unit 210 detects the coordinate values of the face region as a face region detected using the first face region, that is, the spatial feature level Step 408).

이후, 제 2 얼굴 영역 검출 블록(106) 내 순방향 좌표 획득부(302)에서는 순방향 위치 추적에 기반하여 입력된 프레임(t 프레임)으로부터 검출(추적)된 얼굴 영역의 좌표 값에 대응하는 이전 프레임(t 프레임과 시간적으로 인접한 이전의 t-1 프레임) 내 이전 좌표 값을 프레임 메모리(108)로부터 획득하고(단계 410), 역방향 좌표 획득부(304)에서는 역방향 위치 추적에 기반하여 입력된 프레임(t 프레임)으로부터 검출(추적)된 얼굴 영역의 좌표 값에 대응하는 이후 프레임(t 프레임과 시간적으로 인접한 이후의 t+1 프레임) 내 이후 좌표 값을 프레임 메모리(108)로부터 획득한다(단계 412).In the forward-direction coordinate acquisition unit 302 in the second face-area detection block 106, the previous frame (t frame) corresponding to the coordinate value of the face region detected (tracked) from the input frame (step 410), and the backward coordinate acquisition unit 304 obtains the previous frame in the input frame (t-1 frame) based on the backward position tracking (T + 1 frame temporally adjacent to the t frame) corresponding to the coordinate value of the face region detected (traced) from the frame memory 108 (step 412).

이에 응답하여, 시간적 얼굴 검출부(306)에서는 각각 획득된 이전 좌표 값과 이후 좌표 값을 제 2 얼굴 영역, 즉 시간적 특징 레벨을 이용하여 검출한 얼굴 영역으로 검출한다(단계 414).In response to this, the temporal face detection unit 306 detects each of the obtained previous coordinate values and subsequent coordinate values as a face region detected using the second face region, that is, the temporal feature level (Step 414).

다시, 얼굴 영역 통합부(110)에서는 공간적 특징 레벨을 이용하여 검출한 얼굴 영역(제 1 얼굴 영역)과 시간적 특징 레벨을 이용하여 검출한 얼굴 영역(제 2 얼굴 영역)을 통합하여 통합 얼굴 영역을 생성한다(단계 416).The face region integration unit 110 integrates the detected face region (first face region) using the spatial feature level and the detected face region (second face region) using the temporal feature level, (Step 416).

그리고, 얼굴 영역 마스킹부(112)에서는, 비디오 영상 내 얼굴에 대한 시각적인 식별이 불가능하도록, 얼굴 영역 통합부(110)로부터 전달되는 통합 얼굴 영역(통합 얼굴 영역의 좌표 값들)에 마스킹을 적용, 예컨대 블러링(blurring)을 통한 마스킹을 적용한다(단계 418).In the face area masking unit 112, masking is applied to the integrated face area (coordinate values of the integrated face area) transmitted from the face area integration unit 110 so that the face in the video image can not be visually identified, Masking via blurring is applied (step 418).

이후, 프레임 출력부(114)에서는 입력 비디오 프레임으로부터 검출된 얼굴 영역들에 마스킹이 적용된 비디오 프레임을, 예컨대 디스플레이 측으로 출력하게 된다(단계 420).Thereafter, the frame output unit 114 outputs a video frame having masking applied to the face regions detected from the input video frame, for example, to the display side (step 420).

이상의 설명은 본 발명의 기술사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위 내에서 여러 가지 치환, 변형 및 변경 등이 가능함을 쉽게 알 수 있을 것이다. 즉, 본 발명에 개시된 실시 예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것으로서, 이러한 실시 예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다.It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the present invention as defined by the following claims. It is easy to see that this is possible. That is, the embodiments disclosed in the present invention are not intended to limit the scope of the present invention but to limit the scope of the present invention.

따라서, 본 발명의 보호 범위는 후술되는 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.Therefore, the scope of protection of the present invention should be construed in accordance with the following claims, and all technical ideas within the scope of equivalents should be interpreted as being included in the scope of the present invention.

Claims

Extracting a region of interest in a video frame;
Extracting and analyzing a spatial feature of the face from the extracted ROI to detect a first face region;
Detecting a second face region using temporal information between frames from the extracted region of interest;
Generating an integrated face region by integrating the detected first face region and the detected second face region;
A process of applying masking to the generated integrated face area
A method for facial masking using multi-level facial features,

The method according to claim 1,
Wherein the region of interest comprises:
Extracted by skin color pixel detection technique
Facial masking method using multi - level facial features.

The method according to claim 1,
The step of detecting the first face region comprises:
Extracting analytic score values for low-level facial features, intermediate-level facial features, and high-level facial features from the region of interest;
Determining whether each of the extracted analysis score values is a face region by fusing the extracted analysis score values;
Detecting a coordinate value of the area as the first face area when the face area is identified;
A method for facial masking using multi-level facial features,

The method of claim 3,
The low-
Using skin color information
Facial masking method using multi - level facial features.

The method of claim 3,
The intermediate level facial feature may comprise:
Using the local facial features of both eyes and mouth regions
Facial masking method using multi - level facial features.

The method of claim 3,
The high-
Using full face features
Facial masking method using multi - level facial features.

The method of claim 3,
Wherein the step of detecting the second face region comprises:
Obtaining a previous coordinate value in a previous frame corresponding to the coordinate value based on the forward position tracking;
Acquiring an intra-frame intra-frame coordinate value corresponding to the coordinate value based on the reverse position tracking;
Detecting the obtained previous coordinate value and subsequent coordinate values as the second face region
A method for facial masking using multi-level facial features,

An interest region extracting unit for extracting a region of interest from an input video frame,
A first face region detection block for extracting and analyzing spatial features of the face from the extracted region of interest and detecting a first face region,
A second face region detection block for detecting a second face region using temporal information between frames extracted from the extracted region of interest,
A face area integration unit for integrating the first face region and the second face region to generate an integrated face region,
A face region masking unit for applying masking to the generated integrated face region,
Wherein the face masking device is a face masking device.

9. The method of claim 8,
Wherein the first face area detection block comprises:
A first facial feature analysis unit for extracting a first analysis score value for a low-level facial feature from the region of interest;
A second face feature analyzing unit for extracting a second analysis score value for an intermediate level face feature from the ROI,
A third face feature analyzing unit for extracting a third analysis score value for the high-level face feature from the region of interest,
A face area determination unit for determining whether or not the extracted first to third analysis score values are fused to determine whether or not the face region is fused;
A face detection unit detecting a coordinate value of the area as the first face area when the face area is determined as the face area,
Wherein the face masking device is a face masking device.

10. The method of claim 9,
Wherein the first facial feature analyzing unit comprises:
Using skin color information as low-level facial features
Facial masking device using multi - level facial features.

10. The method of claim 9,
Wherein the second face feature analyzing unit comprises:
Using the local facial features of both eyes and mouth regions as the mid-level facial feature
Facial masking device using multi - level facial features.

10. The method of claim 9,
Wherein the third face characteristic analyzing unit comprises:
Using the global face feature as the high-level face feature
Facial masking device using multi - level facial features.

10. The method of claim 9,
Wherein the second face area detection block comprises:
A forward coordinate obtaining unit for obtaining a previous coordinate value in a previous frame corresponding to the coordinate value based on the forward position tracking,
A backward coordinate acquiring unit for acquiring in-frame intra-frame coordinate values corresponding to the coordinate values based on backward position tracking;
And a temporal face detecting unit for detecting the obtained previous coordinate value and subsequent coordinate values as the second face region,
Wherein the face masking device is a face masking device.