KR100886489B1

KR100886489B1 - Method and system for inserting special effects during conversation by visual telephone

Info

Publication number: KR100886489B1
Application number: KR1020070117964A
Authority: KR
Inventors: 조현근; 류중희
Original assignee: (주)올라웍스
Priority date: 2007-11-19
Filing date: 2007-11-19
Publication date: 2009-03-05

Abstract

A method and a system for synthesis of facial expression in reference with a template to fabricate in a video call according to the gesture of the face are provided to synthesize effect image designated in a dictionary according to facial expression at real time. A screen decoration unit(110) is comprised of a face detector(110a), a face tracking tracker(110b) and a face expression recognition part(110c). The face detection part detects the face range of individual indicated in the video call through the screen of the digital device. The face range about individual is detected by the face detection part. The face tracking portion performs the tracking of the detected face range to the periodic or non-periodic. The face expression recognition part analyzes the face range detected by the face detection part or the face range traced with the face tracking portion. The face verification technology is applied to the face of the individual included in the face range.

Description

METHOOD AND SYSTEM FOR INSERTING SPECIAL EFFECTS DURING CONVERSATION BY VISUAL TELEPHONE}

본 발명은 디지털 기기를 이용하여 영상 통화를 할 때, 통화하는 인물의 얼굴 표정에 따라 사전에 지정된 꾸미기 효과 이미지를 자동으로 합성하여 디스플레이하는 방법 및 시스템에 관한 것으로서, 구체적으로는, 디지털 기기를 이용하여 영상 통화를 할 때, 디지털 기기의 화면을 통해 디스플레이되는 인물에 대해 얼굴 검출(face detection), 얼굴 트래킹(face tracking), 얼굴 인식(face recognition) 등의 기법을 적용하여 상기 인물의 표정을 인지하고, 상기 인물의 표정이 기존에 기록된 표정 장식 템플릿에 해당되는 표정과 동일한지 확인하여, 동일하다면 인물의 표정에 부합하는 꾸미기 효과 이미지를 영상 통화 중인 상기 인물의 영상에 실시간으로 합성하여 재미 있는 영상 통화를 할 수 있도록 도와주기 위한 방법 및 시스템에 관한 것이다. The present invention relates to a method and system for automatically synthesizing and displaying a predetermined decorating effect image according to a facial expression of a person making a call when making a video call using a digital device, and specifically, using a digital device. When making a video call, facial expression, face tracking, face recognition, etc. are applied to the person displayed on the screen of the digital device to recognize the expression of the person. And if the expression of the person is identical to the expression corresponding to the previously recorded expression decoration template, and if the same is the same, the decorating effect image corresponding to the expression of the person is synthesized in real time with the image of the person in the video call. A method and system for helping to make a video call.

근래에 들어 통신기술의 발전으로 이동통신 영상전화 서비스가 본격화되고 인터넷 영상전화기, 영상회의 시스템과 같은 시장이 주목을 받고 있으며 유무선 연 동 영상전화, 초고속인터넷과 결합된 형태의 다양한 영상전화 모델들이 속속 출현하고 있다. In recent years, with the development of communication technology, mobile video telephony service is getting full-scale, and the market such as Internet video telephony and video conferencing system is attracting attention. Is appearing.

그런데, 영상 통화 시 계속해서 상대방의 얼굴을 보게 되는 경우 상황이 어색해지거나, 익숙하지 않은 사용자의 경우 어색함으로 인하여 감정의 표현이 부자연스러울 수 있다. 이에, 단조로운 영상 통화에서 벗어나 다양한 기능을 가미함으로써, 영상 통화 시 보다 자연스럽고 재미있는 분위기를 연출할 수 있도록 도움을 줘야 할 필요성이 커지고 있다.However, when the user continues to see the other's face during a video call, the situation may be awkward, or the unfamiliar user may be unnatural due to awkwardness. Thus, by adding various functions away from the monotonous video call, the need to help to create a more natural and fun atmosphere during the video call is increasing.

따라서, 본 발명의 목적은, 종래 기술의 문제점을 해결하고 디지털 기기를 이용하여 영상 통화를 행할 때, 디지털 기기의 화면을 통해 디스플레이 되는 인물에 대해, 상기 인물의 얼굴을 검출, 트래킹, 및 인식하여, 상기 인물의 표정이 사용자가 사전에 저장해 놓은 표정 장식 템플릿 사진에 포함된 모델의 표정과 동일 또는 유사한 경우, 해당 템플릿 사진에 대응되어 기록된 꾸미기 효과 이미지를 상기 인물의 얼굴에 자동으로 합성시켜 준다. 이에 의하면, 사용자의 감정을 실시간으로 풍부하게 표현하면서 영상 통화를 행할 수 있으며, 이에 따라 보다 친밀감 있고 재미 있는 영상 통화가 가능해진다.Accordingly, an object of the present invention is to detect, track, and recognize a face of a person with respect to a person displayed through a screen of the digital device when solving a problem of the prior art and making a video call using a digital device. If the facial expression of the person is the same as or similar to the facial expression of the model included in the facial decoration template picture previously stored by the user, the decorative effect image recorded corresponding to the template picture is automatically synthesized on the face of the person. . According to this, a video call can be made while expressing the user's emotions in abundance in real time, thereby enabling a more intimate and interesting video call.

상기한 바와 같은 본 발명의 목적을 달성하고, 후술하는 본 발명의 특징적인 기능을 수행하기 위한, 본 발명의 특징적인 구성은 하기와 같다.In order to achieve the object of the present invention as described above, and to perform the characteristic functions of the present invention described below, the characteristic configuration of the present invention is as follows.

본 발명의 일 태양에 따르면, 모델의 얼굴 표정마다 템플릿과 꾸미기 효과 이미지가 대응되도록 저장되어 있는 표정 장식 템플릿 DB를 참조하여, 디지털 기기의 사용자의 표정에 따라 상기 사용자의 얼굴과 상기 꾸미기 효과 이미지를 자동으로 합성하여 제공하는 방법에 있어서, (a) 상기 디지털 기기의 렌즈를 통해 촬영되는 상기 사용자의 얼굴을 얼굴 검출 기술을 이용하여 검출하는 단계, (b) 상기 템플릿 중 상기 검출된 얼굴의 표정과 동일 또는 유사한 얼굴 표정을 포함하는 특정 템플릿을 검색하는 단계, 및 (c) 상기 검출된 얼굴의 이미지에 상기 특정 템플릿에 대응되는 상기 꾸미기 효과 이미지를 합성하는 단계를 포함하는 방법을 제공한다.According to an aspect of the present invention, the user's face and the decorating effect image according to the expression of the user of the digital device with reference to the facial expression decoration template DB stored so that the template and the decorating effect image corresponding to each facial expression of the model A method of automatically synthesizing and providing a method, the method comprising: (a) detecting a face of a user photographed through a lens of the digital device using face detection technology; Searching for a specific template including the same or similar facial expressions, and (c) synthesizing the decorating effect image corresponding to the specific template with the detected face image.

본 발명에 따르면, 영상 통화 시에 화면에 나타나는 인물의 표정에 따라 사전에 지정된 꾸미기 효과 이미지를 실시간으로 합성하여 사용자의 감정을 보다 유머러스하고 풍부하게 표현할 수 있으며, 이에 따라 보다 친밀감 있고 재미 있는 영상 통화를 가능하게 해 주므로, 영상 통화 시장의 저변 확대에 이바지할 수 있다.According to the present invention, the user's emotions can be expressed in a more humorous and rich manner by synthesizing a predetermined decorating effect image in real time according to the facial expressions of a person appearing on the screen during a video call, thereby making a more intimate and interesting video call. It can help to expand the base of the video call market.

후술하는 본 발명에 대한 상세한 설명은, 본 발명이 실시될 수 있는 특정 실시예를 예시로서 도시하는 첨부 도면을 참조한다. 이들 실시예는 당업자가 본 발명을 실시할 수 있기에 충분하도록 상세히 설명된다. 본 발명의 다양한 실시예는 서로 다르지만 상호 배타적일 필요는 없음이 이해되어야 한다. 예를 들어, 여기에 기재되어 있는 특정 형상, 구조 및 특성은 일 실시예에 관련하여 본 발명의 정신 및 범위를 벗어나지 않으면서 다른 실시예로 구현될 수 있다. 또한, 각각의 개시된 실시예 내의 개별 구성요소의 위치 또는 배치는 본 발명의 정신 및 범위를 벗어나지 않으면서 변경될 수 있음이 이해되어야 한다. 따라서, 후술하는 상세한 설명은 한정적인 의미로서 취하려는 것이 아니며, 본 발명의 범위는, 적절하게 설명된다면, 그 청구항들이 주장하는 것과 균등한 모든 범위와 더불어 첨부된 청구항에 의해서만 한정된다. 도면에서 유사한 참조부호는 여러 측면에 걸쳐서 동일 또는 유사한 기능을 지칭한다.DETAILED DESCRIPTION The following detailed description of the invention refers to the accompanying drawings that show, by way of illustration, specific embodiments in which the invention may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the invention. It should be understood that the various embodiments of the present invention are different but need not be mutually exclusive. For example, certain shapes, structures, and characteristics described herein may be embodied in other embodiments without departing from the spirit and scope of the invention with respect to one embodiment. In addition, it is to be understood that the location or arrangement of individual components within each disclosed embodiment may be changed without departing from the spirit and scope of the invention. The following detailed description, therefore, is not to be taken in a limiting sense, and the scope of the present invention, if properly described, is defined only by the appended claims, along with the full range of equivalents to which such claims are entitled. Like reference numerals in the drawings refer to the same or similar functions throughout the several aspects.

이하에서는 첨부된 도면을 참조하여 본 발명의 실시예를 상세히 설명하도록 한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시예에 따라, 핸드폰, PC 캠 등의 디지털 기기를 이용하여 영상 통화를 할 때, 대화 중의 인물의 표정에 따라 그에 부합하는 꾸미기 효과 이미지를 자동으로 합성하여 화면에 디스플레이하기 위한 전체 시스템(100)의 구성도이다. 1 is according to an embodiment of the present invention, when making a video call using a digital device such as a mobile phone, PC cam, according to the facial expressions of the person in the conversation and automatically synthesizes the image corresponding to the decorating effect displayed on the screen It is a block diagram of the whole system 100 for the following.

이하에서는 주로 영상 통화 시에 본 발명이 적용되는 예를 설명할 것이지만, 캠코더, 핸드폰 등을 이용하여 동영상을 생성하는 경우 또는 사진과 같은 정지 영상을 생성하는 경우에 있어서도 마찬가지로 유사하게 적용될 수 있음은 물론이다. Hereinafter, an example in which the present invention is mainly applied to a video call will be described. However, the present invention may be similarly applied to a case of generating a video using a camcorder, a mobile phone, or the like, or a still image such as a photo. to be.

도 1을 참조하면, 전체 시스템(100)은 화면 장식부(110), 표정 장식 템플릿 데이터베이스(120), 인터페이스부(130), 통신부(140), 제어부(150) 등을 포함할 수 있다. Referring to FIG. 1, the entire system 100 may include a screen decoration unit 110, a facial expression decoration template database 120, an interface unit 130, a communication unit 140, and a controller 150.

본 발명에 따르면, 화면 장식부(110), 표정 장식 템플릿 데이터베이스(120), 인터페이스부(130), 통신부(140)는 그 중 적어도 일부가 핸드폰 등의 사용자 단말 장치에 포함되거나 사용자 단말 장치와 통신하는 프로그램 모듈들일 수 있다(다만, 도 1에서는 화면 장식부(110), 템플릿 데이터베이스(120), 인터페이스부(130), 통신부(140)가 모두 사용자 단말 장치에 포함되어 있는 것으로 예시하고 있다). 이러한 프로그램 모듈들은 운영 시스템, 응용 프로그램 모듈 및 기타 프로그램 모듈의 형태로 사용자 단말 장치에 포함될 수 있으며, 물리적으로 여러가지 공지의 기억 장치 상에 저장될 수도 있다. 또한, 이러한 프로그램 모듈들은 사용자 단말 장치와 통신 가능한 원격 기억 장치에 저장될 수도 있다. 이러한 프로그램 모듈들은 본 발명에 따라 후술할 특정 업무를 수행하거나 특정 추상 데이터 유형을 실행하는 루틴, 서브루틴, 프로그램, 오브젝트, 컴포넌트, 데이터 구조 등을 포괄하지만, 이에 제한되지는 않는다. According to the present invention, the screen decoration unit 110, the facial expression decoration template database 120, the interface unit 130, the communication unit 140, at least some of them are included in a user terminal device such as a mobile phone or communicate with the user terminal device The screen decoration unit 110, the template database 120, the interface unit 130, and the communication unit 140 are all included in the user terminal device. Such program modules may be included in the user terminal device in the form of an operating system, an application program module, and other program modules, and may also be physically stored on various known storage devices. Also, such program modules may be stored in a remote storage device that can communicate with a user terminal device. Such program modules include, but are not limited to, routines, subroutines, programs, objects, components, data structures, etc. that perform particular tasks or execute particular abstract data types, described below, in accordance with the present invention.

화면 장식부(110)는 얼굴 검출부(110a), 얼굴 트래킹부(110b), 표정 인식부(110c) 등을 포함할 수 있다. 여기서, 얼굴 검출부(110a), 얼굴 트래킹부(110b), 표정 인식부(110c)는 디지털 기기의 화면을 통해 디스플레이되는 인물의 얼굴을 검출하고 검출된 얼굴의 각도, 눈, 코, 입 등의 얼굴 각 부위의 모양 및 크기 등을 참조로 하여 해당 인물의 표정을 판단하는 기능을 수행함에 있어서 편의상 분류한 것으로서, 반드시 이에 한정되는 것은 아님을 밝혀둔다. The screen decoration unit 110 may include a face detector 110a, a face tracking unit 110b, an expression recognition unit 110c, and the like. Here, the face detector 110a, the face tracking unit 110b, and the facial expression recognizer 110c detect the face of the person displayed through the screen of the digital device, and detect the face of the face, such as the angle, eyes, nose, and mouth. It is classified as a convenience in performing the function of determining the expression of the person with reference to the shape and size of each part, etc., but is not necessarily limited thereto.

얼굴 검출부(110a)는 영상 통화 시 디지털 기기의 화면을 통해 디스플레이되는 인물들의 얼굴 영역을 검출하는 역할을 수행한다. The face detector 110a detects face areas of people displayed through a screen of a digital device during a video call.

얼굴 트래킹부(110b)는 얼굴 검출부(110a)에 의해 인물에 대한 얼굴 영역이 검출되면, 검출된 얼굴 영역을 주기적 또는 비주기적인 시간 간격으로 트래킹(tracking)해 가면서 상기 검출된 얼굴 영역을 수시로 추적할 수 있다. When the face tracking unit 110b detects the face area of the person by the face detection unit 110a, the face tracking unit 110b tracks the detected face area at regular intervals or at non-periodic time intervals. can do.

또한, 표정 인식부(110c)는 상기 얼굴 검출부(110a)에 의해 검출된 얼굴 영역 및/또는 얼굴 트래킹부(110b)에 의해 트래킹된 얼굴 영역을 분석하고, 얼굴 영역에 포함된 인물의 얼굴에 대해 얼굴 인식 기술을 적용하여 상기 인물의 표정을 파악하는 기능을 수행한다. In addition, the facial expression recognition unit 110c analyzes the face area detected by the face detection unit 110a and / or the face area tracked by the face tracking unit 110b, and analyzes a face of a person included in the face area. A face recognition technology is applied to detect a face expression of the person.

이와 같은 표정 인식부(110c)의 기능을 보다 구체적으로 설명하기에 앞서, 표정 인식부(110c)가 제 기능을 수행하기 위해 참조하는 표정 장식 템플릿 DB(120) 에 대해 설명한다. Before describing the function of the facial expression recognition unit 110c in more detail, the facial expression decoration template DB 120 referred to by the facial expression recognition unit 110c to perform the function will be described.

표정 장식 템플릿 DB(120)에는 모델의 얼굴 표정이 촬영된 사진과 각 표정마다 지정된 꾸미기 효과 이미지가 대응되도록 기록되어 있다. 사용자가 영상 통화를 행하면서 표정 장식 템플릿 DB(120)에 저장된 템플릿 중 특정 템플릿에 포함된 모델의 표정과 동일한 표정을 지으면, 상기 특정 템플릿에 대응되어 기록된 특정 꾸미기 효과 이미지를 실시간으로 사용자의 얼굴 이미지에 합성하여 상대방 단말기의 화면에 디스플레이되도록 할 수 있다. 여기서 꾸미기 효과 이미지는 도 3a 및 도 3b에서 볼 수 있듯이 아이템 오버레이(item overlay) 방식으로 합성될 수 있을뿐만 아니라, 명도 변경, 채도 변경, 뽀사시 처리와 같은 특수 처리를 가하는 필터 적용, 얼굴 부분의 확대/크랍(crop) 등의 방식에 의해서도 합성될 수 있음은 물론이다. The facial expression decoration template DB 120 is recorded such that the photograph of the facial expression of the model and the decorative effect image designated for each facial expression correspond to each other. When the user makes a video call and has the same facial expression as that of the model included in the specific template among the templates stored in the facial expression decoration template DB 120, the user's face is displayed in real time with the specific decorating effect image recorded corresponding to the specific template. The image may be combined with the image and displayed on the screen of the other terminal. Here, the decorating effect image can be synthesized not only by the item overlay method as shown in FIGS. 3A and 3B, but also by applying filters to apply special treatments such as brightness change, saturation change, and posashi treatment, and enlargement of the face part. Of course, it can also be synthesized by a method such as / crop (crop).

한편, 본 발명에서 언급되는 표정 장식 템플릿 DB (120) 등 각종 DB는 협의의 데이터베이스뿐만 아니라, 파일 시스템에 기반한 데이터 기록 등을 포함하는 넓은 의미의 데이터베이스도 포함하며, 시스템(100) 내에 포함되어 있을 수도 있으나, 시스템(100)과 통신 가능한 원격 기억 장치에 존재할 수도 있을 것이다.Meanwhile, various DBs, such as the facial expression decoration template DB 120 mentioned in the present invention, include not only a database of consultation, but also a database of a broad meaning including a data record based on a file system, and may be included in the system 100. It may also be present in the remote storage device capable of communicating with the system 100.

표정 인식부(110c)는 사용자의 얼굴 표정이 표정 장식 템플릿 DB(120)에 저장된 복수의 표정 장식 템플릿에 해당되는 표정 중 어느 표정과 동일한지 확인하여 그 중 특정 표정과 동일하다고 판단하면, 상기 특정 표정에 대해 지정된 꾸미기 효과 이미지를 상기 사용자의 얼굴 이미지에 실시간으로 합성하여 인터페이스부(130)를 통해 상대방 단말기의 화면에 디스플레이하도록 해 준다. When the facial expression recognition unit 110c determines whether the facial expression of the user is the same as a specific facial expression among the facial expressions corresponding to a plurality of facial expression decorative templates stored in the facial expression decorative template DB 120, the specific facial expression is determined. The decorating effect image specified for the facial expression is synthesized in real time to the face image of the user to be displayed on the screen of the counterpart terminal through the interface unit 130.

또한, 인터페이스부(130)는 인물의 각각의 표정마다 꾸미기 효과 이미지가 대응되도록 기록된 표정 장식 템플릿 DB(120)의 데이터를 사용자가 참조하거나 편집할 수 있도록 디지털 기기의 화면을 통해 보여줄 수 있고, 영상 통화의 양측 상대방의 화면에 다양한 이미지를 디스플레이해 주는 역할을 수행한다. In addition, the interface unit 130 may show through the screen of the digital device so that the user can reference or edit the data of the facial expression decoration template DB 120 recorded so that the decorative effect image corresponding to each facial expression of the person, It plays a role of displaying various images on the screens of both parties in the video call.

통신부(140)는 시스템(100) 내부의 각 구성 모듈 사이의 신호를 송수신하거나 다양한 외부 장치와의 데이터 송수신을 수행하는 것을 담당한다.The communication unit 140 is responsible for transmitting and receiving signals between each component module in the system 100 or transmitting and receiving data with various external devices.

본 발명에 따른 제어부(150)는 화면 장식부(110), 표정 장식 템플릿 DB(120), 인터페이스부(130), 통신부(140) 간의 데이터의 흐름을 제어하는 기능을 수행한다. 즉, 본 발명에 따른 제어부(150)는 통신부(140)를 통하여 각 구성 모듈 간에 송수신되는 신호를 제어함으로써, 화면 장식부(110), 템플릿 DB(120), 인터페이스부(130)에서 각각의 고유의 기능을 수행하도록 제어한다. The controller 150 according to the present invention performs a function of controlling the flow of data between the screen decoration unit 110, the facial expression decoration template DB 120, the interface unit 130, the communication unit 140. That is, the controller 150 according to the present invention controls the signals transmitted and received between each component through the communication unit 140, the screen decoration unit 110, the template DB 120, the interface unit 130, each unique Control to perform functions.

이하에서는 도 2 및 도 3을 참조로 상기의 프로세스 전반에 대해 보다 구체적으로 설명한다.Hereinafter, the overall process will be described in more detail with reference to FIGS. 2 and 3.

도 2는 얼굴 검출 및 얼굴 트래킹 방법에 대한 일예를 도시한 도면이다.2 is a diagram illustrating an example of a face detection and face tracking method.

도 2를 참조하면, 디지털 기기를 이용하여 영상 통화를 행하는 동안, 인물의 얼굴이 검출되면 상기 검출된 얼굴에 대한 트래킹 작업을 수행하면서 주기적 또는 비주기적으로 상기 얼굴에 대한 재검출을 수행할 수 있으며, 이러한 검출된 얼굴 영역 및/또는 트래킹된 얼굴 영역에 포함된 인물의 얼굴에 대해 얼굴 인식 기술을 적용하여 상기 인물의 표정을 파악할 수 있다. 이러한 파악된 표정은, 후술하는 도 3a 및 도 3b에서 볼 수 있듯이, 각각의 표정에 대응되는 꾸미기 효과 이미지와 결합되도록 하여 상대방의 단말기 화면을 통해 디스플레이된다. Referring to FIG. 2, if a face of a person is detected while performing a video call using a digital device, the face may be periodically or aperiodically redetected while tracking the detected face. The facial expression technology may be applied to the face of the person included in the detected face area and / or the tracked face area to determine the expression of the person. The grasped facial expression is displayed on the counterpart's terminal screen by being combined with a decorating effect image corresponding to each facial expression, as shown in FIGS. 3A and 3B to be described later.

구체적으로, 도 2를 참조하면, 얼굴 영역에 대한 트래킹은 예를 들어 영상 통화 중 1초마다 이루어지고 있으며, 1초째에 검출된 얼굴영역에 대해, 2초째, 3초째, 4초째에 트래킹이 이루어지고 있음을 알 수 있다. 그러다가 5초째에는 다시 얼굴 영역을 검출할 수 있고, 6초째에는 5초째에 검출된 영역을 기초로 트래킹이 이루어진 경우를 상정해 본다. 다만, 얼굴 검출 및 트래킹에 대한 예는 이에 한정되는 것이 아님을 밝혀둔다.Specifically, referring to FIG. 2, for example, tracking of the face area is performed every second during a video call, and tracking is performed at the second, third, and fourth seconds of the face area detected in the first second. You can see that it is losing. Then, the face area can be detected again at 5 seconds, and the tracking is performed based on the detected area at 5 seconds at 6 seconds. However, it should be noted that examples of face detection and tracking are not limited thereto.

한편, 이와 같이 검출된 얼굴 영역 또는 트래킹된 얼굴 영역에 포함된 인물의 표정이 어떤 템플릿에 해당되는지 파악하기 위해서는 다양한 얼굴 인식 기술이 적용될 수 있는데, 예를 들어, 주요 구성요소 분석법(principal component analysis), 선형 판별법(linear discriminant analysis) 등의 방법을 상정해 볼 수 있고, W. Zhao, R. Chellappa, A. Rosenfeld, P.J. Phillips가 저술하고, 2003년 ACM Computing Surveys에 게재된 논문인 “Face Recognition : A Literature Survey” 또는 W.S. Yambor가 저술하고, 2000년 콜로라도 주립대학 컴퓨터 사이언스 학부의 Technical Report에 게재된 논문인 “Analysis of PCA-based and Fisher Discriminant-Based Image Recognition Algorithms” 등에 개시된 관련 기술을 예로서 고려할 수 있을 것이다. Meanwhile, various face recognition techniques may be applied to determine which template the facial expression of the person included in the detected face area or the tracked face area corresponds to. For example, a principal component analysis method may be used. , Linear discriminant analysis, etc. can be assumed. W. Zhao, R. Chellappa, A. Rosenfeld, PJ Written by Phillips and published in the 2003 ACM Computing Surveys, “Face Recognition: A Literature Survey” or W.S. An example might be considered by Yambor and published in “Analysis of PCA-based and Fisher Discriminant-Based Image Recognition Algorithms,” an article published in the 2000 Technical Report of the Colorado State University School of Computer Science.

구체적으로, 표정 인식을 수행함에 있어서, 눈, 코, 입, 얼굴의 윤곽, 등의 특성(feature) 데이터를 참조로 하여 표정 장식 템플릿 DB(120)에 이미 저장되어 있는 각 표정의 특성 데이터와의 비교를 통한 표정 매칭 과정을 수행함으로써, 표 정 인식률을 높일 수 있다.Specifically, in performing facial expression recognition, with the feature data of the eye, nose, mouth, face contour, etc., the feature data of each facial expression already stored in the facial expression decoration template DB 120 is referred to. By performing the facial expression matching process through comparison, the facial recognition rate can be increased.

이러한 얼굴의 특성을 참조로 한 얼굴 매칭과 관련된 기술로서, Baker, S. 외 1인이 저술하고, 2004년 IJCV(Internation Journal of Computer Vision)에 게재된 논문인 "Lucas-Kanade 20 Years On: A Unifying Framework"를 예로 들 수 있다. 상기 논문은 템플릿 매칭 방법을 이용하여 인물의 얼굴이 포함된 이미지로부터 눈의 위치를 효율적으로 검출할 수 있는 방법에 대하여 기재하고 있다. 다만, 이와 같은 예에 한정되는 것이 아님을 밝혀둔다.As a technique related to face matching with reference to the characteristics of the face, "Lucas-Kanade 20 Years On: A" by Baker, S. et al., Published in the 2004 International Journal of Computer Vision (IJCV) in 2004 "Unifying Framework." The paper describes a method for efficiently detecting eye position from an image including a face of a person using a template matching method. However, it is not limited to this example.

얼굴 검출부(110a)는 상기와 같은 기술에 의해 검출된 눈의 위치에 기초하여 코와 입의 위치를 추정할 수 있고, 얼굴 트래킹부(110b)는 추정된 얼굴의 각 부위를 주기적 또는 비주기적으로 트래킹할 수 있으며, 표정 인식부(110c)는 이와 같이 검출되거나 트래킹된 눈, 코, 입 등의 부분을 표정 장식 템플릿 DB(130) 등에 포함된 눈, 코, 입 등에 관한 이미지와 수시로 비교하여 매칭시킬 수 있다. The face detection unit 110a may estimate the position of the nose and the mouth based on the position of the eye detected by the above technique, and the face tracking unit 110b may periodically or aperiodically detect each part of the estimated face. The expression recognition unit 110c may track and match the detected part of the eye, nose, mouth, and the like, which is detected or tracked with the image of the eye, nose, mouth, etc. included in the facial expression decoration template DB 130, etc. from time to time. You can.

여기서, 눈, 코, 입 등의 각 부위의 검색 방법은 얼굴의 검색 방법과 같이 P. N. Belhumeur 외 2인이 저술하고, 1997년 IEEE TRANSACTIONS ON PATTERN ALAYSIS AND MACHINE INTELLIGENCE에 게재된 논문인 "Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection"에 개시된 “선형 판별법” 등의 기술을 사용하여 행하여질 수 있다. Here, the search method of each part such as eyes, nose and mouth is written by PN Belhumeur et al. Like the search method of the face, and published in IEEE TRANSACTIONS ON PATTERN ALAYSIS AND MACHINE INTELLIGENCE in 1997, "Eigenfaces vs. Fisherfaces" "Recognition Using Class Specific Linear Projection".

상기와 같은 방법으로 행하여진 얼굴 표정의 각 부위의 유사도로부터 가중치 합(weighted sum) 등을 이용하여 전체 표정의 유사도를 구할 수 있을 것이다. 여기서, 각 부위의 가중치는 인간 인지(human perception)에서의 중요성에 기반하여 정해질 수 있다. The similarity of the entire facial expression may be obtained by using a weighted sum or the like from the similarity of each part of the facial expressions performed in the above manner. Here, the weight of each site can be determined based on the importance in human perception.

하지만, 부위별의 유사도를 각각 합쳐서 유사도를 계산하는 방법 이외의 다른 방식도 가능함은 물론이다. 가령, 얼굴의 부위마다의 유사도를 계산하고 이를 합치는 것이 아니라, 얼굴 전체에 대해 특성 데이터를 추출하고, 상기 얼굴 전체에 대해 추출된 특성 데이터를 비교 대상으로 하여 유사도를 결정할 수 있음은 물론이다. However, other methods other than the method of calculating the similarity by combining the similarity for each site are of course possible. For example, rather than calculating the similarity for each part of the face and combining them, the similarity data may be determined by extracting feature data for the entire face and comparing the feature data extracted for the entire face.

도 3은 본 발명의 일 실시예에 따라, 영상 통화 중에 표정 장식 템플릿 DB(120)에 기록된 복수의 템플릿 중 특정 템플릿에 포함된 모델의 얼굴 표정과 사용자의 얼굴 표정이 동일한 경우, 상기 특정 템플릿에 대응되어 기록된 꾸미기 효과 이미지를 실시간으로 합성하여 상대방의 단말기 화면에 제공해 주는 예를 도시한 도면이다. FIG. 3 is a diagram illustrating a specific template when a facial expression of a model included in a specific template and a user's facial expression are the same among a plurality of templates recorded in the facial expression decoration template DB 120 according to an embodiment of the present invention. Is a diagram showing an example of synthesizing the recording effect image recorded in correspondence with the real time to provide to the terminal screen of the other party.

도 3a 및 도 3b의 좌측 화면을 참조하면, 사전에 사용자가 특정 얼굴 표정을 포함하는 템플릿에 대응되도록 꾸미기 효과 이미지를 선택하여 저장할 수 있도록 사용자 인터페이스가 제공되고 있다. 구체적으로, 도 3a의 경우 어색하게 입이 튀어 나온 표정을 포함하는 템플릿에 대응되도록 ‘땀삐질’ 이라는 꾸미기 효과 이미지가 선택되어 저장된 경우이며, 도 3b의 경우 환하게 웃고 있는 표정을 포함하는 템플릿에 대응되도록 ‘샤방’ 이라는 꾸미기 효과 이미지가 선택되어 저장된 경우를 도시한다. Referring to the left screens of FIGS. 3A and 3B, a user interface is provided so that a user may select and store a decorating effect image so as to correspond to a template including a specific facial expression in advance. In detail, FIG. 3A illustrates a case in which a decorating effect image of 'sweeping' is selected and stored so as to correspond to a template including an awkward mouth expression, and FIG. 3B corresponds to a template including a brightly smiling face. The case where a decorating effect image of 'Shabang' is selected and stored is shown.

그 후, 사용자가 상대방과 영상 통화를 행하면서 상기와 같은 작업을 거쳐 저장된 복수의 템플릿 중 특정 템플릿에 해당되는 표정을 지은 경우, 가령 어색하 게 입이 튀어 나온 표정을 지은 경우, 얼굴 검출부(110a), 얼굴 트래킹부(110b), 및/또는 표정 인식부(110c)에 의해 상기 표정이 상기 복수의 템플릿 중 어느 템플릿에 포함된 인물의 얼굴 표정과 유사 범위에 속해 있는지 판단하여, 도 3a의 우측 화면에서 볼 수 있듯이, 상기 특정 템플릿에 대응되어 저장되어 있는 ‘땀삐질’ 이라는 꾸미기 효과 이미지를 합성시켜 상대방의 단말기 화면에 제공한다. 그 밖에도, 사용자가 상대방과 영상 통화를 행하면서 복수의 템플릿 중 다른 템플릿에 해당되는 표정을 지은 경우, 가령 환하게 웃고 있는 표정을 지으면, 얼굴 검출부(110a), 얼굴 트래킹부(110b), 및/또는 표정 인식부(110c)에 의해 상기 표정이 상기 복수의 템플릿 중 어느 템플릿에 포함된 인물의 얼굴 표정과 유사 범위에 속해 있는지 판단하여, 도 3b의 우측 화면에서 볼 수 있듯이, ‘샤방’ 이라는 꾸미기 효과 이미지를 합성시켜 상대방의 단말기 화면에 제공한다. Subsequently, when the user makes a facial expression corresponding to a specific template among a plurality of templates stored through the above operation while making a video call with the counterpart, for example, when the user makes an awkward facial expression, the face detection unit 110a ), The face tracking unit 110b, and / or the facial expression recognition unit 110c determine whether the facial expression belongs to a range similar to the facial expression of the person included in the template, and the right side of FIG. 3A. As can be seen on the screen, the image of the decorating effect called 'sweeping' stored in correspondence with the specific template is synthesized and provided to the terminal screen of the counterpart. In addition, when the user makes a facial expression corresponding to another template among a plurality of templates while making a video call with the other party, for example, when the user smiles brightly, the face detector 110a, the face tracking unit 110b, and / or The facial expression recognition unit 110c determines whether the facial expression belongs to a range similar to the facial expression of the person included in the template among the plurality of templates, and as shown in the right screen of FIG. 3B, a decorating effect of 'shabang' The image is synthesized and provided to the other party's terminal screen.

앞서 언급하였듯이, 상기와 같은 제 프로세스는 영상 통화의 경우뿐만 아니라 동영상, 사진 등의 정지영상의 촬영에도 마찬가지로 적용될 수 있다. 특히, 사진 등의 정지영상을 촬영하는 경우에는, 화면에 디스플레이되는 얼굴 표정이 특정 템플릿에 포함된 인물의 얼굴 표정과 동일 또는 유사 범위에 속하게 되면 상기 특정 템플릿에 해당되는 꾸미기 효과 이미지를 합성시킨 후 자동으로 사진이 촬영되도록 할 수도 있다.As mentioned above, the above-described process may be similarly applied not only to the video call but also to the shooting of still images such as moving pictures and pictures. In particular, in the case of taking a still image such as a photograph, if the facial expression displayed on the screen falls within the same or similar range as the facial expression of the person included in the specific template, after combining the decorating effect image corresponding to the specific template You can also have the photo taken automatically.

본 발명에 따른 실시예들은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(Floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Embodiments according to the present invention can be implemented in the form of program instructions that can be executed by various computer means can be recorded on a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the media may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well-known and available to those having skill in the computer software arts. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic disks, such as floppy disks. Magneto-optical media, and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine code generated by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like. The hardware device described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

이상과 같이 본 발명에서는 구체적인 구성 요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다.In the present invention as described above has been described by the specific embodiments, such as specific components and limited embodiments and drawings, but this is provided to help a more general understanding of the present invention, the present invention is not limited to the above embodiments. For those skilled in the art, various modifications and variations are possible from these descriptions.

따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the described embodiments, and all the things that are equivalent to or equivalent to the claims as well as the following claims will belong to the scope of the present invention. .

도 1은 본 발명의 일 실시예에 따라, 영상 통화를 할 때 대화 중의 인물의 표정에 따라 그에 부합하는 꾸미기 효과 이미지를 자동으로 합성하여 화면에 디스플레이하기 위한 전체 시스템(100)의 구성도이다. 1 is a block diagram of an entire system 100 for automatically synthesizing a decorating effect image corresponding to a facial expression of a person in a conversation and displaying it on a screen according to an embodiment of the present invention.

도 3은 본 발명의 일 실시예에 따라, 영상 통화 중에 표정 장식 템플릿 DB(120)에 기록된 복수의 템플릿 중 특정 템플릿에 포함된 모델의 얼굴 표정과 사용자의 얼굴 표정이 동일한 경우, 상기 특정 템플릿에 대응되어 기록된 꾸미기 효과 이미지를 실시간으로 합성하여 상대방의 단말기 화면에 제공해 주는 예를 도시한 도면이다.FIG. 3 is a diagram illustrating a specific template when a facial expression of a model included in a specific template and a user's facial expression are the same among a plurality of templates recorded in the facial expression decoration template DB 120 according to an embodiment of the present invention. Is a diagram showing an example of synthesizing the recording effect image recorded in correspondence with the real time to provide to the terminal screen of the other party.

<도면의 주요 부분에 대한 부호의 설명> <Explanation of symbols for the main parts of the drawings>

110 : 화면 장식부110: screen decoration

110a : 얼굴 검출부110a: face detection unit

110b : 얼굴 트래킹부110b: face tracking unit

110c : 표정 인식부110c: facial expression recognition unit

120 : 표정 장식 템플릿 DB120: facial expression decoration template DB

130 : 인터페이스부130: interface unit

140 : 통신부140: communication unit

150 : 제어부150: control unit

Claims

delete

A method of automatically synthesizing the user's face and the decorating effect image according to the user's facial expression of the digital device by referring to the facial expression decoration template DB stored so that the template and the decorating effect image correspond to each facial expression of the model. In

(a) detecting a face of the user photographed through a lens of the digital device by using a face detection technique and tracking the detected face by using a face tracking technique;

(b) searching for a specific template in the template that includes a facial expression that is the same as or similar to the expression of the detected face or the tracked face, and

(c) synthesizing the decorating effect image corresponding to the specific template to the image of the detected face,

Step (c) comprises the step of synthesizing the decorating effect image corresponding to the specific template to the image of the tracked face,

And the facial expression decoration template DB can edit the correspondence between the template and the decorating effect image by the user.

The method of claim 5,

The decorating effect image is synthesized by at least one of an item overlay (item overlay), brightness change, saturation change, applying a filter for applying a special treatment such as posh processing, enlargement or crop of the face portion Way.

The method of claim 6,

In step (b),

And matching each part of the face of the model included in the template with each part of the face of the user.

The method of claim 7, wherein

In step (b),

And searching the position, shape, and size of each part of the face of the model included in the template by comparing the position, shape, and size of each part of the face of the user.

The method of claim 8,

Each part of the face comprises at least one of eyes, nose and mouth.

The method of claim 5,

The digital device is a device capable of video calling.

The method of claim 10,

and displaying the synthesized image on a screen of a terminal of a counterpart who makes a video call with the user.

The method of claim 11,

The tracking is performed using the digital device until the video call is terminated.

The method of claim 12,

In the step (c),

And the decorating effect image is synthesized in real time during the video call.

The method of claim 5,

The digital device is a device for photographing still images.

The method of claim 14,

In step (c),

Synthesizing the decorating effect image and automatically generating the still image when the detected facial expression or the facial expression of the tracked face falls within the same or similar range as the facial expression of the model included in the template. Method comprising a.

The method of claim 5,

The digital device is a device for shooting a video.

The method of claim 16,

In the step (c),

And the decorating effect image is synthesized in real time while generating the moving picture digital data.

A computer-readable medium for recording a computer program for executing the method according to any one of claims 5 to 17.