KR101385373B1

KR101385373B1 - Method for face detection-based hand gesture recognition

Info

Publication number: KR101385373B1
Application number: KR1020120130309A
Authority: KR
Inventors: 고재필; 강전학
Original assignee: 강전학; 금오공과대학교 산학협력단
Priority date: 2012-11-16
Filing date: 2012-11-16
Publication date: 2014-04-29

Abstract

The present invention relates to a method for identifying a hand gesture based on face detection. The present invention comprises: a step of detecting a face; a step of modeling skin colors by detecting the face; a step of calculating information on the distance from a sensor; and a step of detecting a hand area by using the skin color model and the information on the distance. According to the present invention, a hand gesture identification shows a high accuracy rate even in case the background color is similar to the skin color. [Reference numerals] (S10) Detecting a face; (S20) Modeling skin colors; (S30) Calculating the distance from a sensor; (S40) Detecting a hand; (S50) Detecting a gesture area; (S60) Recognizing a gesture

Description

{Method for face detection-based hand gesture recognition}

본 발명은, 색상 및 거리 영상에서의 얼굴 검출 기반 손 제스처 인식 방법에 관한 것으로, 특히 얼굴 검출을 통한 실시간 피부색 모델링과 거리정보를 결합하여 손 영역을 검출하고 손 움직임에 따른 방향 및 원 제스처 인식을 위한 규칙 기반 인식 방법에 관한 것이다.The present invention relates to a face detection based hand gesture recognition method in color and distance images. In particular, the present invention relates to real-time skin color modeling through face detection and distance information to detect a hand region and recognize a direction and one gesture according to hand movement. It relates to a rule-based recognition method.

일반적으로 손 제스처 인식은 컴퓨터 비전 분야에서 활발하게 연구되고 있는 분야이다. 최근 마이크로소프트(Microsoft)사의 키넥트(Kinect)가 제공하는 실시간 거리 정보를 이용한 연구가 진행되고 있다. In general, hand gesture recognition is an active area of research in the field of computer vision. Recently, research is being conducted using real-time distance information provided by Kinect of Microsoft Corporation.

기존의 손 제스처를 이용한 입력 인터페이스들이 많이 연구되고 있지만, 손 영역의 검출하여 손의 모양을 인식하는 포스처 기반 연구가 대부분이다. 또한, 손 움직임을 추적하여 제스처를 인식하는 좌표 기반 방법들도 많이 연구되었지만, 대부분 손바닥이나 특정 손모양의 움직임을 한정적으로 추적하고 은닉 마르코프 모델(Hidden Markov Model: HMM)을 이용해 인식한다. 2D 카메라를 이용한 피부색 모델을 기반으로 손 영역을 검출하여 제스처를 인식하는 방법도 있지만 이는 조명에 민감하고 배경과 피부색이 비슷할 경우 손 영역의 검출이 어렵다는 단점이 있다. Many input interfaces using hand gestures have been studied, but most posture-based studies have been conducted to detect the shape of a hand by detecting a hand region. In addition, many coordinate-based methods for recognizing gestures by tracking hand movements have been studied. However, most of them have limited tracking of palm or specific hand movements and recognize them using Hidden Markov Model (HMM). There is also a method of recognizing a gesture by detecting a hand region based on a skin color model using a 2D camera, but it has a disadvantage in that it is difficult to detect a hand region when it is sensitive to lighting and the background and the skin color are similar.

따라서 본 발명은 상기한 바와 같은 종래 기술의 문제점을 해결하기 위하여 안출된 것으로, 본 발명의 목적은 2D 카메라를 이용한 방법 중 피부색 모델과 거리 정보를 결합하여 손 제스처를 인식하는 방법을 제공하는 것이다.Accordingly, the present invention has been made to solve the problems of the prior art as described above, an object of the present invention is to provide a method for recognizing a hand gesture by combining the skin color model and the distance information of the method using a 2D camera.

본 발명의 다른 목적은 손 좌표를 사용하는 대신 기존 프레임과 현재 프레임에서의 손 좌표 차이를 이용하여 제스처 구간을 설정하는 방법을 제공하는 것이다.Another object of the present invention is to provide a method for setting a gesture section using a difference in hand coordinates in an existing frame and a current frame instead of using hand coordinates.

전술한 바와 같은 목적을 달성하기 위한 본 발명의 특징에 따르면, 본 발명은 얼굴 검출 단계, 상기 얼굴 검출을 통한 피부색 모델링 단계, 센서와의 거리 정보 계산 단계, 및 상기 피부색 모델과 상기 거리 정보를 이용한 손 영역 검출 단계를 포함한다.According to a feature of the present invention for achieving the above object, the present invention is a face detection step, skin color modeling step through the face detection step, calculating distance information with the sensor, and using the skin color model and the distance information Hand region detection step.

위에서 설명한 바와 같이, 본 발명의 구성에 의하면 다음과 같은 효과를 기대할 수 있다.As described above, according to the configuration of the present invention, the following effects can be expected.

첫째, 기존의 범용 피부색 모델과 비교하여 제스처 시점에서 얼굴 검출을 통해 피부색 모델을 실시간으로 생성하기 때문에, 피부색의 개인차나 조명 정도에 따라 발생되는 인식 오류를 방지하는 효과가 있다.First, since the skin color model is generated in real time through the face detection at the gesture point of view compared with the conventional general skin color model, there is an effect of preventing the recognition error caused by the individual differences or the degree of illumination of the skin color.

둘째, 피부색 모델에다가 거리 정보를 결합하여 손을 검출하기 때문에, 손 제스처의 인식률이 매우 정밀해지는 효과가 있다.Second, since the hand is detected by combining distance information with the skin color model, the recognition rate of the hand gesture is very precise.

셋째, 좌표 변화가 아닌 좌표 차이를 통해 제스처를 인식하기 때문에, 제스처를 인식하는 규칙이 매우 간소화되는 효과가 있다.Third, since the gesture is recognized through the coordinate difference rather than the coordinate change, the rule for recognizing the gesture is greatly simplified.

도 1은 본 발명에 의한 비접촉식 스마트기기 제어 시스템을 나타내는 개념도.
도 2는 본 발명에 의한 얼굴 검출 기반 손 제스처 인식 방법을 나타내는 순서도.
도 3a는 본 발명에 의한 하르 라이크(Haar-like) 특징점(features)을 이용한 얼굴 검출 결과의 영상.
도 3b는 본 발명에 의한 Cb, Cr 두 컬러와 GMM을 이용한 피부색 검출 영상.
도 4a 내지 도 4d는 본 발명에 의한 제스처 별 x, y 좌표의 차이 값을 나타낸 그래프.
도 5는 본 발명에 의한 제스처별 2D와 3D 카메라의 인식률을 비교한 그래프.1 is a conceptual diagram illustrating a non-contact smart device control system according to the present invention;
2 is a flowchart illustrating a face detection based hand gesture recognition method according to the present invention.
3A is an image of a face detection result using Har-like features according to the present invention.
Figure 3b is a skin color detection image using Cb, Cr two colors and GMM according to the present invention.
4A to 4D are graphs showing difference values of x and y coordinates for each gesture according to the present invention.
Figure 5 is a graph comparing the recognition rate of each gesture 2D and 3D camera according to the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해 질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려 주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 도면에서 층 및 영역들의 크기 및 상대적인 크기는 설명의 명료성을 위해 과장된 것일 수 있다. 명세서 전체에 걸쳐 동일 참조 부호는 동일 구성 요소를 지칭한다.Brief Description of the Drawings The advantages and features of the present invention, and how to achieve them, will become apparent with reference to the embodiments described in detail below with reference to the accompanying drawings. The present invention may, however, be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Is provided to fully convey the scope of the invention to those skilled in the art, and the invention is only defined by the scope of the claims. The dimensions and relative sizes of layers and regions in the figures may be exaggerated for clarity of illustration. Like reference numerals refer to like elements throughout the specification.

본 명세서에서 기술하는 실시예들은 본 발명의 이상적인 개략도인 평면도 및 단면도를 참고하여 설명될 것이다. 따라서 제조 기술 및/또는 허용 오차 등에 의해 예시도의 형태가 변형될 수 있다. 따라서 본 발명의 실시예들은 도시된 특정 형태로 제한되는 것이 아니라 제조 공정에 따라 생성되는 형태의 변화도 포함하는 것이다. 따라서 도면에서 예시된 영역들은 개략적인 속성을 가지며, 도면에서 예시된 영역들의 모양은 소자의 영역의 특정 형태를 예시하기 위한 것이고, 발명의 범주를 제한하기 위한 것은 아니다.Embodiments described herein will be described with reference to plan views and cross-sectional views, which are ideal schematics of the present invention. Thus, the shape of the illustrations may be modified by manufacturing techniques and / or tolerances. Accordingly, the embodiments of the present invention are not limited to the specific forms shown, but also include changes in the shapes that are produced according to the manufacturing process. Thus, the regions illustrated in the figures have schematic attributes, and the shapes of the regions illustrated in the figures are intended to illustrate specific types of regions of the elements and are not intended to limit the scope of the invention.

이하, 상기한 바와 같은 구성을 가지는 본 발명에 의한 얼굴 검출 기반 손 제스처 인식 방법의 바람직한 실시예를 첨부된 도면을 참고하여 상세하게 설명한다.Hereinafter, a preferred embodiment of a face detection based hand gesture recognition method according to the present invention having the above configuration will be described in detail with reference to the accompanying drawings.

도 1을 참조하면, 본 발명의 비접촉식 스마트기기 제어 시스템(100)은, 비접촉식(non-contact)으로 손 제스처를 감지하는 감지부(110), 감지부(110)에서 전달되는 감지 신호를 분석하여 인식 정보를 생성하는 인식부(120), 인식 정보를 이용하여 비디오(142)나 오디오(144) 혹은 스마트기기의 온/오프(146) 등을 콘트롤하는 제어부(130)를 포함한다.Referring to FIG. 1, the non-contact smart device control system 100 of the present invention analyzes a detection signal transmitted from the detection unit 110 and the detection unit 110 to detect a hand gesture by non-contact. Recognizing unit 120 for generating the recognition information, the control unit 130 for controlling the video 142, the audio 144 or the on / off 146 of the smart device using the recognition information.

감지부(110)는, 비접촉식으로 사용자 인터페이스를 수행하기 위해 손 제스처를 감지할 수 있는 적어도 하나 이상의 감지 장치를 포함한다. 감지 장치는 스캔너 혹은 카메라(일반 카메라는 물론이고 깊이 카메라 포함)(이하 감지 장치는 편의상 ‘카메라’라고 한다.) 등을 포함할 수 있다. 감지부(110)는 카메라로부터 발생되는 비접촉 감지 신호를 획득하여 인식부(120)로 전송한다. 여기서, 상기 비접촉 감지 신호는 손의 위, 아래, 왼쪽, 오른쪽, 및 원 제스처에 의한 센서의 반응 강도에 따른 신호이다. 따라서, 본 발명의 감지 신호는 손 제스처에 따른 신호를 포함한다. 감지부(110)는 손 제스처 신호 등의 동작 제어를 위한 신호를 감지하여 인식부(120)로 전송한다. The sensing unit 110 includes at least one sensing device capable of sensing a hand gesture to perform a user interface in a non-contact manner. The sensing device may include a scanner or a camera (including a normal camera as well as a depth camera) (hereinafter, the sensing device is referred to as a “camera” for convenience). The detector 110 acquires a non-contact detection signal generated from the camera and transmits the signal to the recognizer 120. Here, the non-contact detection signal is a signal according to the response intensity of the sensor by the up, down, left, right, and one gesture of the hand. Thus, the sense signal of the present invention includes a signal according to a hand gesture. The detector 110 detects a signal for controlling an operation such as a hand gesture signal and transmits the signal to the recognizer 120.

인식부(120)는, 감지부(110)로부터 전송되는 비접촉 감지 신호를 분석하여 인식 정보를 생성한다. 가령, 손 제스처의 움직임 방향(상하좌우 및 원근 등) 및 손 제스처의 속도를 기 저장된 패턴 정보 및 영상분석 정보를 이용하여 추정하여 비접촉 인식 정보를 생성할 수 있다. 이와 같이 인식부(120)에서 생성되는 비접촉 인식 정보는 제어부(130)로 전송된다. 가령, 손이 위나 오른쪽으로 움직이는 경우에 기 저장된 원근 정보를 만족시키는 것으로 보아 인식 정보를 생성하여 제어부(130)로 전송할 수 있다. The recognition unit 120 analyzes the noncontact sensing signal transmitted from the sensing unit 110 to generate recognition information. For example, the non-contact recognition information may be generated by estimating the movement direction (up, down, left, right, and the like) of the hand gesture and the speed of the hand gesture using previously stored pattern information and image analysis information. The non-contact recognition information generated by the recognition unit 120 is transmitted to the control unit 130 as described above. For example, when the hand moves up or to the right, it is possible to generate recognition information and transmit the recognition information to the controller 130 in order to satisfy the pre-stored perspective information.

제어부(130)는 비접촉 인식 정보를 전달 받으면, 동작 제어를 시작한다. 가령, 수신된 비접촉 인식 정보에 포함된 내용에 따라 비디오(142)를 조작하거나 오디오(144)를 변경하거나 스마트기기를 온/오프(146)할 수 있다.When receiving the non-contact recognition information, the control unit 130 starts operation control. For example, the video 142 may be manipulated, the audio 144 may be changed, or the smart device may be turned on / off 146 according to the content included in the received contactless recognition information.

이하, 비접촉식 스마트기기 제어 시스템에 있어서 얼굴 검출 기반 손 제스처 인식 방법을 도면을 참조하여 상세히 설명한다.Hereinafter, a face detection based hand gesture recognition method in a non-contact smart device control system will be described in detail with reference to the accompanying drawings.

도 2를 참조하면 본 발명의 얼굴 검출 기반 손 제스처 인식 방법은, 얼굴 검출 단계(S10), 상기 얼굴 검출을 통한 피부색 모델링 단계(S20), 센서와의 거리 정보 계산 단계(S30), 상기 피부색 모델과 상기 거리 정보를 이용한 손 영역 검출 단계(S40), 제스처 구간 검출 단계(S50), 및 제스처 인식 단계(S60)를 포함한다.Referring to FIG. 2, in the face detection based hand gesture recognition method of the present invention, a face detection step (S10), a skin color modeling step (S20) through the face detection, a distance information calculation step (S30) with the sensor, and the skin color model And a hand region detection step S40, a gesture section detection step S50, and a gesture recognition step S60 using the distance information.

제1단계: 얼굴 검출 단계(First step: face detection step ( S10S10 ))

본 발명의 실시예에서는, 피부색 모델을 생성하기 위해 얼굴 검출을 이용하는 점에 특징이 있다. 피부색 모델 기반의 방법은 조명에 민감하다는 단점이 있다. 따라서 이를 보완하기 위해 제스처 사용 시점의 피부색을 이용하여 모델을 실시간으로 생성하기 앞서 우선적으로 얼굴을 검출한다. In an embodiment of the present invention, the feature is that facial detection is used to generate a skin color model. The method based on skin color model has the disadvantage of being sensitive to lighting. Therefore, to compensate for this, the face is first detected before generating the model in real time using the skin color at the time of gesture use.

얼굴 검출은 하르 라이크(Haar-like) 특징점(features)을 이용할 수 있다. 가령, 하르 라이크(Haar-like)를 이용하여 특징점(features)을 추출한 후, 에이다부스트(Adaboost) 학습 알고리즘을 이용하여 얼굴 형태를 생성한 후 얼굴을 검출할 수 있다. 그 밖에 픽셀 밝기 값을 이용하는 다양한 방법으로 얼굴을 검출할 수 있다. 도 3a는 이러한 하르 라이크(Haar-like) 특징점(features)을 이용한 얼굴 검출 결과 영상을 보여준다.Face detection may use Haar-like features. For example, after extracting feature points using a Haar-like, a face shape may be detected after generating a face shape using an Adaboost learning algorithm. In addition, the face may be detected by various methods using pixel brightness values. FIG. 3A shows an image of a face detection result using such Haar-like features.

제2단계: 피부색 Step 2: skin color 모델링modelling 단계( step( S20S20 ))

얼굴 검출 영역은 피부색만 있는 것이 아니라 눈썹과 입술과 같은 피부색에 비해 너무 밝거나 어두운 영역도 포함되어 있다. 이러한 피부색 이외의 색을 얼굴에서 제외시킨 후 Luma-Blue Chroma-Red Chroma(YCbCr) 칼라 모델로 변환하여 피부색 모델을 생성한다. 따라서 피부색 모델링에는 청색(Cb)과 적색(Cr)의 두 컬러를 사용하며 가우시언 혼합 모델(Gaussian Mixture Model: GMM)을 사용하여 피부색 모델을 생성한다. 도 3b는 Cb, Cr 두 컬러와 GMM을 이용한 피부색 검출 영상을 보여준다.The face detection area not only has skin color but also contains areas that are too light or too dark for skin color such as eyebrows and lips. The skin color model is generated by removing a color other than the skin color from the face and converting it to a Luma-Blue Chroma-Red Chroma (YCbCr) color model. Therefore, skin color modeling uses two colors, blue (Cb) and red (Cr), and a skin color model is generated using a Gaussian Mixture Model (GMM). 3b shows skin color detection images using Cb and Cr two colors and GMM.

제3단계: 센서와의 거리 측정 단계(Step 3: Measure the distance to the sensor ( S30S30 ))

얼굴 검출을 통해 생성된 피부색 모델을 이용하면 피부색 영역을 효과적으로 검출할 수 있으나, 도 3b에서 알 수 있듯이 피부색 모델만으로는 배경(가령, 피부색의 종이 상자)에 나타난 피부색 영역을 손 영역과 쉽게 구분할 수 없는 단점이 있다. 따라서 이를 정확하게 구별하기 위하여 거리 추정을 위한 센서가 이용될 수 있다. 센서는 깊이 카메라(depth sensing camera)를 포함할 수 있다. 가령, 깊이 카메라에는 마이크로소프트사의 키넥트(Kinect) 센서가 사용될 수 있다. 깊이 카메라를 이용하여 레이저나 적외선을 얼굴이나 배경 그리고 손에 각각 비추어 되돌아오는 광선을 취득하여 얼굴이나 배경 그리고 손의 거리 정보를 계산할 수 있다. 다만, 키넥트(Kinect) 센서는 거리뿐만 아니라 관절의 위치도 제공하는데, 관절위치 검출 연산시간으로 인해 프레임 처리 속도가 떨어져 실시간 처리에는 적합하지 않으므로 본 발명에서는 관절위치 정보사용은 배제하고 거리정보만을 사용할 수 있다. Using the skin color model generated through face detection, the skin color region can be effectively detected, but as shown in FIG. 3B, the skin color region shown in the background (for example, a paper box of skin color) cannot be easily distinguished from the hand region using only the skin color model. There are disadvantages. Therefore, a sensor for distance estimation can be used to accurately distinguish this. The sensor may comprise a depth sensing camera. For example, a depth camera may use Microsoft's Kinect sensor. Depth cameras can calculate the distance information of the face, background, and hands by acquiring the rays that are returned by illuminating the laser or infrared light on the face, background, and hands, respectively. However, the Kinect sensor provides not only the distance but also the position of the joint. Since the frame processing speed decreases due to the joint position detection calculation time, it is not suitable for real-time processing. Can be used.

제4단계: 손 영역 검출 단계(Step 4: Detect Hand Area ( S40S40 ))

사람이 제스처를 할 경우 손이 얼굴보다 앞에서 움직인다는 가정 하에 거리 정보를 이용하여 아래 수식과 같이 손 영역을 정의할 수 있다. If a person makes a gesture, the hand area can be defined as shown in the equation below using distance information on the assumption that the hand moves in front of the face.

(1)

(One)

여기서, Dface는 센서와 얼굴 간의 거리를 나타내며 실험을 통해 t값을 0.1로 사용하여 손 영역을 추출할 수 있다. Here, Dface represents the distance between the sensor and the face, and through the experiment, the hand region can be extracted using a t value of 0.1.

또한 손의 중심점을 추출하기 위해 검출된 손 영역을 이진화하여 2차원의 축에 대하여 각각 Px, Py로 프로젝션을 실시한 후 손의 중심점 x', y'을 아래의 수식과 같이 구할 수 있다.In order to extract the center point of the hand, the detected hand region is binarized and then projected to Px and Py on the two-dimensional axis, respectively, and the center points x 'and y' of the hand can be obtained as shown in the following equation.

(2)

(3)

여기서, w, h는 이미지의 가로, 세로 크기이며 N은 각 x, y축으로 프로젝션 시킨 좌표축의 합이다. Where w and h are the horizontal and vertical dimensions of the image, and N is the sum of the coordinate axes projected onto the x and y axes.

제5단계: 제스처 구간 검출 단계(Step 5: Gesture section detection step ( S50S50 ))

제스처 인식을 위해서 움직인 구간 설정은 매우 중요한 부분이며 이 구간을 어떻게 설정하느냐에 따라 인식률에 큰 변화가 생긴다. 자연스러운 제스처 동작 시 프레임 간의 손의 움직임 차이는 증가하다가 감소한다고 정의한다. 손의 움직임 차이는 이전 프레임과 현재 프레임의 손 영역의 중심 좌표를 가지고 계산한다. 획득된 좌표 차이 값을 이용해 제스처의 시작과 끝을 판별하고, 제스처의 인식 정보로 활용할 수 있다.The moving section setting for gesture recognition is a very important part, and how to set this section greatly changes the recognition rate. It is defined that the hand movement difference between frames increases and decreases during natural gesture operation. The movement difference of the hand is calculated with the center coordinates of the hand area of the previous frame and the current frame. The start and end of the gesture may be determined using the obtained coordinate difference value, and may be used as recognition information of the gesture.

예컨대, 손의 움직임이 발생할 경우 크기가 N인 큐에 이전 프레임의 손 중심 좌표와 현재 프레임의 손 중심 좌표의 차이 값을 삽입한다. 큐에 모인 좌표의 차이 값이 증가하여 일정 이상의 분산이 되면 제스처의 시작으로 간주하고, 움직임이 줄어들어 분산이 일정 이하가 되면 제스처의 끝으로 간주한다. 이때, 제스처 시작과 끝 사이의 좌표 차이 목록을 다음과 같이 βx, βy라 정의하고 이를 대상으로 제스처 인식을 수행한다.For example, when a hand movement occurs, a difference value between the hand center coordinates of the previous frame and the hand center coordinates of the current frame is inserted into a queue of size N. If the difference in coordinates gathered in the queue increases and becomes more than a certain amount of variance, it is considered to be the beginning of the gesture. At this time, the coordinate difference list between the start and end of the gesture is defined as βx and βy as follows, and gesture recognition is performed.

(4)

(5)

여기서, n은 손이 움직인 구간의 프레임 수다.Here, n is the number of frames of the section in which the hand moved.

제6단계: 제스처 인식 단계(Step 6: Gesture Recognition Step ( S60S60 ))

위, 아래, 왼쪽, 오른쪽 인식은 손의 중심점 좌표의 차이 값을 이용한다. 손의 움직임의 가로축과 세로축의 βx, βy의 분산의 비율을 계산하여 판별한다. 아래 (6)의 수식은 위, 아래, (7)의 수식은 왼쪽, 오른쪽 판별에 사용된다.Recognition up, down, left, and right uses the difference value of the coordinates of the center of the hand. The ratio of the variance of βx and βy of the horizontal and vertical axes of the hand movement is calculated and determined. Equation (6) below is used for discriminating the top, bottom, and (7) left and right.

(6)

(7)

여기서, σ²(βx)와 σ²(βy)는 각 축에 대한 분산을 나타내고, θ₁,θ₂는 분산 비율의 임계값을 나타낸다. 위 수식 (6), (7)을 각각 만족할 경우 각 αx, βy 집합의 전체 좌표 합의 부호가 +이면 아래(6), 왼쪽(7)로 인식하고 -이면 위(6), 오른쪽(7)으로 인식한다. Here, sigma 2 (βx) and sigma 2 (βy) represent the variances on each axis, and θ₁, θ 2 represent the threshold of the dispersion ratio. If the above equations (6) and (7) are satisfied, the sign of the sum of the coordinates of each set of αx and βy is +, and it is recognized as down (6), left (7), and-if up (6), right (7) Recognize.

도 4a 내지 도 4d는 제스처 별 x, y 좌표의 차이 값을 그래프로 나타낸 것이다. 파란색이 x, 붉은색이 y를 나타낸 것인데 도 4a 및 도 4b에서는 y 좌표의 차이 값의 변화가 심하고, 도 4c 및 도 4d에서는 x 좌표의 차이 값의 변화가 심하다는 것을 확인할 수 있다.4A to 4D are graphs showing difference values of x and y coordinates for each gesture. Blue is x, and red is y. However, in FIG. 4A and FIG. 4B, the change in the difference in the y coordinate is severe, and in FIG. 4C and FIG. 4D, the change in the difference in the x coordinate is severe.

또한, 원 인식은 제스처 구간의 프레임 수를 n이라 할 때, n이 일정 값 이상일 때 인식한다. 원 인식의 경우 좌표 차이를 사용할 수 없으므로 손 중심 좌표 값을 그대로 사용한다. 반지름 정보를 아래와 같이 구해 수식(10)을 만족하면 원으로 인식한다. In addition, the circle recognition recognizes when the number of frames of the gesture section is n, and when n is a predetermined value or more. In the case of circle recognition, the coordinate difference cannot be used, so the hand center coordinate value is used as it is. If the radius information is obtained as follows and the equation (10) is satisfied, it is recognized as a circle.

(8)

(9)

(10)

여기서, x, y는 각각의 좌표 값을 나타낸다.Here, x and y represent each coordinate value.

이하, 얼굴 검출 기반 손 제스처의 인식률의 실험 결과를 설명한다.Hereinafter, the experimental results of the recognition rate of the face detection based hand gesture will be described.

실험은 Windows 7 환경의 Intel Core2 Duo 3.0GHz, 메모리 3G, Kinect와 Visual C++ 2010에 OpenCV 2.2를 설치하여 진행하였다. 총 5명의 실험자를 대상으로 5가지 제스처를 각각 10회씩 실험하여 결과를 얻었고, 인식률은 97%로 나타났다. 아래 <표 1>은 실험자 별 제스처 인식 수와 인식률을 나타내고 있다.The experiment was performed by installing OpenCV 2.2 on Intel Core2 Duo 3.0GHz, Memory 3G, Kinect, and Visual C ++ 2010 on Windows 7. A total of five experiments were conducted with 10 gestures of 5 gestures each, and the recognition rate was 97%. Table 1 below shows the number and recognition rate of gesture recognition for each experimenter.

왼쪽left 오른쪽Right side 위top 아래bottom 원won 실험자1Experimenter 1 1010 1010 99 1010 1010 실험자2Experimenter 2 1010 99 1010 1010 99 실험자3Experimenter 3 99 1010 1010 1010 1010 실험자4Experimenter 4 1010 1010 1010 99 99 실험자5Experimenter 5 1010 99 1010 1010 1010 인식률Recognition rate 98%98% 96%96% 98%98% 98%98% 96%96%

여기서 실험자별 차이는 크게 없는 것으로 보인다. 비교를 위해 거리 정보를 사용하지 않고 2D 카메라를 이용하여 동일한 조건에서 Microsoft LifeCam Cinema를 사용하여 실험하였다. 도 5는 제스처별 2D와 3D 카메라의 인식률을 비교한 그래프이다. 여기서, 2D 카메라를 이용한 방법은 위 제스처를 제외한 나머지 제스처들은 3D 카메라를 이용한 방법보다 낮은 인식률을 보였고, 전체 인식률은 94%로 3D 카메라를 이용한 방법보다 3%정도 낮다.There seems to be no significant difference between the experimenters here. For comparison, we experimented with Microsoft LifeCam Cinema under the same conditions using a 2D camera without using distance information. 5 is a graph comparing recognition rates of 2D and 3D cameras for each gesture. Here, the method using the 2D camera showed a lower recognition rate than the method using the 3D camera except for the above gesture, and the overall recognition rate was 94%, which is about 3% lower than the method using the 3D camera.

이상에서 살펴본 바와 같이, 본 발명은 얼굴 검출을 통해 실시간으로 피부색을 모델링하고, 이를 거리 정보와 결합하여 손 영역을 검출하는 구성을 기술적 사상으로 하고 있음을 알 수 있다. 이와 같은 본 발명의 기본적인 기술적 사상의 범주 내에서, 당업계의 통상의 지식을 가진 자에게 있어서는 다른 많은 변형이 가능할 것이다.As described above, it can be seen that the present invention has a technical concept that a skin color is modeled in real time through face detection and combined with distance information to detect a hand region. Many other modifications will be possible to those skilled in the art, within the scope of the basic technical idea of the present invention.

110: 감지부 120: 인식부
130: 제어부110: detection unit 120: recognition unit
130:

Claims

Detecting a face at the point of hand gesture;
Detecting a hand based on the face detection; And
Recognizing the hand gesture,
Detecting the hand,
The skin color model is generated in real time through the face detection at the time of the hand gesture, and the hand region is extracted using the skin color model.
Before the step of detecting the hand, further comprising obtaining distance information using a depth camera,
Detecting the hand,
And extracting the hand region by combining the skin color model and the distance information.

delete

The method of claim 1,
Wherein the step of recognizing the hand gesture comprises:
A face detection-based hand gesture recognition method using a difference value between a hand center coordinate of a previous frame and a hand center coordinate of a current frame.

Face detection step;
Skin color modeling through face detection;
Calculating distance information with the sensor; And
Detecting a hand region using the skin color model and the distance information,
The hand region detection step,
Face detection based hand gesture recognition method, characterized in that for extracting the hand region by distinguishing the hand and the face or background through the distance information with the sensor.

The method of claim 5, wherein
The face detection step,
A face detection based hand gesture recognition method using Har-like features.

The method of claim 5, wherein
The skin color modeling step,
And generating a skin color model by converting the color into a YCbCr color model.

The method of claim 5, wherein
Distance information calculation step with the sensor,
The distance information

Calculated by Equation 2, wherein Dface is a distance between a sensor and the face, and the t value is 0.1.

delete

The method of claim 5, wherein
Further comprising a gesture interval detection step,
The gesture section detecting step,
And calculating the center coordinates of the hand region of the previous frame and the current frame, and determining the start and end of the gesture using the obtained coordinate difference value.