KR101559502B1

KR101559502B1 - Method and recording medium for contactless input interface with real-time hand pose recognition

Info

Publication number: KR101559502B1
Application number: KR1020130163827A
Authority: KR
Inventors: 김태영; 나민영
Original assignee: 서경대학교 산학협력단
Priority date: 2013-12-26
Filing date: 2013-12-26
Publication date: 2015-10-14
Also published as: KR20150075648A

Abstract

본 발명은 카메라만을 통해 실시간으로 손 포즈를 인식할 수 있고, 더 나아가 사용자의 손 위에 가상 키보드를 정합시켜 손가락으로 가상 키보드를 클릭하여 목적하는 정보의 입력이 가능한 비접촉식 입력 인터페이스 방법 및 그 기록 매체에 관한 것이다.
본 발명에 따른 실시간 손 포즈 인식을 통한 비접촉식 입력 인터페이스 방법은, 카메라로부터 손을 포함하는 영상을 입력받는 단계와; 현재 프레임에서 깊이 값 미분을 통해 상기 손의 외곽선 정보를 검출하는 단계와; 상기 현재 프레임에서 임의의 두 개의 점을 잡아 상기 손의 회전각 정보를 산출하는 단계와; 상기 현재 프레임에서 거리 맵(Distance Map)을 이용하여 상기 손의 중심점 정보를 추출하는 단계와; 상기 현재 프레임에서 상기 중심점을 기준으로 확장하는 원을 이용하여, 상기 손의 손가락에 대응되며 상호 이격 배열된 복수의 점 성분 정보를 추출하는 단계; 및 상기 현재 프레임에서 획득된 적어도 상기 복수의 점 성분 정보를 포함한 손 정보를 이용하여, 이전 프레임에서 획득된 손 정보와 매칭을 수행함으로써 상기 현재 프레임의 손 포즈를 인식하는 단계를 포함하는 것을 특징으로 한다.The present invention relates to a non-contact type input interface method capable of recognizing a hand pose in real time only through a camera, further aligning a virtual keyboard on a user's hand and clicking a virtual keyboard with a finger to input desired information, .
A non-contact type input interface method for real-time hand pose recognition according to the present invention includes: inputting an image including a hand from a camera; Detecting the outline information of the hand through the depth value differential in the current frame; Calculating rotation angle information of the hand by holding arbitrary two points in the current frame; Extracting center point information of the hand using a distance map in the current frame; Extracting a plurality of point component information corresponding to the fingers of the hand and spaced apart from each other using a circle extending in the current frame with respect to the center point; And recognizing a hand pose of the current frame by performing matching with hand information obtained in a previous frame using hand information including at least the plurality of point component information obtained in the current frame, do.

Description

TECHNICAL FIELD [0001] The present invention relates to a contactless input interface method and a recording medium for real-

본 발명은 손 포즈 인식 방법에 관한 것으로서, 보다 상세하게는 보다 정확하고 신속하게 사용자의 손 포즈를 실시간으로 인식할 수 있고, 동시에 가상 키보드의 클릭을 감지할 수 있는 실시간 손 포즈 인식을 통한 비접촉식 입력 인터페이스 방법 및 기록 매체에 관한 것이다.
The present invention relates to a hand pose recognition method, and more particularly, to a hand pose recognition method which can recognize a hand pose of a user more accurately and quickly in real time, An interface method and a recording medium.

각종 센서, 컴퓨팅 파워, 인터페이스 기술의 발달로 인간과 디지털 기기간의 스마트한 상호작용이 가능해지고 있다. 특히 글라스 환경의 웨어러블 컴퓨팅의 발전은 스마트 폰 이후 착용형 단말기 환경에 적합한 터치스크린을 대체시킬 차세대 인터페이스에 대한 요구와 관심으로 이어지고 있으며, 장치를 통한 입력에서 직관적인 형태의 동작인식으로 점차 연구가 진행되고 있다. 이러한 흐름에 맞추어 사용자의 행동과 시청각을 구속하지 않으면서 정보가 전달될 수 있는 사용자 친화적인 인터페이스에 대한 개발이 필요한 실정이다.With the development of various sensors, computing power and interface technology, smart interaction between human and digital devices is becoming possible. In particular, the development of wearable computing environment in the glass environment has led to the demand and attention to the next generation interface that will replace the touch screen suitable for the smart phone's wearable terminal environment and gradually researching from the input through the device to the intuitive form recognition . It is necessary to develop a user-friendly interface that can transmit information without constraining user's behavior and audiovisual according to this flow.

학습이 필요 없고 직관적인 사용자 친화적인 인터페이스에 대한 종래 기술로서 멀티터치 인터페이스, 센서를 이용한 동작인식 인터페이스, 비전기반 비접촉식 인터페이스 등이 있다. 하지만 멀티터치 인터페이스나 센서를 이용하는 경우는 별도의 장비가 필요하기 때문에 앞에서 말한 착용형 단말기 환경에 적용하기 힘들다는 한계가 있다. 반면 비전 기반 비접촉식 인터페이스는 센서 등의 하드웨어를 벗어나 비접촉식으로 손의 포즈나 동작을 인식하여 교육, 게임, 프리젠테이션 등의 다양한 응용 분야에 적용할 수 있으므로 이에 대한 연구들이 다양하게 시도되고 있다.There are multitouch interfaces, motion recognition interfaces using sensors, vision-based contactless interfaces, and the like as prior arts for intuitive, user-friendly interfaces that do not require learning. However, when a multi-touch interface or a sensor is used, a separate device is required, which makes it difficult to apply the method to the wearable terminal environment described above. On the other hand, vision - based contactless interface can be applied to various applications such as education, game, and presentation by recognizing the pose or motion of the hand in a non - contact manner out of the hardware such as a sensor.

비전 기반의 손 인식을 이용한 종래 기술은 다음과 같다. Liu Yun 등은 다수의 특징 정보 (Multi-Feature)와 템플릿 매칭 방법(Template Matching)을 사용하여 손 포즈를 인식하는 방법을 제시했다. 상기 방법은 전처리로 각 손 포즈 영상에 대해 4가지 특징(Hu Moment, angle count, skin color angle, non-skin color angle) 정보를 저장한 뒤 입력되는 영상에서 피부 영역을 검출하여 그 영역에 대한 특징 정보를 추출하여 미리 저장된 특징정보와 템플릿 매칭을 통해 손 포즈를 인식하도록 구성된다. 이는 조명의 변화와 손의 회전에 강인하다는 장점이 있지만 인식하여야 할 포즈의 수가 많아질수록 학습량이 많아지고 매칭 속도가 오래 걸려 실시간 인식이 어렵다는 문제점이 있었다.The prior art using vision-based hand recognition is as follows. Liu Yun et al. Proposed a method of recognizing hand pose using a number of feature information (Multi-Feature) and template matching method (Template Matching). The method comprises the steps of storing the four features (Hu Moment, angle count, skin color angle, and non-skin color angle) information for each hand pose image by the preprocessing, detecting the skin region in the input image, And recognizes the hand pose through template matching and pre-stored feature information. This is advantageous in that it is robust against change of illumination and rotation of the hand, but there is a problem that the more the number of poses to be recognized increases, the longer the matching speed becomes, and the real time recognition becomes difficult.

또 다른 종래 기술로서, Chris Harrison은 깊이카메라의 미분영상을 통해 분할된 손의 이미지를 얻고 템플릿 매칭을 통하여 대략적인 손의 위치를 얻은 후, 화소 거리를 실제 거리로 변환하여 손가락을 찾는 기술은 제안하였다. 상기 방법은 손가락의 끝점의 위치를 칼만필터를 통해 보정하는 바 착용형 단말기에 적합한 입력 인터페이스라 할 수 있으나 손가락 끝점을 통한 입력 인터페이스만을 지원하여 다양한 손 포즈의 인식은 어렵다는 한계가 있었다.
As another conventional technique, Chris Harrison obtains an image of a divided hand through a differential image of a depth camera, obtains a rough hand position through template matching, and then converts a pixel distance to an actual distance to find a finger Respectively. The above method can be regarded as an input interface suitable for a bar-type terminal in which the position of the end point of a finger is corrected through a Kalman filter. However, it has a limit in that it can not recognize various hand pauses by supporting only an input interface through a finger end point.

본 발명은 상기와 같은 문제점을 해결하기 위하여 안출된 것으로서, 본 발명의 목적은 다가올 글라스 환경에서 키보드나 마우스와 같은 별도의 장비 없이, 글라스와 카메라만을 통해 실시간으로 손 포즈를 인식할 수 있고, 더 나아가 사용자의 손 위에 가상 키보드를 정합시켜 손가락으로 가상 키보드를 클릭하여 목적하는 정보의 입력이 가능한 비접촉식 입력 인터페이스 방법 및 기록 매체를 제공하는 것이다.SUMMARY OF THE INVENTION The present invention has been made in order to solve the above problems, and it is an object of the present invention to provide a handy pose recognition apparatus and a handy pose recognition method which can recognize a hand pose in real time through only a glass and a camera, The present invention also provides a non-contact type input interface method and a recording medium in which a virtual keyboard is matched on a user's hand and a virtual keyboard is clicked with a finger to input desired information.

본 발명의 또 다른 목적은 종래 손 포즈 인식 기술 대비 복잡한 알고리즘을 간략화하여 그 처리 속도를 크게 향상시킬 수 있고, 이에 따라 저사양에서도 실시간으로 구동 가능한 비접촉식 입력 인터페이스 방법 및 기록 매체를 제공하는 것이다.
Yet another object of the present invention is to provide a non-contact type input interface method and a recording medium which can simplify a complicated algorithm compared to the conventional hand pose recognition technology and greatly improve the processing speed thereof,

상기 목적을 달성하기 위한 본 발명에 따른 실시간 손 포즈 인식을 통한 비접촉식 입력 인터페이스 방법은, 카메라로부터 손을 포함하는 영상을 입력받는 단계와; 현재 프레임에서 깊이 값 미분을 통해 상기 손의 외곽선 정보를 검출하는 단계와; 상기 현재 프레임에서 임의의 두 개의 점을 잡아 상기 손의 회전각 정보를 산출하는 단계와; 상기 현재 프레임에서 거리 맵(Distance Map)을 이용하여 상기 손의 중심점 정보를 추출하는 단계와; 상기 현재 프레임에서 상기 중심점을 기준으로 확장하는 원을 이용하여, 상기 손의 손가락에 대응되며 상호 이격 배열된 복수의 점 성분 정보를 추출하는 단계; 및 상기 현재 프레임에서 획득된 적어도 상기 복수의 점 성분 정보를 포함한 손 정보를 이용하여, 이전 프레임에서 획득된 손 정보와 매칭을 수행함으로써 상기 현재 프레임의 손 포즈를 인식하는 단계를 포함하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided a non-contact type input interface for real-time hand pose recognition, comprising: inputting an image including a hand from a camera; Detecting the outline information of the hand through the depth value differential in the current frame; Calculating rotation angle information of the hand by holding arbitrary two points in the current frame; Extracting center point information of the hand using a distance map in the current frame; Extracting a plurality of point component information corresponding to the fingers of the hand and spaced apart from each other using a circle extending in the current frame with respect to the center point; And recognizing a hand pose of the current frame by performing matching with hand information obtained in a previous frame using hand information including at least the plurality of point component information obtained in the current frame, do.

또한, 본 발명의 확장 실시예에 의하면, 외곽선 정보와 오츠 알고리즘(Otsu Algorithm)을 이용하여 입력영상의 손의 오른손과 왼손을 구분하는 단계를 더 포함한다. 그리고, 상기 경우 상기 복수의 점 성분은 상기 손의 손가락의 관절에 각각 대응하는 적어도 하나의 관절점 정보와 상기 손의 손가락의 종단에 대응하는 끝점 정보를 포함하고, 상기 복수의 점 성분을 포함한 손 정보를 이용하여 상기 손의 오른손 또는 왼손에 가상 키보드를 매핑하는 단계와, 상기 오른손 또는 왼손에 매핑된 가상 키보드에 왼손 또는 오른손이 클릭시, 상기 끝점 정보를 이용하여 상기 클릭을 감지하는 단계를 더 포함하는 것을 특징으로 한다.
Further, according to an embodiment of the present invention, the method further includes the step of distinguishing the right hand and the left hand of the hand of the input image using the outline information and the Otsu Algorithm. In this case, the plurality of point components may include at least one joint point information corresponding to the joint of the finger of the hand and end point information corresponding to the end of the finger of the hand, Mapping the virtual keyboard to the right or left hand of the hand by using the information and detecting the click using the end point information when the left or right hand is clicked on the virtual keyboard mapped to the right hand or the left hand .

본 발명에 따른 실시간 손 포즈 인식을 통한 비접촉식 입력 인터페이스 방법 및 기록 매체에 의하면, 손가락 관절과 끝점을 이용한 손 모델 기반 매칭 방법으로 손 인식을 수행하는 바, 종래 기술과 달리 손의 움직임이나 회전, 기울기 등의 움직임이 있어도 다양한 손 포즈 인식이 가능하고, 초기 데이터 학습이 필요하지 않고, 상대적으로 알고리즘이 단순하여 처리 속도가 더 빠르며, 정확도(손 포즈 인식률)도 더 향상시킬 수 있는 현저한 효과가 있다.According to the non-contact type input interface method and recording medium according to the present invention, the hand recognition is performed by the hand model-based matching method using the finger joints and the end points. Unlike the conventional art, There is a remarkable effect that it is possible to recognize a variety of hand pose even when there is movement such as a hand movement, a need for initial data learning, a relatively simple algorithm, a faster processing speed, and an improved accuracy (hand pose recognition rate).

구체적으로는, 평균 95%이상의 손 포즈 인식률을 보였으며, 가상 키보드 입력의 경우 80%이상의 정확도를 보였다. 또한 속도는 약 32fps를 보여주어 실시간 처리가 가능함을 알 수 있었다.Specifically, the average hand pose recognition rate was 95% or more on average, and the accuracy of virtual keyboard input was more than 80%. Also, the speed is about 32fps and it can be processed in real time.

이에 따라, 강력한 실시간 손 포즈 인식 성능은 물론 글라스 환경에서 별도의 장치 없이 가상 키보드 정합을 통한 입력이 가능하며, 이를 통해 다양한 응용 프로그램에서 활용될 수 있게 되었다.
As a result, it is possible to input not only a powerful real-time hand-pose recognition performance but also a virtual keyboard matching without a separate device in a glass environment, so that it can be utilized in various application programs.

도 1은 본 발명에 따른 실시간 손 포즈 인식을 통한 비접촉식 입력 인터페이스 방법의 처리 흐름을 도시한 블록 순서도.
도 2는 본 발명의 영상 분할부에 의해 검출된 손 외곽선을 보여주는 일례.
도 3은 본 발명에 따라 오른손과 왼손이 분할된 손을 보여주는 일례.
도 4는 본 발명에 따른 선 근사화를 보여주는 일례.
도 5은 본 발명에 따라 산출된 손 회전각을 보여주는 일례.
도 6은 본 발명에 따라 추출된 손 중심점을 보여주는 일례.
도 7은 본 발명에 따른 점 성분 추출 방법의 세부적인 흐름을 도시한 블록 순서도.
도 8 (a) 내지 (e)는 도 7의 각 단계별 처리 결과을 나타낸 일례.
도 9는 본 발명에 따른 손 포즈 인식 방법의 세부적인 흐름을 도시한 블록 순서도.
도 10은 본 발명의 손 모델과 매칭을 통해 손 포즈를 인식하는 모습을 나타낸 일례.
도 11은 본 발명의 가상 키보드를 사용자의 손 위에 매핑한 모습을 나타낸 일례.
도 12는 본 발명의 FCD 기법을 통한 사용자 클릭 감지를 설명하기 위한 도면.
도 13은 다양한 손 포즈에 따른 사용자 인터페이스 정의의 일례.
도 14는 본 발명의 실시간 손 포즈 인식을 통한 비접촉식 입력 인터페이스 방법을 이용하여 화면을 조작하는 모습을 나타낸 일례.
도 15는 종래 방법과 본 발명의 손 포즈 인식률 비교 데이터.
도 16은 본 발명에 따른 가상 키보드 입력 성공률 실험 데이터.1 is a block flow diagram illustrating a process flow of a non-contact input interface method through real-time hand pose recognition in accordance with the present invention.
2 is an example showing a hand outline detected by the image segmentation unit of the present invention.
FIG. 3 shows an example in which the right hand and the left hand are divided according to the present invention.
4 is an example showing line approximation according to the present invention.
5 is an example showing the hand rotation angle calculated according to the present invention.
6 is an example showing the hand center point extracted according to the present invention.
7 is a block flow diagram illustrating a detailed flow of a point component extraction method according to the present invention.
Figs. 8 (a) to 8 (e) are examples showing the processing results of the respective steps in Fig.
9 is a block flow diagram illustrating a detailed flow of a hand pose recognition method in accordance with the present invention.
10 is an example of recognizing a hand pose through matching with a hand model of the present invention.
11 is an example of a virtual keyboard of the present invention mapped on a user's hand.
12 is a diagram for explaining user click detection through the FCD technique of the present invention.
13 is an example of a user interface definition according to various hand pose.
14 is an example of a screen manipulation using a non-contact type input interface method using real-time hand pose recognition of the present invention.
15 is a comparison result of hand pose recognition rate data of the conventional method and the present invention.
16 is a graph of virtual keyboard input success rate experiment data according to the present invention.

본 명세서에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In the present application, the terms "comprises" or "having" and the like are used to specify that there is a feature, a number, a step, an operation, an element, a component or a combination thereof described in the specification, But do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.

또한, 본 명세서에서 제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. Furthermore, although the terms first, second, etc. may be used herein to describe various components, the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another.

이하에서, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예, 장점 및 특징에 대하여 상세히 설명하도록 한다.In the following, preferred embodiments, advantages and features of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따른 실시간 손 포즈 인식을 통한 비접촉식 입력 인터페이스 방법의 처리 흐름을 도시한 블록 순서도이다.1 is a block flow diagram illustrating a process flow of a non-contact input interface method through real-time hand pose recognition in accordance with the present invention.

도 1을 참조하면, 본 발명의 실시간 손 포즈 인식을 통한 비접촉식 입력 인터페이스 방법은 카메라 영상 입력 단계(단계 1)와, 깊이 값 미분을 통해 손의 외곽선 정보를 검출하는 단계(단계 2)와, 손의 회전각 정보를 산출하는 단계(단계 3)와, 손의 중심점 정보를 추출하는 단계(단계 4)와, 손가락의 점 성분 정보를 추출하는 단계(단계 5)와, 손 모델과 매칭을 통한 손 포즈 인식 단계(단계 6)를 포함하여 구성된다.Referring to FIG. 1, a non-contact type input interface method according to the present invention includes a camera image input step (step 1), a step of detecting an outline information of a hand through a depth value differential (step 2) (Step 4) of extracting the center point information of the hand, a step of extracting the point component information of the finger (step 5), a step of calculating the rotation angle information of the hand And a pose recognition step (step 6).

그리고, 상기 단계 1의 영상에는 손이 포함되는데, 상기 손 영상에는 오른손과 왼손이 모두 포함하여 표시될 수 있고, 상기 경우 단계 2와 단계 3 사이에는 손의 오른손과 왼손을 구분하는 단계(단계 2-1)를 더 포함할 수 있다. 또한, 상기 단계 2-1을 더 포함할 경우, 단계 3 내지 상기 단계 6-1은 상기 손의 오른손과 왼손 중 적어도 어느 하나(특히, 오른손) 또는 모두에 대해 수행되도록 구성된다.The image of step 1 includes a hand. The hand image may include both right and left hands. In this case, a step of distinguishing the right hand and the left hand of the hand between steps 2 and 3 (step 2 -1). Further, when the step 2-1 is further included, steps 3 to 6-1 are configured to be performed on at least one of the right hand and the left hand of the hand (particularly, the right hand) or both.

그리고, 본 발명의 확장 실시예에 따르면, 전술한 단계 2 내지 단계 5에서 획득된 손 정보를 기반으로 가상 키보드를 매핑하는 단계(단계 7)와, 상기 가상 키보드에 손가락의 클릭시, 상기 클릭을 감지하는 단계(단계 8)를 더 포함하여 구성된다.According to the extended embodiment of the present invention, a virtual keyboard is mapped on the basis of the hand information obtained in the above-described steps 2 to 5 (step 7). When the finger is clicked on the virtual keyboard, (Step < RTI ID = 0.0 > 8). &Lt; / RTI >

이하에서는 각 단계에 따른 바람직한 실시예를 들어 보다 구체적으로 설명하도록 한다.Hereinafter, preferred embodiments according to the respective steps will be described in more detail.

<단계 1(S10)>&Lt; Step 1 (S10)

본 발명의 단계 1은 카메라(바람직하게는 깊이 카메라)가 사용자의 손을 포함하는 피사체를 촬상하여, 이를 입력받는 단계이다. 따라서, 입력 영상에는 인식 대상인 손은 물론 배경이 함께 표시되어 있게 된다. 또한, 입력 영상에 나타나는 손은 양손 중 어느 하나의 손(왼손 또는 오른손)일 수 있고, 또는 양손 모두 일 수 있다.Step 1 of the present invention is a step in which a camera (preferably a depth camera) captures an image of a subject including a user's hand and inputs the captured image. Therefore, the input image is displayed with the background as well as the hand to be recognized. Further, the hand appearing in the input image may be either one of the hands (left hand or right hand), or both hands.

<단계 2(S20)>&Lt; Step 2 (S20)

단계 1의 카메라를 통해 입력되는 영상에는 전술한 바와 같이 손과 배경이 함께 표현되어 있는데, 손의 포즈를 정확히 인식하기 위해서는 손 영역과 배경을 분리하는 작업이 필요하다.In the image input through the camera of step 1, the hand and the background are expressed together as described above. In order to accurately recognize the pose of the hand, it is necessary to separate the hand area and the background.

본 발명의 단계 2는 이처럼 입력 영상에 혼재되어 있는 손 영역과 배경을 분리하는 단계로서, 현재 프레임의 깊이 값 미분을 통해 상기 현재 프레임에 표시되어 있는 손의 외곽선 정보를 검출하게 된다.Step 2 of the present invention is a step of separating the hand region and the background mixed in the input image, and detects the outline information of the hand displayed on the current frame through the depth value differential of the current frame.

보다 구체적으로 설명하면, 키넥트(Kinect)를 이용하여 깊이 값을 얻고 각 픽셀의 깊이 값 정보를 이용하여 1차 미분을 통해 손의 경계 즉, 외곽선(10)을 검출한다. 도 2는 본 발명의 영상 분할부에 의해 검출된 손 외곽선(10)을 보여주는 일례이다. 도 2에서 알 수 있듯이 영상 분할부는 전술한 단계 2를 따름으로써 배경을 제외한 손 영역(10)만을 추출할 수 있게 된다.More specifically, a depth value is obtained using a Kinect, and a boundary of the hand, that is, an outline 10, is detected through the first derivative using the depth value information of each pixel. 2 is an example showing the hand outline 10 detected by the image segmentation unit of the present invention. As shown in FIG. 2, the image dividing unit can extract only the hand region 10 except for the background by following Step 2 described above.

<단계 2-1(S25)>&Lt; Step 2-1 (S25) >

본 발명의 단계 2-1은 단계 1에서 획득된 외곽선 정보와 오츠 알고리즘(Otsu Algorithm)을 이용하여 상기 손의 오른손과 왼손을 구분하는 단계이다.Step 2-1 of the present invention distinguishes between the right hand and the left hand of the hand using the outline information obtained in step 1 and the Otsu algorithm.

도 3은 본 발명에 따라 오른손과 왼손이 분할된 손을 보여주는 일례이다. 도 3을 참조하면, 단계 1에서 구한 손의 외곽선 정보를 히스토그램화 한 뒤, 오츠 알고리즘(Otsu Algorithm)을 통해 변곡점을 구함으로써 도 3과 같이 변곡점을 중심으로 손을 왼손(10a)과 오른손(10b)으로 나눌 수 있게 된다.FIG. 3 shows an example in which the right hand and the left hand are divided according to the present invention. Referring to FIG. 3, the outline information of the hand obtained in step 1 is histogramged, and the inflection point is obtained through the Otsu algorithm. As a result, the hand is divided into the left hand 10a and the right hand 10b ).

여기서, 오츠 알고리즘은 상기 외곽선 정보 히스토그램의 빈도 분포가 쌍봉 형태(bimodal)로 분포할 때, 이 두 개의 봉우리를 나눠낼 수 있는 임계값 (threshold)을 구하는 Thresholding Algorithm이다. 이는 특히 이미지 프로세싱에서 일반적으로 사용되고 있는 알고리즘이므로 자세한 설명은 생략하기로 한다.Here, the Otsu algorithm is a thresholding algorithm that obtains a threshold value for dividing the two peaks when the frequency distribution of the outline information histogram is distributed in a bimodal form. This is an algorithm that is generally used in image processing, so a detailed description will be omitted.

<단계 3(S30)>&Lt; Step 3 (S30) >

본 발명의 단계 3은 손 정보를 얻기 위하여 임의의 두 개의 점을 잡아 손의 회전각 정보를 산출하는 단계이다.Step 3 of the present invention is a step of calculating rotation angle information of a hand by capturing arbitrary two points in order to obtain hand information.

도 4는 본 발명에 따른 선 근사화를 보여주는 일례이고, 도 5은 본 발명에 따라 산출된 손 회전각을 보여주는 일례이다.FIG. 4 is an example showing line approximation according to the present invention, and FIG. 5 is an example showing hand rotation angle calculated according to the present invention.

도 4 및 도 5를 참조하면, 본 발명의 단계 3은 손 회전각을 구하기 위하여 입력영상의 현재 프레임에서 임의의 두 개의 점을 시작으로 랜삭(RANSAC:RANdom SAmple Consensus) 알고리즘(Algorithm)을 수행하여 선 근사화(line fitting,20)를 한다.Referring to FIGS. 4 and 5, step 3 of the present invention performs a random access consensus (RANSAC) algorithm (Algorithm) at two arbitrary points in the current frame of the input image to obtain hand rotation angle Line approximation (line fitting, 20) is performed.

랜삭 알고리즘(RANSAC Algorithm)은 일련의 데이터 집합으로부터 수학적 모델의 인자들을 반복적인 작업을 통해 영상에서 직선을 예측하는 기법으로, 입력된 영상에서 N번 동안 임의의 샘플 데이터를 뽑아 모델 파라미터를 예측하는 추정 (hypothesis) 과정과 원본 데이터가 예측모델에 잘 맞는지 확인하고 잘 맞다면 유효한 데이터에 집합으로부터 새로운 모델을 구하는 검사(verification)과정을 반복하게 된다. The RANSAC algorithm is a technique that predicts straight lines in an image through repetitive operations of mathematical model parameters from a series of data sets. It estimates model parameters by extracting arbitrary sample data for N times in the input image. the hypothesis process and the original data are well matched to the predictive model, and if it is correct, the validation process is repeated to obtain a new model from the set.

여기서, 임의의 샘플 데이터를 n개 뽑았을 때, 데이터가 직선에 맞는 관찰 형태를 가질 확률이 'Pg' 이고 L번의 모델 예측 시도가 실패할 확률을 계산한 값이 'Pfail' 이라고 가정하면, 성능에 영향을 미치는 최대 반복 횟수는 다음의 수학식 1과 같다.Assuming that n samples of arbitrary sample data are 'P g ' and 'P fail ', the probability that the data will have a straight line observation form is 'P g ' and the probability that the L model prediction attempts will fail is 'P fail ' , The maximum number of iterations that affects the performance is expressed by Equation 1 below.

전술한 단계 3의 과정을 거침으로써, 도 4와 같이 다양한 손 포즈 영상에 대해 선 근사화를 수행할 수 있게 되고, 이와 같이 랜삭 알고리즘을 통해 근사화된 복수 개의 직선(20)을 기반(즉, 직선 간의 교차각)으로 도 5와 같이 손의 회전각(θ1)을 산출할 수 있게 된다.4, the line approximation can be performed on various hand-pose images, and a plurality of straight lines 20 approximated through the random algorithm can be used as a basis (i.e., It is possible to calculate the rotational angle? 1 of the hand as shown in FIG.

<단계 4(S40)>&Lt; Step 4 (S40)

본 발명의 단계 4는 손 정보를 얻기 위하여 거리 맵(거리 변환, Distance Map)을 이용하여 손의 중심점 정보를 추출하는 단계이다.Step 4 of the present invention is a step of extracting center point information of a hand using a distance map (distance conversion, Distance Map) to obtain hand information.

도 6은 본 발명에 따라 추출된 손 중심점을 보여주는 일례이다. 도 6을 참조하면, 단계 4는 먼저 손 영역 내부의 각 픽셀에 대하여 8 방향 탐색을 수행하여 가장 짧은 거리 값을 해당 픽셀 위치에 저장하는 거리맵을 생성한다. 그리고 난 후 거리맵에 저장된 값 중 가장 큰 값을 가지는 픽셀을 손의 중심점(30)으로 추출한다.6 is an example showing the hand center point extracted according to the present invention. Referring to FIG. 6, in step 4, first, an eight-way search is performed on each pixel in the hand region to generate a distance map storing the shortest distance value at the corresponding pixel position. Then, a pixel having the largest value among the values stored in the distance map is extracted as the central point 30 of the hand.

<단계 5(S50)>&Lt; Step 5 (S50)

본 발명의 단계 5는 확장하는 원을 이용하여 손의 손가락의 특정 지점들에 대응되는 복수의 점 성분을 추출하는 단계로서, 손 정보 추출부에 의해 수행된다.Step 5 of the present invention is a step of extracting a plurality of point components corresponding to specific points of a finger of a hand using an expanding circle, and is performed by a hand information extracting unit.

도 7은 본 발명에 따른 점 성분 추출 방법의 세부적인 흐름을 도시한 블록 순서도이고, 도 8 (a) 내지 (e)는 도 7의 각 단계별 처리 결과을 나타낸 일례이다. FIG. 7 is a block flow chart showing a detailed flow of the point component extraction method according to the present invention, and FIGS. 8 (a) to 8 (e) are examples showing results of processing in each step of FIG.

도 7 및 도 8을 참조하면, 본 발명의 점 성분 추출 방법은 손 중심점(30)에서 원을 확장하는 단계(단계 5a;S51)와, 손 경계 교차지점 탐색 및 중간지점들 검출 단계(단계 5b;S52)와, 손목에 대응하는 중간지점 제거 단계(단계 5c;S53)와, 손가락에 대응하는 중간지점 추출 단계(단계 5d;S54)와, 손가락 관절점 및 끝점 추출 단계(단계 5e;S55)를 포함하여 구성된다.7 and 8, a point component extraction method according to the present invention includes a step of extending a circle at a hand center point 30 (step 5a; S51), a hand boundary intersection point search and intermediate points detection step A finger joint point and an end point extraction step (step S5e; S55) corresponding to the finger, a midpoint extraction step (step S5d) corresponding to the finger, .

단계 5a(S51)는 최종적으로는 손가락 관절점 및 끝점 정보를 추출하기 위한 준비 단계로서, 구체적으로는 단계 4에서 추출된 손의 중심점(30)을 기준으로 점차 대경의 원(40)을 확장해 나가도록 구성된다. 이때 확장되는 다수의 원(40)은 도 8(a)과 같이 손의 중심점(30)을 기준하는 다수의 동심원을 이루게 되고, 각 원(40)은 상호 간에 일정한 간격을 형성하도록 구성하는 것이 바람직하다.Step 5a (S51) is finally a preparatory step for extracting the finger joint point and end point information. More specifically, the large circle 40 is gradually expanded on the basis of the central point 30 of the hand extracted in step 4 Lt; / RTI > The plurality of circles 40 extended at this time are formed into a plurality of concentric circles based on the central point 30 of the hand as shown in FIG. 8A, and it is preferable that the circles 40 are formed so as to form a constant interval from each other Do.

단계 5b(S52)는 단계 5a와 같이 동심원(40)을 그려나가면서, 손에 대응하는 흰색과 손 외 영역에 대응하는 검은색 픽셀의 교차점들을 찾아낸다. 그리고 교차되는 점들의 좌표를 손가락의 경계 후보로 판단하고 이를 서로 연결한 후 각 연결선의 중간지점들(50)을 구한다. Step 5b (S52) finds the intersections of the black pixels corresponding to the white and the out-of-hand regions corresponding to the hands while drawing the concentric circle 40 as in step 5a. Then, the coordinates of the intersecting points are determined as the boundary candidates of the finger, and they are connected to each other, and the intermediate points (50) of the connecting lines are obtained.

여기서, 상호 연결되는 교차점은 두 개의 교차점이 짝을 이루며 연결되고, 짝을 이루는 두 개의 교차점은 다수의 동심원 중 동일한 원 상에 위치하며 서로 최근접한 두 개의 점이 상호 연결될 수 있다.Here, the intersecting points to be interconnected are two pairs of intersecting points, and two pairs of intersecting points are located on the same circle among a plurality of concentric circles, and two closest points to each other can be interconnected.

도 8(b)는 단계 5b에서 검출된 교차점 연결선의 중간지점(50)의 일례를 보여주고 있다. 도 8(b)에서 알 수 있듯이, 도 5b에 의해 획득되는 중간지점들(50)은 손가락 이외에 손목이나 손바닥 영역에도 발생할 수 있게 된다. 따라서, 이후 손가락에 대응하는 점 성분(60)을 추출하기 위해서는 이처럼 단계 5b에서 발생할 수 있는 손목 또는 손바닥 영역의 중간지점을 제거하는 작업이 필요하다.8 (b) shows an example of the intermediate point 50 of the intersection connecting line detected in step 5b. As seen in FIG. 8 (b), the intermediate points 50 obtained by FIG. 5 (b) can be generated in the wrist or palm area in addition to the fingers. Therefore, in order to extract the point component 60 corresponding to the finger, it is necessary to remove the middle point of the wrist or palm area, which may occur in the step 5b.

단계 5c(S53)는 전술한 손목 또는 손바닥 영역의 중간지점(53,52)을 제거하는 단계이다. 바람직한 실시예에 따르면, 단계 5c는 단계 3에서 산출된 손의 회전각(θ1)을 기준으로 상기의 각 중간지점(50)과 손의 중심점(30) 간의 각도를 계산하여 손목 부분에 해당하는 중간지점(53)을 제거한다. 도 8(c)는 전술한 방법에 의해 손목 영역의 중간지점(53)이 제거된 모습을 보여주고 있다.Step 5c (S53) is a step of removing the midpoints 53 and 52 of the aforementioned wrist or palm region. According to a preferred embodiment, step 5c calculates the angle between each intermediate point 50 and the center point 30 of the hand on the basis of the rotation angle? 1 of the hand calculated in step 3, The point 53 is removed. 8 (c) shows a state in which the middle point 53 of the wrist area is removed by the above-described method.

단계 5d(S54)는 손가락 영역에 대응하는 중간지점(51)의 분류를 위한 군집화를 통해 단계 5c에서 제거되지 않은 손가락 외 영역(예컨대, 손바닥 영역)을 제거하는 단계이다. 바람직한 실시예에 따르면, 단계 5d는 단계 5c를 통해 손목 영역의 중간지점이 제거된 상태에서 나머지 각 중간지점들(51,52)과 손 중심점 간의 각도가 각 손가락 임계 범위 각도 내에 있는 점(즉, 중간지점)들끼리 군집화(clustering)를 한다. 이 때 상대적으로 길이가 짧은 부분은 손가락이 아니라고 판단하여 제거한다.Step 5d (S54) is a step of removing the non-removed finger region (e.g., palm area) in step 5c through clustering for classification of the intermediate point 51 corresponding to the finger area. According to a preferred embodiment, step 5d is a step 5c in which the intermediate point of the wrist area is removed through step 5c, and the angle between the respective intermediate points 51, 52 and the hand center point is within the respective finger critical range angles (i.e., Middle point) clustering among them. At this time, it is judged that the portion having a relatively short length is not a finger, and is removed.

전술한 단계 5c 및 단계 5d를 거치면, 최종적으로 도 8(d)와 같이 손목이나 손바닥 영역에 해당하는 중간지점(53,52)은 제거되고, 손가락 영역에 해당하는 중간지점(51)만 남게 된다.After the steps 5c and 5d described above, the intermediate points 53 and 52 corresponding to the wrist or the palm area are finally removed as shown in Fig. 8 (d), and only the intermediate point 51 corresponding to the finger area is left .

상기와 같은 상태에서, 손가락의 종단에 각각 대응하는 중간지점은 끝점(61)으로 분류하고, 나머지 중간지점들을 이용하여 관절에 대응하는 관절점(62)을 구한다(단계 5e;S55). 이때, 통상의 사용자는 하나의 손가락 당 3개의 관절이 존재하고, 끝점 및 각 관절은 상호 대략 동일한 이격 비율로 배치되는 바, 상기의 사실을 이용하여 각 관절점(62)을 추출할 수 있다. 도 8(e)는 전술한 방법에 의해 최종적으로 추출된 하나의 끝점(61)과 3개의 관절점(62)을 보여주고 있으며, 이렇게 추출된 끝점과 관절점은 전술한 복수의 점 성분(60)에 포함되게 된다.In the above state, the intermediate points corresponding to the ends of the fingers are classified as the end points 61, and the joint points 62 corresponding to the joints are calculated using the remaining intermediate points (step S 5). In this case, the normal user has three joints per finger, and the end points and the respective joints are arranged at substantially the same spacing ratio, so that each joint point 62 can be extracted using the above fact. FIG. 8 (e) shows one end point 61 and three joint points 62 finally extracted by the above-described method, and the extracted end points and joint points correspond to the plurality of point components 60 ).

<단계 6(S60)>&Lt; Step 6 (S60)

본 발명의 단계 6은 현재 프레임에서 획득된 손 정보를 이용하여, 이전 프레임에서 획득된 손 정보와 매칭을 수행함으로써 상기 현재 프레임의 손 포즈를 인식하는 단계이다.Step 6 of the present invention recognizes the hand pose of the current frame by performing matching with the hand information obtained in the previous frame using the hand information obtained in the current frame.

여기서, 상기 현재 프레임에서 획득된 손 정보란, 단계 2 내지 단계 5에서 획득된 손의 외곽선(10) 정보, 손의 회전각(θ1) 정보 및 복수의 점 성분(60) 정보를 포함하고, 상기 복수의 점 성분(60) 정보는 단계 5에서 추출된 관절점(62) 및 끝점(61) 정보를 포함한다.Here, the hand information obtained in the current frame includes information on the outline 10 of the hand, the rotation angle? 1 of the hand, and information on a plurality of point components 60 obtained in steps 2 to 5, The plurality of point component (60) information includes joint point (62) and end point (61) information extracted in step (5).

도 9는 본 발명에 따른 손 포즈 인식 방법의 세부적인 흐름을 도시한 블록 순서도이다. 도 9를 참조하면, 본 발명의 손 포즈 인식을 위한 단계 6은 세부적으로 손모델 정규화 단계(도1의 S15)와, 손 모델 회전 단계(단계 6a;S61)와, 손 모델 크기 변환 단계(단계 6b;S62)와, 손 모델과 현재 프레임 손정보의 매칭 단계(단계 6c;S63)와, 손 포즈 인식 단계(단계 6d;S64) 및 손 모델 갱신 단계(단계 6e;S65)를 포함하여 구성된다.FIG. 9 is a block flow diagram illustrating a detailed flow of a hand pose recognition method according to the present invention. Referring to FIG. 9, step 6 for recognizing hand pose of the present invention includes a hand model normalization step (S15 in FIG. 1), a hand model rotation step (step 6a; S61) (Step S6), a hand pose recognition step (step S6), and a hand model updating step (step S6e, step S65) of matching the hand model and the current frame hand information .

본 발명의 실시간 손 포즈 인식을 통한 비접촉식 입력 인터페이스 방법은 입력 영상의 첫 프레임에 표시된 손은 이후의 프레임에 표시되는 손 포즈 인식을 위한 최초 손 모델 정보(이하, '초기 손 모델' 이라 함)로 사용된다.The non-contact type input interface method of real-time hand pose recognition according to the present invention is characterized in that hand displayed in the first frame of the input image is first hand model information (hereinafter referred to as 'initial hand model') for hand pose recognition Is used.

한편, 본 발명에서 사용하는 용어 '손 모델'이란 현재 프레임의 손 포즈를 인식하기 위해 사용되는 매칭 대상(즉, 매칭 프레임)을 의미하며, 바람직하게는 상기 현재 프레임 바로 이전의 프레임을 사용한다.The term " hand model " used in the present invention refers to a matching object (i.e., matching frame) used to recognize a hand pose of a current frame, and preferably uses a frame immediately before the current frame.

따라서, 첫 프레임은 이후 프레임을 위한 전 처리 작업이 수행되는데, 구체적으로 사용자는 다섯 손가락을 모두 곧게 편 상태에서 전술한 단계 2 내지 단계 5의 방법을 따름으로써 손의 외곽선(10) 정보, 손의 회전각(θ1) 정보 및 복수의 점 성분(60) 정보를 생성하게 된다. 이와 같은 첫 프레임에 대한 손 정보는 초기 손 모델로 사용되게 된다.(손모델 정규화 단계(S15))Therefore, the first frame is subjected to a preprocessing operation for a subsequent frame. Specifically, the user follows the method of steps 2 to 5 described above in a state in which all five fingers are straightened, so that information on the outline 10 of the hand, The rotation angle? 1 information and the plurality of point component 60 information. The hand information for the first frame is used as the initial hand model (the hand model normalization step S15)

따라서, 손모델 정규화 단계는 첫 프레임 이후의 프레임에 대한 손 포즈 인식(또는 가상 키보드 입력)을 위한 각 단계 즉, 단계 2 ~ 단계 6(또는 단계 2 ~ 단계 8)를 수행하기 전에, 전술한 단계 2 내지 단계 5의 과정을 먼저 따르게 되는 것이다.Thus, the hand model normalization step may be performed prior to performing each step for hand pose recognition (or virtual keyboard input) for the frame after the first frame, i.e., step 2 to step 6 (or step 2 to step 8) 2 to 5 are followed first.

상기 과정을 통해 초기 손 모델 정보가 생성되면, 이 후 두 번째 프레임부터는 이전 프레임에서 만들어진 손 모델과 현재 프레임에서 획득된 손 정보(즉, 관절점 정보, 끝점 정보, 손의 중심점 정보, 손의 중심점 간의 거리, 및 손의 회전각 정보 등)를 기반으로 상호 매칭을 수행하여 손 포즈를 인식하게 된다.When the initial hand model information is generated through the above process, the hand model generated in the previous frame and the hand information obtained in the current frame (i.e., joint point information, end point information, hand center point information, The distance between the two hands, and the rotation angle information of the hand), thereby recognizing the hand pose.

따라서, 상기 두 번째 프레임에 대한 이전 프레임(즉, 손 모델) 상기 첫 프레임이 이에 해당하게 되고, 세 번째 프레임에 대한 이전 프레임(즉, 손 모델)은 상기 두 번째 프레임이 그에 해당하게 되며, N 번째 프레임에 대한 이전 프레임(즉, 손 모델)은 N-1 번째 프레임이 그에 해당하게 된다.Accordingly, the first frame corresponds to the previous frame (i.e., the hand model) for the second frame, the previous frame (i.e., the hand model) for the third frame corresponds to the second frame, and N (N-1) th frame corresponds to the previous frame (i.e., the hand model) of the (n-1) th frame.

한편, 손 모델과의 매칭을 수행하기 전에 입력 영상에서 분할된 손의 크기나 회전각은 매 프레임마다 유동적으로 변할 수 있다. 이와 같은 문제점을 해결하기 위하여 본 발명은 손 모델을 정규화한 후 현재 프레임의 손의 회전각과 일치하도록 손 모델을 회전시킨 다음 현재 프레임의 손 크기에 맞게 변환한다.(손 모델 회전 단계(S61) 및 손 모델 크기 변환 단계(S62))On the other hand, the size or rotation angle of the divided hand in the input image can be changed flexibly for each frame before matching with the hand model. In order to solve such a problem, the present invention normalizes the hand model and then rotates the hand model so as to coincide with the rotation angle of the hand of the current frame, and then transforms the hand model to fit the hand size of the current frame. Hand model size conversion step S62)

그리고, 상기 변환된 이전 프레임의 손 모델과 현재 프레임에서 구한 각 손가락 관절점, 끝점과 손의 중심점 간의 각도차를 비교하여 매칭을 수행한다.(손 모델과 현재 프레임 손정보의 매칭 단계(S63))The matching is performed by comparing the angular difference between the hand model of the converted previous frame and each finger joint point and the end point of the hand obtained from the current frame and the center point of the hand (matching step S63 of the hand model and the current frame hand information) )

그리고, 전술한 손 모델 매칭 작업 및 이에 따른 손 포즈 인식이 완료(손 포즈 인식 단계(S64))되면, 사용된 손 모델을 새롭게 갱신하게 된다. 즉, 다음 프레임을 위한 손 포즈 인식(즉, 손 모델과의 매칭)을 위해 현재 프레임에서 획득된 손 정보를 새로운 손 모델 정보로 갱신하도록 구성된다.(손 모델 갱신 단계(S65))Then, when the hand model matching operation and the hand pose recognition according to the above are completed (hand pose recognition step S64), the used hand model is newly updated. That is, it is configured to update the hand information obtained in the current frame with new hand model information for hand pose recognition (i.e., matching with the hand model) for the next frame (hand model update step S65).

도 10은 본 발명의 손 모델과 매칭을 통해 손 포즈를 인식하는 모습을 나타낸 일례이다. 도 10에서 알 수 있듯이, 손가락이 은닉되었다가 나타났을 때 그 손가락이 엄지나 새끼 손가락이 아닐 때는 어떤 손가락인지 정확히 알기 힘든데, 본 발명에 따른 방법을 사용하면 어떤 손가락인지 명확히 분별할 수 있어 다양한 손 포즈 인식이 가능하게 된다.10 is an example of recognizing a hand pose through matching with a hand model of the present invention. As can be seen from FIG. 10, it is difficult to know exactly which finger the finger is when it is hidden when the finger is hidden or when it is not a thumb or small finger. By using the method according to the present invention, Recognition becomes possible.

<단계 7(S70)>&Lt; Step 7 (S70)

본 발명의 단계 7은 전술한 단계 2 내지 단계 5에서 획득된 손 정보를 기반으로 가상 키보드를 매핑하는 단계이다.Step 7 of the present invention is a step of mapping a virtual keyboard based on the hand information obtained in the above-described steps 2 to 5. [

도 11은 본 발명의 가상 키보드를 사용자의 손 위에 매핑한 모습을 나타낸 일례이다. 도 11을 참조하면, 본 발명은 터치의 정확성과 몰입감을 높이기 위해 전술한 확장하는 원(40)을 이용하여 얻은 손가락의 점 성분(60)(특히, 관절점,62) 정보를 통해 사용자의 손바닥 위치를 찾고, 또한 전술한 방법을 통해 얻은 손의 회전각(θ1)을 이용하여 손바닥의 회전각을 실시간으로 계산함로써 도 11과 같이 가상 키보드(70)가 정확하게 손(1)에 매핑이 될 수 있게 된다.11 is an example of a virtual keyboard of the present invention mapped on a user's hand. Referring to FIG. 11, in order to increase the accuracy and immersion of the touch, the user can use the information of the point component 60 (in particular, the joint point 62) of the finger obtained by using the above- And the rotation angle of the palm is calculated in real time using the rotation angle of the hand (? 1) obtained through the above-described method, the virtual keyboard 70 is accurately mapped to the hand 1 as shown in FIG. 11 .

<단계 8(S80)>&Lt; Step 8 (S80)

본 발명의 단계 8은 단계 7에 의해 사용자의 손(1)에 매핑된 가상 키보드(70)에 손가락의 접촉 내지 근접(이하, '클릭'으로 통칭)시, 상기 클릭을 감지하는 단계이다.Step 8 of the present invention is a step of detecting the click when the finger touches or closes (hereinafter, referred to as "click") the virtual keyboard 70 mapped to the user's hand 1 by step 7.

바람직한 실시예에 따르면, 단계 8은 사용자의 어느 하나의 손(예컨대, 오른손)의 손가락 끝이 다른 하나의 손(예컨대, 왼쪽손)의 손바닥을 눌렀을 때 이를 감지하기 위한 기술로 패스트 클릭 디텍션(Fast Click Detection, FCD)을 사용하도록 구성되는 것을 특징으로 한다.According to a preferred embodiment, step 8 is a technique for detecting when the fingertip of one hand of the user (e.g., the right hand) hits the palm of another hand (e.g., the left hand) Click Detection (FCD).

도 12는 본 발명의 FCD 기법을 통한 사용자 클릭 감지를 설명하기 위한 도면이다. 도 12를 참조하면, 본 발명의 사용자 클릭 감지를 위한 FCD 방법은 도 12와 같이 손가락 끝점 'p(61)'와 'p(61)'의 깊이 값 'Dp'를 이용하도록 구성된다.12 is a diagram for explaining user click detection through the FCD technique of the present invention. Referring to FIG. 12, the FCD method for user click detection of the present invention is configured to use the depth value 'Dp' of the finger endpoints 'p (61)' and 'p (61)' as shown in FIG.

보다 구체적으로는, 손가락 끝점 주변의 16개의 픽셀(i1~i16))들의 깊이 값(Di)과 임계값 't'를 더하거나 뺀 값과 'Dp'를 비교하여 'True' 값의 횟수를 n 이라고 하고, 이 때 'n >= 12'을 만족하면 클릭으로 감지하고, 그렇지 않으면 클릭으로 감지하지 않도록 구성된다. 이를 수식으로 표현하면 다음의 수학식 2와 같다.More specifically, the depth value Di of the sixteen pixels (i1 to i16) around the fingertip) is compared with a value obtained by adding or subtracting the threshold value t to Dp to determine the number of True values as n , And when it satisfies 'n> = 12', it is configured to detect by clicking, otherwise it is not detected by click. This can be expressed by the following equation (2).

한편, 전술한 바와 같이 본 발명이 단계 7 및 단계 8을 더 포함할 경우, 사용자의 손 포즈를 실시간으로 인식함을 기반으로 가상 키보드(70)의 클릭을 감지할 수 있게 되는데, 상기 경우 가상 키보드 입력을 위해서 소정 형태의 가상 키보드(70)를 호출하는 단계를 더 포함할 수 있다.Meanwhile, when the present invention further includes steps 7 and 8 as described above, it is possible to detect a click of the virtual keyboard 70 based on recognizing the hand pose of the user in real time. In this case, Invoking a virtual keyboard 70 of a certain type for input.

이러한 가상 키보드 호출 단계는 제1 단계 내지 제6 단계 중 어느 하나의 단계 이전에 수행되도록 구성할 수 있으며, 바람직하게는 단계 3 이후 수행되게 설정할 수 있다.The virtual keyboard calling step may be configured to be performed before any one of the first to sixth steps, and preferably may be set to be performed after step 3.

또한, 가상 키보드(70)는 설정된 특정 단계 완료 후 또는 시작 전 자동으로 호출되게 구성되거나 또는 사용자의 명령(예컨대, 버튼 조작 등)에 의해 호출되도록 구성될 수도 있다.Further, the virtual keyboard 70 may be configured to be automatically called after completion of the set specific step or before the start, or may be configured to be called by the user's command (e.g., button operation, etc.).

이하에서는, 본 발명에 따른 실시간 손 포즈 인식을 통한 비접촉식 입력 인터페이스 방법의 구체적인 실시예에 대하여 설명하도록 한다.Hereinafter, a specific embodiment of a non-contact type input method through real-time hand pose recognition according to the present invention will be described.

실시예 1은 Intel Core I5 2500 프로세서와 4GB RAM, VS 2005 환경에서 실시하였고, OpenCV와 OpenNI를 이용하였으며 카메라는 MS Kinect를 사용하였다. 사용된 카메라의 입력 영상의 크기는 640 x 480이다.Example 1 was performed in an Intel Core I5 2500 processor, 4GB RAM, VS 2005 environment, OpenCV and OpenNI were used, and the camera used MS Kinect. The input image size of the camera used is 640 x 480.

본 발명에서 제안하는 방법을 실험하기 위하여 손 포즈를 통한 이미지 뷰어 프로그램 명령 제어와 가상키보드 입력 인터페이스를 통한 키 입력 실험을 수행하였으며, 입력 영상의 처리 과정은 전술한 각 단계들을 따랐다.In order to experiment with the method proposed in the present invention, an image viewer program command control through a hand pose and a key input experiment through a virtual keyboard input interface were performed, and the input image processing was performed in the respective steps described above.

첫 번째로 손 포즈 인식의 정확성과 활용성을 알아보기 위해 도 13과 같이 일련의 손 포즈들로 정의된 손동작으로 이미지 뷰어 프로그램을 제어하기 위한 손 기반 사용자 인터페이스를 정의하였다.First, to determine the accuracy and usability of hand pose recognition, we defined a hand-based user interface for controlling the image viewer program by hand motion defined as a series of hand pauses as shown in Fig.

도 14는 멀티미디어 뷰어 프로그램을 구현하여 앞서 정의한 손 기반 사용자 인터페이스를 적용한 결과이다. 사용자는 양손을 사용하여 마우스 좌우 클릭, 확대 축소 및 이미지 넘김 등의 기능을 자유롭게 제어할 수 있었다. 14 is a result of applying the hand-based user interface defined above by implementing the multimedia viewer program. The user can freely control functions such as left / right mouse click, zooming, and image flipping using both hands.

특히 본 발명에서 제안된 방법은 각 손의 손가락 구분이 가능하므로 행해진 손 포즈의 손가락 개수가 같아도 다른 손가락일 경우 다른 포즈이라는 판단이 가능하여 좀 더 다양한 손 동작을 사용자 인터페이스로 사용할 수 있음을 알 수 있었다.Particularly, since the method proposed in the present invention can distinguish the fingers of each hand, it can be determined that the pose of the other hand is different even if the number of fingers of the hand pose is the same, there was.

본 발명의 효과의 현저성을 입증하기 위해 종래 손 포즈 인식 방법과 비교하였다. 비교 대상인 종래 손 포즈 인식 방법은 각 손 포즈 영상에 대해 4가지 특징(Hu Moment, angle count, skin color angle, non-skin color angle) 정보를 저장한 뒤 입력 영상과 템플릿 매칭을 통해 손 포즈를 인식한다. In order to prove the remarkability of the effect of the present invention, it is compared with the conventional hand pose recognition method. The conventional hand pose recognition method, which is a comparison method, stores information of four features (Hu Moment, angle count, skin color angle, and non-skin color angle) for each hand pose image and then recognizes hand pose do.

세 명의 실험자가 도 15에 정의된 포즈를 무작위로 각각 100번씩 수행한 결과 종래 손 포즈 인식 방법의 경우 유사한 포즈 간의 매칭 오류가 있는 반면, 본 발명의 실시예 1에 따른 손 포즈 인식 방법은 매 프레임 얻어낸 특징 손 정보들을 통해 손 모델을 새로 갱신하여 다음 프레임에서 매칭에 사용하므로 인식의 정확도가 높음을 알 수 있었다.As a result of three randomly performing pauses defined in FIG. 15 each time by the pauses defined in FIG. 15, there is a matching error between similar pauses in the conventional hand pose recognition method, whereas the hand pose recognition method according to the first embodiment of the present invention, It is found that the recognition accuracy is high because the hand model is newly updated through the acquired feature hand information and used for matching in the next frame.

한편, 손 포즈 인식 속도의 경우, 종래 손 포즈 인식 방법의 경우 약 15분 정도의 전처리 시간이 필요하고 평균 15 fps의 성능을 보인 반면, 본 발명에 따른 손 포즈 인식 방법의 경우 사전 학습이 필요 없고, 손 포즈 인식에 필요한 특징점 개수가 적어 평균 32 fps의 성능을 나타냈으며, 결국 종래 대비 손 포즈 인식을 위한 처리 시간을 크게 줄일 수 있음을 알 수 있었다.On the other hand, in the case of the hand pose recognition speed, the conventional hand pose recognition method requires a preprocessing time of about 15 minutes and an average of 15 fps, whereas the hand pose recognition method according to the present invention requires no prior learning , And the average number of feature points required for hand pose recognition was small, resulting in an average of 32 fps. As a result, it was found that the processing time for hand pose recognition can be significantly reduced compared with the conventional art.

또한, 가상 키보드 입력 인터페이스의 정확도와 활용성을 테스트하기 위해 가상의 키보드를 한 손에 띄우고 다른 한 손을 통해 1~9까지의 숫자를 무작위로 클릭하려고 할 때의 성공률을 세 명의 실험자가 각각 100번씩 테스트하였으며 실험 결과는 도 16과 같다. 도 16에 나타나듯이, 본 발명에 따른 가상 키보드 입력 방법에 따르면 80% 이상의 높은 입력 성공률을 구현할 수 있음을 알 수 있었다.
In order to test the accuracy and usability of the virtual keyboard input interface, we used a virtual keyboard with one hand and randomly clicked numbers 1 through 9 on the other hand. The test results are shown in FIG. As shown in FIG. 16, according to the virtual keyboard input method of the present invention, a high input success rate of 80% or more can be realized.

상기에서 설명 및 도시한 본 발명의 실시간 손 포즈 인식을 통한 비접촉식 입력 인터페이스 방법은 프로그램 형태로 기록되어 전자기기를 통해 실현될 수 있다. 여기서 전자기기란 통상의 개인 컴퓨터를 비롯하여, 웨어러블 컴퓨터, 스마트 단말기와 같이 중앙처리장치와 메모리를 포함하고 있는 전기 전자 정보 단말기를 총칭한다.The non-contact type input interface method through real-time hand pose recognition of the present invention described and illustrated above can be realized in the form of a program and realized through an electronic device. Here, the electronic device is generically referred to as an electric / electronic information terminal including a central processing unit and a memory such as a general personal computer, a wearable computer, and a smart terminal.

상기 경우, 본 발명의 기록 매체는 전자기기에, 카메라로부터 촬상된 손을 포함하는 영상을 입력받는 기능과; 현재 프레임에서 깊이 값 미분을 통해 상기 손의 외곽선(10) 정보를 검출하는 기능과; 상기 현재 프레임에서 임의의 두 개의 점을 잡아 상기 손의 회전각 정보(θ1)를 산출하는 기능과; 상기 현재 프레임에서 상기 손의 거리 맵(Distance Map)을 이용하여 상기 손의 중심점(30) 정보를 추출하는 기능과; 상기 현재 프레임에서 상기 중심점(30)을 기준으로 확장하는 원(40)을 이용하여, 상기 손의 손가락에 대응되며 상호 이격 배열된 복수의 점 성분(60) 정보를 추출하는 기능; 및 상기 현재 프레임에서 획득된 적어도 상기 복수의 점 성분(60) 정보를 포함한 손 정보를 이용하여, 이전 프레임에서 획득된 손 정보와 매칭을 수행함으로써 상기 현재 프레임의 손 포즈를 인식하는 기능을 실현시키기 위한 프로그램을 기록한 전자기기로 읽을 수 있는 매체로 제공될 수 있다.In the above case, the recording medium of the present invention includes a function of inputting an image including an image of a hand picked up by a camera to an electronic apparatus; Detecting information of an outline (10) of the hand through a depth value differential in a current frame; A function of calculating rotation angle information (? 1) of the hand by holding arbitrary two points in the current frame; Extracting a center point (30) of the hand using a distance map of the hand in the current frame; Extracting a plurality of point component (60) information corresponding to the fingers of the hand and spaced apart from each other by using a circle (40) extending in the current frame with respect to the center point (30); And recognizing a hand pose of the current frame by performing matching with hand information obtained in a previous frame using hand information including at least the plurality of point components (60) information obtained in the current frame And a program for recording the program.

바람직하게는, 본 발명의 기록 매체는 외곽선 정보와 오츠 알고리즘(Otsu Algorithm)을 이용하여 입력 영상의 손의 오른손과 왼손을 구분하는 기능을 더 실현시키기 위한 프로그램을 기록한 전자기기로 읽을 수 있는 매체일 수 있다.Preferably, the recording medium of the present invention is a medium that can be read by an electronic device that records a program for further realizing a function of distinguishing the right hand and the left hand of an input image by using outline information and an Otsu Algorithm .

본 발명의 확장 실시예에 따르면, 본 발명의 기록 매체는 복수의 점 성분은 상기 손의 손가락의 관절에 각각 대응하는 적어도 하나의 관절점(62) 정보를 포함하고,상기 관절점 정보를 포함한 손 정보를 이용하여 상기 손의 오른손(또는 왼손)에 가상 키보드를 매핑시키는 기능을 더 실현시키기 위한 프로그램을 기록한 전자기기로 읽을 수 있는 매체일 수 있다.According to the extended embodiment of the present invention, the recording medium of the present invention is characterized in that the plurality of point components include at least one joint point (62) information corresponding to each joint of the hand, And a program for realizing a function of mapping a virtual keyboard to the right hand (or left hand) of the hand using information.

상기 경우, 본 발명의 기록 매체는 상기 복수의 점 성분은 상기 손의 손가락의 종단에 대응하는 끝점(61) 정보를 포함하고, 상기 오른손(또는 왼손)에 매핑된 가상 키보드(70)에 왼손(또는 오른손)이 클릭시, 상기 끝점(61) 정보를 이용하여 상기 클릭 여부를 감지하는 기능을 더 실현시키기 위한 프로그램을 기록한 전자기기로 읽을 수 있는 매체일 수 있다.In the recording medium of the present invention, the plurality of dot components include information on the end point 61 corresponding to the end of the finger of the hand, and the virtual keyboard 70 mapped to the right hand (or left hand) Or a right-handed user), when the user clicks the end point 61, detects the click or not by using the end point 61 information.

바람작하게는, 본 발명의 기록 매체는 상기 가상 키보드(70)에 클릭된 손가락 끝점(61) 주변의 복수 개의 픽셀들의 깊이 값에 임계값을 가감한 값과 상기 손가락 끝점의 깊이 값을 비교하여 상기 클릭 여부를 감지하는 기능을 실현시키기 위한 프로그램을 기록한 전자기기로 읽을 수 있는 매체일 수 있다.The recording medium of the present invention compares a value obtained by adding a threshold value to a depth value of a plurality of pixels around the finger end point 61 clicked on the virtual keyboard 70 and a depth value of the finger end point And may be a medium that can be read by an electronic device in which a program for realizing the function of detecting click or not is recorded.

그리고, 본 발명의 기록 매체는 사용자가 키보드 입력을 희망할 경우, 가상 키보드를 외부로 호출하는 기능을 더 실현시키기 위한 프로그램을 기록한 전자기기로 읽을 수 있는 매체일 수 있다.
The recording medium of the present invention may be a medium readable by an electronic device that records a program for further realizing a function of calling a virtual keyboard to the outside when a user wishes to input a keyboard.

상기에서 본 발명의 바람직한 실시예가 특정 용어들을 사용하여 설명 및 도시되었지만 그러한 용어는 오로지 본 발명을 명확히 설명하기 위한 것일 뿐이며, 본 발명의 실시예 및 기술된 용어는 다음의 청구범위의 기술적 사상 및 범위로부터 이탈되지 않고서 여러가지 변경 및 변화가 가해질 수 있는 것은 자명한 일이다. 이와 같이 변형된 실시예들은 본 발명의 사상 및 범위로부터 개별적으로 이해되어져서는 안되며, 본 발명의 청구범위 안에 속한다고 해야 할 것이다.
While the preferred embodiments of the present invention have been described and illustrated above using specific terms, such terms are used only for the purpose of clarifying the invention, and it is to be understood that the embodiment It will be obvious that various changes and modifications can be made without departing from the spirit and scope of the invention. Such modified embodiments should not be understood individually from the spirit and scope of the present invention, but should be regarded as being within the scope of the claims of the present invention.

10: 손의 외곽선 20: 근사화된 직선
30: 손의 중심점 40: 확장하는 원
51: 손가락 영역의 중간지점 52: 손바닥 영역의 중간지점
53: 손목 영역의 중간지점 61: 손가락 끝점
62: 손가락 관절점 70: 가상 키보드10: hand outline 20: approximated straight line
30: center of hand 40:
51: midpoint of the finger region 52: midpoint of the palm region
53: middle point of wrist area 61: finger end point
62: finger joint point 70: virtual keyboard

Claims

delete

A first step of receiving a hand image including a left hand or a right hand from a camera;
A second step of detecting the outline information of the hand through the depth value differential in the current frame;
A step 2-1 of discriminating whether the hand is a right hand or a left hand using the outline information and the Otsu algorithm;
A third step of calculating rotation angle information of the hand by holding arbitrary two points in the current frame;
A fourth step of extracting center point information of the hand using the distance map of the hand in the current frame;
Extracting a plurality of point component information corresponding to the fingers of the hand and spaced apart from each other using a circle extending in the current frame with respect to the center point, A fifth step of extracting at least one corresponding joint point information and the plurality of point components so as to include end point information corresponding to an end of a finger of the hand;
A sixth step of recognizing a hand pose of the current frame by performing matching with hand information obtained in a previous frame using hand information including at least the plurality of point component information obtained in the current frame;
A seventh step of mapping a virtual keyboard to a right hand (or a left hand) of the hand using the hand information including the joint point information; And
And an eighth step of detecting whether the virtual keyboard is mapped to the right hand (or left hand) by using the end point information when the left hand (or right hand) is clicked.
In the eighth step,
Wherein the real time hand pose recognition is performed by comparing a value obtained by adding a threshold value to a depth value of a plurality of pixels around a finger end point clicked on the virtual keyboard and a depth value of the finger end point, &Lt; / RTI >

delete

6. The method of claim 5,
Further comprising the step of calling a virtual keyboard prior to any one of the first to sixth steps. &Lt; RTI ID = 0.0 > 11. < / RTI >

6. The method of claim 5,
Wherein the step 2-1 is configured to distinguish the right hand and the left hand of the hand by obtaining an inflection point through the Otsu algorithm after histogramizing the outline information. Contactless input interface method.

6. The method of claim 5,
Wherein the third step is configured to calculate rotation angle information of the hand using line fitting through a RANSAC algorithm. &Lt; RTI ID = 0.0 > 11. < / RTI >

6. The method of claim 5,
And the fourth step is configured to extract a pixel having the largest value among the values stored in the distance map as the center point of the hand.

6. The method of claim 5,
In the fifth step,
5a) extending the circle at the center point of the hand;
(B) detecting midpoints by searching a hand boundary crossing point using the expanding circle;
A fifth step (c) of removing an intermediate point corresponding to the wrist of the hand;
A fifth step of extracting an intermediate point corresponding to the finger of the hand; And
And a fifth step of extracting the plurality of point component information from the intermediate point extracted in the step (5d)
The plurality of point component information extracted in the step 5e includes at least one joint point information corresponding to the joint of the finger of the hand and one end point information corresponding to the end of the finger of the hand respectively A non - contact input interface method through real - time hand pose recognition.

6. The method of claim 5,
In the sixth step,
Rotating the hand of the previous frame so that the hand of the previous frame coincides with the hand of the current frame;
Transforming the size of the hand of the previous frame so that the hand of the previous frame coincides with the hand of the current frame;
Comparing the hand information of the hand of the previous frame converted through the steps (6a) and (6b) with the hand information obtained in the current frame, and performing matching; And
And recognizing a hand pose of the current frame through matching in the step (6c). &Lt; Desc / Clms Page number 19 >

6. The method of claim 5,
The first frame of the input image in the first step is used as the first hand information for hand pose recognition of the second frame by performing the second step to the fifth step in a state in which all five fingers of the hand are spread, A non-contact type input interface method through real time hand pose recognition.

6. The method of claim 5,
Wherein the hand information obtained in the current frame is transferred to the hand information acquired in the previous frame upon hand pose recognition in the next frame.

delete

In electronic devices,
A function of receiving an image including a hand picked up from a camera;
Detecting the outline information of the hand through the depth value differential in the current frame;
A function of distinguishing the right hand and the left hand of the hand using the outline information and the Otsu algorithm;
A function of calculating rotation angle information of the hand by holding arbitrary two points in the current frame;
A function of extracting the center point information of the hand using the distance map of the hand in the current frame;
Extracting a plurality of point component information corresponding to the fingers of the hand and spaced apart from each other using a circle extending in the current frame with respect to the center point, Extracting a plurality of point component information such that at least one corresponding joint point information and at least one end point information corresponding to an end of a finger of the hand are included in the plurality of point components;
Recognizing a hand pose of the current frame by performing matching with hand information obtained in a previous frame using hand information including at least the plurality of point component information obtained in the current frame;
A function of mapping a virtual keyboard to the right hand (or left hand) of the hand using the hand information including the joint point information;
A function of detecting whether the virtual keyboard is clicked using the joint point information or the end point information when a left hand (or right hand) is clicked on the virtual keyboard mapped to the right hand (or left hand); And
An electronic device in which a program for realizing a function of detecting the click or not by comparing a value obtained by adding a threshold value to a depth value of a plurality of pixels around a finger end point clicked on the virtual keyboard and a depth value of the finger end point Readable medium.

20. The method of claim 19,
And a program for realizing the function of calling the virtual keyboard.