KR20130081126A

KR20130081126A - Method for hand-gesture recognition and apparatus thereof

Info

Publication number: KR20130081126A
Application number: KR1020120002119A
Authority: KR
Inventors: 이희경; 차지훈; 변혜란; 조선영
Original assignee: 한국전자통신연구원; 연세대학교 산학협력단
Priority date: 2012-01-06
Filing date: 2012-01-06
Publication date: 2013-07-16
Also published as: KR101868520B1

Abstract

PURPOSE: A method for recognizing hand gesture and an apparatus thereof are provided to improve recognition performance of hand gesture and to process hand gesture recognition in real-time, by being strong to direction change. CONSTITUTION: A hand gesture recognition device acquires joint information of a tested person, and obtains direction element information for each direction element from the acquired joint information (S110,S120). The device generates direction histogram for each element of a vector expressing hand pose (S130). The device combines a number of direction histograms having different quantization levels (S140). The device recognizes hand gesture using the generated combined direction histogram (S150). [Reference numerals] (AA) Start; (BB) End; (S110) Acquire joint information; (S120) Draw direction element information; (S130) Generate direction histogram; (S140) Generate combined direction histogram; (S150) Recognize hand gesture using a random decision forest sorter

Description

Hand gesture recognition method and device therefor {METHOD FOR HAND-GESTURE RECOGNITION AND APPARATUS THEREOF}

본 발명은 영상 처리에 관한 것으로서, 보다 상세하게는 손 제스처 인식 방법 및 장치에 관한 것이다.The present invention relates to image processing, and more particularly, to a method and apparatus for hand gesture recognition.

비전 기반의 손 제스처 인식은, 인간-컴퓨터 상호작용(human-computer interaction) 분야에서 요구되는 필수적인 기술의 하나로서, 활발한 연구가 진행되고 있는 분야이다. 손 제스처 인식을 위해 사용되는 방법에는, 운동학상(kinematic) 모델 기반 방법, 뷰(view) 기반 방법, 저수준 특징 기반 방법 등이 있다. 운동학상 모델 기반 방법에서는 손의 포즈 정보가 이용되고, 뷰 기반 방법에서는 복수의 뷰들의 시퀀스를 이용하여 손이 모델링되며, 저수준 특징 기반 방법에서는 손 영역의 저수준 이미지 특징이 이용될 수 있다. 상기 손 제스처 인식 방법 중에서 운동학상 모델 기반 방법은, 손의 포즈가 정확하게 추정되는 경우, 높은 인식 성능을 제공할 수 있다. 그러나, 상기 손 제스처 인식 방법들은, 복잡한 배경이 존재하는 환경에서는 손 포즈 추정이 어렵다는 문제점을 가진다. Vision-based hand gesture recognition is one of essential technologies required in the field of human-computer interaction and is an active research field. Methods used for hand gesture recognition include kinematic model based methods, view based methods, low level feature based methods, and the like. In the kinematic model-based method, hand pose information is used. In the view-based method, a hand is modeled using a sequence of a plurality of views. In the low-level feature-based method, a low-level image feature of a hand region may be used. The kinematic model-based method among the hand gesture recognition methods may provide high recognition performance when a hand pose is accurately estimated. However, the hand gesture recognition methods have a problem in that hand pose estimation is difficult in an environment in which a complicated background exists.

한편, 최근 몇 년간의 깊이 센싱 기술의 발전으로, 실시간 깊이 카메라가 활발히 사용되고 있다. 예를 들어, 저가의 깊이 카메라에는 마이크로소프트(Microsoft)에서 출시된 키넥트(Kinect) 센서 등이 있으며, 상기 센서는 실시간으로 RGB 컬러 영상, 깊이 영상 및 관절 추적 정보를 제공할 수 있다. 이러한 센싱 기술의 발전은, 실시간으로 정확한 손 포즈 추정 결과를 획득할 수 있게 한다. 또한, 손의 포즈를 표현하기 위해 방향 히스토그램(histogram)이 사용될 수 있다. 방향 히스토그램은 영상의 그래디언트를 통해 획득된 방향 정보를 나타내는 히스토그램이다. 방향 히스토그램은 조명 변화에 강인하고, 단순하며, 방향 정보 및/또는 특징을 빠르게 추출할 수 있다는 장점을 가진다. 그러나, 방향 히스토그램은 방향 변화에 강인하지 않다는 단점을 가지므로, 방향 히스토그램이 사용되는 경우, 실생활 환경에서 발생하는, 다양한 방향 변화를 갖는 손 제스처가 제대로 인식되지 못 하는 문제점이 발생할 수 있다. 따라서 실생활 환경에 적용 가능한 손 제스처 인식 기술을 제공하기 위해서는, 단순하고 빠르면서도 높은 정확도의 인식 성능을 제공할 수 있는, 손의 포즈 특징 및 제스처 분류 프레임워크가 필요하다.Meanwhile, with the development of depth sensing technology in recent years, real-time depth cameras are actively used. For example, low-cost depth cameras include Kinect sensors released by Microsoft, which can provide RGB color images, depth images, and joint tracking information in real time. The development of this sensing technology makes it possible to obtain accurate hand pose estimation results in real time. In addition, a directional histogram can be used to represent the pose of the hand. The direction histogram is a histogram indicating the direction information obtained through the gradient of the image. Directional histograms have the advantage of being robust to light changes, simple, and capable of quickly extracting direction information and / or features. However, since the direction histogram has a disadvantage in that it is not robust to the direction change, when the direction histogram is used, there may be a problem that a hand gesture having various direction changes, which occurs in a real life environment, may not be properly recognized. Accordingly, in order to provide a hand gesture recognition technology applicable to a real-life environment, a hand pose feature and a gesture classification framework that can provide a simple, fast and high accuracy recognition performance are required.

본 발명의 기술적 과제는 높은 인식 성능을 제공할 수 있는 손 제스처 인식 방법을 제공함에 있다.An object of the present invention is to provide a hand gesture recognition method capable of providing high recognition performance.

본 발명의 다른 기술적 과제는 높은 인식 성능을 제공할 수 있는 손 제스처 인식 장치를 제공함에 있다. Another technical problem of the present invention is to provide a hand gesture recognition apparatus capable of providing high recognition performance.

본 발명의 일 실시 형태는 손 제스처 인식 방법이다. 상기 방법은 깊이 카메라를 이용하여 손 제스처에 관련된 관절 정보를 획득하는 단계, 상기 획득된 관절 정보를 이용하여, 구면 좌표 기반의 방향 요소 정보를 도출하는 단계, 상기 도출된 방향 요소 정보를 이용하여, 복수의 방향 히스토그램을 생성하는 단계, 상기 생성된 복수의 방향 히스토그램을 결합하여, 결합 방향 히스토그램을 생성하는 단계 및 상기 생성된 결합 방향 히스토그램에 랜덤 결정 포레스트(forest) 분류기를 적용하여, 상기 손 제스처에 대한 최종 인식 결과를 도출하는 단계를 포함하되, 상기 복수의 방향 히스토그램은 서로 다른 양자화 레벨을 가진다.One embodiment of the present invention is a hand gesture recognition method. The method may further include obtaining joint information related to a hand gesture using a depth camera, deriving direction element information based on spherical coordinates using the obtained joint information, and using the derived direction element information. Generating a plurality of direction histograms, combining the generated plurality of direction histograms, generating a combined direction histogram, and applying a random decision forest classifier to the generated combined direction histogram, to the hand gesture. Deriving a final recognition result for the plurality of directional histograms having different levels of quantization.

본 발명에 따른 손 제스처 인식 방법에 의하면, 높은 인식 성능이 제공될 수 있다.According to the hand gesture recognition method according to the present invention, high recognition performance can be provided.

본 발명에 따른 손 제스처 인식 장치에 의하면, 높은 인식 성능이 제공될 수 있다.According to the hand gesture recognizing apparatus according to the present invention, high recognition performance can be provided.

도 1은 본 발명의 실시예에 따른 손 제스처 인식 방법을 개략적으로 나타내는 흐름도이다.
도 2는 결합 방향 히스토그램 생성 방법의 일 실시예를 개략적으로 나타내는 개념도이다.
도 3은 결정 트리 개수에 따른, 랜덤 결정 포레스트 분류기의 인식 정확도를 개략적으로 나타내는 시뮬레이션 결과이다.
도 4는 본 발명의 실시예에 따른 손 제스처 인식 장치를 개략적으로 나타내는 블록도이다.
도 5는 방향 히스토그램 타입 및 결정 트리의 개수에 따른, 손 제스처 인식 정확도의 실시예를 나타내는 표이다.1 is a flowchart schematically illustrating a hand gesture recognition method according to an embodiment of the present invention.
2 is a conceptual diagram schematically illustrating an embodiment of a method of generating a coupling direction histogram.
3 is a simulation result schematically showing the recognition accuracy of the random decision forest classifier according to the number of decision trees.
4 is a block diagram schematically illustrating an apparatus for recognizing a hand gesture according to an exemplary embodiment of the present invention.
5 is a table illustrating an embodiment of hand gesture recognition accuracy according to a direction histogram type and the number of decision trees.

이하, 도면을 참조하여 본 발명의 실시 형태에 대하여 구체적으로 설명한다. 본 명세서의 실시예를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 명세서의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명은 생략한다.Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. In the following description of the embodiments of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present disclosure rather unclear.

어떤 구성 요소가 다른 구성 요소에 “연결되어” 있다거나 “접속되어” 있다고 언급된 때에는, 그 다른 구성 요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있으나, 중간에 다른 구성 요소가 존재할 수도 있다고 이해되어야 할 것이다. 아울러, 본 발명에서 특정 구성을 “포함”한다고 기술하는 내용은 해당 구성 이외의 구성을 배제하는 것이 아니며, 추가적인 구성이 본 발명의 실시 또는 본 발명의 기술적 사상의 범위에 포함될 수 있음을 의미한다. It is to be understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, . In addition, the description of "including" a specific configuration in the present invention does not exclude a configuration other than the configuration, and means that additional configurations can be included in the practice of the present invention or the technical scope of the present invention.

제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다.The terms first, second, etc. may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component.

또한 본 발명의 실시예에 나타나는 구성부들은 서로 다른 특징적인 기능들을 나타내기 위해 독립적으로 도시되는 것으로, 각 구성부들이 분리된 하드웨어나 하나의 소프트웨어 구성단위로 이루어짐을 의미하지 않는다. 즉, 각 구성부는 설명의 편의상 각각의 구성부로 나열하여 포함한 것으로 각 구성부 중 적어도 두 개의 구성부가 합쳐져 하나의 구성부로 이루어지거나, 하나의 구성부가 복수 개의 구성부로 나뉘어져 기능을 수행할 수 있고 이러한 각 구성부의 통합된 실시예 및 분리된 실시예도 본 발명의 본질에서 벗어나지 않는 한 본 발명의 권리범위에 포함된다.In addition, the components shown in the embodiments of the present invention are shown independently to represent different characteristic functions, which does not mean that each component is composed of separate hardware or software constituent units. That is, each constituent unit is included in each constituent unit for convenience of explanation, and at least two constituent units of the constituent units may be combined to form one constituent unit, or one constituent unit may be divided into a plurality of constituent units to perform a function. The integrated embodiments and separate embodiments of the components are also included within the scope of the present invention, unless they depart from the essence of the present invention.

또한, 일부의 구성 요소는 본 발명에서 본질적인 기능을 수행하는 필수적인 구성 요소는 아니고 단지 성능을 향상시키기 위한 선택적 구성 요소일 수 있다. 본 발명은 단지 성능 향상을 위해 사용되는 구성 요소를 제외한 본 발명의 본질을 구현하는데 필수적인 구성부만을 포함하여 구현될 수 있고, 단지 성능 향상을 위해 사용되는 선택적 구성 요소를 제외한 필수 구성 요소만을 포함한 구조도 본 발명의 권리범위에 포함된다.
In addition, some of the components are not essential components to perform essential functions in the present invention, but may be optional components only to improve performance. The present invention can be implemented only with components essential for realizing the essence of the present invention, except for the components used for the performance improvement, and can be implemented by only including the essential components except the optional components used for performance improvement Are also included in the scope of the present invention.

도 1은 본 발명의 실시예에 따른 손 제스처 인식 방법을 개략적으로 나타내는 흐름도이다.1 is a flowchart schematically illustrating a hand gesture recognition method according to an embodiment of the present invention.

도 1을 참조하면, 손 제스처 인식 장치는 피실험자의 관절 정보를 획득할 수 있다(S110). 상기 관절 정보는 깊이 카메라에서 제공되는 관절 추적 정보를 이용하여 획득될 수 있다. 상기 깊이 카메라에는 예를 들어, 프라임센스(PrimeSense) 사의 깊이 카메라, 마이크로소프트(Microsoft)사의 키넥트(kinect) 등이 있을 수 있다. 손 제스처 인식 장치는 획득된 몸 전체의 관절 정보 중에서, 손 제스처와 관련된 어깨, 팔꿈치, 손의 관절에 대한 3차원 좌표 정보를 이용할 수 있다.
Referring to FIG. 1, the hand gesture recognition apparatus may acquire joint information of a test subject (S110). The joint information may be obtained using joint tracking information provided by a depth camera. The depth camera may include, for example, a PrimeSense depth camera, a Microsoft Kinect, and the like. The hand gesture recognizing apparatus may use three-dimensional coordinate information of the joints of the shoulder, elbow, and hand related to the hand gesture among the acquired joint information of the whole body.

손 제스처 인식 장치는 획득된 관절 정보로부터, 각각의 방향 요소에 대한 방향 요소 정보를 도출할 수 있다(S120). 여기서, 상기 방향 요소는 구면 좌표에 기반한, 4개의 방향 요소일 수 있다.The hand gesture recognizing apparatus may derive direction element information for each direction element from the obtained joint information (S120). Here, the direction element may be four direction elements based on spherical coordinates.

일례로, 어깨, 팔꿈치 및 손의 3차원 데카르트(cartesian) 좌표는 각각

,

및

이라 가정한다. 손 제스처 인식 장치는 손 포즈를 표현하기 위해, 어깨에 대한 팔꿈치의 방향과 팔꿈치에 대한 손의 방향을 계산할 수 있다. 이를 위해, 손 제스처 인식 장치는 3차원 데카르트 좌표를 구면 좌표로 변환하고, 팔꿈치 및 손의 방향을 계산할 수 있다. 이는 다음 수학식 1에 의해 나타내어질 수 있다.For example, the three-dimensional Cartesian coordinates of the shoulders, elbows, and hands are

,

And

Assume that The hand gesture recognizing apparatus may calculate the direction of the elbow with respect to the shoulder and the direction of the hand with respect to the elbow to express the hand pose. To this end, the hand gesture recognizing apparatus may convert 3D Cartesian coordinates into spherical coordinates and calculate elbow and hand directions. This can be represented by the following equation (1).

[수학식 1][Equation 1]

여기서,

는 어깨에서 팔꿈치 간의 벡터를 나타내고,

는 팔꿈치에서 손 간의 벡터를 나타낼 수 있다.

과

은 어깨에 대한 팔꿈치의 방향을 나타내는 구면 좌표의 각도를 의미하고,

과

는 팔꿈치에 대한 손의 방향을 나타내는 구면 좌표의 각도를 의미할 수 있다. 따라서, 손 포즈 및/또는 방향 요소 정보는 팔꿈치 및 손의 방향을 나타내는 4개의 방향 요소로 구성된, 벡터

로 나타내어질 수 있다.
here,

Represents the vector from the shoulder to the elbow,

Can represent a vector between the hands at the elbows.

and

Means the angle of the spherical coordinates representing the direction of the elbow to the shoulder,

and

May refer to an angle of spherical coordinates representing the direction of the hand with respect to the elbow. Thus, the hand pose and / or direction element information consists of four direction elements representing the direction of the elbow and hand, the vector

It can be represented as.

다시 도 1을 참조하면, 손 제스처 인식 장치는, 손 포즈를 표현하는 벡터의 각 요소에 대해, 방향 히스토그램(histogram)을 생성할 수 있다(S130). 방향 히스토그램은 영상의 그래디언트를 통해 획득된 방향 정보를 나타내는 히스토그램이다. Referring back to FIG. 1, the hand gesture recognizing apparatus may generate a direction histogram for each element of the vector representing the hand pose (S130). The direction histogram is a histogram indicating the direction information obtained through the gradient of the image.

과

에 대한 방향 히스토그램의 빈(bin)은

범위의 구간이 균등하게 분할되어 결정될 수 있고,

과

에 대한 방향 히스토그램의 빈은

범위의 구간이 균등하게 분할되어 결정될 수 있다. 이 때, 손 제스처 인식 장치는, 서로 다른 양자화 레벨을 갖는 복수의 방향 히스토그램을 생성할 수 있다. 즉, 손 제스처 인식 장치는, 다양한 각도 간격에 기반하여, 방향 히스토그램의 빈(bin) 개수를 결정할 수 있다.

and

The bin of the direction histogram for

The intervals of the range can be determined evenly divided,

and

The bin of the direction histogram for

The intervals of the range can be determined evenly divided. In this case, the hand gesture recognizing apparatus may generate a plurality of directional histograms having different quantization levels. That is, the hand gesture recognizing apparatus may determine the number of bins of the directional histogram based on various angular intervals.

손 제스처 인식 장치는, 방향 변화에 대한 강인성을 높이기 위해, 서로 다른 양자화 레벨을 갖는 복수의 방향 히스토그램을 결합하여, 결합 방향 히스토그램을 생성할 수 있다(S140). 즉, 손 제스처 인식 장치는 서로 다른 빈 개수를 갖는 복수의 방향 히스토그램을 결합할 수 있다. 이 때, 손 포즈는 결합 방향 히스토그램에 의해 표현될 수 있으며, 결합 방향 히스토그램은 다음 수학식 2에 의해 나타내어질 수 있다.The hand gesture recognizing apparatus may combine the plurality of direction histograms having different quantization levels to generate the combined direction histogram in order to increase the robustness against the direction change (S140). That is, the hand gesture recognition apparatus may combine a plurality of direction histograms having different numbers of bins. In this case, the hand pose may be represented by the coupling direction histogram, and the coupling direction histogram may be represented by the following equation (2).

[수학식 2]&Quot; (2) "

여기서,

는 히스토그램 빈에 대한 각도 간격을 나타낼 수 있다. 또한,

는 4개의 방향 요소에 대한 방향 히스토그램으로 구성된 벡터이고,

를 구성하는 각각의 방향 히스토그램은,

각도 간격의 히스토그램 빈을 가질 수 있다. n은 결합 방향 히스토그램 생성에 사용되는 방향 히스토그램의 개수를 나타낼 수 있다.here,

May represent an angular interval with respect to the histogram bin. Also,

Is a vector of directional histograms for four directional elements,

Each direction histogram constituting the

It can have histogram bins at angular intervals. n may indicate the number of directional histograms used for generating the combined direction histogram.

도 2는 결합 방향 히스토그램 생성 방법의 일 실시예를 개략적으로 나타내는 개념도이다. 도 2에서, 비디오 시퀀스의 관찰값은 x로 표현될 수 있으며, x는 m 크기의 윈도우 내의 프레임 관찰값들의 벡터를 나타낸다. 도 2의 실시예에서는, n개의 방향 히스토그램(h₁~h_n)이 결합되어 결합 방향 히스토그램(h)이 생성될 수 있다.
2 is a conceptual diagram schematically illustrating an embodiment of a method of generating a coupling direction histogram. In FIG. 2, the observation of the video sequence may be represented by x, where x represents a vector of frame observations within a window of size m. In the embodiment of FIG. 2, n direction histograms h ₁ to h _n may be combined to generate a coupling direction histogram h.

다시 도 1을 참조하면, 손 제스처 인식 장치는, 결합 방향 히스토그램을 이용하여 손 제스처를 인식할 수 있다(S150). 일례로, 손 제스처 인식 장치는, 결합 방향 히스토그램에 랜덤 결정 포레스트(forest) 분류기를 적용함으로써, 손 제스처에 대한 최종 인식 결과를 획득할 수 있다. 즉, 랜덤 결정 포레스트 분류기에는 테스트 영상으로부터 추출된 결합 방향 히스토그램이 입력될 수 있으며, 결합 방향 히스토그램으로 표현된 손 제스처는, 랜덤 결정 포레스트 분류기를 이용하여 인식될 수 있다. Referring back to FIG. 1, the hand gesture recognizing apparatus may recognize the hand gesture by using the coupling direction histogram (S150). For example, the hand gesture recognition apparatus may obtain a final recognition result for the hand gesture by applying a random decision forest classifier to the coupling direction histogram. That is, the combined direction histogram extracted from the test image may be input to the random decision forest classifier, and the hand gesture represented by the combined direction histogram may be recognized using the random decision forest classifier.

랜덤 결정 포레스트 분류기는, 랜덤 부분공간(subspace) 방법을 이용한 앙상블(ensemble) 분류기로서, 복수 개의 결정 트리(tree)로 구성될 수 있다. 랜덤 결정 포레스트 분류기는 빠른 수행 속도와 비교적 높은 인식률을 효과적으로 제공할 수 있다.The random decision forest classifier is an ensemble classifier using a random subspace method and may be composed of a plurality of decision trees. The random decision forest classifier can effectively provide a high performance rate and a relatively high recognition rate.

손 제스처 인식 장치는, 테스트 영상으로부터 추출된 결합 방향 히스토그램 집합에서, 랜덤 교체 방식으로 부분 집합을 선택할 수 있다. 손 제스처 인식 장치는, 결합 방향 히스토그램 집합에서 선택된 부분 집합을 이용하여, T(T는 양의 정수)개의 결정 트리로 구성된, 랜덤 결정 포레스트 분류기에 대한 학습을 수행할 수 있다. 학습된 분류기는 모든 결정 트리에 대한 확률들의 평균을 계산함으로써, 손 제스처에 대한 최종 인식 결과를 획득할 수 있다. 이는 다음 수학식 3에 의해 나타내어질 수 있다.The hand gesture recognizing apparatus may select a subset by a random replacement method from the combination direction histogram set extracted from the test image. The hand gesture recognizing apparatus may perform learning on a random decision forest classifier composed of T (T is a positive integer) decision trees using a subset selected from the set of combination direction histograms. The learned classifier can obtain the final recognition result for the hand gesture by calculating the average of the probabilities for all decision trees. This can be represented by the following equation (3).

[수학식 3]&Quot; (3) "

여기서, P_t(c|H)는 결정 트리 t에서의 제스처 클래스 c에 대한 확률을 나타낼 수 있으며,

는 손 제스처에 대한 최종 인식 결과를 나타낼 수 있다.
Here, P _t (c | H) may represent the probability for the gesture class c in the decision tree t,

May represent a final recognition result for the hand gesture.

도 3은 결정 트리 개수에 따른, 랜덤 결정 포레스트 분류기의 인식 정확도를 개략적으로 나타내는 시뮬레이션 결과이다. 도 3은 손 제스처 데이터 셋에 대한 인식 정확도를 도시하며, 손 제스처 데이터 셋은 30개 클래스로 구성된다.3 is a simulation result schematically showing the recognition accuracy of the random decision forest classifier according to the number of decision trees. 3 shows the recognition accuracy of the hand gesture data set, and the hand gesture data set is composed of 30 classes.

도 3을 참조하면, 85개 이상의 결정 트리가 사용되는 랜덤 결정 포레스트 분류기는 100%의 정확도를 제공할 수 있다.
Referring to FIG. 3, a random decision forest classifier that uses more than 85 decision trees may provide 100% accuracy.

도 4는 본 발명의 실시예에 따른 손 제스처 인식 장치를 개략적으로 나타내는 블록도이다. 손 제스처 인식 장치(400)는 관절 정보 획득부(410), 방향 요소 정보 도출부(420), 결합 방향 히스토그램 생성부(430) 및 인식부(440)를 포함할 수 있다.4 is a block diagram schematically illustrating an apparatus for recognizing a hand gesture according to an exemplary embodiment of the present invention. The hand gesture recognition apparatus 400 may include a joint information acquirer 410, a direction element information derivator 420, a combined direction histogram generator 430, and a recognizer 440.

도 4를 참조하면, 관절 정보 획득부(410)는 피실험자의 관절 정보를 획득할 수 있다. 손 제스처 인식 장치(400)는 획득된 몸 전체의 관절 정보 중에서, 손 제스처와 관련된 어깨, 팔꿈치, 손의 관절에 대한 3차원 정보를 이용할 수 있다.Referring to FIG. 4, the joint information acquisition unit 410 may acquire joint information of a test subject. The hand gesture recognizing apparatus 400 may use three-dimensional information about the joints of the shoulder, elbow, and hand related to the hand gesture among the acquired joint information of the entire body.

방향 요소 정보 도출부(420)는 획득된 관절 정보로부터, 각각의 방향 요소에 대한 방향 요소 정보를 도출할 수 있다. 여기서, 상기 방향 요소는 구면 좌표에 기반한, 4개의 방향 요소일 수 있다. 방향 요소 정보 도출 방법의 구체적인 실시예는 상술한 바 있으므로, 생략하기로 한다.The direction element information deriving unit 420 may derive the direction element information for each direction element from the obtained joint information. Here, the direction element may be four direction elements based on spherical coordinates. Since the specific embodiment of the method for deriving the direction element information has been described above, it will be omitted.

결합 방향 히스토그램 생성부(430)는 각각의 방향 요소에 대해, 방향 히스토그램을 생성할 수 있다. 이 때, 결합 방향 히스토그램 생성부(430)는, 서로 다른 양자화 레벨을 갖는 복수의 방향 히스토그램을 생성할 수 있으며, 상기 복수의 방향 히스토그램을 결합하여 결합 방향 히스토그램을 생성할 수 있다. The combining direction histogram generator 430 may generate a direction histogram for each direction element. In this case, the coupling direction histogram generator 430 may generate a plurality of directional histograms having different quantization levels, and generate the combined direction histogram by combining the plurality of directional histograms.

인식부(440)는 결합 방향 히스토그램을 이용하여 손 제스처를 인식할 수 있다. 일례로, 인식부(440)는, 생성된 결합 방향 히스토그램에 랜덤 결정 포레스트 분류기를 적용함으로써, 손 제스처에 대한 최종 인식 결과를 획득할 수 있다. 최종 인식 결과 획득 방법의 구체적인 실시예는 상술한 바 있으므로, 생략하기로 한다.
The recognition unit 440 may recognize a hand gesture using a coupling direction histogram. For example, the recognizer 440 may obtain a final recognition result for the hand gesture by applying the random decision forest classifier to the generated combining direction histogram. Since the specific embodiment of the method for obtaining the final recognition result has been described above, it will be omitted.

도 5는 방향 히스토그램 타입 및 결정 트리의 개수에 따른, 손 제스처 인식 정확도의 실시예를 나타내는 표이다. 도 5에서,

,

은 각각 10도, 20도, 30도 간격의 빈을 이용하여 생성된 하나의 방향 히스토그램을 나타낸다.5 is a table illustrating an embodiment of hand gesture recognition accuracy according to a direction histogram type and the number of decision trees. In Figure 5,

,

Represents one directional histogram generated using bins at 10, 20, and 30 degree intervals, respectively.

도 5를 참조하면, 랜덤 결정 포레스트 분류기의 결정 트리 개수가 1개인 경우, 10도 간격의 빈을 이용하여 생성된 하나의 방향 히스토그램에 대해, 가장 높은 손 제스처 인식 정확도가 제공될 수 있다. 그러나, 랜덤 결정 포레스트 분류기의 결정 트리 개수가 2개 이상인 경우, 결합 방향 히스토그램에 대해 가장 높은 손 제스처 인식 정확도가 제공될 수 있다. 이는, 결합 방향 히스토그램이 손 제스처 인식 정확도를 높일 수 있음을 의미한다.
Referring to FIG. 5, when the number of decision trees of the random decision forest classifier is one, the highest hand gesture recognition accuracy may be provided for one direction histogram generated by using bins having a 10 degree interval. However, when the number of decision trees of the random decision forest classifier is two or more, the highest hand gesture recognition accuracy may be provided for the combined direction histogram. This means that the combined direction histogram can increase the hand gesture recognition accuracy.

상술한 방법, 장치, 특성 및/또는 인식 프레임워크는 손 제스처 인식에 한정되지 않고, 전체 몸체(full-body) 제스처 인식, 사람의 행동 분석 및/또는 사람의 행동 인식 등에도 적용될 수 있다.The above-described methods, devices, features and / or recognition frameworks are not limited to hand gesture recognition, but may also be applied to full-body gesture recognition, human behavior analysis and / or human behavior recognition.

본 발명에 따른 손 제스처 인식 방법 및 장치는, 방향 변화에 강인하여 손 제스처의 인식 성능을 향상시키고 손 제스처 인식의 실시간 처리를 가능하게 할 수 있다. 즉, 본 발명에 따른 손 제스처 인식 방법 및 장치는, 실생활에서 발생되는 다양한 방향 변화를 갖는 손 제스처를 효율적으로 인식할 수 있으며, 저차원 이미지 특성 및 빠른 속도의 분류기를 사용함으로써, 손 제스처 인식의 실시간 처리를 가능하게 할 수 있다. The method and apparatus for recognizing a hand gesture according to the present invention may be robust to a change in direction to improve the recognition performance of the hand gesture and enable real time processing of the hand gesture recognition. That is, the hand gesture recognition method and apparatus according to the present invention can efficiently recognize hand gestures having various direction changes generated in real life, and by using a low-dimensional image characteristic and a fast classifier, Real time processing can be enabled.

또한, 본 발명은 인간-컴퓨터 상호작용 응용을 위한 필수적인 기술로서, 제스처 기반 TV, 비디오 게임 콘솔 등 사용자와 TV 사이의 컨트롤을 필요로 하는 다양한 응용 분야에서 유용하게 사용될 수 있다.
In addition, the present invention is an essential technology for human-computer interaction applications, and may be usefully used in various applications requiring control between a user and a TV, such as a gesture-based TV and a video game console.

상술한 실시예들에서, 방법들은 일련의 단계 또는 블록으로서 순서도를 기초로 설명되고 있으나, 본 발명은 단계들의 순서에 한정되는 것은 아니며, 어떤 단계는 상술한 바와 다른 단계와 다른 순서로 또는 동시에 발생할 수 있다. 또한, 당해 기술 분야에서 통상의 지식을 가진 자라면 순서도에 나타난 단계들이 배타적이지 않고, 다른 단계가 포함되거나, 순서도의 하나 또는 그 이상의 단계가 본 발명의 범위에 영향을 미치지 않고 삭제될 수 있음을 이해할 수 있을 것이다.In the above-described embodiments, the methods are described on the basis of a flowchart as a series of steps or blocks, but the present invention is not limited to the order of the steps, and some steps may occur in different orders or simultaneously . It will also be understood by those skilled in the art that the steps depicted in the flowchart illustrations are not exclusive, that other steps may be included, or that one or more steps in the flowchart may be deleted without affecting the scope of the present invention. You will understand.

상술한 실시예는 다양한 양태의 예시들을 포함한다. 다양한 양태들을 나타내기 위한 모든 가능한 조합을 기술할 수는 없지만, 해당 기술 분야의 통상의 지식을 가진 자는 다른 조합이 가능함을 인식할 수 있을 것이다. 따라서, 본 발명은 이하의 특허청구범위 내에 속하는 모든 다른 교체, 수정 및 변경을 포함한다고 할 것이다.The above-described embodiments include examples of various aspects. While it is not possible to describe every possible combination for expressing various aspects, one of ordinary skill in the art will recognize that other combinations are possible. Accordingly, it is intended that the invention include all alternatives, modifications and variations that fall within the scope of the following claims.

Claims

Acquiring joint information related to a hand gesture using a depth camera;
Deriving direction element information based on spherical coordinates using the obtained joint information;
Generating a plurality of direction histograms using the derived direction element information;
Combining the generated plurality of direction histograms to generate a combining direction histogram; And
Deriving a final recognition result for the hand gesture by applying a random decision forest classifier to the generated combined direction histogram,
And the plurality of directional histograms have different quantization levels.