KR20000039192A

KR20000039192A - Gesture recognizing system and method by analyzing trace on polar coordinates

Info

Publication number: KR20000039192A
Application number: KR1019980054447A
Authority: KR
Inventors: 민병우; 양영규; 소정; 윤호섭
Original assignee: 정선종; 한국전자통신연구원
Priority date: 1998-12-11
Filing date: 1998-12-11
Publication date: 2000-07-05

Abstract

PURPOSE: A gesture recognizing system and method in accordance with the trace analysis on polar coordinates is provided to pursue the motion of hand by using the natural color information included in the hand and to recognize the obtained information as the dynamic pattern. CONSTITUTION: A gesture recognizing system and method in accordance with the trace analysis on polar coordinates comprises a hand region extractor(11), a key-gesture spotter(12) and a gesture recognizer(13). The hand region extractor(11) separates the hand region from the video image inputted continuously and extracts the dynamic gesture. The key-gesture spotter(12) separates the key-gesture which is the meaning gesture from the dynamic gesture. The gesture recognizer(13) extracts the command is meant by the key-gesture. A gesture identification method in accordance with the track analysis on polar coordinates comprises the steps of;inputting the hand motion gesture image which means the specific command;separating the hand region from the hand motion gesture image and obtaining a track of gesture;separating the key-gesture which is the meaning gesture from the track of gesture;and extracting the command is meant by the key-gesture.

Description

System and method of recognition of dynamic gesture on the polar coordinate space

본 발명은 극좌표 상의 궤적분석에 의한 동적 제스쳐 인식 시스템 및 방법에 관한 것으로서, 특히 동적 제스쳐의 궤적을 극좌표 상에 투영하여 이로부터 제스쳐의 동적 특성을 추출하고, 이 동적 특성을 이중 통계적 모델인 히든 마르코프 모델에서 학습하고 인신하도록 한 극좌표 상의 궤적분석에 의한 동적 제스쳐 인식 시스템 및 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a dynamic gesture recognition system and method by trajectory analysis on polar coordinates. Particularly, the dynamic characteristic of a gesture is extracted by projecting the trajectory of the dynamic gesture on polar coordinates, and the dynamic characteristic is hidden by the Markov as a double statistical model. The present invention relates to a dynamic gesture recognition system and method by trajectory analysis on polar coordinates to train and learn from a model.

종래의 제스쳐 인식 시스템은 손의 움직임을 전기적 신호로 바꾸어주는 데이터 글로브나 색상이 있는 장갑을 활용하고, 제스쳐의 동적 패턴을 시간적 정보를 고려하기 않고 단순히 정적 정보로 취급하기 때문에 제스쳐가 갖는 동적 특성을 이용할 수 없었다.Conventional gesture recognition system utilizes data gloves or colored gloves that convert hand movements into electrical signals, and handles the dynamic characteristics of gestures by treating the dynamic pattern of gestures as static information without considering temporal information. It was not available.

따라서, 본 발명은 상기와 같은 종래기술의 문제점을 해결하기 위하여 안출된 것으로서, 자연 환경에서 손이 지니는 자연스런 색상 정보를 활용하여 손의 움직임을 추적하고, 얻어진 정보를 동적 패턴에 적합하게 인식하는 극좌표 상의 궤적분석에 의한 동적 제스쳐 인식 시스템 및 방법을 제공하는 데 그 목적이 있다.Accordingly, the present invention has been made to solve the problems of the prior art as described above, polar coordinates to track the movement of the hand using the natural color information of the hand in the natural environment, and to recognize the obtained information suitably in the dynamic pattern The purpose of the present invention is to provide a dynamic gesture recognition system and method by trajectory analysis of images.

또한, 본 발명은 사람과 컴퓨터 혹은 각종 기기를 약속된 손의 제스쳐를 사용하여 접속하고 제어할 수 있도록 하여 접속기술을 유연화하고 지능화함으로써 사람과 컴퓨터의 친화성을 높이는 극좌표 상의 궤적분석에 의한 동적 제스쳐 인식 시스템 및 방법을 제공하는 데 다른 목적이 있다.In addition, the present invention provides a dynamic gesture by the trajectory analysis on the polar coordinates to increase the affinity of the person and the computer by making the connection technology flexible and intelligent by connecting and controlling the person and the computer or various devices using the gesture of the promised hand. Another object is to provide a recognition system and method.

도 1은 본 발명의 한 실시예에 따른 극좌표 상의 궤적분석에 의한 동적 제스쳐 인식 시스템을 도시한 도면,1 illustrates a dynamic gesture recognition system by trajectory analysis on polar coordinates according to an embodiment of the present invention;

도 2는 본 발명의 한 실시예에 따른 제스쳐 궤적의 평탄화 전, 후의 궤적을 도시한 도면,2 is a diagram illustrating a trajectory before and after planarization of a gesture trajectory according to an embodiment of the present invention;

도 3은 시공간 상에서 제스쳐 궤적 및 단계별 속도 정보를 이용한 제스쳐의 키-제스쳐 분리 상태를 도시한 도면,3 is a diagram illustrating a key-gesture separation state of a gesture using a gesture trajectory and step-by-step speed information in time and space;

도 4는 원 및 사각형 제스쳐가 극좌표 상에 투영된 궤적을 도시한 도면,4 shows a trajectory of projected circle and square gestures on polar coordinates;

도 5는 본 발명의 한 실시예에 따른 동적 제스쳐 인식 시스템에 활용되는 좌-우 히든 마르코프 모델을 도시한 도면,5 illustrates a left-right hidden Markov model utilized in a dynamic gesture recognition system according to an embodiment of the present invention;

도 6은 본 발명의 한 실시예에 따른 동적 제스쳐 인식 시스템에 활용되는 편집시스템의 사용자 인터페이스를 도시한 도면이다.6 is a diagram illustrating a user interface of an editing system utilized in a dynamic gesture recognition system according to an embodiment of the present invention.

※ 도면의 주요 부분에 대한 부호의 설명 ※※ Explanation of code about main part of drawing ※

11 : 손영역 추출기 12 : 키-제스쳐 분리기11: Hand Area Extractor 12: Key-Gesture Separator

13 : 제스쳐 인식기13: gesture recognizer

상기한 목적을 달성하기 위하여 본 발명에 따른 극좌표 상의 궤적분석에 의한 동적 제스쳐 인식 시스템은, 연속적으로 입력되는 비디오 영상으로부터 손영역을 분리하고 동적 제스쳐를 추출하는 손영역 추출기와,In order to achieve the above object, a dynamic gesture recognition system by trajectory analysis on polar coordinates according to the present invention includes a hand region extractor for separating a hand region from a continuously input video image and extracting a dynamic gesture;

상기 동적 제스쳐에서 의미있는 제스쳐인 키-제스쳐를 분리하는 키-제스쳐 분리기, 및A key-gesture separator for separating key-gestures that are meaningful gestures in the dynamic gestures, and

상기 키-제스쳐가 의미하는 명령어를 추출하는 제스쳐 인식기를 포함한 것을 특징으로 한다.And a gesture recognizer for extracting a command meaning the key-gesture.

또한, 본 발명에 따른 극좌표 상의 궤적분석에 의한 동적 제스쳐 인식 방법은, 특정 명령어를 의미하는 손동작 제스쳐 이미지가 입력되는 제1단계와,In addition, the dynamic gesture recognition method by trajectory analysis on the polar coordinates according to the present invention, the first step of inputting a hand gesture gesture image means a specific command;

상기 손동작 제스쳐 이미지로부터 손영역을 분리하고 제스쳐의 궤적을 구하는 제2단계,A second step of separating a hand region from the hand gesture gesture image and obtaining a trajectory of the gesture;

상기 제스쳐의 궤적 중 의미있는 제스쳐인 키-제스쳐를 분리하는 제3단계, 및A third step of separating a key gesture which is a meaningful gesture among the trajectories of the gesture, and

상기 키-제스쳐가 의미하는 명령어를 추출하는 제4단계를 포함한 것을 특징으로 한다.And a fourth step of extracting a command meaning the key-gesture.

또한, 본 발명은, 컴퓨터에,In addition, the present invention, in a computer,

특정 명령어를 의미하는 손동작 제스쳐 이미지가 입력되면, 상기 손동작 제스쳐 이미지로부터 손영역을 분리하고 제스쳐의 궤적을 구하는 제1단계,A first step of separating a hand region from the hand gesture gesture image and obtaining a trace of the gesture when a hand gesture gesture image representing a specific command is input;

상기 제스쳐의 궤적을 선행-제스쳐, 키-제스쳐, 및 후행-제스쳐로 분리하여 의미있는 손동작인 키-제스쳐를 추출하는 제2단계,A second step of extracting a key gesture which is a meaningful hand gesture by separating the trajectory of the gesture into a pre-gesture, a key gesture, and a trailing-gesture;

상기 키-제스쳐를 극좌표 (ρ,φ) 의 파라메터 공간상으로 투영하고, 군집화 알로그즘에 적용하여 키-제스쳐의 특징을 정량화하는 제3단계,Polar coordinates of the key-gesture (ρ, φ) A third step of projecting onto the parameter space of and applying the clustering algorithm to quantify the characteristics of the key-gesture,

상기 정량화된 키-제스쳐의 동적 특징을 분할하여 히든 마르코프 모델에 적용하여 학습 및 인식모델을 구성하여, 상기 키-제스쳐가 의미하는 명령어를 추출하는 제4단계, 및A fourth step of dividing the dynamic feature of the quantized key-gesture and applying it to a hidden Markov model to construct a learning and recognition model to extract a command meaning of the key-gesture; and

상기 명령어를 실행하는 제5단계를 실행시키는 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체를 제공한다.A computer-readable recording medium having recorded thereon a program for executing the fifth step of executing the command is provided.

이하, 첨부된 도면을 참조하여 본 발명의 실시예를 설명하면 다음과 같다.Hereinafter, exemplary embodiments of the present invention will be described with reference to the accompanying drawings.

도 1은 본 발명의 한 실시예에 따른 극좌표 상의 궤적분석에 의한 동적 제스쳐 인식 시스템을 도시한 도면이고, 도 2는 본 발명의 한 실시예에 따른 제스쳐 궤적의 평탄화 전, 후의 궤적을 도시한 도면이며, 도 3은 시공간 상에서 제스쳐 궤적 및 단계별 속도 정보를 이용한 제스쳐의 키-제스쳐 분리 상태를 도시한 도면이고, 도 4는 원 및 사각형 제스쳐가 극좌표 상에 투영된 궤적을 도시한 도면이며, 도 5는 본 발명의 한 실시예에 따른 동적 제스쳐 인식 시스템에 활용되는 좌-우 히든 마르코프 모델을 도시한 도면이고, 도 6은 본 발명의 한 실시예에 따른 동적 제스쳐 인식 시스템에 활용되는 편집시스템의 사용자 인터페이스를 도시한 도면이다.1 is a diagram illustrating a dynamic gesture recognition system by trajectory analysis on polar coordinates according to an embodiment of the present invention, and FIG. 2 is a diagram illustrating a trajectory before and after flattening the gesture trajectory according to an embodiment of the present invention. 3 is a diagram illustrating a key-gesture separation state of a gesture using a gesture trajectory and step-by-step velocity information in space-time, FIG. 4 is a diagram illustrating a trajectory of a circle and a rectangular gesture projected on polar coordinates, and FIG. 5. Is a diagram illustrating a left-right hidden Markov model utilized in a dynamic gesture recognition system according to an embodiment of the present invention, and FIG. 6 is a user of an editing system utilized in a dynamic gesture recognition system according to an embodiment of the present invention. It is a figure which shows an interface.

본 발명에 따른 극좌표 상의 궤적분석에 의한 동적 제스쳐 인식 시스템은 도 1에 도시된 바와 같이 손영역 추출기(11)와, 키-제스쳐 분리기(12), 및 제스쳐 인식기(13)로 구성된다.The dynamic gesture recognition system by trajectory analysis on polar coordinates according to the present invention includes a hand region extractor 11, a key-gesture separator 12, and a gesture recognizer 13 as shown in FIG. 1.

상기한 손영역 추출기(11)는 연속적으로 입력되는 비디오 영상으로부터 손 영역을 분리한다. 비디오 영상으로부터 물체를 추적하는 연구는 영상처리분야에서 활발이 진행되어 왔으나, 모든 경우에 일반적으로 적용될 수는 없으며, 연구환경에 적합한 알고리즘을 적용하는 것이 보통이다.The hand region extractor 11 separates the hand region from the continuously input video image. The research on tracking an object from a video image has been actively conducted in the field of image processing, but it is not generally applicable in all cases, and it is common to apply an algorithm suitable for the research environment.

움직이는 물체를 추적하기 위해서는 비디오 영상의 각 프레임 사이의 차영상(differential image) 정보에 의해 이동 물체를 추적하거나, 이동하는 정보의 속성정보 즉 색상, 위치, 크기 등에 관한 사전지식(a priori knowledge)을 가지고 있는 경우에 이들의 정보에 의해 이동물체를 추적하는 방법, 혹은 전자와 후자를 동시에 이용하는 방법이 사용된다.In order to track a moving object, a moving object is tracked by differential image information between frames of a video image, or a priori knowledge about color, position, size, etc. of attribute information of moving information is obtained. If so, a method of tracking the moving object based on these information, or using the former and the latter simultaneously.

즉, 이러한 손영역 추출기(11)는 손이 갖는 색상정보와 손의 크기, 그리고 손의 이동에 대한 사전 지식을 적용하여 손 영역을 연속적으로 추출하여 제스쳐의 궤적을 구하고, 키 제스쳐 분리기(12)는 상태 변환 정보에 의해 키-제스쳐를 분리한다.That is, the hand region extractor 11 extracts the hand region continuously by applying the prior knowledge about the color information, the size of the hand, and the movement of the hand, to obtain the trajectory of the gesture, and the key gesture separator 12. Separates key-gestures by state transition information.

손 영역을 추출하기 위하여 손이 갖는 색상의 모델을 구해야 하는데, 손영역 추출기(11)로 입력되는 RGB 제스쳐 이미지는 YIQ 영상정보로 변환되며, 이중에서 색상 정보를 나타내는 I-Q 정보를 추출하여 모델, 즉 Look-Up 테이블을 만든다. 입력되는 제스쳐 이미지에서 손 영역을 추출하기 위해서는 손의 모델과 유사한 색상을 갖는 영역을 찾아야 한다. 이때, 상기 제스쳐 이미지를 화소 단위로 처리하면 처리비용이 많이 들기 때문에 일정한 크기의 셀(cell)로 나누어 처리한다. 즉, 한 셀의 크기를 4 x 4 pixels로 하며, 셀 내의 각 화소의 색상정보가 구축된 Look-Up 테이블 내에 등록되어 있는 지를 검사하여 한 셀이 모델 색상과 닮은 정도, 즉 유사도(similarity)를 수학식 1을 이용하여 구한다.In order to extract the hand region, a model of the color of the hand should be obtained. The RGB gesture image inputted to the hand region extractor 11 is converted into YIQ image information, and the IQ information representing the color information is extracted from the model, that is, Create a Look-Up table. In order to extract a hand region from an input gesture image, an area having a color similar to that of a hand model must be found. In this case, if the gesture image is processed in pixel units, the processing cost is high, so that the gesture image is divided into cells having a predetermined size. That is, the size of one cell is 4 x 4 pixels, and the color information of each pixel in the cell is registered in the built-up look-up table, and the degree of similarity to the model color, that is, the similarity (similarity) is determined. It is obtained using Equation 1.

상기의 수학식 1에서 H^m은 모델의 히스토그램 값이며, H_k는 입력영상 중에서 k번째 셀의 히스토그램 값이다. S_k는 입력 영상 중 k 번째 셀의 손 모델과의 유사도로서, 0과 1사이의 값으로 나타나며, 계산된 값이 일정 크기 이상의 값을 갖지 않을 때, 즉 S_k＜ S_th일 경우에는, 잡음(noise)으로 고려하여 유사도를 0으로 만들어 제거한다.In Equation 1, H ^m is a histogram value of the model, and H _k is a histogram value of the k-th cell of the input image. S _k is the similarity with the hand model of the k-th cell of the input image, which is expressed as a value between 0 and 1, and when the calculated value does not have a certain value or more, that is, when S _k <S _th , Considering (noise) makes the similarity zero and removes it.

잡음을 제거한 후에 후보 영역의 팽창(dilation), 이치화(binarization), 남아있는 후보 영역 중에 위치 및 크기 정보와 같은 사전 지식을 이용하여 하나의 손 영역을 추출한다.After removing the noise, one hand region is extracted by using prior knowledge such as dilation, binarization of the candidate region, and position and size information among the remaining candidate regions.

이와 같이 추출된 손 영역의 정보는 손의 모양에 관한 정보를 포함하고 있다. 일반적으로 손이 움직이는 동안에는 손 모양의 변화는 의미를 갖지 않는 것으로 알려져 있으며, 본 발명에서 대상으로 하는 제스쳐의 어휘는 궤적 자체가 형상화된 모양이기 때문에 이동궤적의 추적 및 분석에 의해 제스쳐를 인식할 수 있다. 따라서, 손 영역의 중심점을 하나의 토큰으로 고려할 때 제스쳐의 궤적은 중심점 토큰들의 집합으로 나타난다.The extracted hand region information includes information regarding the shape of the hand. In general, it is known that a change in the shape of a hand while the hand is moving has no meaning, and since the vocabulary of the gesture of the present invention is a shape of the trajectory itself, the gesture can be recognized by tracking and analyzing the movement trajectory. have. Therefore, when considering the center point of the hand area as a token, the gesture trajectory is represented as a set of center point tokens.

이 중심 좌표를 그대로 사용하면 손 영역 추출의 상태가 안정적이지 못할 때 실제의 궤적과는 달리 궤적의 출력상태가 도중에 급격한 변화를 나타내는 경우가 발생하기 때문에 이를 완화시키기 위해 수학식 2에 적용하여 평탄화 단계를 거친다. 도 2의 (a)는 중심좌표의 추적만으로 된 추출된 궤적의 평탄화 단계 이전의 상태를 도시한 도면이고, 도 2의 (b)는 평탄화 단계 이후의 상태를 도시한 도면이다.If this center coordinate is used as it is, when the state of hand region extraction is not stable, unlike the actual trajectory, the output state of the trajectory may show a sudden change on the way. Go through. FIG. 2A is a view showing a state before the flattening step of the extracted trajectory only by tracking the center coordinates, and FIG. 2B is a view showing a state after the flattening step.

이와 같이 구하여진 궤적은 시-공간상에서의 제스쳐 궤적이 2차원 공간상에서 시간정보를 갖는 점들의 토큰 집합으로 나타난다. 따라서, 우리가 실제로 얻을 수 있는 제스쳐의 정보는 수학식 3과 같이 표시되며, 이는 제스쳐가 궤적 좌표 점들의 토큰 집합으로 나타남을 의미한다. 이렇게 구해진 제스쳐 정보는 시간의 변화에 따른 궤적의 정보이기 때문에 시간 변화에 따른 절대좌표, 이동속도 및 각속도, 이동방향, 이동된 절대거리 등의 동적 정보를 포함하고 있으며, 이 정보들은 동적 제스쳐의 특성요소로서 인식에 활용된다.The trajectory thus obtained is represented as a set of tokens of the points having the time information in the two-dimensional space. Therefore, the information of the gesture that we can actually obtain is expressed as Equation 3, which means that the gesture is represented as a token set of trajectory coordinate points. Since the gesture information obtained is the information of the trajectory according to the change of time, it contains dynamic information such as the absolute coordinate, the moving speed and the angular velocity, the moving direction, and the moved absolute distance according to the change of time, and the information includes the characteristics of the dynamic gesture. It is used for recognition as an element.

다음, 키-제스쳐 분리기(spotter)에 대하여 살펴보면, 넓은 의미에서의 제스쳐는 사람이 생성하는 모든 손동작을 포함하나 우리가 다루고자 하는 제스쳐는 그 범위를 좁혀 의미를 갖는 동작만을 뜻한다. 의미있는 제스쳐는 사람의 손동작 중에서 일부분을 차지하기 때문에 이를 분리하여 인식하여야 한다. 사람이 갖고 있는 제스쳐 인식 메커니즘은 의미있는 부분과 의미없는 부분이 혼재되어 있는 경우에도 의미있는 키-제스쳐 만을 매우 유연하게 추출하여 인식할 수 있다. 즉, 의미 없는 손동작은 그 형태가 다양하고 변화의 폭이 커서 패턴으로서 정의하여 학습하고 인식하기가 어렵고, 제한적인 범주 내에서 학습이 가능한 경우에도 의미없는 손동작으로 인하여 키-제스쳐의 인식결과에 나쁜 영향을 주기 때문에 키-제스쳐 즉, 의미있는 손동작을 분리하여 인식하는 것이 중요하다.Next, when we look at the key-gesture separator, the gesture in the broadest sense includes all the hand gestures that a person creates, but the gesture we want to deal with refers only to actions that have a narrow meaning. Meaningful gestures make up part of a human's hand movements, so they must be recognized separately. Human gesture recognition mechanism can extract and recognize only meaningful key-gestures very flexibly even when meaningful and meaningless parts are mixed. In other words, meaningless hand gestures are difficult to learn and recognize by defining them as patterns with varying forms and a wide range of changes, and they are bad for recognition results of key-gestures due to meaningless hand gestures even when learning within a limited category is possible. It is important to recognize key-gestures, ie, meaningful hand gestures, separately.

심리학자인 Kendon 등의 분석에 의하면 하나의 제스쳐를 수행하는 동작은 일반적으로 세 단계, 즉 준비(preparation), 핵심(nucleus, peak, or stroke), 복귀(retraction) 단계로 분리됨을 알 수 있다.According to the analysis of psychologist Kendon et al., One gesture is generally divided into three stages: preparation, nucleus, peak, or stroke, and retraction.

또한, Queck 등은 이러한 분리 개념에 바탕을 두고 다음과 같은 규칙을 제안하였다. 첫째, 제스쳐는 정지 상태로부터 천천히 움직이기 시작하여 속도가 점차적으로 증가되는 상태를 지나 다시 정지 상태에서 끝을 맺는다. 둘째, 손은 스토로크 단계에서 특정한 모양으로 가정된다. 셋째, 정지 상태에서 저속은 움직임이 아니다. 넷째, 손동작에 의해 만들어지는 제스쳐는 일정한 공간영역에 형성된다. 다섯째, 정적인 제스쳐는 인식을 위해 일정한 시간을 요구한다. 여섯째, 반복적인 움직임도 제스쳐가 될 수 있다. 또한, 몇 가지 예외는 있지만 손가락의 움직임은 손이 정지상태에 있을 때 의미를 갖는다.Also, based on this separation concept, Queck et al. Proposed the following rules. First, the gesture starts to move slowly from the stop state, passes through a state where the speed is gradually increased, and ends at the stop state again. Second, the hand is assumed to be of a particular shape at the stroke stage. Third, low speed is not a motion in the stationary state. Fourth, the gesture made by the hand gesture is formed in a certain spatial region. Fifth, static gestures require a certain amount of time for recognition. Sixth, repetitive movements can also be gestures. There are also some exceptions, but the movement of the finger makes sense when the hand is at rest.

키-제스쳐 분리기(12)는 상기와 같은 이론에 기초하여, 제스쳐를 세가지 즉, 선행 제스쳐(pre-gesture)와, 키-제스쳐(key-gesture), 그리고 후행-제스쳐(post-gesture)로 분류하고, 각 단계의 제스쳐 사이에는 짧은 쉼 동작이 있다는 가설을 설정하여 키-제스쳐를 분리함으로서 제스쳐 인식기(13)에서 키-제스쳐 만을 인식하도록 한다.Based on the above theory, the key-gesture separator 12 classifies gestures into three types: pre-gesture, key-gesture, and post-gesture. Then, the hypothesis that there is a short pause action between gestures in each step is set so that the key recognizer 13 recognizes only the key gesture by separating the key gesture.

즉, 키-제스쳐 분리기(12)는 위 세단계의 제스쳐 사이에서 수학식 4와 같이 속도의 변화가 거의 없는 상태가 임계치 이상 계속되는 부분을 제스쳐의 시작 부분과 끝 부분에서 찾아서 이 시점들을 t_pre및 t_post라고 한다. 이때, 키-제스쳐는 수학식 5에 의해 분리된 부분의 길이가 임계값 이상일 때 결정된다.In other words, the key-gesture separator 12 finds a portion in which the state where there is almost no change in speed between the three gestures above at the beginning and the end of the gesture is found at the beginning and the end of the gesture, and the time points t _pre and It is called t _post . At this time, the key-gesture is determined when the length of the separated part by Equation 5 is greater than or equal to the threshold value.

도 3은 원(circle) 제스쳐를 일 예로 하여 위치좌표가 시간축 상에서 나타나는 모양을 도시한 도면이며, 이 제스쳐는 선행-제스쳐 단계와, 키-제스쳐 단계, 및 후행-제스쳐 단계로 나누어지고, 단계가 변하는 부분 사이에 제스쳐의 움직임이 거의 없는 상태가 잠시 존재하고, 이 부분에서 키-제스쳐가 분리(spatting)된다.FIG. 3 is a diagram illustrating a shape in which a position coordinate appears on a time axis by using a circle gesture as an example. The gesture is divided into a pre-gesture step, a key-gesture step, and a post-gesture step. There is a moment in which there is little movement of the gesture between the changing parts, in which the key-gesture is separated.

이 키-제스쳐 분리기는 속도정보를 지속적으로 검색하여 속도의 변화가 거의 없는 상태가 연속적으로 나타나는 부분에서 스파팅하여 키-제스쳐를 분리한다.The key-gesture separator continuously retrieves the velocity information and separates the key-gesture by spattering in the part where the state of the speed change is almost continuous.

다음, 마지막으로 제스쳐 인식기(13)에 대하여 설명하면, 제스쳐의 궤적 정보 분석은 궤적의 동적인 정보를 추출하고 이를 정량화하여 히든 마르코프 모델 상에서 학습하고 인식한다. 얻어진 궤적 정보는 시간에 따른 손 위치 정보의 토큰들이며, 시간에 따른 위치정보와 함께 방향 및 속도 변위 등의 정보를 포함하고 있다.Next, finally, the gesture recognizer 13 will be described. The gesture trajectory information analysis extracts and quantifies the dynamic information of the trajectory to learn and recognize on the Hidden Markov model. The obtained trajectory information is tokens of hand position information over time, and includes information such as direction and velocity displacement along with time position information.

그러나, 이러한 궤적 정보는 3차원 상의 변위를 고려하지 않은 카메라 앞, 즉 가상의 2차원 공간 위에서 생성되는 궤적으로 크기 및 모양의 변위가 심하기 때문에 이들을 흡수할 수 있는 정규화가 필요하다. 이러한 정규화를 위하여 2차원 평면상의 제스쳐 위치정보를 극좌표 (ρ,φ) 의 파라메터 공간 상으로 투영(mapping)한 후, 극좌표 (ρ,φ) 공간에서의 특징들을 정량화한다. 이렇게 하는 이유는, (ρ,φ) 값이 제스쳐의 중심점으로부터 궤적 변화의 정보를 잘 반영하면서 크기 변화의 정량화가 용이하기 때문이며, 이에 변환을 위해 수학식 6을 사용한다.However, since the trajectory information is severely displaced in size and shape to the trajectories generated in front of the camera that does not consider the three-dimensional displacement, that is, on the virtual two-dimensional space, normalization capable of absorbing them is necessary. Polar coordinates of gesture position information on a two-dimensional plane (ρ, φ) Polar coordinates after projecting onto the parameter space of (ρ, φ) Quantify features in space. The reason for this is (ρ, φ) This is because the value easily reflects the information of the trajectory change from the center point of the gesture and the quantification of the size change is easy, and thus Equation 6 is used for the conversion.

위의 식에서 는 제스쳐 궤적의 중심 좌표를 나타내며, r_max는 제스쳐 궤적의 위치 중 중심점에서 가장 먼 곳의 반경이다. θ 는 기준선으로부터의 각도를 나타낸다. 제스쳐가 처음 시작되는 위치를 유연하게 처리하기 위해 대각선 방향으로 π/2 마다 기준선을 둔다. 즉, 즉 전 구간에 4개의 기준선을 기점으로 각도를 계산한다. 이는 제스쳐를 시작하는 시점이 사람마다 차이가 있기 때문에 동일한 구간에서 시작하는 제스쳐가 시작점에서 약간의 차이가 나는 경우에도 기준선을 동일하게 고려함으로써 유연성을 향상시키기 위함이다.In the above expression Is the center coordinate of the gesture trajectory, and r _max is the radius of the furthest point from the center point of the position of the gesture trajectory. θ Represents the angle from the baseline. A baseline is placed every π / 2 in the diagonal direction to flexibly handle where the gesture first starts. That is, the angle is calculated based on four reference lines for all sections. This is to improve flexibility by considering the same baseline even when the gesture starting in the same section has a slight difference at the starting point because the starting point of the gesture is different for each person.

도 4는 원(circle)과 사각형(squqre)을 나타내는 제스쳐가 ρ-φ 극좌표 공간으로 투영한 결과를 보여주고 있다. ρ-φ 극좌표 공간 상에 투영된 제스쳐의 토큰들은 LBG 알고리즘에 의해 군집화되어 코드워드인 심볼로 변환한다. 이 ρ-φ 특징 공간을 2^M개로 분할하여 도 5에 도시된 히든 마르코프 모델의 입력정보로 활용하며, 여기서 활용된 히든 마르코프 모델의 종류는 처리의 효율화를 위해 정보흐름의 전진(forward)만을 허용하는 좌-우 모델로서 활용된다.4 shows a gesture representing a circle and a square ρ-φ It shows the result of projection in polar space. ρ-φ The tokens of the gesture projected on the polar space are clustered by the LBG algorithm and converted into symbols that are codewords. this ρ-φ The feature space is divided into ^2M pieces and used as input information of the Hidden Markov model shown in FIG. 5, where the type of Hidden Markov model utilized is left-right to allow only forward of information flow for efficient processing. It is used as a model.

본 발명은 도 6에 도시된 바와 같은 편집시스템의 사용자 인터페이스를 제공한다.The present invention provides a user interface of the editing system as shown in FIG.

상기와 같은 본 발명은 컴퓨터로 읽을 수 있는 기록매체에 기록되고, 컴퓨터에 의해 처리된다.The present invention as described above is recorded on a computer-readable recording medium and processed by a computer.

이상과 같이 본 발명에 의하면, 가전제품의 원격제어 혹은 컴퓨터나 각종 산업기기들을 제스쳐로 제어할 수 있기 때문에 기계의 인간 친화도가 상승되는 효과가 있다.As described above, according to the present invention, since the remote control of the home appliance or the computer or various industrial devices can be controlled by the gesture, the human friendliness of the machine is increased.

Claims

A hand region extractor for separating a hand region and extracting a dynamic gesture from a continuously input video image;

A key-gesture separator for separating key-gestures that are meaningful gestures in the dynamic gestures, and

And a gesture recognizer for extracting the command meaning of the key-gesture.

The method of claim 1, wherein the gesture recognizer,

Project the key-gesture onto the parameter space in polar coordinates,

ρ-φ

A dynamic gesture recognition system using a trajectory analysis on polar coordinates by dividing a feature space and applying it to a hidden Markov model, and extracting a symbol that is a codeword by applying the gestures projected on the polar coordinates to an LBG algorithm.

A first step of inputting a gesture gesture image representing a specific command;

A second step of separating a hand region from the hand gesture gesture image and obtaining a trajectory of the gesture;

A third step of separating a key gesture which is a meaningful gesture among the trajectories of the gesture, and

And a fourth step of extracting a command meaning the key-gesture.

The method of claim 3, wherein the second step,

A first sub step of converting the hand gesture gesture image input as RGB information into YIQ image information;

A second sub step of creating a lookup table using I-Q information among the YIQ image information;

A third sub step of dividing the hand gesture gesture image by a predetermined cell unit;

A fourth sub-step of extracting the center point of the hand region as a token by examining the similarity between the color information of the arbitrary cell stored in the lookup table and the model color, and

And a fifth sub-step of obtaining a gesture trajectory in space-time by repeating the second sub-step to the fourth sub-step on the continuous hand gesture gesture image.

The method of claim 4, wherein the second step,

And a sixth sub-step of flattening the extracted gesture trajectory.

The method of claim 3, wherein the third step,

A first sub-step of extracting a portion (t _pre , t _post ) in which a state where there is almost no change in speed continues at a threshold or more at the beginning and the end of the trajectory of the gesture, respectively;

And a second sub-step of _pre -gesturing a part before t _pre based on the extracted two parts, key-gesture a part between t _pre and t _post , and extracting a part after t _post as a post-gesture. Dynamic gesture recognition method by trajectory analysis on polar coordinates.

The method of claim 3, wherein the fourth step,

Polar coordinates of the key-gesture

(ρ, φ)

The first sub-projection onto the parameter space of

The polar coordinates

(ρ, φ)

A second sub-step of quantifying the feature of the key-gesture in space, and

And a third sub-step of dividing the dynamic features of the quantified key-gestures and applying them to a hidden Markov model to construct a learning and recognition model.

The method of claim 7, wherein the second sub-step,

Dynamic gesture recognition method by trajectory analysis on polar coordinates, characterized in that the step of quantifying the characteristics of the key-gesture by the statistical method using the LBG algorithm.

On your computer,

A first step of separating a hand region from the hand gesture gesture image and obtaining a trace of the gesture when a hand gesture gesture image representing a specific command is input;

A second step of extracting a key gesture which is a meaningful hand gesture by separating the trajectory of the gesture into a pre-gesture, a key gesture, and a trailing-gesture;

Polar coordinates of the key-gesture

(ρ, φ)

A third step of projecting onto the parameter space of and applying the clustering algorithm to quantify the characteristics of the key-gesture,

A fourth step of dividing the dynamic feature of the quantized key-gesture and applying it to a hidden Markov model to construct a learning and recognition model to extract a command meaning of the key-gesture; and

A computer-readable recording medium having recorded thereon a program for executing the fifth step of executing the command.