KR102437979B1

KR102437979B1 - Apparatus and method for interfacing with object orientation based on gesture

Info

Publication number: KR102437979B1
Application number: KR1020220022982A
Authority: KR
Inventors: 채규열; 유성은
Original assignee: 주식회사 마인드포지
Priority date: 2022-02-22
Filing date: 2022-02-22
Publication date: 2022-08-30
Also published as: KR20230126150A; KR20230126158A

Abstract

A gesture-based object-oriented interfacing method of the present invention comprises: an input step of receiving image data from a camera device; an object recognition step of parsing the image data and recognizing an object; and an execution control step of controlling a specific command to be executed according to an event generated by the object. The object recognition step includes: a vertical object recognition step of recognizing a first vertical object having a shape extending more than a first reference length in a vertical direction from the image data; a point object recognition step of recognizing a point object located on either the left side or the right side of a first vertical line corresponding to the first vertical object; and an event detection step of detecting a first event in which the point object moves to the opposite side with respect to the first vertical line. In the execution control step, when the first event is detected, a command mapped to the first event is controlled to be executed. The object recognition rate can be dramatically improved.

Description

Gesture-based object-oriented interfacing method and device

본 발명은 제스처(gesture) 기반의 인터페이싱 방법 등에 관한 것으로서, 더욱 구체적으로는 인식의 대상이 되는 객체의 절대적 이동 및 상대적 이동의 상호 관계를 유기적으로 접목시킴으로써 사용자와 전자장치 사이의 인터페이싱(interfacing) 또는 인터렉션(interaction)을 더욱 명확하고 정교하게 구현할 수 있는 제스처 기반의 객체 지향적 인터페이싱 방법 등에 관한 것이다. The present invention relates to a gesture-based interfacing method and the like, and more particularly, by organically grafting the mutual relationship between absolute and relative movement of an object to be recognized, interfacing between a user and an electronic device or It relates to a gesture-based object-oriented interfacing method that can implement an interaction more clearly and precisely.

사용자의 높아진 니즈(needs)를 충족하고 사용자 편의성 및 장치 활용성을 더욱 높이기 위하여 포즈, 행위, 몸짓, 신체부위(손, 팔 등)의 특정 모양이나 움직임 등(이하 '제스처'라 지칭한다)을 영상매칭, 인공 지능, 딥러닝, 심층 신경망(DNN) 모델 등으로 인식하고 이를 기반으로 전자장치와의 인터페이싱을 제어하는 제스처 인식 기술이 널리 적용되고 있다.In order to meet the user's heightened needs and to further enhance user convenience and usability of the device, poses, actions, gestures, and specific shapes or movements of body parts (hands, arms, etc.) (hereinafter referred to as 'gestures') Gesture recognition technology that recognizes image matching, artificial intelligence, deep learning, and deep neural network (DNN) models and controls interfacing with electronic devices based on this is widely applied.

이러한 제스처 인식 기술은 비접촉식 방법이 가지는 기본적인 특장점은 물론, 인터페이싱을 위한 추가적인 수단이 필요하지 않으며 원격 거리에서도 구현되는 장점을 가지므로 전자장치의 범주나 종류에 제한됨이 없이 확장적으로 적용되고 있다.This gesture recognition technology has the advantage of not requiring additional means for interfacing as well as the basic features of the non-contact method, and has the advantage of being implemented even at a remote distance.

종래 기술들을 살펴보면, 손이나 팔의 전체적인 이동만을 감지하여 인터페이싱에 적용하는 간단한 기술부터 카메라장치로부터 생성되는 영상데이터를 대상으로 상당히 복잡한 객체 인식 알고리즘 체계를 적용하여 섬세한 동작이나 형상적 특징 등을 인식하는 기술들까지 다양한 실시 형태가 개시되고 있다.Looking at the prior art, from a simple technique that detects only the overall movement of a hand or arm and applies it to interfacing, a fairly complex object recognition algorithm system is applied to image data generated from a camera device to recognize delicate motions or shape features. Various embodiments up to techniques are disclosed.

그러나 후자의 기술들은 한국등록특허공보 10-1785650호와 같이 휴먼객체의 유니크하고 복잡한 제스처의 정합성에 기초하거나 복잡한 인식 알고리즘이 적용되므로 객체의 형상적 특징이 정밀하게 인식되지 않는다면 제스처 인식의 명확성이 확보되기 어렵고 인터페이싱의 응답 지연, 작동 중지, 오동작 등이 빈번하게 발생될 수 있음은 물론, 연산처리에서도 즉시 응답성을 구현하기가 어렵다고 할 수 있다. However, since the latter techniques are based on the consistency of unique and complex gestures of human objects as in Korea Patent Publication No. 10-1785650, or because complex recognition algorithms are applied, the clarity of gesture recognition is ensured if the shape features of the object are not precisely recognized. It is difficult to do so, and interfacing response delay, operation stop, malfunction, etc. may occur frequently, and it can be said that it is difficult to implement immediate responsiveness in operation processing.

물론, 분해능이 높은 고가의 하드웨어 리소스(센서, CPU, 카메라 등)가 충분히 뒷받침된다면 이러한 문제가 일부 해소될 수 있을지 모르나, 트레이드-오프(trade-off) 관계에 의하여 이러한 방법은 도리어 제스처 인식 기술의 범용성 및 접근성을 제한하는 또 다른 본질적인 문제를 야기할 수 있다.Of course, some of these problems may be solved if expensive hardware resources with high resolution (sensor, CPU, camera, etc.) are sufficiently supported. It can lead to another intrinsic problem that limits its versatility and accessibility.

또한, 종래 기술의 경우, 분해능 등의 향상을 위하여 사용자에게 익숙하지 않는 생소한 제스처가 주로 적용되므로 목적의식을 가지고 지속적으로 학습하지 않는 한 사용자가 직관적으로 적응하기 어려워 사용자 편의성이 낮다고 할 수 있다.In addition, in the case of the prior art, since unfamiliar gestures are mainly applied to the user to improve resolution, etc., it can be said that the user convenience is low because it is difficult for the user to intuitively adapt unless continuously learning with a sense of purpose.

한국등록특허공보 10-1785650호(2017.09.29.)Korean Patent Publication No. 10-1785650 (2017.09.29.) 한국등록특허공보 10-1812605호(2017.12.20)Korean Patent Publication No. 10-1812605 (2017.12.20) 한국등록특허공보 10-2320754호(2021.10.27.)Korean Patent Publication No. 10-2320754 (October 27, 2021)

본 발명은 상기와 같은 배경에서 상술된 문제점을 해결하기 위하여 창안된 것으로서, 사용자에게 익숙한 제스처에 기반하여 직관적 인식에 의한 사용자 편의성을 최적화시키고, 개별객체의 상대적 위치관계에 대한 동적(動的) 변화를 객체 인식에 효과적으로 접목시킴으로써 연산 효율성은 물론, 객체 인식률을 비약적으로 향상시킬 수 있는 제스처 기반의 객체 지향적 인터페이싱 방법 등을 제공하는데 그 목적이 있다.The present invention was devised to solve the above-mentioned problems in the background as described above, and optimizes user convenience by intuitive recognition based on a gesture familiar to the user, and dynamically changes the relative positional relationship of individual objects. The purpose is to provide a gesture-based object-oriented interfacing method that can dramatically improve computational efficiency as well as object recognition rate by effectively grafting it to object recognition.

본 발명의 다른 목적 및 장점들은 아래의 설명에 의하여 이해될 수 있으며, 본 발명의 실시예에 의하여 보다 분명하게 알게 될 것이다. 또한, 본 발명의 목적 및 장점들은 특허청구범위에 나타난 구성과 그 구성의 조합에 의하여 실현될 수 있다.Other objects and advantages of the present invention may be understood by the following description, and will be more clearly understood by the embodiments of the present invention. In addition, the objects and advantages of the present invention can be realized by the configuration shown in the claims and the combination of the configuration.

상기 목적을 달성하기 위한 본 발명의 일 실시예에 의한, 제스처 기반의 객체 지향적 인터페이싱 방법은 카메라장치로부터 영상데이터를 입력받는 입력단계; 상기 영상데이터를 파싱하여 객체를 인식하는 객체인식단계; 및 상기 객체가 발생시키는 이벤트에 따라 특정 커맨드가 실행되도록 제어하는 실행제어단계를 포함한다.According to an embodiment of the present invention for achieving the above object, a gesture-based object-oriented interfacing method includes an input step of receiving image data from a camera device; an object recognition step of recognizing an object by parsing the image data; and an execution control step of controlling a specific command to be executed according to an event generated by the object.

구체적으로 상기 객체인식단계는 상기 영상데이터에서 수직 방향을 기준으로 제1기준길이 이상 연장된 형상을 가지는 제1버티컬객체를 인식하는 버티컬객체인식단계; 상기 제1버티컬객체에 대응되는 제1수직라인을 기준으로 좌측 또는 우측 중 일측에 위치하는 포인트객체를 인식하는 포인트객체인식단계; 및 상기 포인트객체가 상기 제1수직라인을 기준으로 반대측으로 이동하는 제1이벤트를 검출하는 이벤트검출단계를 포함할 수 있으며, 이 경우 본 발명의 상기 실행제어단계는 상기 제1이벤트가 검출되는 경우 상기 제1이벤트에 맵핑된 커맨드를 실행하도록 구성된다.Specifically, the object recognition step may include: a vertical object recognition step of recognizing a first vertical object having a shape extending more than a first reference length based on a vertical direction in the image data; a point object recognition step of recognizing a point object located on one of the left and right sides with respect to a first vertical line corresponding to the first vertical object; and an event detecting step of detecting a first event in which the point object moves to the opposite side with respect to the first vertical line. In this case, the execution control step of the present invention is performed when the first event is detected. and execute a command mapped to the first event.

또한, 본 발명의 상기 버티컬객체인식단계는 상기 제1수직라인의 좌측 또는 우측 중 상기 포인트객체의 반대측 방향에 위치한 객체로서, 수직 방향을 기준으로 연장된 형상을 가지되, 상기 제1기준길이보다 작은 길이를 가지는 제2버티컬객체를 더 인식하도록 구성될 수 있다.In addition, the step of recognizing the vertical object of the present invention is an object located on the opposite side of the point object among the left or right sides of the first vertical line, and has a shape extending in the vertical direction, more than the first reference length. It may be configured to further recognize a second vertical object having a small length.

이 경우 본 발명의 상기 이벤트검출단계의 상기 제1이벤트는 상기 포인트객체의 위치정보가 상기 제2버티컬객체가 형성하는 제2수직라인에 대응되는 경우로 설정되는 것이 바람직하다.In this case, it is preferable that the first event of the event detection step of the present invention is set to a case where the location information of the point object corresponds to a second vertical line formed by the second vertical object.

바람직하게, 본 발명의 이벤트검출단계는 상기 포인트객체의 위치정보가 상기 제2수직라인의 상위 지점에 대응되는 제1서브이벤트 또는 상기 포인트객체의 위치정보가 상기 제2수직라인의 하위 지점에 대응되는 제2서브이벤트 중 하나를 검출하도록 구성될 수 있다. 이 경우, 본 발명의 상기 실행제어단계는 상기 제1서브이벤트 또는 제2서브이벤트 각각에 맵핑된 커맨드가 실행되도록 제어한다.Preferably, in the event detection step of the present invention, a first sub-event in which the location information of the point object corresponds to an upper point of the second vertical line or the location information of the point object corresponds to a lower point of the second vertical line and detect one of the second sub-events. In this case, the execution control step of the present invention controls the command mapped to each of the first sub-event or the second sub-event to be executed.

구체적으로 상기 제1버티컬객체는 사용자의 손가락 중 펴진 검지객체이며, 상기 포인트객체는 엄지객체이고 상기 제2버티컬객체는 접힌 중지객체로 설정되는 것이 바람직하다.Specifically, it is preferable that the first vertical object is an open index object among the user's fingers, the point object is a thumb object, and the second vertical object is a folded middle object.

실시형태에 따라서, 본 발명은 상기 영상데이터에서 눈(eye) 사이의 제1거리정보를 연산하는 거리연산단계; 실제 눈 사이의 평균거리정보와 상기 제1거리정보 사이의 비율을 이용하여 디스플레이장치와 사용자 사이의 이격거리정보를 연산하는 추정단계; 및 상기 이격거리정보와 상기 디스플레이장치의 크기정보를 이용하여 기준인식영역을 설정하는 인식영역설정단계를 더 포함할 수 있다.According to an embodiment, the present invention provides a distance calculation step of calculating first distance information between eyes in the image data; an estimation step of calculating separation distance information between a display apparatus and a user by using a ratio between the average distance information between the actual eyes and the first distance information; and a recognition area setting step of setting a reference recognition area using the separation distance information and the size information of the display device.

이 경우, 본 발명의 상기 객체인식단계는 상기 기준인식영역에 대응되는 영역의 영상데이터를 이용하도록 구성되는 것이 바람직하다.In this case, the object recognition step of the present invention is preferably configured to use image data of an area corresponding to the reference recognition area.

또한, 본 발명은 마이크모듈이 감지한 사운드가 기동사운드(wake-up sound)에 해당하는 경우, 상기 카메라장치가 활성화되도록 제어하는 카메라제어단계; 및 상기 기동사운드의 발생 위치 또는 방향성에 대한 정보인 방향성정보가 생성되는 방향성정보입력단계를 더 포함할 수 있다.In addition, the present invention provides a camera control step of controlling the camera device to be activated when the sound detected by the microphone module corresponds to a wake-up sound; and a directionality information input step of generating direction information, which is information on a location or directionality of the starting sound.

이 경우, 본 발명의 상기 객체인식단계는 상기 영상데이터가 n(n은 2이상의 자연수)개의 영역으로 나누어진 n개의 단위영상을 이용하여 객체 인식에 대한 프로세싱을 수행하되, 상기 n개의 단위영상 중 상기 방향성정보에 대응되는 단위영상을 우선 대상으로 객체 인식에 대한 프로세싱을 수행하도록 구성될 수 있다.In this case, the object recognition step of the present invention performs processing for object recognition using n unit images in which the image data is divided into n (n is a natural number greater than or equal to 2) regions, but among the n unit images It may be configured to perform processing for object recognition with priority on a unit image corresponding to the directionality information.

나아가 본 발명은 직각 형상이 포함된 손 객체가 대각선 방향으로 대칭되는 위치에서 인식되는 경우, 상기 직각 형상의 연장 라인에 의하여 형성되는 영역을 기준인식영역으로 설정하는 인식영역설정단계를 더 포함할 수 있으며 이 경우 본 발명의 상기 객체인식단계는 상기 기준인식영역에 대응되는 영역의 영상데이터를 이용하도록 구성될 수 있다.Furthermore, the present invention may further include a recognition area setting step of setting an area formed by the extension line of the right angle shape as a reference recognition area when a hand object including a right angle shape is recognized at a position symmetrical in the diagonal direction. In this case, the object recognition step of the present invention may be configured to use image data of an area corresponding to the reference recognition area.

더욱 바람직하게, 본 발명의 상기 인식영역설정단계는 상기 기준인식영역이 설정된 시점의 얼굴위치정보를 설정하는 위치설정단계; 및 상기 설정된 얼굴위치정보가 이동하는 경우 그 이동에 대응되도록 상기 기준인식영역의 위치를 이동시키는 가변설정단계를 포함하도록 구성될 수 있다.More preferably, the recognition region setting step of the present invention comprises: a position setting step of setting face position information at a time point when the reference recognition region is set; and a variable setting step of moving the position of the reference recognition area to correspond to the movement when the set face position information moves.

본 발명의 다른 측면에 의한 제스처 기반의 객체 지향적 인터페이싱 장치는 카메라장치로부터 영상데이터를 입력받는 입력부; 상기 영상데이터를 파싱하여 객체를 인식하는 객체인식부; 및 상기 객체가 발생시키는 이벤트에 따라 특정 커맨드가 실행되도록 제어하는 실행제어부를 포함할 수 있다.A gesture-based object-oriented interfacing apparatus according to another aspect of the present invention includes an input unit for receiving image data from a camera device; an object recognition unit for recognizing an object by parsing the image data; and an execution control unit for controlling a specific command to be executed according to an event generated by the object.

구체적으로 상기 객체인식부는 상기 영상데이터에서 수직 방향을 기준으로 제1기준길이 이상 연장된 형상을 가지는 제1버티컬객체를 인식하는 버티컬객체인식부; 상기 제1버티컬객체에 대응되는 제1수직라인을 기준으로 좌측 또는 우측 중 일측에 위치하는 포인트객체를 인식하는 포인트객체인식부; 및 상기 포인트객체의 위치정보가 상기 제1수직라인을 기준으로 반대측으로 이동하는 제1이벤트를 검출하는 이벤트검출부를 포함할 수 있다.Specifically, the object recognition unit may include: a vertical object recognition unit for recognizing a first vertical object having a shape extending more than a first reference length in the vertical direction in the image data; a point object recognition unit for recognizing a point object located on one of the left and right sides with respect to a first vertical line corresponding to the first vertical object; and an event detector configured to detect a first event in which the location information of the point object moves to the opposite side with respect to the first vertical line.

이 경우 본 발명의 상기 실행제어부는 상기 제1이벤트가 검출되는 경우, 상기 제1이벤트에 맵핑된 커맨드가 실행되도록 제어한다.In this case, when the first event is detected, the execution control unit of the present invention controls the command mapped to the first event to be executed.

본 발명의 바람직한 실시예에 의할 때, 간단한 변위 특징을 가지는 제스처가 적용되므로 연산 처리의 효율성을 더욱 높일 수 있음은 물론, 해당 제스처를 구성하는 개별 객체들 사이의 상대적 위치 관계의 변화를 정확히 특정하고 이를 인터페이싱에 접목시킬 수 있어 사용자와 전자장치 사이의 인터페이싱(interfacing) 또는 인터렉션(interaction)의 명확성과 정밀성 또한, 구현할 수 있다.According to a preferred embodiment of the present invention, since a gesture having a simple displacement characteristic is applied, the efficiency of computational processing can be further increased, and changes in the relative positional relationship between individual objects constituting the gesture can be accurately specified. And since it can be grafted to the interfacing, clarity and precision of interfacing or interaction between the user and the electronic device can also be implemented.

본 발명에 의하는 경우, 전체적으로 함께 이동하는 동작 특성을 가짐과 동시에 독립적 이동이 가능한 개별객체들의 상대적 위치 관계가 제스처로 적용되므로 제스처 인식 내지 검출을 위한 위치 기준을 시계열적으로 변화시킬 필요가 없어 연산 처리의 효율성을 더욱 향상시킬 수 있다.According to the present invention, there is no need to change the positional reference for gesture recognition or detection in time series because the relative positional relationship of individual objects that can move together and move independently at the same time are applied as a gesture according to the present invention. The efficiency of processing can be further improved.

또한, 본 발명의 실시예에 의할 때, 사용자에게 익숙한 UX에 상응하는 모양 또는 움직임 등이 제스처로 적용되므로 직관적 인식에 의한 사용자 편의성을 더욱 최적화시킬 수 있다.In addition, according to an embodiment of the present invention, since a shape or movement corresponding to a UX familiar to the user is applied as a gesture, user convenience through intuitive recognition can be further optimized.

나아가 본 발명의 다른 실시예에 의할 때, 객체 인식을 위한 영역을 선행적으로 선별 내지 특정하는 전처리 프로세싱 또는 영상데이터 전체를 그대로 적용하지 않고 사용자의 현재 위치 내지 방향 등에 대응되는 영역의 영상을 객체 인식에 우선적으로 활용하는 프로세싱을 통하여 연산 처리 속도를 비약적으로 향상시킬 수 있어 인터페이싱의 즉시 응답성을 구현할 수 있다.Furthermore, according to another embodiment of the present invention, the image of the region corresponding to the user's current location or direction, etc. is not applied to the entire image data or pre-processing for pre-selecting or specifying the region for object recognition as an object. Through processing that is preferentially used for recognition, it is possible to dramatically improve the processing speed of calculations, so that immediate responsiveness of interfacing can be realized.

본 명세서에 첨부되는 다음의 도면들은 본 발명의 바람직한 실시예를 예시하는 것이며, 후술되는 발명의 상세한 설명과 함께 본 발명의 기술사상을 더욱 효과적으로 이해시키는 역할을 하는 것이므로, 본 발명은 이러한 도면에 기재된 사항에만 한정되어 해석되어서는 아니 된다.
도 1은 본 발명의 바람직한 일 실시예에 의한 인터페이싱 장치의 상세 구성을 도시한 블록도,
도 2는 도 1에 도시된 객체인식부의 상세 구성을 도시한 블록도,
도 3 및 도 4는 본 발명의 실시예들에 대한 프로세싱 과정을 도시한 흐름도,
도 5 및 도 6은 버티컬객체 및 포인트객체에 대한 상세 구성 및 이를 기반으로 생성되는 이벤트를 설명하는 도면,
도 7은 객체 인식을 위한 기준인식영역이 설정되는 본 발명의 프로세싱 과정 등을 도시한 흐름도,
도 8은 사운드 방향정보 등이 활용되는 본 발명의 프로세싱 과정을 도시한 흐름도,
도 9는 객체 인식을 위한 기준인식영역이 설정되는 본 발명의 다른 실시예를 설명하는 도면이다.The following drawings attached to the present specification illustrate preferred embodiments of the present invention, and serve to more effectively understand the technical spirit of the present invention together with the detailed description of the present invention to be described later, so the present invention is described in these drawings It should not be construed as being limited only to the matters.
1 is a block diagram showing a detailed configuration of an interfacing device according to a preferred embodiment of the present invention;
Figure 2 is a block diagram showing the detailed configuration of the object recognition unit shown in Figure 1;
3 and 4 are flow charts illustrating a processing procedure for embodiments of the present invention;
5 and 6 are views for explaining detailed configurations of vertical objects and point objects and events generated based on them;
7 is a flowchart illustrating a processing process of the present invention in which a reference recognition area for object recognition is set;
8 is a flowchart illustrating a processing process of the present invention in which sound direction information and the like are utilized;
9 is a view for explaining another embodiment of the present invention in which a reference recognition area for object recognition is set.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 상세히 설명하기로 한다. 이에 앞서, 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니 되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 용어의 개념을 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. Prior to this, the terms or words used in the present specification and claims should not be construed as being limited to conventional or dictionary meanings, and the inventor should properly understand the concept of the term in order to best describe his invention. Based on the principle that it can be defined, it should be interpreted as meaning and concept consistent with the technical idea of the present invention.

따라서 본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명의 가장 바람직한 일 실시예에 불과할 뿐이고 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형예들이 있을 수 있음을 이해하여야 한다.Therefore, the configuration shown in the embodiments and drawings described in the present specification is only the most preferred embodiment of the present invention and does not represent all the technical spirit of the present invention, so various equivalents that can replace them at the time of the present application It should be understood that there may be water and variations.

도 1은 본 발명의 바람직한 일 실시예에 의한 제스처 기반의 객체 지향적 인터페이싱 장치(이하 '인터페이싱 장치'라 지칭한다)(100)의 상세 구성을 도시한 블록도, 도 2는 도 1에 도시된 객체인식부(120)의 상세 구성을 도시한 블록도이며, 도 3 및 도 4는 본 발명의 실시예에 의한 프로세싱 과정을 도시한 흐름도 그리고 도 5 및 도 6은 버티컬객체 및 포인트객체에 대한 상세 구성 및 이를 활용하여 생성되는 이벤트를 설명하는 도면이다.1 is a block diagram illustrating a detailed configuration of a gesture-based object-oriented interfacing device (hereinafter referred to as an 'interfacing device') 100 according to a preferred embodiment of the present invention, and FIG. 2 is an object shown in FIG. It is a block diagram showing a detailed configuration of the recognition unit 120, FIGS. 3 and 4 are flowcharts showing a processing process according to an embodiment of the present invention, and FIGS. 5 and 6 are detailed configurations of vertical objects and point objects and an event generated using the same.

도 1에 도시된 바와 같이 본 발명의 인터페이싱 장치(100)는 입력부(110), 객체인식부(120), 실행제어부(130), 영역설정부(140) 및 사운드처리부(150) 등을 포함하여 구성될 수 있다.As shown in FIG. 1 , the interfacing device 100 of the present invention includes an input unit 110 , an object recognition unit 120 , an execution control unit 130 , a region setting unit 140 , and a sound processing unit 150 , etc. can be configured.

또한, 도 2에 도시된 바와 같이 본 발명의 인터페이싱 장치(100)의 일 구성인 객체인식부(120)는 구체적으로 AI프로세싱부(121), 버티컬객체인식부(123), 포인트객체인식부(125), 트래킹부(127) 및 이벤트검출부(129) 등을 포함하여 구성될 수 있다.In addition, as shown in FIG. 2, the object recognition unit 120, which is a component of the interfacing device 100 of the present invention, specifically includes an AI processing unit 121, a vertical object recognition unit 123, and a point object recognition unit ( 125), a tracking unit 127, and an event detection unit 129 may be included.

본 발명에 의한 제스처 기반의 객체 지향적 인터페이싱 방법(이하 '본 발명의 인터페이싱 방법'이라 지칭한다)은 특정 단말이나 장치 등에 탑재되어 구동되는 소프트웨어 형태로 구현될 수 있으며, 저장수단, 연산처리수단 등의 전자소자, 부품 등을 이용하여 본 발명에 의한 기술사상이 구현되도록 설계된 모듈 또는 독립된 장치 등의 하드웨어 형태로도 구현될 수 있다.The gesture-based object-oriented interfacing method (hereinafter referred to as 'interfacing method of the present invention') according to the present invention may be implemented in the form of software mounted and driven in a specific terminal or device, and may include storage means, arithmetic processing means, etc. It may also be implemented in the form of hardware such as a module or an independent device designed to implement the technical idea according to the present invention using electronic devices, parts, and the like.

또한, 실시형태에 따라서 본 발명의 인터페이싱 방법(장치)은 인터페이싱 제어의 대상이 되는 TV, 디스플레이 장치, 내비게이션, 영상 의료장치, AR 또는 VR 장치, 컴퓨터(모니터), 스크린(screen) 시스템, 키오스크(kiosk), 자동화시스템 등과 같은 설비나 장치에 임베디드되는 형태로도 구현될 수 있음은 물론이다. In addition, according to an embodiment, the interfacing method (device) of the present invention is a TV, a display device, a navigation device, an imaging medical device, an AR or VR device, a computer (monitor), a screen system, a kiosk ( Of course, it can also be implemented in the form of being embedded in facilities or devices such as kiosks) and automation systems.

이러한 점에서 도 1에 도시된 본 발명의 인터페이싱 장치(100) 및 도 2에 도시된 객체인식부(120)의 각 구성요소는 물리적으로 구분되는 구성요소라기보다는 논리적으로 구분되는 구성요소로 이해되어야 한다.In this regard, each component of the interfacing device 100 of the present invention shown in FIG. 1 and the object recognition unit 120 shown in FIG. 2 is to be understood as a logically distinct component rather than a physically separate component. do.

즉, 상기 도면에 도시된 각각의 구성요소는 본 발명에 의한 기술사상을 효과적으로 설명하기 위한 논리적 구성에 해당하므로 각각의 구성요소가 통합 또는 분리되어 구성되더라도 본 발명의 논리 구성이 수행하는 기능이 실현될 수 있다면 본 발명의 범위 내에 있다고 해석되어야 하며, 동일 또는 유사한 기능을 수행하는 구성요소라면 그 명칭상의 일치성 여부와는 무관히 본 발명의 범위 내에 있다고 해석되어야 함은 물론이다.That is, since each component shown in the drawing corresponds to a logical configuration for effectively explaining the technical idea according to the present invention, the function performed by the logical configuration of the present invention is realized even if each component is integrated or separated. If possible, it should be construed to be within the scope of the present invention, and if it is a component that performs the same or similar function, it should be construed as being within the scope of the present invention regardless of whether the names are identical.

본 발명의 인터페이싱 장치(100) 또는 방법은 카메라장치(50)로부터 입력되는 영상데이터를 분석(parsing)하고 매칭 알고리즘 또는 딥러닝 등의 알고리즘을 적용하여 영상데이터 내 특정 객체를 인식 및 검출하며, 검출된 객체의 형상적 특징 또는 움직임 패턴, 이동 특성 등에 맵핑된(mapping) 특정 커맨드가 디바이스(70)에서 실행되도록 제어하는 장치(방법)에 해당한다.The interfacing apparatus 100 or method of the present invention analyzes image data input from the camera apparatus 50 and applies an algorithm such as a matching algorithm or deep learning to recognize and detect a specific object in the image data, and detect It corresponds to an apparatus (method) for controlling a specific command mapped to a shape feature, a movement pattern, a movement characteristic, etc. of the object to be executed in the device 70 .

본 발명의 객체인식부(120)는 입력부(110)를 통하여 카메라장치(50)로부터 영상데이터가 입력되면(S300, S400, 도 3 등 참조) 입력된 영상데이터를 파싱하여 객체를 인식하는 프로세싱을 수행한다(S310).When image data is input from the camera device 50 through the input unit 110 (see S300, S400, FIG. 3, etc.), the object recognition unit 120 of the present invention parses the input image data to recognize the object. perform (S310).

구체적으로, 객체인식부(120)의 일 구성인 버티컬객체인식부(123)는 상기 영상데이터에서 수직 방향을 기준으로 제1기준길이 이상 연장된 형상을 가지는 객체인 제1버티컬객체를 인식 내지 검출한다(S311). Specifically, the vertical object recognition unit 123, which is a component of the object recognition unit 120, recognizes or detects a first vertical object, which is an object having a shape extending more than a first reference length in the vertical direction in the image data. do (S311).

실시형태에 따라서 도 3에 예시된 바와 같이 레퍼런스 데이터 또는 학습 모델링 데이터 등이 효과적으로 활용될 수 있도록 버티컬객체인식부(123)는 AI프로세싱부(121)에 연동하도록 구성될 수 있다.According to the embodiment, as illustrated in FIG. 3 , the vertical object recognition unit 123 may be configured to interwork with the AI processing unit 121 so that reference data or learning modeling data can be effectively utilized.

객체를 인식하는 방법은 딥러닝 등을 포함한 다양한 방법이 적용될 수 있으며, 객체의 형상이나 모양, 움직임 등은 스켈레톤(skeleton) 또는 관절 포인트 등을 이용하는 기법 등이 활용될 수 있다.As a method of recognizing an object, various methods including deep learning may be applied, and a technique using a skeleton or joint points, etc. may be used for the shape, shape, and movement of the object.

상기 제1버티컬객체는 후술되는 바와 같이 객체 인식 및 제스처 인식의 기준을 정하기 위한 객체로서, 제1버티컬객체의 형상적 특징(수직 길이 방향으로 연장된 형상적 특징)을 이용함으로써 왼쪽 영역(LA, 도 5 참조)과 오른쪽 영역(RA, 도 5 참조)이 명확히 구분될 수 있어 다른 객체의 상대적 위치 관계가 정확히 특정될 수 있음은 물론, 객체들의 상대적 위치 관계를 기반으로 한 객체의 이동이 더욱 뚜렷하고 명확하게 특정될 수 있다.The first vertical object is an object for determining the criteria for object recognition and gesture recognition as will be described later, and the left area (LA, LA, 5) and the right area (RA, see FIG. 5) can be clearly distinguished so that the relative positional relationship of other objects can be precisely specified, as well as the movement of the object based on the relative positional relationship of the objects is more distinct and can be clearly specified.

수직 길이 방향으로 연장된 형상을 가지는 휴먼 객체라면 인체의 바디, 팔 등 다양한 객체가 상기 제1버티컬객체(VO1, 도 5 등 참조)로 적용될 수 있음은 물론이다. Of course, if it is a human object having a shape extending in the vertical longitudinal direction, various objects such as a human body and an arm may be applied as the first vertical object (VO1, see FIG. 5, etc.).

다만, 개별 객체들 사이의 절대적 이동과 상대적 이동의 상관관계를 유기적으로 반영하고 나아가 사용자의 직관적 인식과 UX에 기반한 편의성을 더욱 높이기 위하여 상기 제1버티컬객체(VO1)는 사용자의 손가락 중 펴진 검지객체(un-folded index finger object)로 설정되는 것이 바람직하다.However, in order to organically reflect the correlation between absolute movement and relative movement between individual objects and further enhance the user's intuitive recognition and UX-based convenience, the first vertical object VO1 is the index object spread among the user's fingers. (unfolded index finger object) is preferably set.

이러한 점에서 상기 제1기준길이는 가변적으로 설정될 수 있음은 물론이며, 검지의 물리적인 평균 길이, 영상과 사용자 사이의 거리 등의 함수 관계를 통하여 정해질 수 있다. 또한, 실시형태에 따라서 스켈레톤 등으로 손 객체가 인식되는 경우 손의 형상적 특징을 이용하여 검지에 해당하는 제1버티컬객체(VO1)가 검출될 수도 있음은 물론이다.In this regard, of course, the first reference length may be variably set, and may be determined through a functional relationship such as the average physical length of the index finger, the distance between the image and the user. In addition, according to an embodiment, when a hand object is recognized as a skeleton or the like, it goes without saying that the first vertical object VO1 corresponding to the index finger may be detected using the shape features of the hand.

상세한 설명은 생략하나, 일반적인 객체 인식 기술과 같이 제1버티컬객체(VO1)가 인식 또는 검출된다는 것은 그에 해당하는 위치정보, 영역데이터 또는 벡터 데이터 등이 생성되고(S313) 생성된 데이터가 추적(tracking)될 수 있음을 의미한다. 도 2에 도시된 트래킹부(127)는 이러한 기능이 본 발명의 객체인식부(120)에 구현될 수 있음을 의미한다.Although a detailed description is omitted, the recognition or detection of the first vertical object VO1 as in general object recognition technology means that the corresponding location information, area data, or vector data is generated (S313), and the generated data is tracked. ) means that it can be The tracking unit 127 shown in FIG. 2 means that this function can be implemented in the object recognition unit 120 of the present invention.

이와 같이 제1버티컬객체(VO1)가 인식되면, 본 발명의 포인트객체인식부(125)는 제1버티컬객체(VO1)에 대응되는 제1수직라인(제1수직영역)(VL1, 도 5 참조)을 기준으로 좌측 또는 우측 중 일측에 위치하는 포인트객체(PO)를 인식 내지 검출한다(S315).As such, when the first vertical object VO1 is recognized, the point object recognition unit 125 of the present invention performs a first vertical line (first vertical region) corresponding to the first vertical object VO1 (VL1, see FIG. 5 ). ), the point object PO located on either the left or the right side is recognized or detected (S315).

상기 제1수직라인은 선을 표상하는 데이터로 구성될 수 있음은 물론이나, 실시형태에 따라서 제1버티컬객체(VO1)가 차지하는 영역을 표상하는 데이터로 구성될 수도 있음은 물론이며, 실시형태에 따라서 제1수직라인은 제1버티컬객체(VO1)의 영역 중 좌측, 우측 또는 중심 라인 등으로 설정될 수 있다. 후술되는 제2버티컬객체(VO2)에 의한 제2수직라인(VL2) 또한, 이와 같다.Of course, the first vertical line may be composed of data representing the line, but may also be composed of data representing the area occupied by the first vertical object VO1 depending on the embodiment. Accordingly, the first vertical line may be set as a left, right, or center line among the regions of the first vertical object VO1 . The second vertical line VL2 by the second vertical object VO2, which will be described later, is also the same.

앞서 기술된 바와 같이 제1버티컬객체(VO1)가 검지로 설정되는 경우, 상기 포인트객체(PO)는 엄지 또는 엄지의 단부(tip)(포인트 또는 단부 영역)로 설정되는 것이 바람직하다.As described above, when the first vertical object VO1 is set with the index finger, the point object PO is preferably set as the thumb or the tip (point or end region) of the thumb.

이와 같이 제1버티컬객체(VO1)와 포인트객체(PO)의 인식이 이루어지면, 본 발명의 이벤트검출부(129)는 상기 포인트객체(PO)가 상기 제1수직라인(VL1)을 기준으로 반대측으로 이동하는 무빙 또는 액션(이하 '제1이벤트'라 지칭한다)이 이루어지는지 여부를 모니터링한다(S317).As such, when the first vertical object VO1 and the point object PO are recognized, the event detection unit 129 of the present invention detects that the point object PO moves to the opposite side with respect to the first vertical line VL1. It is monitored whether a moving moving or an action (hereinafter referred to as a 'first event') is made (S317).

즉, 본 발명에 의하는 경우, 포인트객체(PO)의 이동 여부가, 동일 휴먼 객체(손)에 속하되, 다른 개별 객체에 해당하는 제1버티컬객체(VO1)가 형성하는 수직 영역 내지 라인을 기준으로 결정되므로 그 이동 여부(LA→RA, 도 5 참조)가 더욱 정확하고 명확하게 인식될 수 있다.That is, in the case of the present invention, whether or not the point object PO is moved is determined by determining the vertical region or line formed by the first vertical object VO1 that belongs to the same human object (hand) and corresponds to a different individual object. Since it is determined based on the reference, whether the movement (LA→RA, see FIG. 5) can be recognized more accurately and clearly.

이와 같이 제1이벤트가 검출되면 본 발명의 실행제어부(130)는 상기 제1이벤트에 맵핑된 소정의 커맨드가 디바이스(70)에서 실행되도록 제어한다(S320). 제1이벤트에 맵핑되는 커맨드는 디바이스의 종류, 디바이스에서 표출되는 컨텐츠의 종류와 특성, 사용자 등록 등에 따라 다양하게 설정될 수 있음은 물론이다. When the first event is detected as described above, the execution control unit 130 of the present invention controls a predetermined command mapped to the first event to be executed in the device 70 (S320). It goes without saying that the command mapped to the first event may be set in various ways according to the type of device, the type and characteristics of content displayed on the device, user registration, and the like.

상술된 본 발명의 프로세싱은 장치 비활성화, 종료 명령 등과 같은 종료 조건이 충족되지 않는 한(S330, S460) 순환적으로 적용되도록 구성될 수 있음은 물론이다.Of course, the above-described processing of the present invention may be configured to be applied cyclically as long as termination conditions such as device deactivation, termination command, etc. are not satisfied (S330 and S460).

더욱 바람직한 실시형태의 구현을 위하여 본 발명의 버티컬객체인식부(123)는 제2버티컬객체(VO2)를 더 인식할 수 있다(S410).For the implementation of a more preferred embodiment, the vertical object recognition unit 123 of the present invention may further recognize the second vertical object VO2 (S410).

제2버티컬객체(VO2)는 제1수직라인(VL1)의 좌측 영역 또는 우측 영역 중 상기 포인트객체(PO)가 위치하지 않는 영역에 위치한 객체(도 5 등 참조)로서 수직 방향을 기준으로 연장된 형상을 가지되, 상기 제1기준길이(제1버티컬객체(VO1)의 수직 방향 길이)보다 작은 길이를 가지는 객체에 해당한다.The second vertical object VO2 is an object (refer to FIG. 5, etc.) located in an area where the point object PO is not located among the left area or the right area of the first vertical line VL1 (see FIG. 5, etc.) and is extended in the vertical direction. It corresponds to an object having a shape but having a length smaller than the first reference length (the length in the vertical direction of the first vertical object VO1).

즉, 이 실시예에 의하는 경우, 포인트객체(PO)와 제2버티컬객체(VO2)는 제1버티컬객체(VO1)를 기준으로 반대 영역에 위치한다(default).That is, according to this embodiment, the point object PO and the second vertical object VO2 are located in opposite areas with respect to the first vertical object VO1 (default).

앞서 기술된 바와 같이, 제2버티컬객체(VO2)가 인식되면, 제2버티컬객체(VO2)에 해당하는 다양한 정보 및 데이터가 생성되고 트래킹될 수 있음은 물론이다(S420).As described above, when the second vertical object VO2 is recognized, it goes without saying that various information and data corresponding to the second vertical object VO2 may be generated and tracked (S420).

앞서 설명된 실시예에서는 포인트객체(PO)의 위치정보가 제1버티컬객체(VO1)에 의한 제1수직라인(VL1)의 영역을 기준으로 일측에서 타측으로 변화되는 제스처가 제1이벤트가 된다.In the embodiment described above, a gesture in which the location information of the point object PO is changed from one side to the other side based on the area of the first vertical line VL1 by the first vertical object VO1 becomes the first event.

이러한 위치 관계의 변화를 더욱 뚜렷하고 명확하게 구성하기 위하여 이 실시예에서는 포인트객체(PO)가 제1수직라인(VL1)을 가로질러 그 위치가 일측에서 타측에서 변경되는 제1조건 및 포인트객체(PO)의 위치정보가 제2버티컬객체(VO2)가 형성하는 제2수직라인(VL2)에 대응되는 제2조건이 모두 충족되는 경우(S430)가 제1이벤트가 되도록 구성된다.In order to more clearly and clearly configure the change in the positional relationship, in this embodiment, the point object PO crosses the first vertical line VL1 and the first condition and the point object PO in which the position is changed from one side to the other side ) is configured such that the first event occurs when all the second conditions corresponding to the second vertical line VL2 formed by the second vertical object VO2 are satisfied (S430).

데이터 처리의 관점으로 환언하면, 포인트객체(PO)의 위치정보 등이 제2수직라인(VL2)의 영역 내 위치하거나 제2수직라인(VL2)의 외곽라인에 해당하는 경우 등이 이 실시예에서의 제1이벤트가 된다.In other words, in this embodiment, the location information of the point object PO, etc. is located within the area of the second vertical line VL2 or corresponds to the outer line of the second vertical line VL2. becomes the first event of

앞서 기술된 바와 같이, 제1버티컬객체(VO1)가 펴진 검지객체이고, 포인트객체(PO)가 엄지객체로 구성되는 경우, 제2버티컬객체(VO2)는 접힌 중지객체(folded middle finger)로 구성될 수 있다.As described above, when the first vertical object VO1 is an open index object and the point object PO is composed of a thumb object, the second vertical object VO2 is composed of a folded middle finger. can be

이와 같이 구성되는 경우, 사용자에게 가장 익숙한 입력수단인 마우스(mouse)를 사용할 때 사용자가 취하는 손의 자세 내지 모양 등과 유사하게 되므로 직관적 인식에 의한 사용 편의성을 더욱 높일 수 있다.In this configuration, since the posture or shape of the hand taken by the user when using a mouse, which is the input means most familiar to the user, is similar to the user, the ease of use by intuitive recognition can be further improved.

그러므로 이 실시예를 사용자의 관점에서 보면, 엄지(PO)가, 검지(VO1)가 위치한 영역을 가로질러 엄지(PO)의 반대편(default 위치 기준)에 위치한 중지로 이동하여 대접하는 제스처가 제1이벤트가 된다. 이러한 제스처는 마우스 버튼을 클릭하는 행위 동작과 상응하므로 UX에 부합하는 인터페이싱 환경이 구현될 수 있다.Therefore, looking at this embodiment from the user's point of view, the gesture of serving by moving the thumb PO across the region where the index finger VO1 is located to the middle finger located on the opposite side of the thumb PO (based on the default position) is the first becomes an event. Since such a gesture corresponds to an action of clicking a mouse button, an interfacing environment conforming to UX can be implemented.

더욱 구체적으로 본 발명의 이벤트검출부(129)는 포인트객체(PO)의 위치가 제2수직라인(VL2)에 대응되는 경우(S430), 포인트객체(PO)의 위치가 제2수직라인(VL2)의 상위지점 또는 하위지점 중 어디로 이동하였는지를 검출하도록(S440) 구성될 수 있다.More specifically, when the position of the point object PO corresponds to the second vertical line VL2 ( S430 ), the event detection unit 129 of the present invention determines that the position of the point object PO corresponds to the second vertical line VL2 . It may be configured to detect which of the upper point or lower point of the has moved (S440).

이와 같이 이벤트검출부(129)에 의하여 포인트객체(PO)가 제1버티컬객체(VO1)를 가로질러 제2버티컬객체(VO2)의 상위 지점(P1)으로 이동하는 이벤트(제1서브이벤트)(도 6의 좌측 도면) 또는 제2버티컬객체(VO2)의 하위 지점(P2)으로 이동하는 이벤트(제2서브이벤트)(도 6의 우측 도면)가 검출되면, 본 발명의 실행제어부(130)는 제1 또는 제2서브이벤트에 맵핑된 해당 커맨드가 디바이스(70)에서 실행되도록 제어한다(S450).In this way, the event (first sub-event) in which the point object PO crosses the first vertical object VO1 and moves to the upper point P1 of the second vertical object VO2 by the event detection unit 129 (Fig. 6) or an event (second sub-event) moving to the lower point P2 of the second vertical object VO2 (the right diagram of FIG. 6) is detected, the execution control unit 130 of the present invention A corresponding command mapped to the first or second sub-event is controlled to be executed in the device 70 (S450).

실시형태에 따라서, 포인트객체(PO)가 이동한 위치가 제2버티컬객체(VO2)의 상위 또는 하위 지점 중 하나로 명확히 구분되지 않는 경우, 사용자에게 위치조정을 유도하는 안내정보가 표출수단(시각적 매체 또는 청각적 매체 등)을 통하여 표출되도록(S445) 구성될 수 있다.According to the embodiment, when the position to which the point object PO is moved is not clearly distinguished as one of the upper or lower points of the second vertical object VO2, guide information for inducing the user to adjust the position is displayed as an expression means (visual medium). Or it may be configured to be expressed through an auditory medium (S445).

이하에서는 도 7 등을 참조하여 객체 인식을 위한 기준인식영역이 설정되는 본 발명의 일 실시예를 상세히 설명하도록 한다.Hereinafter, an embodiment of the present invention in which a reference recognition area for object recognition is set will be described in detail with reference to FIG. 7 and the like.

도 1에 도시된 바와 같이 본 발명의 영역설정부(140)는 구체적으로 거리연산부(141), 추정연산부(142) 및 인식영역설정부(143) 등을 포함하여 구성될 수 있다. 앞서 기술된 바와 같이 이들 구성요소 또한, 그 기능을 중심으로 한 논리적 구성에 해당함은 물론이다.As shown in FIG. 1 , the area setting unit 140 of the present invention may specifically include a distance calculating unit 141 , an estimation calculating unit 142 , and a recognition area setting unit 143 . Of course, as described above, these components also correspond to logical configurations centered on their functions.

거리연산부(141)는 카메라장치(50)로부터 영상데이터가 입력되면(S600) 특징점 추출, 벡터 연산, 딥러닝 등의 알고리즘 등을 적용하여 영상데이터에서 눈(eye)에 해당하는 객체를 선별하고 선별된 눈 객체 사이의 제1거리정보를 연산한다(S610).When image data is input from the camera device 50 (S600), the distance calculator 141 selects and selects an object corresponding to an eye from the image data by applying algorithms such as feature point extraction, vector operation, and deep learning. The first distance information between the eye objects is calculated ( S610 ).

이와 같이 영상데이터의 픽셀 위치 등을 이용하여 제1거리정보가 산출되면 본 발명의 추정연산부(142)는 실제 눈 사이의 물리적 평균거리정보와 상기 제1거리정보 사이의 비율 정보를 이용하여 디스플레이장치(디바이스(70))와 사용자 사이의 이격거리정보를 추정 연산한다(S620).As such, when the first distance information is calculated using the pixel position of the image data, the estimation operation unit 142 of the present invention uses the ratio information between the actual physical average distance information between the eyes and the first distance information to the display device. The distance information between the (device 70) and the user is estimated and calculated (S620).

이격거리정보가 추정되면 본 발명의 인식영역설정부(143)는 상기 이격거리정보, 디스플레이장치(70)의 크기정보, 영상데이터 내 휴대객체의 크기 정보 또는 인체 공학적 측면에서 고려된 휴먼객체(손 객체)의 이동범위 정보 등을 종합적으로 고려하여 기준인식영역을 설정한다(S630).When the separation distance information is estimated, the recognition area setting unit 143 of the present invention performs the separation distance information, the size information of the display device 70, the size information of the portable object in the image data, or the human object (hand) considered in terms of ergonomics. The reference recognition area is set by comprehensively considering the movement range information of the object) (S630).

이 기준인식영역은 현재 디바이스(디스플레이장치)(70)와 사용자 사이의 물리적 거리, 디바이스(디스플레이장치)(70)의 크기 등을 고려하여 설정된 영역으로서 사용자의 제스처가 이루어질 가능성이 높은 가상 영역에 해당한다.This reference recognition area is an area set in consideration of the physical distance between the current device (display apparatus) 70 and the user, the size of the device (display apparatus) 70, etc., and corresponds to a virtual area where the user's gesture is highly likely. do.

이와 같이 기준인식영역이 설정되면 본 발명의 객체인식부(120)는 카메라장치(50)로부터 입력된 각 프레임별 영상데이터 전체를 파싱의 대상으로 하지 않고 기준인식영역에 해당하는 특정영역만을 그 대상으로 하거나 또는 이 특정영역을 우선 대상으로 객체 인식 프로세싱을 수행할(S670) 수 있어 연산처리의 효율성을 증진시킬 수 있다.When the reference recognition area is set in this way, the object recognition unit 120 of the present invention does not parse the entire image data for each frame input from the camera device 50, but only a specific area corresponding to the reference recognition area. Alternatively, object recognition processing may be performed with a priority on this specific area (S670), thereby improving the efficiency of computational processing.

더욱 바람직하게, 인식영역설정부(143)는 기준인식영역이 설정된 시점의 얼굴위치에 대한 정보를 저장하고(S640) 활용하도록 구성될 수 있다.More preferably, the recognition area setting unit 143 may be configured to store and utilize information on the position of the face at the point in time when the reference recognition area is set (S640).

이와 같이 얼굴위치에 대한 정보가 저장되면 향후 얼굴위치정보가 이동하는지 여부를 모니터링하고(S650) 얼굴의 위치가 이동하는 경우 그에 함수적으로 결정되는 위치로 상기 기준인식영역을 이동시킴으로써(S660) 사용자의 이동에 따라 자연스럽게 기준인식영역을 가변시킬 수 있어 연산처리의 효율성을 물론, 사용자 지향적 환경을 제공할 수 있다.If the information on the face position is stored in this way, it monitors whether or not the face position information moves in the future (S650), and when the position of the face moves, the reference recognition area is moved to a position determined as a function thereof (S660). As the reference recognition area can be naturally changed according to the movement of

안면 인식 알고리즘이 본 발명에 구현되는 경우, 등록된 사용자를 우선 대상으로 상술된 방법이 적용되도록 구성될 수 있다.When the face recognition algorithm is implemented in the present invention, it may be configured such that the above-described method is applied to a registered user first.

도 8은 사운드 방향 정보 등을 활용한 본 발명의 일 실시예에 의한 프로세싱 과정을 도시한 흐름도이다.8 is a flowchart illustrating a processing process according to an embodiment of the present invention utilizing sound direction information and the like.

각 프레임별 영상데이터의 사이즈는 하드웨어 리소스에 따라 다양하며 또한, 카메라장치(50)(디바이스(70)에 설치되는 형태 등 포함)와 사용자 사이의 물리적 거리에 따라 영상데이터 내 휴먼객체가 포함되는 비율이 다르게 된다.The size of the image data for each frame varies depending on the hardware resource, and the proportion of human objects included in the image data according to the physical distance between the camera device 50 (including the form installed on the device 70, etc.) and the user this will be different

그러므로 영상데이터 전체를 대상으로 객체 인식 알고리즘을 적용하는 경우 연산처리의 효율성이 낮음은 물론, 영상데이터 내 검출되는 휴먼객체의 크기가 상대적으로 작아져 객체 검출의 정확성과 정밀성 등이 저하될 수 있다.Therefore, when the object recognition algorithm is applied to the entire image data, the efficiency of calculation processing is low, and the size of the human object detected in the image data is relatively small, so the accuracy and precision of object detection may be deteriorated.

이러한 점을 고려하여, 영상데이터를 n(n은 2이상의 자연수)개의 영역으로 나누고, n개로 나누어진 개별 단위영상을 대상으로 하는 객체 검출 내지 인식 프로세싱이 적용될 수 있다.In consideration of this point, image data may be divided into n (n is a natural number greater than or equal to 2) regions, and object detection or recognition processing may be applied to the n-divided individual unit images.

예를 들어, 각 프레임별 영상데이터가 4개의 단위영역 내지 단위영상으로 나누어진다면, 순차적 방법 등으로 첫 번째 단위영상에서 네 번째 단위영상을 대상으로 객체 검출 내지 인식 프로세싱이 적용되는데, 만약 #3번째 단위영상에서 의도된 객체가 검출된다면 #1단위영상 및 #2단위영상을 대상으로 한 검출 프로세싱은 연산 처리의 효율성을 저하시키는 프로세싱이 된다.For example, if the image data for each frame is divided into four unit regions or unit images, object detection or recognition processing is applied to the fourth unit image from the first unit image by a sequential method, etc., if the #3 If an intended object is detected in the unit image, the detection processing for the #1 unit image and the #2 unit image is processing that reduces the efficiency of the calculation processing.

도 8에 도시된 본 발명의 실시예는 이러한 문제점을 효과적으로 극복하기 위한 실시예로서, 마이크모듈에 의하여 취득되는 사운드의 방향성 정보를 활용하여 복수 개 단위영상 중 특정 단위영상을 우선 대상으로 객체 인식 프로세싱을 수행하는 실시예에 해당한다.The embodiment of the present invention shown in FIG. 8 is an embodiment for effectively overcoming this problem, and object recognition processing is prioritized for a specific unit image among a plurality of unit images by utilizing the directionality information of the sound acquired by the microphone module. Corresponds to an embodiment in which

구체적으로 본 발명의 사운드처리부(150)는 마이크모듈(60)로부터 사운드 정보가 입력되면(S710) 입력된 사운드 정보가, 미리 설정된 기동사운드(wake-up sound)에 해당하는지 여부를 판단하고(S720) 기동사운드에 해당하는 경우 카메라장치(50)가 활성화(wake mode)되도록 제어한다(S730).Specifically, when sound information is input from the microphone module 60 (S710), the sound processing unit 150 of the present invention determines whether the input sound information corresponds to a preset wake-up sound (S720). ), the camera device 50 is controlled to be activated (wake mode) in case of a start sound (S730).

기동사운드는 미리 등록사용자 등에 의하여 등록될 수 있는데, 특정 단어, 구문 등으로 구성되는 음성(voice) 기반 사운드일 수 있으며, 정해진 횟수의 박수 등과 같이 음성 이외의 사운드일 수도 있다.The startup sound may be registered in advance by a registered user, etc., and may be a voice-based sound composed of specific words and phrases, or may be a sound other than voice, such as a predetermined number of applause.

이와 같이 구성되는 경우, 평상 시 카메라장치(50)를 비활성화모드(슬립모드, sleep mode)로 유지할(S700)로 수 있어 전력 사용의 효율성 또한, 높일 수 있다.In this configuration, the camera device 50 can be normally maintained in an inactive mode (sleep mode, sleep mode) (S700), so that the efficiency of power use can also be increased.

실시형태에 따라서, 마이크모듈(60)이 지향성 마이크이거나 또는 서로 다른 위치에 구비된 복수 개 마이크유닛으로 구성되는 경우, 기동사운드가 발생된 위치 내지 해당 위치에 대한 방향성 정보를 생성할 수 있다(S740). According to the embodiment, when the microphone module 60 is a directional microphone or is composed of a plurality of microphone units provided at different positions, it is possible to generate directional information on the position or the position where the starting sound is generated (S740). ).

상기 방향성 정보는 실시형태에 따라서 마이크모듈(60) 자체에서 생성되거나 또는 본 발명의 사운드처리부(150)에서 사운드정보의 파싱을 통하여 생성될 수 있음은 물론이다.Of course, the direction information may be generated by the microphone module 60 itself or by parsing the sound information in the sound processing unit 150 of the present invention, depending on the embodiment.

이와 같이 기동사운드가 발생된 위치 내지 방향성 등에 대한 정보인 방향성 정보가 생성되면, 본 발명의 객체인식부(120)는 영상데이터가 n개로 분할된 영상인 n개의 단위영상 중 상기 방향성정보에 대응되는 단위영상을 선별하고(S750) 이 단위영상을 우선 대상으로 객체 인식에 대한 프로세싱을 수행하도록(S760) 구성될 수 있다.As such, when the direction information, which is information about the location or direction of the starting sound, is generated, the object recognition unit 120 of the present invention corresponds to the direction information among n unit images that are images in which the image data is divided into n pieces. It may be configured to select a unit image (S750) and perform processing for object recognition with the unit image as a priority (S760).

이와 같이 구성되는 경우, 앞서 기술된 바와 같이, 단위 시간 당 처리되는 연산량을 비약적으로 감소시킬 수 있어 객체 인식 및 이를 기반으로 하는 인터페이싱의 즉시 응답성을 더욱 효과적으로 구현할 수 있다.In this configuration, as described above, the amount of operations processed per unit time can be drastically reduced, so that the instantaneous responsiveness of object recognition and interfacing based thereon can be more effectively implemented.

도 9는 객체 인식을 위한 기준인식영역이 설정되는 본 발명의 다른 실시예를 설명하는 도면이다. 9 is a view for explaining another embodiment of the present invention in which a reference recognition area for object recognition is set.

이 실시예는 사용자의 의사를 기반으로 사용자가 편의적으로 정하는 영역을 객체 인식 및 인터페이싱 제어를 위한 제스처가 입력되는 영역인 기준인식영역으로 설정하는 실시예에 해당한다.This embodiment corresponds to an embodiment in which an area conveniently determined by a user based on a user's intention is set as a reference recognition area, which is an area to which a gesture for object recognition and interfacing control is input.

본 발명의 인식영역설정부(143)는 카메라장치(50)로부터 영상데이터가 입력되면, 상기 입력된 영상데이터에서 직각 형상(B)이 포함된 손 객체(H)가 대각선 방향으로 대칭되는 위치에서 인식되는지 여부를 모니터링한다.When the image data is input from the camera device 50, the recognition area setting unit 143 of the present invention is located at a position where the hand object H including the right-angled shape B in the input image data is diagonally symmetrical. Monitor whether it is recognized.

상술된 형상을 가지는 손 객체가 인식되면, 직각 형상(B)의 연장 라인에 의하여 형성되는 영역을 생성하고 이 생성된 영역을 기준인식영역(A)로 설정한다.When the hand object having the above-described shape is recognized, an area formed by the extension line of the right-angled shape B is generated and the generated area is set as the reference recognition area A. As shown in FIG.

이와 같이 기준인식영역(A)이 설정되면 앞서 도 7을 참조하여 설명된 실시예와 같이 본 발명의 객체인식부(120)는 카메라장치(50)로부터 입력된 각 프레임별 영상데이터 전체를 파싱의 대상으로 하지 않고 기준인식영역에 해당하는 특정영역만을 그 대상으로 하거나 또는 이 특정영역을 우선 대상으로 객체 인식 프로세싱을 수행하도록 구성될 수 있다.When the reference recognition area A is set in this way, as in the embodiment described with reference to FIG. 7 , the object recognition unit 120 of the present invention parses the entire image data for each frame input from the camera device 50 . It may be configured to target only a specific area corresponding to the reference recognition area without a target, or to perform object recognition processing with this specific area as a priority target.

나아가 이 실시예에서도 저장된 얼굴위치정보가 이동하는지 여부를 모니터링하고 얼굴의 위치가 이동하는 경우 그에 함수적으로 결정되는 위치로 상기 기준인식영역이 자연스럽게 이동되도록 구성될 수 있음은 물론이다.Furthermore, in this embodiment, whether the stored face location information is moved or not, and if the location of the face is moved, the reference recognition area may be configured to naturally move to a location determined functionally.

이상에서 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 이것에 의해 한정되지 않으며 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 본 발명의 기술사상과 아래에 기재될 특허청구범위의 균등범위 내에서 다양한 수정 및 변형이 가능함은 물론이다.In the above, although the present invention has been described with reference to limited embodiments and drawings, the present invention is not limited thereto and will be described below with the technical idea of the present invention by those of ordinary skill in the art to which the present invention pertains. It goes without saying that various modifications and variations are possible within the scope of equivalents of the claims.

상술된 본 발명의 설명에 있어 제1 및 제2 등과 같은 수식어는 상호 간의 구성요소를 상대적으로 구분하기 위하여 사용되는 도구적 개념의 용어일 뿐이므로, 특정의 순서, 우선순위 등을 나타내기 위하여 사용되는 용어가 아니라고 해석되어야 한다.In the above description of the present invention, modifiers such as first and second are only instrumental terms used to relatively distinguish between components, so they are used to indicate a specific order, priority, etc. It should not be construed as a term that

본 발명의 설명과 그에 대한 실시예의 도시를 위하여 첨부된 도면 등은 본 발명에 의한 기술 내용을 강조 내지 부각하기 위하여 다소 과장된 형태로 도시될 수 있으나, 앞서 기술된 내용과 도면에 도시된 사항 등을 고려하여 본 기술분야의 통상의 기술자 수준에서 다양한 형태의 변형 적용 예가 가능할 수 있음은 자명하다고 해석되어야 한다.The accompanying drawings for the purpose of explaining the present invention and illustrating examples thereof may be shown in a somewhat exaggerated form in order to emphasize or highlight the technical contents of the present invention, but the above-described contents and matters shown in the drawings, etc. It should be construed as apparent that various types of modification application examples may be possible at the level of those skilled in the art in consideration of this.

상술된 본 발명의 제스쳐 기반의 객체 지향적 인터페이싱 방법은 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현될 수 있다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치(시디롬, 램, 롬, 플로피 디스크, 자기 디스크, 하드 디스크, 광자기 디스크 등)를 포함하며, 유무선 인터넷 전송을 위한 서버도 포함한다.The gesture-based object-oriented interfacing method of the present invention described above may be implemented as a computer-readable code on a computer-readable recording medium. Computer-readable recording media includes all types of recording devices (CD-ROM, RAM, ROM, floppy disk, magnetic disk, hard disk, magneto-optical disk, etc.) in which computer-readable data is stored, and wired and wireless Internet It also includes a server for transmission.

50 : 카메라장치 60 : 마이크모듈
70 : 디바이스(디스플레이장치) 100 : 본 발명의 인터페이싱 장치
110 : 입력부 120 : 객체인식부
121 : AI프로세싱부 123 : 버티컬객체인식부
125 : 포인트객체인식부 127 : 트래킹부
129 : 이벤트검출부 130 : 실행제어부
140 : 영역설정부 141 : 거리연산부
142 : 추정연산부 143 : 인식영역설정부
150 : 사운드처리부50: camera device 60: microphone module
70: device (display apparatus) 100: interfacing apparatus of the present invention
110: input unit 120: object recognition unit
121: AI processing unit 123: vertical object recognition unit
125: point object recognition unit 127: tracking unit
129: event detection unit 130: execution control unit
140: area setting unit 141: distance calculation unit
142: estimation calculation unit 143: recognition area setting unit
150: sound processing unit

Claims

an input step of receiving image data from a camera device;
a distance calculation step of calculating first distance information between eyes in the image data;
an estimation step of calculating separation distance information between a display apparatus and a user by using a ratio between the average distance information between the actual eyes and the first distance information;
a recognition area setting step of setting a reference recognition area using the separation distance information and the size information of the display device;
an object recognition step of recognizing an object by parsing the image data, but recognizing an object by parsing image data of an area corresponding to a reference recognition area among the image data; and
and an execution control step of controlling a specific command to be executed according to an event generated by the object.

According to claim 1, wherein the object recognition step,
a vertical object recognition step of recognizing a first vertical object having a shape extending over a first reference length in the vertical direction from the image data;
a point object recognition step of recognizing a point object located on one of the left and right sides with respect to a first vertical line corresponding to the first vertical object; and
An event detection step of detecting a first event in which the point object moves to the opposite side with respect to the first vertical line,
In the execution control step, when the first event is detected, the gesture-based object-oriented interfacing method, characterized in that controlling the command mapped to the first event to be executed.

The method of claim 2, wherein the step of recognizing the vertical object comprises:
A second vertical object, which is an object located on the opposite side of the point object from the left or right side of the first vertical line, has a shape extending in the vertical direction, and has a length smaller than the first reference length, is further recognized do,
The first event of the event detection step is,
The gesture-based object-oriented interfacing method, characterized in that the positional information of the point object corresponds to a second vertical line formed by the second vertical object.

The method of claim 3, wherein the event detection step comprises:
Detecting one of a first sub-event in which the location information of the point object corresponds to an upper point of the second vertical line or a second sub-event in which the location information of the point object corresponds to a lower point in the second vertical line, and ,
The execution control step is
The gesture-based object-oriented interfacing method, characterized in that controlling the execution of a command mapped to each of the first sub-event or the second sub-event.

According to claim 4, The first vertical object,
A gesture-based object-oriented interfacing method, characterized in that the index object is an open index object among the user's fingers, the point object is a thumb object, and the second vertical object is a folded middle object object.

delete

an input unit for receiving image data from a camera device;
a distance calculator for calculating first distance information between eyes in the image data;
an estimation calculator configured to calculate separation distance information between a display device and a user by using a ratio between the average distance information between the eyes and the first distance information;
a recognition area setting unit for setting a reference recognition area using the separation distance information and the size information of the display device;
an object recognition unit for recognizing an object by parsing the image data, but recognizing an object by parsing image data of an area corresponding to a reference recognition area among the image data; and
and an execution control unit for controlling a specific command to be executed according to an event generated by the object.

delete

A computer-readable recording medium in which a program for performing the method according to any one of claims 1 to 5 is recorded.