KR20230126158A

KR20230126158A - Apparatus and method for interfacing using directional information of wake-up sound

Info

Publication number: KR20230126158A
Application number: KR1020220097232A
Authority: KR
Inventors: 채규열; 유성은
Original assignee: 주식회사 마인드포지
Priority date: 2022-02-22
Filing date: 2022-08-04
Publication date: 2023-08-29
Also published as: KR20230126150A; KR102437979B1

Abstract

본 발명의 제스처 기반의 객체 지향적 인터페이싱 방법은 카메라장치로부터 영상데이터를 입력받는 입력단계; 상기 영상데이터를 파싱하여 객체를 인식하는 객체인식단계; 및 상기 객체가 발생시키는 이벤트에 따라 특정 커맨드가 실행되도록 제어하는 실행제어단계를 포함하고, 상기 객체인식단계는 상기 영상데이터에서 수직 방향을 기준으로 제1기준길이 이상 연장된 형상을 가지는 제1버티컬객체를 인식하는 버티컬객체인식단계; 상기 제1버티컬객체에 대응되는 제1수직라인을 기준으로 좌측 또는 우측 중 일측에 위치하는 포인트객체를 인식하는 포인트객체인식단계; 및 상기 포인트객체가 상기 제1수직라인을 기준으로 반대측으로 이동하는 제1이벤트를 검출하는 이벤트검출단계를 포함하며, 상기 실행제어단계는 상기 제1이벤트가 검출되는 경우, 상기 제1이벤트에 맵핑된 커맨드가 실행되도록 제어하는 것을 특징으로 한다.A gesture-based object-oriented interfacing method of the present invention includes an input step of receiving image data from a camera device; an object recognition step of parsing the image data and recognizing an object; and an execution control step of controlling execution of a specific command according to an event generated by the object, wherein the object recognition step includes a first vertical image extending from the image data to a first reference length or more in a vertical direction. a vertical object recognition step of recognizing an object; a point object recognizing step of recognizing a point object located on one of the left and right sides of a first vertical line corresponding to the first vertical object; and an event detection step of detecting a first event in which the point object moves to the opposite side with respect to the first vertical line, wherein the execution control step performs mapping to the first event when the first event is detected. It is characterized in that the command is controlled to be executed.

Description

Interfacing method and apparatus using directional information of starting sound

본 발명은 제스처(gesture) 기반의 인터페이싱 방법 등에 관한 것으로서, 더욱 구체적으로는 인식의 대상이 되는 객체의 절대적 이동 및 상대적 이동의 상호 관계를 유기적으로 접목시킴으로써 사용자와 전자장치 사이의 인터페이싱(interfacing) 또는 인터렉션(interaction)을 더욱 명확하고 정교하게 구현할 수 있는 제스처 기반의 객체 지향적 인터페이싱 방법 등에 관한 것이다. The present invention relates to a gesture-based interfacing method and the like, and more specifically, interfacing between a user and an electronic device by organically combining the mutual relationship between absolute movement and relative movement of an object to be recognized. It relates to a gesture-based object-oriented interfacing method that can more clearly and elaborately implement an interaction.

사용자의 높아진 니즈(needs)를 충족하고 사용자 편의성 및 장치 활용성을 더욱 높이기 위하여 포즈, 행위, 몸짓, 신체부위(손, 팔 등)의 특정 모양이나 움직임 등(이하 '제스처'라 지칭한다)을 영상매칭, 인공 지능, 딥러닝, 심층 신경망(DNN) 모델 등으로 인식하고 이를 기반으로 전자장치와의 인터페이싱을 제어하는 제스처 인식 기술이 널리 적용되고 있다.In order to meet the increased needs of users and further enhance user convenience and device usability, poses, actions, gestures, specific shapes or movements of body parts (hands, arms, etc.) (hereinafter referred to as 'gestures') are used. Gesture recognition technology that recognizes image matching, artificial intelligence, deep learning, deep neural network (DNN) models, etc. and controls interfacing with electronic devices based on this is widely applied.

이러한 제스처 인식 기술은 비접촉식 방법이 가지는 기본적인 특장점은 물론, 인터페이싱을 위한 추가적인 수단이 필요하지 않으며 원격 거리에서도 구현되는 장점을 가지므로 전자장치의 범주나 종류에 제한됨이 없이 확장적으로 적용되고 있다.Since this gesture recognition technology has the basic characteristics of the non-contact method as well as the advantage of not requiring additional means for interfacing and being implemented at a remote distance, it is extensively applied without being limited to the category or type of electronic device.

종래 기술들을 살펴보면, 손이나 팔의 전체적인 이동만을 감지하여 인터페이싱에 적용하는 간단한 기술부터 카메라장치로부터 생성되는 영상데이터를 대상으로 상당히 복잡한 객체 인식 알고리즘 체계를 적용하여 섬세한 동작이나 형상적 특징 등을 인식하는 기술들까지 다양한 실시 형태가 개시되고 있다.Looking at the prior art, from a simple technique of detecting only the overall movement of a hand or arm and applying it to interfacing, to a fairly complex object recognition algorithm system for image data generated from a camera device to recognize delicate motions or shape features. Various embodiments have been disclosed, including techniques.

그러나 후자의 기술들은 한국등록특허공보 10-1785650호와 같이 휴먼객체의 유니크하고 복잡한 제스처의 정합성에 기초하거나 복잡한 인식 알고리즘이 적용되므로 객체의 형상적 특징이 정밀하게 인식되지 않는다면 제스처 인식의 명확성이 확보되기 어렵고 인터페이싱의 응답 지연, 작동 중지, 오동작 등이 빈번하게 발생될 수 있음은 물론, 연산처리에서도 즉시 응답성을 구현하기가 어렵다고 할 수 있다. However, since the latter technologies are based on the consistency of unique and complex gestures of human objects or complex recognition algorithms are applied, as in Korean Patent Registration Publication No. 10-1785650, the clarity of gesture recognition is secured if the geometric features of the object are not precisely recognized. It is difficult to achieve, and interfacing response delay, operation stoppage, malfunction, etc. may occur frequently, and it is difficult to implement immediate responsiveness even in calculation processing.

물론, 분해능이 높은 고가의 하드웨어 리소스(센서, CPU, 카메라 등)가 충분히 뒷받침된다면 이러한 문제가 일부 해소될 수 있을지 모르나, 트레이드-오프(trade-off) 관계에 의하여 이러한 방법은 도리어 제스처 인식 기술의 범용성 및 접근성을 제한하는 또 다른 본질적인 문제를 야기할 수 있다.Of course, some of these problems may be solved if expensive hardware resources (sensor, CPU, camera, etc.) with high resolution are sufficiently supported. It can cause other intrinsic problems that limit versatility and accessibility.

또한, 종래 기술의 경우, 분해능 등의 향상을 위하여 사용자에게 익숙하지 않는 생소한 제스처가 주로 적용되므로 목적의식을 가지고 지속적으로 학습하지 않는 한 사용자가 직관적으로 적응하기 어려워 사용자 편의성이 낮다고 할 수 있다.In addition, in the case of the prior art, since unfamiliar gestures unfamiliar to the user are mainly applied to improve resolution, etc., it is difficult for the user to intuitively adapt unless he or she continuously learns with a sense of purpose, so user convenience is low.

한국등록특허공보 10-1785650호(2017.09.29.)Korean Registered Patent Publication No. 10-1785650 (2017.09.29.) 한국등록특허공보 10-1812605호(2017.12.20)Korean Registered Patent Publication No. 10-1812605 (2017.12.20) 한국등록특허공보 10-2320754호(2021.10.27.)Korean Registered Patent Publication No. 10-2320754 (2021.10.27.)

본 발명은 상기와 같은 배경에서 상술된 문제점을 해결하기 위하여 창안된 것으로서, 사용자에게 익숙한 제스처에 기반하여 직관적 인식에 의한 사용자 편의성을 최적화시키고, 개별객체의 상대적 위치관계에 대한 동적(動的) 변화를 객체 인식에 효과적으로 접목시킴으로써 연산 효율성은 물론, 객체 인식률을 비약적으로 향상시킬 수 있는 제스처 기반의 객체 지향적 인터페이싱 방법 등을 제공하는데 그 목적이 있다.The present invention was devised to solve the above-mentioned problems in the background as described above, and optimizes user convenience by intuitive recognition based on gestures familiar to the user, and provides dynamic changes to the relative positional relationship of individual objects. Its purpose is to provide a gesture-based object-oriented interfacing method that can dramatically improve not only computational efficiency but also object recognition rate by effectively grafting .

본 발명의 다른 목적 및 장점들은 아래의 설명에 의하여 이해될 수 있으며, 본 발명의 실시예에 의하여 보다 분명하게 알게 될 것이다. 또한, 본 발명의 목적 및 장점들은 특허청구범위에 나타난 구성과 그 구성의 조합에 의하여 실현될 수 있다.Other objects and advantages of the present invention can be understood by the following description, and will be more clearly understood by the embodiments of the present invention. In addition, the objects and advantages of the present invention can be realized by the configuration shown in the claims and the combination of the configuration.

상기 목적을 달성하기 위한 본 발명의 일 실시예에 의한, 제스처 기반의 객체 지향적 인터페이싱 방법은 카메라장치로부터 영상데이터를 입력받는 입력단계; 상기 영상데이터를 파싱하여 객체를 인식하는 객체인식단계; 및 상기 객체가 발생시키는 이벤트에 따라 특정 커맨드가 실행되도록 제어하는 실행제어단계를 포함한다.According to an embodiment of the present invention for achieving the above object, a gesture-based object-oriented interfacing method includes an input step of receiving image data from a camera device; an object recognition step of parsing the image data and recognizing an object; and an execution control step of controlling a specific command to be executed according to an event generated by the object.

구체적으로 상기 객체인식단계는 상기 영상데이터에서 수직 방향을 기준으로 제1기준길이 이상 연장된 형상을 가지는 제1버티컬객체를 인식하는 버티컬객체인식단계; 상기 제1버티컬객체에 대응되는 제1수직라인을 기준으로 좌측 또는 우측 중 일측에 위치하는 포인트객체를 인식하는 포인트객체인식단계; 및 상기 포인트객체가 상기 제1수직라인을 기준으로 반대측으로 이동하는 제1이벤트를 검출하는 이벤트검출단계를 포함할 수 있으며, 이 경우 본 발명의 상기 실행제어단계는 상기 제1이벤트가 검출되는 경우 상기 제1이벤트에 맵핑된 커맨드를 실행하도록 구성된다.Specifically, the object recognizing step may include: a vertical object recognizing step of recognizing a first vertical object having a shape extending more than a first reference length with respect to the vertical direction in the image data; a point object recognizing step of recognizing a point object located on one of the left and right sides of a first vertical line corresponding to the first vertical object; and an event detection step of detecting a first event in which the point object moves to the opposite side with respect to the first vertical line. In this case, the execution control step of the present invention is carried out when the first event is detected. and execute a command mapped to the first event.

또한, 본 발명의 상기 버티컬객체인식단계는 상기 제1수직라인의 좌측 또는 우측 중 상기 포인트객체의 반대측 방향에 위치한 객체로서, 수직 방향을 기준으로 연장된 형상을 가지되, 상기 제1기준길이보다 작은 길이를 가지는 제2버티컬객체를 더 인식하도록 구성될 수 있다.In addition, the vertical object recognition step of the present invention is an object located on the opposite side of the point object on the left or right side of the first vertical line, and has an extended shape based on the vertical direction, but is longer than the first reference length. It may be configured to further recognize the second vertical object having a small length.

이 경우 본 발명의 상기 이벤트검출단계의 상기 제1이벤트는 상기 포인트객체의 위치정보가 상기 제2버티컬객체가 형성하는 제2수직라인에 대응되는 경우로 설정되는 것이 바람직하다.In this case, it is preferable that the first event in the event detection step of the present invention is set when the location information of the point object corresponds to the second vertical line formed by the second vertical object.

바람직하게, 본 발명의 이벤트검출단계는 상기 포인트객체의 위치정보가 상기 제2수직라인의 상위 지점에 대응되는 제1서브이벤트 또는 상기 포인트객체의 위치정보가 상기 제2수직라인의 하위 지점에 대응되는 제2서브이벤트 중 하나를 검출하도록 구성될 수 있다. 이 경우, 본 발명의 상기 실행제어단계는 상기 제1서브이벤트 또는 제2서브이벤트 각각에 맵핑된 커맨드가 실행되도록 제어한다.Preferably, in the event detection step of the present invention, the location information of the point object corresponds to a first sub-event corresponding to a higher point of the second vertical line or the location information of the point object corresponds to a lower point of the second vertical line. It may be configured to detect one of the second sub-events. In this case, the execution control step of the present invention controls commands mapped to each of the first sub-event and the second sub-event to be executed.

구체적으로 상기 제1버티컬객체는 사용자의 손가락 중 펴진 검지객체이며, 상기 포인트객체는 엄지객체이고 상기 제2버티컬객체는 접힌 중지객체로 설정되는 것이 바람직하다.Specifically, it is preferable that the first vertical object is a user's finger extended index object, the point object is a thumb object, and the second vertical object is a folded stop object.

실시형태에 따라서, 본 발명은 상기 영상데이터에서 눈(eye) 사이의 제1거리정보를 연산하는 거리연산단계; 실제 눈 사이의 평균거리정보와 상기 제1거리정보 사이의 비율을 이용하여 디스플레이장치와 사용자 사이의 이격거리정보를 연산하는 추정단계; 및 상기 이격거리정보와 상기 디스플레이장치의 크기정보를 이용하여 기준인식영역을 설정하는 인식영역설정단계를 더 포함할 수 있다.According to an embodiment, the present invention provides a distance calculation step of calculating first distance information between eyes in the image data; an estimation step of calculating separation distance information between the display device and the user by using a ratio between the average distance information between actual eyes and the first distance information; and a recognition area setting step of setting a reference recognition area using the separation distance information and size information of the display device.

이 경우, 본 발명의 상기 객체인식단계는 상기 기준인식영역에 대응되는 영역의 영상데이터를 이용하도록 구성되는 것이 바람직하다.In this case, it is preferable that the object recognition step of the present invention is configured to use image data of an area corresponding to the reference recognition area.

또한, 본 발명은 마이크모듈이 감지한 사운드가 기동사운드(wake-up sound)에 해당하는 경우, 상기 카메라장치가 활성화되도록 제어하는 카메라제어단계; 및 상기 기동사운드의 발생 위치 또는 방향성에 대한 정보인 방향성정보가 생성되는 방향성정보입력단계를 더 포함할 수 있다.In addition, the present invention includes a camera control step of controlling the camera device to be activated when the sound detected by the microphone module corresponds to a wake-up sound; And it may further include a direction information input step of generating direction information that is information on the generation location or direction of the startup sound.

이 경우, 본 발명의 상기 객체인식단계는 상기 영상데이터가 n(n은 2이상의 자연수)개의 영역으로 나누어진 n개의 단위영상을 이용하여 객체 인식에 대한 프로세싱을 수행하되, 상기 n개의 단위영상 중 상기 방향성정보에 대응되는 단위영상을 우선 대상으로 객체 인식에 대한 프로세싱을 수행하도록 구성될 수 있다.In this case, the object recognition step of the present invention performs object recognition processing using n unit images in which the image data is divided into n (n is a natural number of 2 or more) regions, and among the n unit images The unit image corresponding to the directional information may be configured to perform object recognition processing first.

나아가 본 발명은 직각 형상이 포함된 손 객체가 대각선 방향으로 대칭되는 위치에서 인식되는 경우, 상기 직각 형상의 연장 라인에 의하여 형성되는 영역을 기준인식영역으로 설정하는 인식영역설정단계를 더 포함할 수 있으며 이 경우 본 발명의 상기 객체인식단계는 상기 기준인식영역에 대응되는 영역의 영상데이터를 이용하도록 구성될 수 있다.Furthermore, the present invention may further include a recognition area setting step of setting an area formed by the extension line of the right angle shape as a reference recognition area when a hand object including a right angle shape is recognized at a diagonally symmetrical position. In this case, the object recognition step of the present invention may be configured to use image data of an area corresponding to the reference recognition area.

더욱 바람직하게, 본 발명의 상기 인식영역설정단계는 상기 기준인식영역이 설정된 시점의 얼굴위치정보를 설정하는 위치설정단계; 및 상기 설정된 얼굴위치정보가 이동하는 경우 그 이동에 대응되도록 상기 기준인식영역의 위치를 이동시키는 가변설정단계를 포함하도록 구성될 수 있다.More preferably, the recognition area setting step of the present invention includes a position setting step of setting face location information at the time when the reference recognition area is set; and a variable setting step of moving the location of the reference recognition region to correspond to the movement when the set face location information moves.

본 발명의 다른 측면에 의한 제스처 기반의 객체 지향적 인터페이싱 장치는 카메라장치로부터 영상데이터를 입력받는 입력부; 상기 영상데이터를 파싱하여 객체를 인식하는 객체인식부; 및 상기 객체가 발생시키는 이벤트에 따라 특정 커맨드가 실행되도록 제어하는 실행제어부를 포함할 수 있다.A gesture-based object-oriented interfacing device according to another aspect of the present invention includes an input unit for receiving image data from a camera device; an object recognition unit that recognizes an object by parsing the image data; and an execution control unit that controls execution of a specific command according to an event generated by the object.

구체적으로 상기 객체인식부는 상기 영상데이터에서 수직 방향을 기준으로 제1기준길이 이상 연장된 형상을 가지는 제1버티컬객체를 인식하는 버티컬객체인식부; 상기 제1버티컬객체에 대응되는 제1수직라인을 기준으로 좌측 또는 우측 중 일측에 위치하는 포인트객체를 인식하는 포인트객체인식부; 및 상기 포인트객체의 위치정보가 상기 제1수직라인을 기준으로 반대측으로 이동하는 제1이벤트를 검출하는 이벤트검출부를 포함할 수 있다.Specifically, the object recognizing unit includes a vertical object recognizing unit recognizing a first vertical object having a shape extending more than a first reference length with respect to the vertical direction in the image data; a point object recognition unit for recognizing a point object located on either the left side or the right side of a first vertical line corresponding to the first vertical object; and an event detection unit configured to detect a first event in which the location information of the point object moves to the opposite side with respect to the first vertical line.

이 경우 본 발명의 상기 실행제어부는 상기 제1이벤트가 검출되는 경우, 상기 제1이벤트에 맵핑된 커맨드가 실행되도록 제어한다.In this case, when the first event is detected, the execution control unit of the present invention controls a command mapped to the first event to be executed.

본 발명의 바람직한 실시예에 의할 때, 간단한 변위 특징을 가지는 제스처가 적용되므로 연산 처리의 효율성을 더욱 높일 수 있음은 물론, 해당 제스처를 구성하는 개별 객체들 사이의 상대적 위치 관계의 변화를 정확히 특정하고 이를 인터페이싱에 접목시킬 수 있어 사용자와 전자장치 사이의 인터페이싱(interfacing) 또는 인터렉션(interaction)의 명확성과 정밀성 또한, 구현할 수 있다.According to a preferred embodiment of the present invention, since a gesture having a simple displacement feature is applied, the efficiency of calculation processing can be further increased, and a change in the relative positional relationship between individual objects constituting the gesture can be accurately specified. And it can be grafted to interfacing, so the clarity and precision of interfacing or interaction between a user and an electronic device can also be implemented.

본 발명에 의하는 경우, 전체적으로 함께 이동하는 동작 특성을 가짐과 동시에 독립적 이동이 가능한 개별객체들의 상대적 위치 관계가 제스처로 적용되므로 제스처 인식 내지 검출을 위한 위치 기준을 시계열적으로 변화시킬 필요가 없어 연산 처리의 효율성을 더욱 향상시킬 수 있다.In the case of the present invention, since the relative positional relationship of individual objects that have motion characteristics that move together as a whole and can move independently is applied as a gesture, there is no need to change the positional reference for gesture recognition or detection in a time-series manner. The processing efficiency can be further improved.

또한, 본 발명의 실시예에 의할 때, 사용자에게 익숙한 UX에 상응하는 모양 또는 움직임 등이 제스처로 적용되므로 직관적 인식에 의한 사용자 편의성을 더욱 최적화시킬 수 있다.In addition, according to an embodiment of the present invention, since a shape or movement corresponding to a user-familiar UX is applied as a gesture, user convenience by intuitive recognition can be further optimized.

나아가 본 발명의 다른 실시예에 의할 때, 객체 인식을 위한 영역을 선행적으로 선별 내지 특정하는 전처리 프로세싱 또는 영상데이터 전체를 그대로 적용하지 않고 사용자의 현재 위치 내지 방향 등에 대응되는 영역의 영상을 객체 인식에 우선적으로 활용하는 프로세싱을 통하여 연산 처리 속도를 비약적으로 향상시킬 수 있어 인터페이싱의 즉시 응답성을 구현할 수 있다.Furthermore, according to another embodiment of the present invention, the image of the region corresponding to the user's current location or direction is converted into an object without pre-processing to select or specify the region for object recognition in advance or without applying the entire image data as it is. Through processing that is prioritized for recognition, the operation processing speed can be drastically improved, so that the immediate response of interfacing can be implemented.

본 명세서에 첨부되는 다음의 도면들은 본 발명의 바람직한 실시예를 예시하는 것이며, 후술되는 발명의 상세한 설명과 함께 본 발명의 기술사상을 더욱 효과적으로 이해시키는 역할을 하는 것이므로, 본 발명은 이러한 도면에 기재된 사항에만 한정되어 해석되어서는 아니 된다.
도 1은 본 발명의 바람직한 일 실시예에 의한 인터페이싱 장치의 상세 구성을 도시한 블록도,
도 2는 도 1에 도시된 객체인식부의 상세 구성을 도시한 블록도,
도 3 및 도 4는 본 발명의 실시예들에 대한 프로세싱 과정을 도시한 흐름도,
도 5 및 도 6은 버티컬객체 및 포인트객체에 대한 상세 구성 및 이를 기반으로 생성되는 이벤트를 설명하는 도면,
도 7은 객체 인식을 위한 기준인식영역이 설정되는 본 발명의 프로세싱 과정 등을 도시한 흐름도,
도 8은 사운드 방향정보 등이 활용되는 본 발명의 프로세싱 과정을 도시한 흐름도,
도 9는 객체 인식을 위한 기준인식영역이 설정되는 본 발명의 다른 실시예를 설명하는 도면이다.The following drawings attached to this specification illustrate preferred embodiments of the present invention, and together with the detailed description of the present invention serve to more effectively understand the technical idea of the present invention, the present invention is described in these drawings should not be construed as limited to
1 is a block diagram showing the detailed configuration of an interfacing device according to a preferred embodiment of the present invention;
Figure 2 is a block diagram showing the detailed configuration of the object recognition unit shown in Figure 1;
3 and 4 are flow charts showing processing procedures for embodiments of the present invention;
5 and 6 are diagrams for explaining detailed configurations of vertical objects and point objects and events generated based thereon;
7 is a flowchart showing the processing procedure of the present invention in which a reference recognition area for object recognition is set;
8 is a flowchart showing the processing process of the present invention in which sound direction information and the like are utilized;
9 is a diagram for explaining another embodiment of the present invention in which a reference recognition area for object recognition is set.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 상세히 설명하기로 한다. 이에 앞서, 본 명세서 및 청구범위에 사용된 용어나 단어는 통상적이거나 사전적인 의미로 한정해서 해석되어서는 아니 되며, 발명자는 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 용어의 개념을 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야만 한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. Prior to this, the terms or words used in this specification and claims should not be construed as being limited to the usual or dictionary meaning, and the inventor appropriately uses the concept of the term in order to explain his/her invention in the best way. It should be interpreted as a meaning and concept consistent with the technical idea of the present invention based on the principle that it can be defined.

따라서 본 명세서에 기재된 실시예와 도면에 도시된 구성은 본 발명의 가장 바람직한 일 실시예에 불과할 뿐이고 본 발명의 기술적 사상을 모두 대변하는 것은 아니므로, 본 출원시점에 있어서 이들을 대체할 수 있는 다양한 균등물과 변형예들이 있을 수 있음을 이해하여야 한다.Therefore, since the embodiments described in this specification and the configurations shown in the drawings are only one of the most preferred embodiments of the present invention and do not represent all of the technical ideas of the present invention, various equivalents that can replace them at the time of the present application It should be understood that there may be waters and variations.

도 1은 본 발명의 바람직한 일 실시예에 의한 기동 사운드의 방향성 정보를 이용한 인터페이싱 장치(이하 '인터페이싱 장치'라 지칭한다)(100)의 상세 구성을 도시한 블록도, 도 2는 도 1에 도시된 객체인식부(120)의 상세 구성을 도시한 블록도이며, 도 3 및 도 4는 본 발명의 실시예에 의한 프로세싱 과정을 도시한 흐름도 그리고 도 5 및 도 6은 버티컬객체 및 포인트객체에 대한 상세 구성 및 이를 활용하여 생성되는 이벤트를 설명하는 도면이다.1 is a block diagram showing the detailed configuration of an interfacing device (hereinafter referred to as 'interfacing device') 100 using directional information of a starting sound according to a preferred embodiment of the present invention, and FIG. 2 is shown in FIG. 3 and 4 are flowcharts showing a processing procedure according to an embodiment of the present invention, and FIGS. 5 and 6 are for vertical objects and point objects. It is a diagram explaining the detailed configuration and events generated by using it.

도 1에 도시된 바와 같이 본 발명의 인터페이싱 장치(100)는 입력부(110), 객체인식부(120), 실행제어부(130), 영역설정부(140) 및 사운드처리부(150) 등을 포함하여 구성될 수 있다.As shown in FIG. 1, the interfacing device 100 of the present invention includes an input unit 110, an object recognition unit 120, an execution control unit 130, a region setting unit 140, and a sound processing unit 150. can be configured.

또한, 도 2에 도시된 바와 같이 본 발명의 인터페이싱 장치(100)의 일 구성인 객체인식부(120)는 구체적으로 AI프로세싱부(121), 버티컬객체인식부(123), 포인트객체인식부(125), 트래킹부(127) 및 이벤트검출부(129) 등을 포함하여 구성될 수 있다.In addition, as shown in FIG. 2, the object recognition unit 120, which is one component of the interfacing device 100 of the present invention, specifically includes the AI processing unit 121, the vertical object recognition unit 123, the point object recognition unit ( 125), a tracking unit 127, an event detection unit 129, and the like.

본 발명에 의한 인터페이싱 방법(이하 '본 발명의 인터페이싱 방법'이라 지칭한다)은 특정 단말이나 장치 등에 탑재되어 구동되는 소프트웨어 형태로 구현될 수 있으며, 저장수단, 연산처리수단 등의 전자소자, 부품 등을 이용하여 본 발명에 의한 기술사상이 구현되도록 설계된 모듈 또는 독립된 장치 등의 하드웨어 형태로도 구현될 수 있다.The interfacing method according to the present invention (hereinafter referred to as the 'interfacing method of the present invention') may be implemented in the form of software that is loaded and driven on a specific terminal or device, and may include electronic elements such as storage means and calculation processing means, parts, etc. It can also be implemented in the form of hardware such as a module or an independent device designed to implement the technical idea according to the present invention using .

또한, 실시형태에 따라서 본 발명의 인터페이싱 방법(장치)은 인터페이싱 제어의 대상이 되는 TV, 디스플레이 장치, 내비게이션, 영상 의료장치, AR 또는 VR 장치, 컴퓨터(모니터), 스크린(screen) 시스템, 키오스크(kiosk), 자동화시스템 등과 같은 설비나 장치에 임베디드되는 형태로도 구현될 수 있음은 물론이다. In addition, according to the embodiment, the interfacing method (device) of the present invention is a TV, display device, navigation, medical imaging device, AR or VR device, computer (monitor), screen system, kiosk ( Of course, it can also be implemented in the form of being embedded in facilities or devices such as a kiosk) and an automation system.

이러한 점에서 도 1에 도시된 본 발명의 인터페이싱 장치(100) 및 도 2에 도시된 객체인식부(120)의 각 구성요소는 물리적으로 구분되는 구성요소라기보다는 논리적으로 구분되는 구성요소로 이해되어야 한다.In this respect, each component of the interfacing device 100 of the present invention shown in FIG. 1 and the object recognition unit 120 shown in FIG. 2 should be understood as a logically separated component rather than a physically separated component. do.

즉, 상기 도면에 도시된 각각의 구성요소는 본 발명에 의한 기술사상을 효과적으로 설명하기 위한 논리적 구성에 해당하므로 각각의 구성요소가 통합 또는 분리되어 구성되더라도 본 발명의 논리 구성이 수행하는 기능이 실현될 수 있다면 본 발명의 범위 내에 있다고 해석되어야 하며, 동일 또는 유사한 기능을 수행하는 구성요소라면 그 명칭상의 일치성 여부와는 무관히 본 발명의 범위 내에 있다고 해석되어야 함은 물론이다.That is, since each component shown in the drawings corresponds to a logical configuration for effectively explaining the technical idea according to the present invention, even if each component is integrated or separated, the function performed by the logical configuration of the present invention is realized. If possible, it should be interpreted as being within the scope of the present invention, and components performing the same or similar functions should be construed as being within the scope of the present invention regardless of whether or not their names are identical.

본 발명의 인터페이싱 장치(100) 또는 방법은 카메라장치(50)로부터 입력되는 영상데이터를 분석(parsing)하고 매칭 알고리즘 또는 딥러닝 등의 알고리즘을 적용하여 영상데이터 내 특정 객체를 인식 및 검출하며, 검출된 객체의 형상적 특징 또는 움직임 패턴, 이동 특성 등에 맵핑된(mapping) 특정 커맨드가 디바이스(70)에서 실행되도록 제어하는 장치(방법)에 해당한다.The interfacing device 100 or method of the present invention analyzes (parses) image data input from the camera device 50, recognizes and detects a specific object in the image data by applying an algorithm such as a matching algorithm or deep learning, and detects it. It corresponds to a device (method) for controlling a specific command mapped to a shape feature, movement pattern, or movement characteristic of an object to be executed in the device 70 .

본 발명의 객체인식부(120)는 입력부(110)를 통하여 카메라장치(50)로부터 영상데이터가 입력되면(S300, S400, 도 3 등 참조) 입력된 영상데이터를 파싱하여 객체를 인식하는 프로세싱을 수행한다(S310).When image data is input from the camera device 50 through the input unit 110 (see S300, S400, FIG. 3, etc.), the object recognition unit 120 of the present invention parses the input image data and performs processing for recognizing an object. It is performed (S310).

구체적으로, 객체인식부(120)의 일 구성인 버티컬객체인식부(123)는 상기 영상데이터에서 수직 방향을 기준으로 제1기준길이 이상 연장된 형상을 가지는 객체인 제1버티컬객체를 인식 내지 검출한다(S311). Specifically, the vertical object recognition unit 123, which is one component of the object recognition unit 120, recognizes or detects a first vertical object, which is an object having a shape extending beyond a first reference length in the vertical direction, in the image data. Do (S311).

실시형태에 따라서 도 3에 예시된 바와 같이 레퍼런스 데이터 또는 학습 모델링 데이터 등이 효과적으로 활용될 수 있도록 버티컬객체인식부(123)는 AI프로세싱부(121)에 연동하도록 구성될 수 있다.According to the embodiment, as illustrated in FIG. 3 , the vertical object recognition unit 123 may be configured to interwork with the AI processing unit 121 so that reference data or learning modeling data can be effectively utilized.

객체를 인식하는 방법은 딥러닝 등을 포함한 다양한 방법이 적용될 수 있으며, 객체의 형상이나 모양, 움직임 등은 스켈레톤(skeleton) 또는 관절 포인트 등을 이용하는 기법 등이 활용될 수 있다.A method of recognizing an object may be applied using various methods including deep learning, and a technique using a skeleton or joint points may be used for the shape, shape, and motion of an object.

상기 제1버티컬객체는 후술되는 바와 같이 객체 인식 및 제스처 인식의 기준을 정하기 위한 객체로서, 제1버티컬객체의 형상적 특징(수직 길이 방향으로 연장된 형상적 특징)을 이용함으로써 왼쪽 영역(LA, 도 5 참조)과 오른쪽 영역(RA, 도 5 참조)이 명확히 구분될 수 있어 다른 객체의 상대적 위치 관계가 정확히 특정될 수 있음은 물론, 객체들의 상대적 위치 관계를 기반으로 한 객체의 이동이 더욱 뚜렷하고 명확하게 특정될 수 있다.As will be described later, the first vertical object is an object for determining the standards for object recognition and gesture recognition, and the left area (LA, 5) and the right area (RA, see FIG. 5) can be clearly distinguished, so that the relative positional relationship of other objects can be accurately specified, and the movement of objects based on the relative positional relationship between objects is more distinct and can be clearly specified.

수직 길이 방향으로 연장된 형상을 가지는 휴먼 객체라면 인체의 바디, 팔 등 다양한 객체가 상기 제1버티컬객체(VO1, 도 5 등 참조)로 적용될 수 있음은 물론이다. Of course, various objects such as the body and arms of a human body can be applied as the first vertical object (VO1, see FIG. 5, etc.) if it is a human object having a shape extending in the vertical length direction.

다만, 개별 객체들 사이의 절대적 이동과 상대적 이동의 상관관계를 유기적으로 반영하고 나아가 사용자의 직관적 인식과 UX에 기반한 편의성을 더욱 높이기 위하여 상기 제1버티컬객체(VO1)는 사용자의 손가락 중 펴진 검지객체(un-folded index finger object)로 설정되는 것이 바람직하다.However, in order to organically reflect the correlation between absolute movement and relative movement between individual objects, and to further enhance the user's intuitive recognition and UX-based convenience, the first vertical object (VO1) is the user's finger. (un-folded index finger object).

이러한 점에서 상기 제1기준길이는 가변적으로 설정될 수 있음은 물론이며, 검지의 물리적인 평균 길이, 영상과 사용자 사이의 거리 등의 함수 관계를 통하여 정해질 수 있다. 또한, 실시형태에 따라서 스켈레톤 등으로 손 객체가 인식되는 경우 손의 형상적 특징을 이용하여 검지에 해당하는 제1버티컬객체(VO1)가 검출될 수도 있음은 물론이다.In this regard, the first reference length can be set variably, and can be determined through a functional relationship such as a physical average length of an index finger and a distance between an image and a user. In addition, according to the embodiment, when a hand object is recognized as a skeleton or the like, it goes without saying that the first vertical object VO1 corresponding to the index finger may be detected using the shape characteristics of the hand.

상세한 설명은 생략하나, 일반적인 객체 인식 기술과 같이 제1버티컬객체(VO1)가 인식 또는 검출된다는 것은 그에 해당하는 위치정보, 영역데이터 또는 벡터 데이터 등이 생성되고(S313) 생성된 데이터가 추적(tracking)될 수 있음을 의미한다. 도 2에 도시된 트래킹부(127)는 이러한 기능이 본 발명의 객체인식부(120)에 구현될 수 있음을 의미한다.Although a detailed description is omitted, recognizing or detecting the first vertical object VO1 as in general object recognition technology means that corresponding location information, area data, or vector data are generated (S313) and the generated data is tracked. ) means that it can be The tracking unit 127 shown in FIG. 2 means that these functions can be implemented in the object recognition unit 120 of the present invention.

이와 같이 제1버티컬객체(VO1)가 인식되면, 본 발명의 포인트객체인식부(125)는 제1버티컬객체(VO1)에 대응되는 제1수직라인(제1수직영역)(VL1, 도 5 참조)을 기준으로 좌측 또는 우측 중 일측에 위치하는 포인트객체(PO)를 인식 내지 검출한다(S315).In this way, when the first vertical object VO1 is recognized, the point object recognition unit 125 of the present invention receives a first vertical line (first vertical area) (VL1, see FIG. 5) corresponding to the first vertical object VO1. ) is recognized or detected (S315).

상기 제1수직라인은 선을 표상하는 데이터로 구성될 수 있음은 물론이나, 실시형태에 따라서 제1버티컬객체(VO1)가 차지하는 영역을 표상하는 데이터로 구성될 수도 있음은 물론이며, 실시형태에 따라서 제1수직라인은 제1버티컬객체(VO1)의 영역 중 좌측, 우측 또는 중심 라인 등으로 설정될 수 있다. 후술되는 제2버티컬객체(VO2)에 의한 제2수직라인(VL2) 또한, 이와 같다.Of course, the first vertical line may be composed of data representing a line, but may also be composed of data representing an area occupied by the first vertical object VO1 according to an embodiment. Accordingly, the first vertical line may be set to the left, right, or center line of the area of the first vertical object VO1. The second vertical line VL2 by the second vertical object VO2 to be described later is also the same.

앞서 기술된 바와 같이 제1버티컬객체(VO1)가 검지로 설정되는 경우, 상기 포인트객체(PO)는 엄지 또는 엄지의 단부(tip)(포인트 또는 단부 영역)로 설정되는 것이 바람직하다.As described above, when the first vertical object VO1 is set to the index finger, the point object PO is set to the thumb or the tip (point or end area) of the thumb.

이와 같이 제1버티컬객체(VO1)와 포인트객체(PO)의 인식이 이루어지면, 본 발명의 이벤트검출부(129)는 상기 포인트객체(PO)가 상기 제1수직라인(VL1)을 기준으로 반대측으로 이동하는 무빙 또는 액션(이하 '제1이벤트'라 지칭한다)이 이루어지는지 여부를 모니터링한다(S317).In this way, when the first vertical object VO1 and the point object PO are recognized, the event detection unit 129 of the present invention moves the point object PO to the opposite side with respect to the first vertical line VL1. It is monitored whether a moving movement or action (hereinafter referred to as a 'first event') is performed (S317).

즉, 본 발명에 의하는 경우, 포인트객체(PO)의 이동 여부가, 동일 휴먼 객체(손)에 속하되, 다른 개별 객체에 해당하는 제1버티컬객체(VO1)가 형성하는 수직 영역 내지 라인을 기준으로 결정되므로 그 이동 여부(LA→RA, 도 5 참조)가 더욱 정확하고 명확하게 인식될 수 있다.That is, in the case of the present invention, whether or not the point object PO is moved depends on the vertical area or line formed by the first vertical object VO1 belonging to the same human object (hand) but corresponding to another individual object. Since it is determined based on the criteria, whether or not the movement (LA→RA, see FIG. 5) can be recognized more accurately and clearly.

이와 같이 제1이벤트가 검출되면 본 발명의 실행제어부(130)는 상기 제1이벤트에 맵핑된 소정의 커맨드가 디바이스(70)에서 실행되도록 제어한다(S320). 제1이벤트에 맵핑되는 커맨드는 디바이스의 종류, 디바이스에서 표출되는 컨텐츠의 종류와 특성, 사용자 등록 등에 따라 다양하게 설정될 수 있음은 물론이다. When the first event is detected in this way, the execution control unit 130 of the present invention controls a predetermined command mapped to the first event to be executed in the device 70 (S320). Of course, the command mapped to the first event can be set in various ways according to the type of device, the type and characteristics of content displayed on the device, user registration, and the like.

상술된 본 발명의 프로세싱은 장치 비활성화, 종료 명령 등과 같은 종료 조건이 충족되지 않는 한(S330, S460) 순환적으로 적용되도록 구성될 수 있음은 물론이다.It goes without saying that the processing of the present invention described above may be configured to be applied cyclically unless termination conditions such as device deactivation, termination command, and the like are satisfied (S330 and S460).

더욱 바람직한 실시형태의 구현을 위하여 본 발명의 버티컬객체인식부(123)는 제2버티컬객체(VO2)를 더 인식할 수 있다(S410).In order to implement a more preferred embodiment, the vertical object recognition unit 123 of the present invention may further recognize the second vertical object VO2 (S410).

제2버티컬객체(VO2)는 제1수직라인(VL1)의 좌측 영역 또는 우측 영역 중 상기 포인트객체(PO)가 위치하지 않는 영역에 위치한 객체(도 5 등 참조)로서 수직 방향을 기준으로 연장된 형상을 가지되, 상기 제1기준길이(제1버티컬객체(VO1)의 수직 방향 길이)보다 작은 길이를 가지는 객체에 해당한다.The second vertical object VO2 is an object located in an area where the point object PO is not located among the left or right areas of the first vertical line VL1 (see FIG. 5, etc.) and is extended in the vertical direction. It corresponds to an object having a shape but having a length smaller than the first reference length (the length of the first vertical object VO1 in the vertical direction).

즉, 이 실시예에 의하는 경우, 포인트객체(PO)와 제2버티컬객체(VO2)는 제1버티컬객체(VO1)를 기준으로 반대 영역에 위치한다(default).That is, in the case of this embodiment, the point object PO and the second vertical object VO2 are located in opposite areas relative to the first vertical object VO1 (default).

앞서 기술된 바와 같이, 제2버티컬객체(VO2)가 인식되면, 제2버티컬객체(VO2)에 해당하는 다양한 정보 및 데이터가 생성되고 트래킹될 수 있음은 물론이다(S420).As described above, when the second vertical object VO2 is recognized, various information and data corresponding to the second vertical object VO2 can be created and tracked (S420).

앞서 설명된 실시예에서는 포인트객체(PO)의 위치정보가 제1버티컬객체(VO1)에 의한 제1수직라인(VL1)의 영역을 기준으로 일측에서 타측으로 변화되는 제스처가 제1이벤트가 된다.In the above-described embodiment, the first event is a gesture in which the location information of the point object PO is changed from one side to the other side based on the area of the first vertical line VL1 by the first vertical object VO1.

이러한 위치 관계의 변화를 더욱 뚜렷하고 명확하게 구성하기 위하여 이 실시예에서는 포인트객체(PO)가 제1수직라인(VL1)을 가로질러 그 위치가 일측에서 타측에서 변경되는 제1조건 및 포인트객체(PO)의 위치정보가 제2버티컬객체(VO2)가 형성하는 제2수직라인(VL2)에 대응되는 제2조건이 모두 충족되는 경우(S430)가 제1이벤트가 되도록 구성된다.In order to make this change in positional relationship more distinct and clear, in this embodiment, the first condition that the point object PO crosses the first vertical line VL1 and its position changes from one side to the other and the point object PO ) is configured to be the first event when all the second conditions corresponding to the second vertical line VL2 formed by the second vertical object VO2 are satisfied (S430).

데이터 처리의 관점으로 환언하면, 포인트객체(PO)의 위치정보 등이 제2수직라인(VL2)의 영역 내 위치하거나 제2수직라인(VL2)의 외곽라인에 해당하는 경우 등이 이 실시예에서의 제1이벤트가 된다.In other words, in terms of data processing, in this embodiment, the location information of the point object PO is located within the area of the second vertical line VL2 or corresponds to the outer line of the second vertical line VL2. becomes the first event of

앞서 기술된 바와 같이, 제1버티컬객체(VO1)가 펴진 검지객체이고, 포인트객체(PO)가 엄지객체로 구성되는 경우, 제2버티컬객체(VO2)는 접힌 중지객체(folded middle finger)로 구성될 수 있다.As described above, when the first vertical object VO1 is an unfolded index object and the point object PO is composed of a thumb object, the second vertical object VO2 is composed of a folded middle finger. It can be.

이와 같이 구성되는 경우, 사용자에게 가장 익숙한 입력수단인 마우스(mouse)를 사용할 때 사용자가 취하는 손의 자세 내지 모양 등과 유사하게 되므로 직관적 인식에 의한 사용 편의성을 더욱 높일 수 있다.In the case of this configuration, since the posture or shape of the user's hand when using a mouse, which is the most familiar input means to the user, is similar, the convenience of use by intuitive recognition can be further enhanced.

그러므로 이 실시예를 사용자의 관점에서 보면, 엄지(PO)가, 검지(VO1)가 위치한 영역을 가로질러 엄지(PO)의 반대편(default 위치 기준)에 위치한 중지로 이동하여 대접하는 제스처가 제1이벤트가 된다. 이러한 제스처는 마우스 버튼을 클릭하는 행위 동작과 상응하므로 UX에 부합하는 인터페이싱 환경이 구현될 수 있다.Therefore, when looking at this embodiment from the user's point of view, the gesture of moving the thumb PO across the area where the index finger VO1 is located to the middle finger located on the opposite side of the thumb PO (based on the default position) to treat is the first gesture. become an event Since these gestures correspond to the act of clicking a mouse button, an interfacing environment suitable for UX can be implemented.

더욱 구체적으로 본 발명의 이벤트검출부(129)는 포인트객체(PO)의 위치가 제2수직라인(VL2)에 대응되는 경우(S430), 포인트객체(PO)의 위치가 제2수직라인(VL2)의 상위지점 또는 하위지점 중 어디로 이동하였는지를 검출하도록(S440) 구성될 수 있다.More specifically, when the position of the point object PO corresponds to the second vertical line VL2 (S430), the event detection unit 129 of the present invention determines that the position of the point object PO corresponds to the second vertical line VL2. It may be configured to detect which of the upper or lower points of the has moved (S440).

이와 같이 이벤트검출부(129)에 의하여 포인트객체(PO)가 제1버티컬객체(VO1)를 가로질러 제2버티컬객체(VO2)의 상위 지점(P1)으로 이동하는 이벤트(제1서브이벤트)(도 6의 좌측 도면) 또는 제2버티컬객체(VO2)의 하위 지점(P2)으로 이동하는 이벤트(제2서브이벤트)(도 6의 우측 도면)가 검출되면, 본 발명의 실행제어부(130)는 제1 또는 제2서브이벤트에 맵핑된 해당 커맨드가 디바이스(70)에서 실행되도록 제어한다(S450).In this way, an event (first sub-event) in which the point object PO moves to the upper point P1 of the second vertical object VO2 across the first vertical object VO1 by the event detector 129 (FIG. 6) or an event moving to the lower point P2 of the second vertical object VO2 (second sub-event) (right diagram of FIG. 6) is detected, the execution control unit 130 of the present invention The command mapped to the first or second sub-event is controlled to be executed in the device 70 (S450).

실시형태에 따라서, 포인트객체(PO)가 이동한 위치가 제2버티컬객체(VO2)의 상위 또는 하위 지점 중 하나로 명확히 구분되지 않는 경우, 사용자에게 위치조정을 유도하는 안내정보가 표출수단(시각적 매체 또는 청각적 매체 등)을 통하여 표출되도록(S445) 구성될 수 있다.Depending on the embodiment, when the position to which the point object PO has moved is not clearly divided into either an upper or lower point of the second vertical object VO2, guidance information for inducing the user to adjust the position is an expression means (visual medium). Or, it may be configured to be expressed (S445) through an auditory medium, etc.).

이하에서는 도 7 등을 참조하여 객체 인식을 위한 기준인식영역이 설정되는 본 발명의 일 실시예를 상세히 설명하도록 한다.Hereinafter, an embodiment of the present invention in which a reference recognition area for object recognition is set will be described in detail with reference to FIG. 7 and the like.

도 1에 도시된 바와 같이 본 발명의 영역설정부(140)는 구체적으로 거리연산부(141), 추정연산부(142) 및 인식영역설정부(143) 등을 포함하여 구성될 수 있다. 앞서 기술된 바와 같이 이들 구성요소 또한, 그 기능을 중심으로 한 논리적 구성에 해당함은 물론이다.As shown in FIG. 1 , the area setting unit 140 of the present invention may include a distance calculation unit 141 , an estimation operation unit 142 , a recognition area setting unit 143 , and the like. As described above, of course, these components also correspond to a logical configuration centered on their functions.

거리연산부(141)는 카메라장치(50)로부터 영상데이터가 입력되면(S600) 특징점 추출, 벡터 연산, 딥러닝 등의 알고리즘 등을 적용하여 영상데이터에서 눈(eye)에 해당하는 객체를 선별하고 선별된 눈 객체 사이의 제1거리정보를 연산한다(S610).When the image data is input from the camera device 50 (S600), the distance calculation unit 141 selects and selects an object corresponding to the eye from the image data by applying algorithms such as feature point extraction, vector operation, and deep learning. First distance information between the eye objects is calculated (S610).

이와 같이 영상데이터의 픽셀 위치 등을 이용하여 제1거리정보가 산출되면 본 발명의 추정연산부(142)는 실제 눈 사이의 물리적 평균거리정보와 상기 제1거리정보 사이의 비율 정보를 이용하여 디스플레이장치(디바이스(70))와 사용자 사이의 이격거리정보를 추정 연산한다(S620).In this way, when the first distance information is calculated using the pixel position of the image data, the estimation operation unit 142 of the present invention uses the ratio information between the physical average distance information between the eyes and the first distance information to display the device. Information on the separation distance between (the device 70) and the user is estimated and calculated (S620).

이격거리정보가 추정되면 본 발명의 인식영역설정부(143)는 상기 이격거리정보, 디스플레이장치(70)의 크기정보, 영상데이터 내 휴대객체의 크기 정보 또는 인체 공학적 측면에서 고려된 휴먼객체(손 객체)의 이동범위 정보 등을 종합적으로 고려하여 기준인식영역을 설정한다(S630).If the separation distance information is estimated, the recognition area setting unit 143 of the present invention determines the separation distance information, the size information of the display device 70, the size information of the portable object in the image data, or the human object (hand hand) considered in terms of ergonomics. The reference recognition area is set by comprehensively considering the movement range information of the object) (S630).

이 기준인식영역은 현재 디바이스(디스플레이장치)(70)와 사용자 사이의 물리적 거리, 디바이스(디스플레이장치)(70)의 크기 등을 고려하여 설정된 영역으로서 사용자의 제스처가 이루어질 가능성이 높은 가상 영역에 해당한다.This reference recognition area is an area set in consideration of the physical distance between the current device (display device) 70 and the user, the size of the device (display device) 70, and the like, and corresponds to a virtual area in which the user's gesture is likely to be made. do.

이와 같이 기준인식영역이 설정되면 본 발명의 객체인식부(120)는 카메라장치(50)로부터 입력된 각 프레임별 영상데이터 전체를 파싱의 대상으로 하지 않고 기준인식영역에 해당하는 특정영역만을 그 대상으로 하거나 또는 이 특정영역을 우선 대상으로 객체 인식 프로세싱을 수행할(S670) 수 있어 연산처리의 효율성을 증진시킬 수 있다.When the reference recognition area is set in this way, the object recognition unit 120 of the present invention does not parse the entire image data for each frame input from the camera device 50, but only the specific area corresponding to the reference recognition area. Alternatively, object recognition processing may be performed (S670) targeting this specific area first, thereby improving the efficiency of calculation processing.

더욱 바람직하게, 인식영역설정부(143)는 기준인식영역이 설정된 시점의 얼굴위치에 대한 정보를 저장하고(S640) 활용하도록 구성될 수 있다.More preferably, the recognition area setting unit 143 may be configured to store (S640) information on the face position at the time when the reference recognition area is set and utilize it.

이와 같이 얼굴위치에 대한 정보가 저장되면 향후 얼굴위치정보가 이동하는지 여부를 모니터링하고(S650) 얼굴의 위치가 이동하는 경우 그에 함수적으로 결정되는 위치로 상기 기준인식영역을 이동시킴으로써(S660) 사용자의 이동에 따라 자연스럽게 기준인식영역을 가변시킬 수 있어 연산처리의 효율성을 물론, 사용자 지향적 환경을 제공할 수 있다.When the information on the face position is stored in this way, whether the face position information moves in the future is monitored (S650), and when the position of the face moves, the reference recognition area is moved to a position determined functionally (S660). As the reference recognition area can be naturally varied according to the movement of , it is possible to provide a user-oriented environment as well as efficiency in calculation processing.

안면 인식 알고리즘이 본 발명에 구현되는 경우, 등록된 사용자를 우선 대상으로 상술된 방법이 적용되도록 구성될 수 있다.When the face recognition algorithm is implemented in the present invention, the above-described method may be applied to registered users first.

도 8은 사운드 방향 정보 등을 활용한 본 발명의 일 실시예에 의한 프로세싱 과정을 도시한 흐름도이다.8 is a flowchart illustrating a processing process according to an embodiment of the present invention utilizing sound direction information and the like.

각 프레임별 영상데이터의 사이즈는 하드웨어 리소스에 따라 다양하며 또한, 카메라장치(50)(디바이스(70)에 설치되는 형태 등 포함)와 사용자 사이의 물리적 거리에 따라 영상데이터 내 휴먼객체가 포함되는 비율이 다르게 된다.The size of the image data for each frame varies according to hardware resources, and the ratio of human objects included in the image data according to the physical distance between the camera device 50 (including the type installed in the device 70) and the user. this becomes different

그러므로 영상데이터 전체를 대상으로 객체 인식 알고리즘을 적용하는 경우 연산처리의 효율성이 낮음은 물론, 영상데이터 내 검출되는 휴먼객체의 크기가 상대적으로 작아져 객체 검출의 정확성과 정밀성 등이 저하될 수 있다.Therefore, when the object recognition algorithm is applied to the entire image data, not only the efficiency of operation processing is low, but also the accuracy and precision of object detection may be deteriorated because the size of a human object detected in the image data is relatively small.

이러한 점을 고려하여, 영상데이터를 n(n은 2이상의 자연수)개의 영역으로 나누고, n개로 나누어진 개별 단위영상을 대상으로 하는 객체 검출 내지 인식 프로세싱이 적용될 수 있다.Considering this point, image data may be divided into n (n is a natural number equal to or greater than 2) regions, and object detection or recognition processing may be applied to individual unit images divided into n regions.

예를 들어, 각 프레임별 영상데이터가 4개의 단위영역 내지 단위영상으로 나누어진다면, 순차적 방법 등으로 첫 번째 단위영상에서 네 번째 단위영상을 대상으로 객체 검출 내지 인식 프로세싱이 적용되는데, 만약 #3번째 단위영상에서 의도된 객체가 검출된다면 #1단위영상 및 #2단위영상을 대상으로 한 검출 프로세싱은 연산 처리의 효율성을 저하시키는 프로세싱이 된다.For example, if image data for each frame is divided into 4 unit regions or unit images, object detection or recognition processing is applied to the fourth unit image from the first unit image in a sequential manner. If an intended object is detected in the unit image, the detection processing for the #1 unit image and the #2 unit image becomes processing that reduces the efficiency of calculation processing.

도 8에 도시된 본 발명의 실시예는 이러한 문제점을 효과적으로 극복하기 위한 실시예로서, 마이크모듈에 의하여 취득되는 사운드의 방향성 정보를 활용하여 복수 개 단위영상 중 특정 단위영상을 우선 대상으로 객체 인식 프로세싱을 수행하는 실시예에 해당한다.The embodiment of the present invention shown in FIG. 8 is an embodiment for effectively overcoming such a problem, and object recognition processing is performed by prioritizing a specific unit image among a plurality of unit images by utilizing the directional information of sound acquired by the microphone module. Corresponds to an embodiment that performs

구체적으로 본 발명의 사운드처리부(150)는 마이크모듈(60)로부터 사운드 정보가 입력되면(S710) 입력된 사운드 정보가, 미리 설정된 기동사운드(wake-up sound)에 해당하는지 여부를 판단하고(S720) 기동사운드에 해당하는 경우 카메라장치(50)가 활성화(wake mode)되도록 제어한다(S730).Specifically, when sound information is input from the microphone module 60 (S710), the sound processing unit 150 of the present invention determines whether the input sound information corresponds to a preset wake-up sound (S720 ) When corresponding to the startup sound, the camera device 50 is controlled to be activated (wake mode) (S730).

기동사운드는 미리 등록사용자 등에 의하여 등록될 수 있는데, 특정 단어, 구문 등으로 구성되는 음성(voice) 기반 사운드일 수 있으며, 정해진 횟수의 박수 등과 같이 음성 이외의 사운드일 수도 있다.The startup sound may be registered in advance by a registered user, etc., and may be a voice-based sound composed of a specific word or phrase, or may be a sound other than voice, such as a set number of claps.

이와 같이 구성되는 경우, 평상 시 카메라장치(50)를 비활성화모드(슬립모드, sleep mode)로 유지할(S700)로 수 있어 전력 사용의 효율성 또한, 높일 수 있다.In this configuration, the camera device 50 can be maintained in an inactive mode (sleep mode) during normal times (S700), so that the efficiency of power use can also be increased.

실시형태에 따라서, 마이크모듈(60)이 지향성 마이크이거나 또는 서로 다른 위치에 구비된 복수 개 마이크유닛으로 구성되는 경우, 기동사운드가 발생된 위치 내지 해당 위치에 대한 방향성 정보를 생성할 수 있다(S740). Depending on the embodiment, when the microphone module 60 is a directional microphone or is composed of a plurality of microphone units provided at different locations, it is possible to generate a location where a startup sound is generated or directional information about the corresponding location (S740). ).

상기 방향성 정보는 실시형태에 따라서 마이크모듈(60) 자체에서 생성되거나 또는 본 발명의 사운드처리부(150)에서 사운드정보의 파싱을 통하여 생성될 수 있음은 물론이다.Of course, the directional information may be generated in the microphone module 60 itself or generated through parsing of sound information in the sound processing unit 150 according to the embodiment.

이와 같이 기동사운드가 발생된 위치 내지 방향성 등에 대한 정보인 방향성 정보가 생성되면, 본 발명의 객체인식부(120)는 영상데이터가 n개로 분할된 영상인 n개의 단위영상 중 상기 방향성정보에 대응되는 단위영상을 선별하고(S750) 이 단위영상을 우선 대상으로 객체 인식에 대한 프로세싱을 수행하도록(S760) 구성될 수 있다.In this way, when directional information, which is information on the location or direction of occurrence of the activation sound, is generated, the object recognition unit 120 of the present invention corresponds to the directional information among n unit images, which are images in which image data is divided into n pieces. It may be configured to select a unit image (S750) and perform processing for object recognition with the unit image as a priority (S760).

이와 같이 구성되는 경우, 앞서 기술된 바와 같이, 단위 시간 당 처리되는 연산량을 비약적으로 감소시킬 수 있어 객체 인식 및 이를 기반으로 하는 인터페이싱의 즉시 응답성을 더욱 효과적으로 구현할 수 있다.In the case of this configuration, as described above, the amount of computations processed per unit time can be drastically reduced, so that object recognition and immediate responsiveness of interfacing based thereon can be implemented more effectively.

도 9는 객체 인식을 위한 기준인식영역이 설정되는 본 발명의 다른 실시예를 설명하는 도면이다. 9 is a diagram for explaining another embodiment of the present invention in which a reference recognition area for object recognition is set.

이 실시예는 사용자의 의사를 기반으로 사용자가 편의적으로 정하는 영역을 객체 인식 및 인터페이싱 제어를 위한 제스처가 입력되는 영역인 기준인식영역으로 설정하는 실시예에 해당한다.This embodiment corresponds to an embodiment in which an area conveniently determined by a user based on a user's intention is set as a reference recognition area, which is an area where a gesture for object recognition and interfacing control is input.

본 발명의 인식영역설정부(143)는 카메라장치(50)로부터 영상데이터가 입력되면, 상기 입력된 영상데이터에서 직각 형상(B)이 포함된 손 객체(H)가 대각선 방향으로 대칭되는 위치에서 인식되는지 여부를 모니터링한다.When image data is input from the camera device 50, the recognition area setting unit 143 of the present invention is located at a position where the hand object H including the rectangular shape B in the input image data is diagonally symmetrical. It monitors whether it is recognized.

상술된 형상을 가지는 손 객체가 인식되면, 직각 형상(B)의 연장 라인에 의하여 형성되는 영역을 생성하고 이 생성된 영역을 기준인식영역(A)로 설정한다.When a hand object having the above-described shape is recognized, an area formed by the extension line of the rectangular shape (B) is created and this created area is set as the reference recognition area (A).

이와 같이 기준인식영역(A)이 설정되면 앞서 도 7을 참조하여 설명된 실시예와 같이 본 발명의 객체인식부(120)는 카메라장치(50)로부터 입력된 각 프레임별 영상데이터 전체를 파싱의 대상으로 하지 않고 기준인식영역에 해당하는 특정영역만을 그 대상으로 하거나 또는 이 특정영역을 우선 대상으로 객체 인식 프로세싱을 수행하도록 구성될 수 있다.When the reference recognition area A is set as described above, the object recognition unit 120 of the present invention parses the entire image data for each frame input from the camera device 50 as in the embodiment described with reference to FIG. It may be configured to target only a specific region corresponding to the reference recognition region without targeting it, or to perform object recognition processing with a priority targeting this specific region.

나아가 이 실시예에서도 저장된 얼굴위치정보가 이동하는지 여부를 모니터링하고 얼굴의 위치가 이동하는 경우 그에 함수적으로 결정되는 위치로 상기 기준인식영역이 자연스럽게 이동되도록 구성될 수 있음은 물론이다.Furthermore, it goes without saying that in this embodiment, whether or not the stored face location information is moved is monitored, and when the location of the face moves, the reference recognition area can be naturally moved to a location determined functionally thereto.

이상에서 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 이것에 의해 한정되지 않으며 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 본 발명의 기술사상과 아래에 기재될 특허청구범위의 균등범위 내에서 다양한 수정 및 변형이 가능함은 물론이다.Although the present invention has been described above with limited examples and drawings, the present invention is not limited thereto and will be described below and the technical spirit of the present invention by those skilled in the art to which the present invention belongs. Of course, various modifications and variations are possible within the scope of the claims.

상술된 본 발명의 설명에 있어 제1 및 제2 등과 같은 수식어는 상호 간의 구성요소를 상대적으로 구분하기 위하여 사용되는 도구적 개념의 용어일 뿐이므로, 특정의 순서, 우선순위 등을 나타내기 위하여 사용되는 용어가 아니라고 해석되어야 한다.In the description of the present invention described above, modifiers such as first and second are only terms of instrumental concepts used to relatively distinguish components from each other, so they are used to indicate a specific order, priority, etc. It should be interpreted that it is not a term that

본 발명의 설명과 그에 대한 실시예의 도시를 위하여 첨부된 도면 등은 본 발명에 의한 기술 내용을 강조 내지 부각하기 위하여 다소 과장된 형태로 도시될 수 있으나, 앞서 기술된 내용과 도면에 도시된 사항 등을 고려하여 본 기술분야의 통상의 기술자 수준에서 다양한 형태의 변형 적용 예가 가능할 수 있음은 자명하다고 해석되어야 한다.Although the accompanying drawings and the like for illustration of the description of the present invention and its embodiments may be shown in a slightly exaggerated form in order to emphasize or highlight the technical contents according to the present invention, the above-described contents and matters shown in the drawings Taking into account, it should be interpreted that it is obvious that various types of modifications can be applied at the level of those skilled in the art.

상술된 본 발명의 제스쳐 기반의 객체 지향적 인터페이싱 방법은 컴퓨터로 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현될 수 있다. 컴퓨터가 읽을 수 있는 기록매체는 컴퓨터에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록장치(시디롬, 램, 롬, 플로피 디스크, 자기 디스크, 하드 디스크, 광자기 디스크 등)를 포함하며, 유무선 인터넷 전송을 위한 서버도 포함한다.The above-described gesture-based object-oriented interfacing method of the present invention can be implemented as computer-readable code on a computer-readable recording medium. Computer-readable recording media include all types of recording devices (CD-ROM, RAM, ROM, floppy disk, magnetic disk, hard disk, magneto-optical disk, etc.) in which data that can be read by a computer is stored. It also includes a server for transmission.

50 : 카메라장치 60 : 마이크모듈
70 : 디바이스(디스플레이장치) 100 : 본 발명의 인터페이싱 장치
110 : 입력부 120 : 객체인식부
121 : AI프로세싱부 123 : 버티컬객체인식부
125 : 포인트객체인식부 127 : 트래킹부
129 : 이벤트검출부 130 : 실행제어부
140 : 영역설정부 141 : 거리연산부
142 : 추정연산부 143 : 인식영역설정부
150 : 사운드처리부50: camera device 60: microphone module
70: device (display device) 100: interfacing device of the present invention
110: input unit 120: object recognition unit
121: AI processing unit 123: vertical object recognition unit
125: point object recognition unit 127: tracking unit
129: event detection unit 130: execution control unit
140: area setting unit 141: distance calculation unit
142: estimation operation unit 143: recognition area setting unit
150: sound processing unit

Claims

a camera control step of controlling a camera device to be activated when the sound detected by the microphone module corresponds to a wake-up sound;
a direction information input step in which direction information, which is information on the generation location or direction of the startup sound, is generated;
an input step of receiving image data from the camera device;
an object recognition step of parsing the image data and recognizing an object; and
An execution control step of controlling a specific command to be executed according to an event generated by the object,
The object recognition step,
Processing for object recognition is performed using n unit images in which the image data is divided into n (n is a natural number of 2 or more) regions, and among the n unit images, the unit image corresponding to the directional information is firstly targeted. Interfacing method characterized in that performing processing for object recognition with.

The method of claim 1, wherein the object recognition step,
a vertical object recognizing step of recognizing a first vertical object having a shape extending beyond a first reference length in a vertical direction from the image data;
a point object recognizing step of recognizing a point object located on one of the left and right sides of a first vertical line corresponding to the first vertical object; and
An event detection step of detecting a first event in which the point object moves to the opposite side with respect to the first vertical line,
Wherein the execution control step controls a command mapped to the first event to be executed when the first event is detected.

a sound processing unit for controlling a camera device to be activated when the sound detected by the microphone module corresponds to a wake-up sound, and generating directional information that is information about a location or direction of the wake-up sound;
an input unit receiving image data from the camera device;
an object recognition unit that recognizes an object by parsing the image data; and
An execution control unit for controlling execution of a specific command according to an event generated by the object,
The object recognition unit,
Processing for object recognition is performed using n unit images in which the image data is divided into n (n is a natural number of 2 or more) regions, and among the n unit images, the unit image corresponding to the directional information is firstly targeted. An interfacing device characterized in that it performs processing for object recognition with.

A computer-readable recording medium on which a program for performing the method according to claim 1 or 2 is recorded.