KR102543936B1

KR102543936B1 - Apparatus and Method for Recognizing Sign Language

Info

Publication number: KR102543936B1
Application number: KR1020220087227A
Authority: KR
Inventors: 남창환
Original assignee: 남창환
Priority date: 2022-07-14
Filing date: 2022-07-14
Publication date: 2023-06-16

Abstract

본 발명은 수화를 이용한 언어의 인식 장치에 관한 것으로 사용자의 입력 등을 지원하는 사용자 인터페이스 컨트롤러(UI(User Inferface) Controller) 부, 수화자의 손가락 형상 및 이동 방향 등을 캡처할 수 있는 카메라 콘트롤(Camera Control)부 및 캡처된 수화자의 손가락 좌표를 인식할 수 있는 좌표 모델(Coordinate Hands ML) 추출부 등을 포함하여 구성된 제1 레이어부와 상기 좌표 모델(Coordinate Hands ML) 추출부를 통해 인식되어 저장된 좌표 정보를 제공받아서 상기 손가락의 형상 모델을 추출하는 형상모델(Shape Model) 추출부(Shape Hands ML)와 상기 추출된 형상 모델을 지속적인 학습을 통해서 학습된 손 모양 아이디(ID)를 획득하는 수화인식(Sign Language Recognition)부를 포함하여 구성된 제2 레이어부를 포함하여 구성된 수화 인식 장치 및 방법에 관한 것으로, 손가락의 관절 위치 및 좌표를 통해 그 형태를 추출하고 지속적으로 그 형태를 학습하여 손 모양 데이터를 축적함과 동시에 상기 손의 이동 방향도 인식하고 이를 상기 손 모양 데이터에 매핑하여 사용하고 추가적인 장갑이나 제한적인 조건 없이 한 손 혹은 양 손가락 모두를 사용할 수 있게 함으로써 인식 가능한 수화의 숫자를 획기적으로 증대하고 또한 학습을 통하여 지속적으로 데이터를 최신 데이터로 갱신함으로써 좀 더 정확하고 간편한 수화 인식 장치 및 방법을 제공할 수 있는 효과가 있다.The present invention relates to a language recognition device using sign language, and relates to a user interface controller (UI) that supports a user's input, etc., and a camera control that can capture the shape and movement direction of a signer's finger. Control) unit and a coordinate model (Coordinate Hands ML) extractor capable of recognizing the coordinates of the captured receiver's fingers, and the coordinate information recognized and stored through the first layer unit and the coordinate model (Coordinate Hands ML) extractor. and a shape model extraction unit (Shape Hands ML) that extracts the shape model of the finger and sign language recognition (Sign It relates to a sign language recognition device and method comprising a second layer unit including a language recognition unit, which extracts the shape through the location and coordinates of the joints of the fingers and continuously learns the shape to accumulate hand shape data; At the same time, the movement direction of the hand is also recognized, mapped to the hand shape data, and used without additional gloves or restrictive conditions, thereby dramatically increasing the number of recognizable sign languages and learning There is an effect of providing a more accurate and convenient sign language recognition device and method by continuously updating the data with the latest data.

Description

Sign language recognition device and method {Apparatus and Method for Recognizing Sign Language}

본 발명은 수화를 이용한 언어의 인식 장치 및 방법에 관한 것으로 사용자의 입력 등을 지원하는 사용자 인터페이스 컨트롤러(UI(User Inferface) Controller) 부, 수화자의 손가락 형상 및 이동 방향 등을 캡처할 수 있는 카메라 조절(Camera Control) 부 및 캡처된 수화자의 손가락 좌표를 인식할 수 있는 좌표 모델(Coordinate Hands ML) 추출부 등을 포함하여 구성된 제1 레이어부와 상기 좌표 모델(Coordinate Hands ML)추출부를 통해 인식되어 저장된 좌표 정보를 받아서 상기 손가락의 형상 모델을 추출하는 형상모델(Shape Model) 추출부(Shape Hands ML)와 상기 추출된 형상 모델을 지속적인 학습을 통해서 학습된 손 모양 아이디(ID)를 획득하는 수화인식(Sign Language Recognition)부를 포함하여 구성된 제2 레이어부를 포함하여 구성된 수화 인식 장치 및 방법에 관한 것이다.The present invention relates to an apparatus and method for recognizing a language using sign language, and relates to a user interface controller (UI) that supports a user's input, etc., and controls a camera capable of capturing the shape and movement direction of a sign language speaker's finger. (Camera Control) unit and a coordinate model (Coordinate Hands ML) extraction unit capable of recognizing the coordinates of the captured receiver's fingers, which are recognized and stored through the first layer unit and the Coordinate Hands ML extraction unit. A shape model extractor (Shape Hands ML) that receives coordinate information and extracts a shape model of the finger and sign language recognition that obtains a hand shape ID (ID) learned through continuous learning of the extracted shape model A sign language recognition device and method including a second layer unit including a sign language recognition unit.

수화는 청각 장애인과 언어 장애인 등과 같이 말로서 자신의 언어를 표현하는 데 어려움을 겪는 사람들을 위하여 몸짓이나 손짓 등으로 표현하는 의사를 전달하는 방법으로 손가락의 모양, 그 위치나 이동, 표정 및 입술의 움직임 등을 종합하여 행해진다. 이처럼 수화는 언어를 손가락이나 몸짓 등으로 표현하기 때문에 그 사용 방법을 익히는 데 상당한 노력이 필요하기 때문에 일반인들이 배우기에는 매우 어려운 언어이며, 이로 인해 청각 및 언어 장애인들이 일반인들과 대화할 때 많은 어려움을 겪고 있는 상황이다. 따라서 장애인의 가족이나 혹은 일부의 사회복지사 등 특정 인원만이 수화를 학습하여 수화 통역사 역할을 수행하여 청각 및 언어 장애인들과 대화를 나누고 있으며, 상기 장애인들이 일반인들과 대화하기 위해서는 반드시 수화 통역사들이 필요한 실정이나 현재 수화 통역사가 충분히 배출되지 않으므로 장애인들이 많은 어려움에 직면하고 있다. 이러한 문제점을 해결하기 위해서 청각 및 언어 장애인들의 수화 동작을 인식하고 번역하여 일반인들에게 상기 번역한 수화 동작을 제공함으로써 청각 및 언어 장애인들과 일반인들이 수화를 통해 쉽게 의사소통을 할 수 있도록 하는 수화 인식 장치의 개발이 절실히 필요한 실정이다. 이러한 개발의 하나로써 대한민국 공개특허 10-2003-0049265는 수화 장갑을 제시하고 있다. 상기 수화 장갑에는 다양한 센서가 장착되어, 수화자가 수화 장갑을 착용하여 수화를 하면, 수화 장갑에 장착된 센서를 이용하여 그 수화를 인식하는 방안을 제시하고 있다. 상기 방안은 장갑을 착용하는 방식이기 때문에, 각각의 손가락에 부착된 센서를 이용하여 손가락의 구부림 동작을 인식하고 손에 부착된 센서를 이용하여 손가락의 공간적인 움직임을 인식함으로써 수화를 인식하는 방식이나, 손등에 부착된 센서만을 이용하여 공간상에서 손가락의 움직임을 인식하는 것은 자체 측정 센서 자체의 오류로 인해 실질적인 측정이 어려운 단점이 존재할 수 있다. 또 다른 해결방안의 하나로써 대한민국 공개특허 10-2015-0044084 역시 수화 장갑을 제시하고 있으며 손가락의 구부림 정보 및 방향 정보에 더불어 손가락의 벌림 정보를 추가적으로 인식하여 인식정보를 추가하여 센싱의 정확도를 높이고자 하였으나 마찬가지로 수화 장갑을 착용하여야 하는 번거로움과 더불어 인식정보 하나만을 추가하는 정도에 불과하여 근원적으로 오류의 문제를 해결하지는 못한 것으로 판단 된다. 이러한 장갑 착용의 불편함을 해소하기 위하여 대한민국 등록 특허 10-1958201에서는 수화 장갑을 착용하지 않고 한 손의 움직임만을 인식하여 하나의 수화 동작을 인식하고 특정 임계시간을 기준을 상기 인식한 수화 동작을 조합하여 사용자의 의사를 해석하는 장치를 제공함으로써 등록된 바 있으나, 이러한 발명은 한 손의 인식이라는 제한적인 알고리즘을 제공한 것으로서 본 발명에서는 제안하고자 하는 방안과는 목적 및 구성이 상이한 발명으로 손가락 관절의 형태 및 이동 방향 그리고 학습을 통한 데이터베이스의 구축 등을 통한 본 발명에 따르는 장치의 기술적 특징에 대해서는 아무런 기재가 없어 본 발명과 목적, 구성 및 효과에 있어서 명확한 차이점이 존재한다. Sign language is a method of conveying intentions expressed through gestures or hand gestures for people who have difficulty expressing their language verbally, such as the hearing impaired and speech impaired. It is done by integrating As such, sign language is a very difficult language for the general public to learn because it requires considerable effort to learn how to use sign language because language is expressed with fingers or gestures. situation you are experiencing. Therefore, only certain people, such as family members of the disabled or some social workers, learn sign language and play the role of sign language interpreters to communicate with hearing and speech impaired people. However, currently, there are not enough sign language interpreters, so people with disabilities face many difficulties. In order to solve this problem, sign language recognition recognizes and translates the sign language motions of the hearing and speech impaired people and provides the translated sign language motions to the general public so that the hearing and speech impaired people and the general public can easily communicate through sign language. The development of the device is urgently needed. As one of these developments, Korean Patent Publication No. 10-2003-0049265 suggests a hydration glove. The sign language glove is equipped with various sensors, and when a sign language sign language is worn by a sign language glove, a method of recognizing the sign language using the sensor mounted on the sign language glove is proposed. Since the above method is a method of wearing gloves, a method of recognizing sign language by recognizing a bending motion of a finger using a sensor attached to each finger and recognizing a spatial movement of a finger using a sensor attached to a hand, or However, recognizing the movement of a finger in space using only a sensor attached to the back of the hand may have a disadvantage in that practical measurement is difficult due to errors in the self-measurement sensor itself. As another solution, Korean Patent Publication No. 10-2015-0044084 also suggests a sign language glove, and in order to increase the accuracy of sensing by adding recognition information by additionally recognizing finger bending information and direction information as well as finger opening information However, it is judged that the problem of error has not been fundamentally solved because it is only a matter of adding only one recognition information along with the inconvenience of wearing sign language gloves. In order to solve the inconvenience of wearing gloves, Korean Registered Patent No. 10-1958201 recognizes only the movement of one hand without wearing sign language gloves, recognizes one sign language action, and combines the recognized sign language actions based on a specific critical time. has been registered by providing a device for interpreting the user's intention, but this invention provides a limited algorithm for recognizing one hand, and the present invention has a different purpose and configuration from the proposed method, There is no description about the technical characteristics of the device according to the present invention through the form and direction of movement and the construction of the database through learning, so there is a clear difference between the present invention and the purpose, configuration and effect.

본 발명은 상기와 같은 문제점을 해결하기 위해 고안된 것으로, 손가락 관절의 형태를 관절 좌표를 통해 인식하는 좌표 모델(Coordinate Model) 추출부, 인식된 관절 좌표를 전달받아 손가락의 형태를 도출하는 형상 모델(Shape Model) 추출부, 상기 인식되고 추출된 모델을 학습하여 손 모양 아이디(ID)를 획득하는 손 모양 아이디(ID) 추출부, 및 상기 손 모양 아이디(ID)의 동선을 인식하는 손 모양 아이디(ID) 동선 인식부 등을 포함한 수화 인식 장치와 상기 수화 인식 장치를 이용한 수화 인식 방법을 제공하는 것을 그 목적으로 한다. 또한 본 발명은 이러한 매핑된 데이터를 이용하여 학습을 통하여 그 정확도를 향상시키고 학습된 데이터를 지속적으로 업데이트하여 수화 사전을 제공하는 것을 그 목적으로 한다. The present invention was devised to solve the above problems, a coordinate model extractor that recognizes the shape of a finger joint through joint coordinates, and a shape model that derives the shape of a finger by receiving the recognized joint coordinates ( A shape model extractor, a hand shape ID (ID) extractor that acquires a hand shape ID (ID) by learning the recognized and extracted model, and a hand shape ID (ID) that recognizes the movement of the hand shape ID (ID) ID) It is an object of the present invention to provide a sign language recognition device including a movement line recognition unit and a sign language recognition method using the sign language recognition device. Another object of the present invention is to provide a sign language dictionary by improving accuracy through learning using such mapped data and continuously updating the learned data.

상기한 목적을 달성하기 위해 본 발명의 바람직한 실시예에 따르는 수화 인식 장치는 손가락 관절의 형태를 관절 좌표를 통해 인식하는 좌표 모델(Coordinate Model) 추출부, 인식된 관절 좌표를 전달받아 손가락의 형태를 도출하는 형상 모델(Shape Model) 추출부, 상기 인식되고 추출된 모델을 학습하여 손 모양 아이디(ID)를 획득하는 손 모양 아이디(ID) 추출부, 및 상기 손 모양 아이디(ID)의 동선을 인식하는 손 모양 아이디(ID) 동선 인식부를 포함하여 구성된 것을 특징으로 한다. 상기 좌표 모델(Coordinate Model)추출 부는 전체 시스템 구성도의 일부분으로 하나의 레이어에 포함될 수 있으며, 상기 형상 모델(Shape Model) 추출부 또한 전체 시스템의 일부분으로 다른 레이어에 포함되어 구성될 수 있다. 상기 손 모양 아이디(ID) 추출부는 상기 좌표 모델(Coordinate Model) 추출부 및 형상 모델(Shape Model)추출 부의 데이터를 매핑하고 학습하여 구성되어 상기 레이어중 하나에 혹은 또 다른 레이어에 구성될 수 있다. 상기 손 모양 아이디(ID) 동선 인식부도 상기 레이어중 하나 또는 또 다른 레이어에 구성될 수 있으며 상기 좌표 모델(Coordinate Model) 추출부, 형상 모델(Shape Model) 추출부, 손 모양 아이디(ID) 추출부 및 손 모양 아이디(ID) 동선 인식부 등을 통해서 추출된 데이터는 번역되어 사용자에게 음성, 문자 혹은 추가적인 다른 알림 방식으로 그 내용이 보여지거나 나타나게 되는 것을 특징으로 한다. 또한 이렇게 추출된 데이터가 학습을 통해서 그 내용이 갱신되고 갱신된 내용을 데이터베이스로 지속적으로 저장하여 매번 학습이 이루어지는 최신 내용이 저장되고 활용되는 수화 사전을 제공하는 것을 특징으로 한다. In order to achieve the above object, a sign language recognition device according to a preferred embodiment of the present invention includes a coordinate model extractor that recognizes the shape of a finger joint through joint coordinates, and a finger shape that receives the recognized joint coordinates. A shape model extractor that derives, a hand shape ID extractor that learns the recognized and extracted model to obtain a hand shape ID, and recognizes the movement of the hand shape ID. It is characterized in that it is configured to include a hand-shaped ID (ID) movement line recognition unit. The coordinate model extraction unit may be included in one layer as a part of the overall system configuration diagram, and the shape model extraction unit may also be included in another layer as a part of the overall system. The hand shape ID extraction unit may be configured by mapping and learning the data of the coordinate model extraction unit and the shape model extraction unit, and may be configured in one of the layers or another layer. The hand shape ID (ID) movement line recognition unit may also be configured in one of the layers or another layer, and the coordinate model extraction unit, the shape model extraction unit, and the hand shape ID (ID) extraction unit And the data extracted through the hand-shaped ID (ID) movement line recognition unit is translated and the contents are shown or displayed to the user in voice, text, or additional notification methods. In addition, the contents of the data extracted in this way are updated through learning, and the updated contents are continuously stored in a database to provide a sign language dictionary in which the latest contents that are learned each time are stored and utilized.

이상과 같이 본 발명에 따르는 수화 인식 장치는 손가락의 관절 위치 및 좌표를 통해 그 형태를 추출하고 지속적으로 그 형태를 학습하여 손 모양 데이터를 축적함과 동시에 상기 손의 이동 방향도 인식하고 이를 상기 손 모양 데이터에 매핑하여 사용하고 추가적인 장갑이나 제한적인 조건 없이 한 손 혹은 양 손가락 모두를 사용할 수 있게 함으로써 인식 가능한 수화의 숫자를 획기적으로 증대하고 또한 학습을 통하여 지속적으로 데이터를 최신 데이터로 갱신함으로써 좀 더 정확하고 간편한 수화 인식 장치 및 그 방법을 제공할 수 있는 효과가 있다.As described above, the sign language recognition device according to the present invention extracts the shape through the positions and coordinates of the joints of the fingers, continuously learns the shape, accumulates hand shape data, and recognizes the movement direction of the hand and recognizes the movement direction of the hand. By mapping and using shape data and enabling the use of one hand or both fingers without additional gloves or restrictive conditions, the number of recognizable sign language is dramatically increased, and by continuously updating data with the latest data through learning, more An accurate and simple sign language recognition device and method may be provided.

본 명세서에 첨부되는 다음의 도면들은 본 발명의 바람직한 실시 예를 예시하는 것이며, 전술한 발명의 내용과 함께 본 발명의 기술사상을 더욱 이해시키는 역할을 하는 것이므로, 본 발명은 그러한 도면에 기재된 사항에만 한정되어 해석되어서는 아니 된다.
도 1은 본 발명에 따르는 수화 인식 장치의 시스템 아키텍쳐를 나타낸 도면이다.
도 2는 본 발명에 따르는 학습된 손 모양 아이디(ID)를 획득하는 단계를 나타낸 도면이다.
도 3은 수화인식(Sign Language Recognition)부에서 수행되는 상기 손 모양 아이디(ID)를 획득하는 알고리즘을 도시한 도면이다.
도 4는 도 3에 도시된 상기 손 모양 아이디(ID)를 획득하는 알고리즘에 부가하여 수행되는 추가적인 알고리즘을 도시한 도면이다.
도 5는 손 동선 인식 알고리즘의 개념을 도시한 도면이다.
도 6은 상기 도 5에 개시된 손 동선 인식 알고리즘의 개념을 상세히 도시한 도면이다.
도 7은 도 2에 도시된 형상 모델(Shape Model) 추출부(Shape Hands ML)의 구성도를 도시한 도면이다.
도 8은 본 발명에 따르는 학습 및 인식 머신 랭귀지 데이터 세트(Maching Language Data Set)를 도시한 도면이다.
도 9는 본 발명에 따르는 수화 인식 장치가 휴대용 스마트기기와 같은 모바일 단말에 구현된 일 실시예를 도시한 도면이다.
도 10은 본 발명에 따르는 수화 인식 장치의 방법을 상세 흐름도로 도시한 도면이다.
도 11은 본 발명에 따르는 이미지 데이터의 손실 방지를 위한 일련의 과정을 도시한 도면이다.The following drawings attached to this specification illustrate preferred embodiments of the present invention, and serve to further understand the technical idea of the present invention together with the contents of the above-described invention, so the present invention is limited to those described in the drawings. It should not be construed as limiting.
1 is a diagram showing the system architecture of a sign language recognition apparatus according to the present invention.
2 is a diagram illustrating steps of obtaining a learned hand shape ID (ID) according to the present invention.
3 is a diagram showing an algorithm for acquiring the hand shape ID performed by a sign language recognition unit.
FIG. 4 is a diagram showing an additional algorithm performed in addition to the algorithm for acquiring the hand shape ID (ID) shown in FIG. 3 .
5 is a diagram illustrating the concept of a hand movement recognition algorithm.
FIG. 6 is a diagram showing the concept of the hand movement recognition algorithm disclosed in FIG. 5 in detail.
FIG. 7 is a diagram showing the configuration of a shape model extraction unit (Shape Hands ML) shown in FIG. 2 .
8 is a diagram illustrating a learning and recognition machine language data set (Maching Language Data Set) according to the present invention.
9 is a diagram illustrating an embodiment in which a sign language recognition device according to the present invention is implemented in a mobile terminal such as a portable smart device.
10 is a diagram showing a detailed flowchart of a method of the sign language recognition apparatus according to the present invention.
11 is a diagram showing a series of processes for preventing loss of image data according to the present invention.

전술한 목적, 특징 및 장점은 첨부된 도면을 참조하여 상세하게 후술되며, 이에 따라 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있을 것이다. 본 발명을 설명함에 있어서 본 발명과 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에는 상세한 설명을 생략한다.The above objects, features and advantages will be described later in detail with reference to the accompanying drawings, and accordingly, those skilled in the art to which the present invention belongs will be able to easily implement the technical spirit of the present invention. In describing the present invention, if it is determined that the detailed description of the known technology related to the present invention may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted.

제1, 제2 등과 같이 서수를 포함하는 용어는 다양한 구성 요소들을 설명하는 데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되지는 않는다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다.Terms including ordinal numbers, such as first and second, may be used to describe various components, but the components are not limited by the terms. These terms are only used for the purpose of distinguishing one component from another. For example, a first element may be termed a second element, and similarly, a second element may be termed a first element, without departing from the scope of the present invention. Terms used in this application are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly dictates otherwise.

본 발명에서 사용되는 용어는 본 발명에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 출원인이 임의로 선정한 용어도 있으며, 이 경우 해당하는 발명의 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서 본 발명에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 발명의 전반에 걸친 내용을 토대로 정의되어야 한다.The terms used in the present invention have been selected from general terms that are currently widely used as much as possible while considering the functions in the present invention, but these may vary depending on the intention of a person skilled in the art or precedent, the emergence of new technologies, and the like. In addition, in a specific case, there is also a term arbitrarily selected by the applicant, and in this case, the meaning will be described in detail in the description of the invention. Therefore, the term used in the present invention should be defined based on the meaning of the term and the overall content of the present invention, not simply the name of the term.

명세서 전체에서 어떤 부분이 어떤 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있음을 의미한다. When it is said that a certain part "includes" a certain component throughout the specification, it means that it may further include other components without excluding other components unless otherwise stated.

이하, 첨부 도면을 참조하여 본 발명의 실시예를 상세하게 설명한다. 다음에 예시하는 본 발명의 실시예는 여러 가지 다른 형태로 변형될 수 있으며, 본 발명의 범위가 다음에 상술하는 실시예에 한정되는 것은 아니다. 본 발명의 실시예는 당업계에서 통상의 지식을 갖춘 자에게 본 발명을 더 완전하게 설명하기 위하여 제공되어지는 것이다.Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. The embodiments of the present invention exemplified below may be modified in many different forms, and the scope of the present invention is not limited to the embodiments described below. The embodiments of the present invention are provided to more completely explain the present invention to those skilled in the art.

도 1은 본 발명에 따르는 수화인식 장치의 시스템 아키텍처를 나타낸 도면이다. 본 장치의 시스템 아키텍처는 편의에 따라 복수의 레이어로 구성될 수 있다. 제1 레이어는 도면에 도시된 바와 같이 뷰 레이어로 사용자의 입력 등을 지원하는 사용자 인터페이스 컨트롤러(UI(User Inferface) Controller) 부, 수화자의 손가락 형상 및 이동 방향 등을 캡처할 수 있는 카메라 조절(Camera Control) 부 및 캡처된 수화자의 손가락 좌표를 인식할 수 있는 좌표 모델(Coordinate Hands ML) 추출부 등을 포함하여 구성된다. 상기 뷰 레이어는 수화자가 수화를 수행할 때 카메라로 상기 수화자의 손가락 관절 좌표를 캡처하고 캡처된 관절 좌표를 인식하는 기능을 수행하여 이때 사용자의 입력을 사용자 인터페이스 컨트롤러(UI(User Inferface) Controller) 부를 통해 입력받고 이에 따라 상기 카메라 영상 캡처 동작 및 이에 따르는 일련의 관련 동작을 수행할 수 있다. 좌표 모델(Coordinate Hands ML) 추출부를 통해 인식된 좌표 정보는 머신 랭귀지 좌표 모델(ML Model(Coordinate Model))로서 1차적으로 저장될 수 있다. 상기 뷰 레이어부와 연결된 제2 레이어인 코어 레이어(Native(Core) Layer)가 도면에 도시되며, 상기 좌표 모델(Coordinate Hands ML) 추출부를 통해 인식되어 저장된 좌표 정보를 받아서 상기 손가락의 형상 모델을 추출하는 형상모델(Shape Model) 추출부(Shape Hands ML)와 상기 형상모델(Shape Model) 추출부(Shape Hands ML)와 연동하여 추출된 형상 모델을 지속적인 학습을 통해서 학습된 손 모양 아이디(ID)를 획득하는 수화인식(Sign Language Recognition)부를 포함하여 구성될 수 있다. 이러한 일련의 단계는 도 2에 도면으로 간단히 도시된다. 도 2는 도 1에 도시된 바와 같이 손가락 혹은 손의 각 관절 좌표를 좌표 모델(Coordinate Hands ML) 추출부를 통해 인식하고 인식된 상기 관절 좌표를 형상 모델(Shape Model) 추출부(Shape Hands ML)로 전송하고 이를 통해 인식된 형상 모델을 수화인식(Sign Language Recognition)부를 통해 지속적으로 학습하는 과정을 통해 학습된 손 모양 아이디(ID)를 획득하는 단계를 나타내고 있다. 이러한 과정을 통해 각각의 손 모양 아이디(ID)가 실시간 조합을 통해 최종 수화 단어를 산출하게 되며 예로써, 제1 손 모양 아이디(ID)(ID_A)를 획득하고 제2 손 모양 아이디(ID)를 획득한 후 이의 조합을 통해서 하나의 수화 단어(Sign Language Word)를 획득할 수 있는 단계를 나타내고 있다. 도 3은 수화인식(Sign Language Recognition)부에서 수행되는 상기 손 모양 아이디(ID)를 획득하는 알고리즘을 개시한 도면이다. 하나의 손 모양 인식을 위해 예로써 최소 8프레임 이상의 영상 정보가 필요하여 이를 하나의 턴(Turn)(손 모양 인식단위)로 지정할 수 있다. 상기 턴(Turn) 정보는 손 모양의 인식 단위로 사용되며 또한 상기 손 모양이 유지되어 지속되는 단위(Life) 정보로 사용될 수도 있다. 예로써 도면에 도시된 바와 같이 8개의 프레임 정보 내에서 제1 아이디 정보(ID_1)가 5프레임 지속되고 제2 아이디 정보(ID_2)가 3프레임 정보로 지속되는 상황을 시퀀스 핸드 아이디 리스트(Sequence Hand ID List)로 표기될 수 있으며, 이는 도면의 프리퀀시 맵(Frequency Map)에 도시된 바와 같이 상기 같은 경우는 이미 프리퀀시 맵(Frequency Map)에 정의된 바와 같이 혼자(스스로)라는 수화 단어로 인식될 수 있다. 이러한 일련의 영상 프레임 정보는 지속적인 학습을 통해서 약간의 오류가 발생하더라도 축적된 학습정보를 통해서 오류를 최소화하여 그 인식률을 향상시킬 수 있다. 도 4는 도 3에 도시된 상기 손 모양 아이디(ID)를 획득하는 알고리즘에 부가하여 수행되는 추가적인 알고리즘으로 손 모양에 대한 위치정보 및 유지정보를 이용하여 손 모양 아이디(ID)의 추가적인 정보를 통해 인식의 정확성을 높임과 동시에 더욱 많은 수화 단어를 생성할 수 있는 기능을 제공한다. 도면에 도시된 바와 같이 포지션 맵(Postion Map)을 통해 각각의 손 모양에 대한 위치정보(상대 좌표)를 저장할 수 있으며 이는 직사각형 형태의 Rect_1..Rect_n의 부호로서 상세히 개시되고 있다. 또한 롱텁 맵(LongTerm Map)이 제공되며 이를 통해 현재 시퀀스 핸드 아이디 리스트(Sequence Hand ID List)에 남아있는 손 모양 정보에 대한 유지 여부를 결정할 수 있는 기능도 수행할 수 있으며 예로써, 하나의 턴(Turn)이 지날 때마다 그 유지 여부를 결정하는 라이프(Life)가 증가 또는 감소될 수 있다. 도 5는 손 동선 인식 알고리즘의 개념을 도시한 도면이다. 상기 손 동선의 인식은 동일한 손 모양 아이디(ID)를 가지고 있으나, 이동 방향에 따라 서로 다른 의미를 가진 단어를 처리하기 위하여 고안된 알고리즘으로 도 1에 도시된 수화인식(Sign Language Recognition)부에 포함될 수도 있으며 도면에 도시하지는 않았으나 설계자의 편의에 따라 독자적인 추가 인식부를 구성하여 수행될 수도 있다. 상기와 같은 손 동선 인식 알고리즘은 인식된 각각의 손 모양 아이디(ID)들에 대한 동선을 처리하며 이러한 동선 처리는 상기 알고리즘이 수행되는 기기(Device)의 성능에 따라 동적 프레임의 변화를 줄 수도 있다. 도면에 도시된 바와 같이 예로써, '오빠(형)'를 나타나는 수화는 '산'을 의미하는 손 모양을 위로 이동함으로써 구현할 수 있고, 원래의 의미인 '산'을 의미하기 위해서는 그 형태로 6프레임을 유지하여 나타낼 수 있다. 또한 '남동생'을 의미하기 위해서는 '산'을 의미하는 손 모양을 아래로 이동함으로써 구현할 수도 있다. 도 6은 상기 도 5에 개시된 손 동선 인식 알고리즘의 개념을 상세히 개시한 도면이다. 상기 인식된 손 모양 아이디(ID)의 직사각형 좌표 정보(Rectangle(Rect_n)Postion Map)을 이용하여 각 동선 간 X 좌표의 변화량(Treshold)를 우측(Right)값 및 좌측(Left)값으로 변환하여 횟수(가중치)를 측정할 수 있으며, 동선 간 Y 좌표의 변화량(Treshold)을 위 쪽(Up)값 및 아래 쪽(Down)값으로 변환하여 횟수(가중치)를 측정할 수 있다. 이러한 과정을 통해 수집된 상기 우측(Right)값, 좌측(Left)값, 위 쪽(Up)값 및 아래 쪽(Down)값의 4방향 횟수(가중치)를 이용하여 수화 단어별 필요한 방향의 횟수를 기준으로 이동 방향을 판단할 수 있다. 이러한 측정에서 일정 변화량(Treshold) 이하의 값을 무시하여 계산의 편의성을 확보할 수도 있다. 예로써, '형'이란 수화 단어의 동선은 위 쪽(Up)값 횟수(가중치)의 2 값을 부여할 수 있고, '남동생'이란 수화 단어의 동선은 아래 쪽(Down)값 횟수(가중치)의 2 값을 부여할 수 있다.1 is a diagram showing the system architecture of a sign language recognition device according to the present invention. The system architecture of the present device may be composed of a plurality of layers according to convenience. As shown in the drawing, the first layer is a view layer, and a user interface (UI) controller that supports user input, etc., and a camera that can capture the shape and movement direction of a callee's finger. Control) unit and a coordinate model (Coordinate Hands ML) extraction unit capable of recognizing the coordinates of the captured receiver's fingers. The view layer performs a function of capturing joint coordinates of the sign language of the sign language with a camera and recognizing the coordinates of the joint when the sign language is performed by the sign language. It is possible to receive an input through the camera image capture operation and perform a series of related operations accordingly. Coordinate information recognized through the Coordinate Hands ML extraction unit may be primarily stored as a machine language coordinate model (ML Model (Coordinate Model)). A core layer (Native (Core) Layer), which is a second layer connected to the view layer unit, is shown in the drawing, receives coordinate information recognized and stored through the coordinate model (Coordinate Hands ML) extraction unit, and extracts the shape model of the finger. The shape model extraction unit (Shape Hands ML) and the hand shape ID (ID) learned through continuous learning of the extracted shape model in conjunction with the shape model extraction unit (Shape Hands ML) It may be configured to include a sign language recognition unit to acquire. This series of steps is schematically illustrated in FIG. 2 . 2, as shown in FIG. 1, each joint coordinate of a finger or hand is recognized through a coordinate model (Coordinate Hands ML) extraction unit, and the recognized joint coordinates are converted into a shape model (Shape Model) extraction unit (Shape Hands ML). It shows the step of acquiring the learned hand shape ID through the process of transmitting and continuously learning the recognized shape model through the sign language recognition unit. Through this process, each hand shape ID (ID) calculates a final sign language word through real-time combination. For example, a first hand shape ID (ID) (ID_A) is obtained and a second hand shape ID (ID) is obtained. After acquisition, a step of acquiring one sign language word through combination thereof is shown. 3 is a diagram illustrating an algorithm for acquiring the hand shape ID performed by a sign language recognition unit. For example, image information of at least 8 frames or more is required for recognizing one hand shape, and this can be designated as one turn (hand shape recognition unit). The turn information is used as a hand shape recognition unit and may also be used as unit (life) information in which the hand shape is maintained and maintained. As an example, as shown in the drawing, a situation in which the first ID information (ID_1) lasts for 5 frames and the second ID information (ID_2) lasts for 3 frames within 8 frame information is referred to as the Sequence Hand ID List (Sequence Hand ID List). List), and as shown in the frequency map of the drawing, the above case can be recognized as a sign language word alone (self) as already defined in the frequency map. . Even if a slight error occurs in this series of image frame information through continuous learning, the recognition rate can be improved by minimizing the error through the accumulated learning information. FIG. 4 is an additional algorithm performed in addition to the hand shape ID acquisition algorithm shown in FIG. It improves recognition accuracy and provides a function to generate more sign language words. As shown in the drawing, positional information (relative coordinates) for each hand shape can be stored through a position map, which is disclosed in detail as a code of Rect_1..Rect_n in a rectangular shape. In addition, a LongTerm Map is provided, and through this, a function of determining whether or not to maintain hand shape information remaining in the current Sequence Hand ID List can be performed, and for example, one turn ( Every time a turn passes, a life that determines whether or not to be maintained may be increased or decreased. 5 is a diagram illustrating the concept of a hand movement recognition algorithm. The recognition of the hand movement is an algorithm designed to process words having the same hand shape ID but different meanings according to the moving direction, and may be included in the sign language recognition unit shown in FIG. 1. Although not shown in the drawings, it may be performed by configuring an independent additional recognition unit according to the designer's convenience. The hand movement recognition algorithm as described above processes the movement of each recognized hand shape ID, and this movement process may change the dynamic frame according to the performance of the device on which the algorithm is performed. . As shown in the drawing, for example, the sign language representing 'oppa (brother)' can be implemented by moving the hand shape meaning 'mountain' upward, and in order to mean 'mountain', which is the original meaning, it can be implemented in the form of 6 It can be displayed by maintaining the frame. Also, in order to mean 'younger brother', it can be implemented by moving the hand shape meaning 'mountain' downward. FIG. 6 is a diagram showing the concept of the hand movement recognition algorithm disclosed in FIG. 5 in detail. By using the rectangular coordinate information (Rectangle(Rect_n)Postion Map) of the recognized hand shape ID (ID), the amount of change (Threshold) of the X coordinate between each movement line is converted into the right value and the left value. (Weight) can be measured, and the number of times (Weight) can be measured by converting the change (Threshold) of the Y coordinate between the moving lines into an Up value and a Down value. The number of required directions for each sign language word is determined using the four-direction counts (weights) of the Right value, Left value, Up value, and Down value collected through this process. Based on this, the direction of movement can be determined. In this measurement, the convenience of calculation may be secured by ignoring values below a certain amount of change (Threshold). For example, the movement line of the sign language word 'brother' can be assigned a value of 2 of the number of times (weight) of the up value, and the movement line of the sign language word 'younger brother' can be given the number of times (weight) of the down value. A value of 2 can be given.

상기 도 1 내지 도 6의 과정을 통해서 인식하고 학습하여 추출한 손 모양 아이디(ID)에 대해서 손 모양 아이디(ID) 사전을 구축할 수도 있다. 이러한 사전은 이종의 데이터베이스 개념으로 인식되고 학습된 손 모양 아이디(ID) 지속적으로 누적되어 축적됨으로써 시간이 지날수록 수화 단어의 인식률이 상기 축적된 데이터베이스로 인해 획기적으로 향상될 수 있으며, 일상생활의 모든 단어들이 또한 시간이 지날 수로 축적되어서 수화로 표현할 수 있는 단어를 획기적으로 증진시킬 수 있다. 상기 손 모양 아이디(ID) 사전에서 각각의 손 모양은 유니크(Unique)한 아이디(ID)로 매핑될 수 있으며, 각각의 수화 단어는 동작에 따라 손 모양 아이디(ID)의 조합으로 매핑될 수 있다. 상기 사전의 형태는 복수의 손 모양 아이디(ID), 수화 단어 번호, 수화 단어 및 수화 설명 등을 포함할 수 있으며, 그 예로써, 26, 27/22/인사/(양손) 등으로 나타낼 수 있고, 설계자의 용도에 따라 변경 및 추가 등도 가능하다. 상기 손 모양 아이디(ID) 사전은 또한 수화 단어 인식 리스트도 포함할 수 있다. 예로써, 현재 인식 가능한 수화 단어가 209개라고 가정하면 본 발명에서 인식되고 학습되어 추출된 228개의 손 모양 아이디(ID)를 조합하여 수화 단어를 구성할 수 있으며, 예로써, 나/너/만나다/반갑다를 조합하여 수화 단어 인식 리스트를 구현할 수 있다. 또한 이러한 인식 리스트는 카테고리(Category)별로 분류하여 그 효율성을 한층 강화할 수 있으며 이러한 인식 카테고리(Category)는 예로써, 상태/자연/계절/시간/연령/가족/감정/인사/사람 등으로 나눌 수 있으며 카테고리(Category)별 분류는 설계자의 용도에 따라 변경 및 추가도 얼마든지 가능할 수 있다. 도 7은 도 2에 도시된 형상 모델(Shape Model) 추출부(Shape Hands ML)의 구성도를 개시한 도면이다. 상기 형상모델(Shape Model) 추출부(Shape Hands ML)는 본 발명에 따르는 자체 머신 랭귀지(ML:Machine Language) 엔진을 사용할 수 있으며 예로써, 18개의 레이어로 구성될 수 있고, 이러한 구성은 상기 머신 랭귀지(ML:Machine Language)의 인식 및 응답 성능을 최대화하기 위한 레이어 구성으로 도면에 도시된 바와 같이 구성될 수 있다. 이러한 레이어의 구성은 본 발명의 일 실시예를 나타낸 것으로 설계자의 용도에 따라 변경 및 추가 혹은 삭감 등도 얼마든지 이루어질 수도 있다. 또한 상기 머신 랭귀지(ML:Machine Language)의 인식 및 응답 성능을 최대화하기 위한 추가적인 구성을 하이퍼 파라미터(Hyper Parameter)의 구성도 추가할 수 있으며, 이러한 구성은 필터(Filter)(커널(Kernel)로 칭해지기도 한다) 개수(바람직하게 32, 64개의 필터를 사용할 수 있다, 필터(Filter) 크기(바람직하게 9 x 1(너비(Width) x 높이(Height)), 1D 컨벌루션 필터(Convolution Filter)를 사용할 수 있다), 스트라이드(Stride)(Stride) 크기(바람직하게 1 x 1(너비(Width) x 높이(Height)), 패딩(Padding),액티베이션 기능(Activation Function)(바람직하게 RELU 사용할 수 있다)을 포함하여 구성될 수 있으며, 설계자의 용도에 따라 추가적인 기능을 포함하여 구성될 수도 있다. 상기 형상모델(Shape Model) 추출부(Shape Hands ML)는 그 데이터 세트(Date Set)도 정의할 수 있으며, 이러한 구성은 손 모양 형태를 표현하는 절대(Absolutely)좌표 구성, 손 모양 위치를 표현하는 상대(Screen Position)좌표 구성을 포함하여 구성될 수 있다. 상기 절대(Absolutely)좌표 구성은 하나의 좌표에 대한 X, Y, Z의 3축 값으로 구성될 수 있으며 이는 총 21개의 점으로 구성될 수 있고, (-1.0 ~ 1.0)의 값을 가질 수 있다. 또한 상기 상대(Screen Position)좌표 구성은 하나의 좌표에 대한 X, Y 2축 값으로 구성될 수 있고 이는 총 21개의 점으로 구성될 수 있으며, (0.0 ~ 1.0)의 값을 가질 수 있다. 상기 절대(Absolutely)좌표 및 상대(Screen Position)좌표를 이용하여 학습 및 인식 머신 랭귀지 데이터 세트(Maching Language Data Set)를 구성할 수 있으며, 본 발명에서는 한 손뿐 아니라 양손 모두 인식 및 학습이 가능함으로 예로써, 양손 모두를 인식한다고 가정하면 각 왼손(Left Hand)좌표값 + 각 오른손(Right Hand)좌표값의 조합으로 총 210개의 구성으로 나타낼 수 있으며 이를 수식으로 표기하면 도 8과 같이 나타낼 수 있다. 여기서 L은 왼손, R은 오른손, A는 절대값, S는 상대값을 나타낸다. 예로써, L_AX_1은 왼손 X축 1번째 절대값을 표기한 것으로 나타낼 수 있으며, 그 값의 절대적인 수치는 0.097552724를 나타냄을 알 수 있다.A hand shape ID dictionary may be constructed for hand shape IDs extracted by recognizing and learning through the processes of FIGS. 1 to 6 . This dictionary is recognized as a heterogeneous database concept and continuously accumulates and accumulates learned hand shape IDs, so that the recognition rate of sign language words can be dramatically improved over time due to the accumulated database, and all aspects of daily life Words can also accumulate over time, dramatically improving the number of expressive words in sign language. In the hand shape ID dictionary, each hand shape may be mapped to a unique ID, and each sign language word may be mapped to a combination of hand shape IDs according to motions. . The form of the dictionary may include a plurality of hand shape IDs (IDs), sign language word numbers, sign language words, sign language explanations, etc. However, changes and additions are possible according to the purpose of the designer. The hand shape ID (ID) dictionary may also include a sign language word recognition list. For example, assuming that there are 209 currently recognizable sign language words, a sign language word can be formed by combining 228 hand shape IDs recognized, learned, and extracted in the present invention. For example, I/You/Meet A sign language word recognition list can be implemented by combining /welcome. In addition, this recognition list can be classified by category to further enhance its efficiency, and this recognition category can be divided into, for example, state/nature/season/time/age/family/emotion/personal/person. And the classification by category can be changed and added according to the purpose of the designer. FIG. 7 is a view showing a configuration diagram of a shape model extraction unit (Shape Hands ML) shown in FIG. 2 . The shape model extraction unit (Shape Hands ML) may use its own machine language (ML) engine according to the present invention, and may be composed of, for example, 18 layers, and this configuration is the machine It can be configured as shown in the figure as a layer configuration for maximizing machine language (ML) recognition and response performance. The configuration of these layers represents an embodiment of the present invention, and may be changed, added, or reduced according to the designer's purpose. In addition, an additional configuration for maximizing the recognition and response performance of the machine language (ML: Machine Language) can be added to a hyper parameter configuration, and this configuration is referred to as a filter (kernel) number (preferably 32 or 64 filters can be used), filter size (preferably 9 x 1 (Width x Height)), 1D convolution filter can be used has), Stride size (preferably 1 x 1 (Width x Height), Padding, Activation Function (preferably RELU can be used)) According to the purpose of the designer, it may be configured with additional functions.The shape model extraction unit (Shape Hands ML) may also define its data set (Date Set), The configuration may include an absolute coordinate configuration expressing the hand shape and a screen position coordinate configuration expressing the hand position. The absolute coordinate configuration is X for one coordinate. , Y, Z, which can be composed of a total of 21 points, and can have a value of (-1.0 to 1.0) In addition, the screen position coordinate configuration is one coordinate It can be composed of X, Y 2-axis values for , which can consist of a total of 21 points, and can have a value of (0.0 to 1.0) The absolute coordinates and screen position coordinates It is possible to configure a learning and recognition machine language data set (Maching Language Data Set), and in the present invention, both hands as well as one hand can be recognized and learned. For example, assuming that both hands are recognized, each left hand ) coordinate value + each right hand (Right Hand) coordinate value combination can be represented by a total of 210 configurations, which can be expressed as a formula as shown in FIG. 8. Here, L is the left hand, R is the right hand, A is the absolute value, and S is the relative value. For example, L_AX_1 can be expressed as the first absolute value of the X-axis of the left hand, and it can be seen that the absolute value of the value represents 0.097552724.

상기와 같은 본 발명에 따르는 수화 인식 장치는 모바일이나 고정형이나 관계없이 어떠한 디바이스에라도 구현되어 장착될 수 있으며, 최근 산업발전과 정보통신 기술의 급격한 발전으로 인해 태블릿 PC나 스마트폰 등과 같은 고성능 휴대용 스마트기기가 보급됨에 따라 바람직하게는 휴대용 스마트기기에 장착될 수 있으며, 도 9에서 본 발명에 따르는 시스템이 상기 휴대용 스마트기기와 같은 모바일 단말에 구현될 수 있는 일 실시예를 보여주고 있다. 도면에 도시된 바와 같이 본 발명에 따르는 복수의 레이어에 의해서 추출되어 인식된 수화는 상기 모바일 디바이스의 수화 번역기에 의해서 문자, 음성 및 영상 등의 형태로 나타내어질 수 있으며 상기 모바일 디바이스의 운영체제 예로서 안드로이드 OS (Android Operation System)에 의해서 구동됨을 보여주고 있다. 도 10은 본 발명에 따르는 수화 인식 장치의 방법을 상세 흐름도로 도시한 도면이다. 본 발명에 따르는 수화 인식 장치는 먼저 카메라로부터 실시간으로 외부 이미지를 수신한다. 상기 외부 이미지는 수화자에 의해서 수행되는 수화 동작을 캡쳐한 화면으로 실시간으로 카메라에 의해 캡쳐되어 수신되며, 상기 수신된 이미지는 좌표 모델(Coordinate Hands ML) 추출부(이하 C.Model로 칭함)를 통해 각각의 손 모양 좌표 데이터(Coordinate) 형태로 획득된다. 상기 획득된 손 모양 좌표 데이터는 형상 모델(Shape Model) 추출부(이하 S.Model로 칭함)로 전송되며, 수신된 상기 좌표 데이터는 머신 러닝(Machine Learning)을 통해 학습된 손 모양에 매칭되는 핸드 아이디(Hand ID)로 획득된다. 상기 획득된 핸드 아이디(Hand ID)와 위치정보(Position)를 포함하는 데이터를 내부 저장소에 저장하며, 내부 저장소에 수화 단어를 결정할 수 있는 최소한의 데이터들이 저장되는 경우에 수화 분석을 시작하고, 상기 내부 저장소에 핸드 아이디(Hand ID)의 출현 빈도수, 위치정보 등을 통해 이동 동선을 파악하여 최종 수화 단어를 결정하게 된다. 상기 과정을 통해 내부 저장소에 존재하는 데이터들에 대한 라이프 카운터(Life Count)는 감소하게 되며, 라이프 사이클(Life Cycle)이 끝난 데이터들은 내부 저장소에서 삭제된다. 이러한 삭제는 사용되지 못하고 방치되는 데이터들을 내부 저장소에서 삭제함으로써 수화 단어 결정시에 오동작을 방지할 수 있다. 상기 과정을 통해 최종 수화 단어가 획득되게 되며, 수화 단어를 화면에 출력 혹은 음성 등의 다양한 방법으로 나타낼 수 있다. 이러한 과정들이 상기 제1 레이어와 제2 레이어에서 진행될 수 있다. 상기 일련의 과정 중에서 카메라로부터 실시간으로 수신되고 C.Model을 통해 획득된 각각의 손 모양 좌표 데이터(Coordinate)는 즉시 S.Model에 전송할 수 있지만, 내부 파일을 이용해 저장소에 저장할 수도 있다. S.Model의 최종 응답 시간을 기다리는 동안 카메라로부터 수신되는 이미지 데이터들이 손실될 수 있기 때문에 이러한 손실되는 이미지 데이터들을 방지하기 위해 내부 저장소에 이미지로부터 추출한 손 모양 좌표 데이터들을 실시간으로 저장하는 과정을 거칠 수 있다. 동시에 내부 저장소의 출력단을 이용하여 내부 저장소에 입력된 데이터를 추출할 수도 있으며 상기 내부 저장소에서 출력된 데이터를 S.Model에 전송할 수도 있다. S.Model의 응답을 기다리는 동안 내부 저장소의 입력단을 통해서 실시간으로 손 모양 좌표 데이터들이 계속해서 저장될 수 있도록 할 수 있다. 이러한 일련의 과정을 통해 손실되는 이미지 데이터를 방지할 수 있도록 구성된다. 도 11은 이러한 일련의 이미지 데이터 손실 방지를 위해 손 모양 좌표 데이터들이 내부 저장소에 입력되고 출력되는 단계를 나타내는 도면이다.The sign language recognition device according to the present invention as described above can be implemented and installed in any device regardless of whether it is mobile or fixed, and due to recent industrial development and rapid development of information and communication technology, high-performance portable smart devices such as tablet PCs and smartphones As is spread, it can preferably be mounted on a portable smart device, and FIG. 9 shows an embodiment in which the system according to the present invention can be implemented in a mobile terminal such as the portable smart device. As shown in the figure, the sign language extracted and recognized by the plurality of layers according to the present invention can be expressed in the form of text, voice, and video by the sign language translator of the mobile device, and as an operating system of the mobile device, for example, Android. It shows that it is driven by the OS (Android Operation System). 10 is a diagram showing a detailed flowchart of a method of the sign language recognition apparatus according to the present invention. The sign language recognition apparatus according to the present invention first receives an external image from a camera in real time. The external image is a screen that captures a sign language operation performed by a sign language speaker and is captured and received by a camera in real time, and the received image is used by a Coordinate Hands ML extraction unit (hereinafter referred to as C. obtained in the form of each hand shape coordinate data (Coordinate). The obtained hand shape coordinate data is transmitted to a shape model extraction unit (hereinafter referred to as S.Model), and the received coordinate data is a hand that matches the hand shape learned through machine learning It is obtained as an ID (Hand ID). Data including the acquired hand ID and position information is stored in an internal storage, and sign language analysis is started when minimum data for determining sign language words are stored in the internal storage, The final sign language word is determined by figuring out the movement line through the frequency of hand ID appearance and location information in the internal storage. Through the above process, the life count of the data existing in the internal storage decreases, and the data whose life cycle is over are deleted from the internal storage. Such deletion can prevent an erroneous operation in determining a sign language word by deleting unused and neglected data from the internal storage. Through the above process, a final sign language word is obtained, and the sign language word can be displayed in various ways such as output on a screen or voice. These processes may be performed in the first layer and the second layer. Each hand shape coordinate data (Coordinate) received from the camera in real time and acquired through C.Model during the above series of processes can be immediately transmitted to S.Model, but can also be stored in the storage using an internal file. Since the image data received from the camera may be lost while waiting for the final response time of S.Model, a process of saving the hand shape coordinate data extracted from the image in the internal storage in real time can be performed to prevent such loss of image data. there is. At the same time, data input to the internal storage may be extracted using the output terminal of the internal storage, and data output from the internal storage may be transmitted to S.Model. While waiting for the response of S.Model, the hand shape coordinate data can be continuously saved in real time through the input terminal of the internal storage. It is configured to prevent image data from being lost through this series of processes. FIG. 11 is a diagram illustrating steps of inputting and outputting hand shape coordinate data to and from an internal storage to prevent loss of a series of image data.

이상과 같이, 상기에서는 본 발명에 따른 바람직한 실시예를 위주로 상술하였으나, 본 발명의 기술적 사상은 이에 한정되는 것은 아니며 본 발명의 각 구성요소는 동일한 목적 및 효과의 달성을 위하여 본 발명의 기술적 범위 내에서 변경 또는 수정될 수 있을 것이다. 아울러 이상에서는 본 발명의 바람직한 실시예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 다양한 변형 실시가 가능한 것은 물론이고, 이러한 변형 실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어서는 안 될 것이다.As described above, although the preferred embodiments according to the present invention have been described above, the technical idea of the present invention is not limited thereto, and each component of the present invention is within the technical scope of the present invention to achieve the same purpose and effect. may be changed or amended in In addition, although the preferred embodiments of the present invention have been shown and described above, the present invention is not limited to the specific embodiments described above, and in the technical field to which the present invention belongs without departing from the gist of the present invention claimed in the claims. Various modified implementations are possible by those skilled in the art, and these modified implementations should not be individually understood from the technical spirit or perspective of the present invention.

Claims

As a sign language recognition device,
a user interface controller (UI) unit supporting a user's input to the view layer;
a camera controller unit capable of capturing the shape and movement direction of a receiver's finger;
A first layer unit configured to include a Coordinate Hands ML (Coordinate Hands ML) extraction unit capable of recognizing the captured finger coordinates of the talker;
a shape model extractor (Shape Hands ML) extracting a shape model of the finger by receiving coordinate information recognized and stored through the coordinate model (Coordinate Hands ML) extractor;
A second configuration including a sign language recognition unit that obtains a hand shape ID learned through continuous learning of the extracted shape model in conjunction with the shape model extraction unit (Shape Hands ML). Including the layer part,
The shape model extraction unit (Shape Hands ML) may store position information (relative coordinates) for each hand shape through a position map in conjunction with the sign language recognition unit. In addition, by using the LongTerm Map, a function can be performed to determine whether or not to maintain the hand shape information remaining in the current Sequence Hand ID List. A sign language recognition device characterized in that a life that determines whether or not to be maintained can be increased or decreased each time.

According to claim 1,
The shape model extraction unit (Shape Hands ML) may be configured to include a hand shape recognition algorithm in conjunction with the sign language recognition unit, and the hand shape recognition algorithm recognizes one hand shape. It includes image information of at least 8 frames or more, and it can be designated as one turn (hand shape recognition unit), and the turn information is used as a hand shape recognition unit, and the hand shape is It can also be used as life information that is maintained and continued, and a situation in which the first ID information (ID_1) lasts for 5 frames and the second ID information (ID_2) lasts for 3 frame information within the 8 frame information is sequenced. An apparatus for recognizing sign language, which can be expressed as a hand ID list, and can recognize sign language words by using and mapping a previously prepared frequency map.

delete

According to claim 1,
The Shape Model extraction unit (Shape Hands ML) works in conjunction with the Sign Language Recognition unit to use rectangular coordinate information (Rectangle (Rect_n) Position Map) of the recognized hand shape ID. The number of times (weight) can be measured by converting the change (Threshold) of the X coordinate between each movement line into the right (Right) value and the left (Left) value. It is possible to measure the number of times (weight) by converting it into a value and a down value, and the measured right value, left value, up value, and down value A sign language recognition device that can determine a movement direction based on the number of required directions for each sign language word by using the number of times (weights) in four directions.

According to claim 1,
The sign language recognition device is characterized in that it is configured to further include a hand shape ID (ID) dictionary constructed for the hand shape ID (ID) extracted by recognizing and learning, and each hand shape in the hand shape ID (ID) dictionary may be mapped to a unique ID, and each sign language word may be mapped to a combination of hand shape IDs according to motions, and the form of the dictionary may be mapped to a plurality of hand shape IDs (IDs). ), sign language word numbers, sign language words and sign language descriptions, etc.

According to claim 5,
The hand shape ID (ID) dictionary may also include a sign language word recognition list, and the sign language word recognition list may constitute a sign language word by combining the recognized, learned and extracted hand shape IDs. The sign language recognition device, characterized in that the recognition list can be classified and stored by category.

According to claim 1,
The shape model extraction unit (Shape Hands ML) is configured using its own machine language (ML) engine, and can be configured including 18 layers and a plurality of hyper parameters , and the 18 layers include 7 convolutional layers, 6 batch normalize layers, 2 max pooling layers, and 2 dropout layers and one Softmax layer, and the configuration of the plurality of hyper parameters includes the number of filters (sometimes referred to as kernels), filter size, and stride A sign language recognition device characterized by comprising a stride size, padding, and an activation function.

According to claim 1,
The shape model extraction unit (Shape Hands ML) includes a data set, and the data set consists of absolute coordinates representing the hand shape and screen position coordinates representing the position of the hand shape. It may be configured to include, The absolute (Absolutely) coordinate configuration may consist of three axis values of X, Y, Z for one coordinate, which may consist of a total of 21 points, (-1.0 ~ 1.0), and the relative (Screen Position) coordinate configuration can consist of two axes of X and Y for one coordinate, which can consist of a total of 21 points, (0.0 ~ 1.0) It may have a value of , and a learning and recognition machine language data set may be configured using the absolute coordinates and the screen position coordinates, and each left hand coordinate value + A sign language recognition device characterized in that it can be represented in a total of 210 configurations as a combination of each right hand coordinate value.

As a sign language recognition method,
Receiving an external image in real time through a camera;
converting the received image into each hand shape coordinate data (Coordinate) form through a coordinate model (Coordinate Hands ML) extractor (hereinafter referred to as C.Model);
Transmitting the converted hand shape coordinate data to a shape model extraction unit (hereinafter referred to as S.Model);
Converting the received coordinate data through the shape model extractor (hereinafter referred to as S.Model) into a hand ID matching the learned hand shape using machine learning, extracting the location information;
storing the converted hand ID and position information in an internal storage;
starting sign language analysis when minimum data for determining sign language words are stored in the internal storage;
Determining a final sign language word by determining a movement line through the frequency of occurrence of Hand ID and location information in the internal storage,
The shape model extraction unit (Shape Hands ML) links the extracted shape model with the sign language recognition unit that acquires the hand shape ID (ID) learned through continuous learning to provide a position map (Position Hands ML). Location information (relative coordinates) for each hand shape can be stored through the Map, and the hand shape information remaining in the current Sequence Hand ID List is maintained using the LongTerm Map. A sign language recognition method capable of performing a function of determining whether or not a sign language is present and increasing or decreasing a life that determines whether or not to be maintained whenever one turn passes.

According to claim 9,
deleting from the internal storage data whose life cycle is over due to a decrease in life count among data existing in the internal storage;
Each hand shape coordinate data (Coordinate) obtained through the C.Model is stored in an internal storage instead of immediately transmitted to the S.Model, and data input to the internal storage is extracted through an output terminal of the internal storage, Transmitting the extracted data to S.Model;
and continuously storing hand shape coordinate data in real time in the internal storage input terminal while waiting for a response from the S.Model.