KR102100703B1

KR102100703B1 - Voice recognition kiosk with multiple input means

Info

Publication number: KR102100703B1
Application number: KR1020190152477A
Authority: KR
Inventors: 정원석
Original assignee: 정원석
Priority date: 2019-11-25
Filing date: 2019-11-25
Publication date: 2020-04-14

Abstract

The present invention relates to a voice recognition kiosk with a plurality of first input means comprising: a kiosk main body with a display of a certain size; a plurality of first input means provided on the left side or the right side of the display and receiving a voice signal of a speaker; a noise pickup unit connected to the first input means to remove the ambient noise from the speaker; and an output unit outputting information requested by the speaker to the display based on the voice signal of the speaker from which the ambient noise is removed.

Description

Voice recognition kiosk with multiple input means

본 발명은 복수의 입력수단이 구비된 음성 인식 키오스크에 관한 것으로, 더욱 구체적으로는 키오스크에 높이에 따라 복수의 입력수단을 구비함으로써 발화자의 음성 인식율을 증가시킬 수 있도록 하는 키오스크에 관한 것이다.The present invention relates to a speech recognition kiosk equipped with a plurality of input means, and more particularly, to a kiosk that is provided with a plurality of input means according to the height of the kiosk to increase the speech recognition rate of the talker.

키오스크는 정부기관이나 지방자치단체, 은행, 백화점, 전시장 등 공공장소에 설치된 무인 정보단말기로써, 동적 교통정보 및 대중교통정보, 경로 안내, 요금 카드 배포, 예약 업무, 각종 전화번호 및 주소 안내 정보제공, 행정절차나 상품정보, 시설물의 이용방법 등을 제공할 수 있다.Kiosk is an unmanned information terminal installed in public places such as government agencies, local governments, banks, department stores, and exhibition halls. It provides dynamic traffic information, public transportation information, route information, fare card distribution, reservations, and various phone numbers and address information. It can provide administrative procedures, product information, and how to use facilities.

또한, 터치스크린과 사운드, 그래픽, 통신카드 등 첨단 멀티미디어 기기를 활용하여 음성서비스, 동영상 구현 등 이용자에게 효율적인 정보를 제공하는 무인 종합정보안내시스템이다.In addition, it is an unmanned comprehensive information guidance system that provides efficient information to users, such as voice service and video realization, by utilizing advanced multimedia devices such as touch screens, sound, graphics, and communication cards.

하지만 종래에는 단지 터치를 통해 키오스크에 입력 데이터를 입력하는 기능이 주를 이루었고, 음성 인식이 구비된 키오스크라 하더라도 마이크가 단수로 구비되어 마이크가 구비된 위치에 따라 모든 연령대의 사람들에게 만족스러운 음성 인식이 이루어지지 않은 문제점이 있었다. 아울러, 마이크에 발화자의 음성과 주변 소음이 함께 입력될 수 있기 때문에 음성 인식률이 좋지 않은 문제점이 있었다.However, conventionally, the function of inputting input data to the kiosk through only a touch has been mainly performed, and even for a kiosk equipped with voice recognition, a microphone is provided in a singular number, so that speech recognition satisfactory to people of all ages depending on the location of the microphone There was a problem that was not achieved. In addition, there is a problem in that the speech recognition rate is poor because the voice of the talker and the ambient noise can be input to the microphone.

따라서 키오스크에 모든 연령대의 음성 인식을 위한 마이크를 구비하고, 마이크에 입력되는 주변 소음을 제거할 필요성이 있다.Therefore, there is a need to provide a microphone for speech recognition of all ages in the kiosk and remove ambient noise input to the microphone.

한국 공개특허공보 제10-2002-0087297호Korean Patent Publication No. 10-2002-0087297

본 발명은 상술한 바와 같은 종래 기술의 문제점을 해결하기 위한 것으로, 복수의 입력수단을 구비하여 다양한 높이에서 발화자의 음성을 인식하는 인식률을 증가시키는 데 그 목적이 있다.The present invention is to solve the problems of the prior art as described above, it is an object to increase the recognition rate of recognizing the speech of the speaker at various heights by providing a plurality of input means.

상기 목적을 달성하기 위한 본 발명에 따른 복수의 입력수단이 구비된 음성 인식 키오스크는 소정의 크기로 디스플레이가 구비된 키오스크 본체; 상기 디스플레이의 좌측 또는 우측으로 구비되어, 발화자의 음성 신호를 수신하는 복수의 제1입력수단; 상기 복수의 제1입력수단과 연결되어, 발화자의 주변 소음을 제거하는 노이즈 픽업부; 및 주변 소음이 제거된 발화자의 음성 신호를 기반으로 상기 디스플레이에 발화자가 요청하는 정보를 출력하는 출력부;를 포함할 수 있다.A voice recognition kiosk equipped with a plurality of input means according to the present invention for achieving the above object includes a kiosk body having a display in a predetermined size; A plurality of first input means provided on the left side or the right side of the display to receive a voice signal of a talker; A noise pickup unit connected to the plurality of first input means to remove ambient noise of a speaker; And an output unit that outputs information requested by the speaker to the display based on the voice signal of the speaker from which ambient noise is removed.

여기서, 상기 복수의 제1입력수단은 마이크이고, 상기 마이크는 상기 키오스크 본체의 상단에서부터 하단까지 일정한 간격으로 구비되는 것을 특징으로 한다.Here, the plurality of first input means is a microphone, characterized in that the microphone is provided at regular intervals from the top to the bottom of the kiosk body.

또, 상기 마이크는 상기 키오스크 본체의 상단, 중단 및 하단에 각각 1개씩 구비되며, 상기 마이크는 1개의 노이즈 픽업부와 연결되어 있으며, 상기 노이즈 픽업부가 상기 마이크의 주변 소음을 제거하는 것을 특징으로 한다.In addition, one of the microphones is provided at the top, middle, and bottom of the kiosk body, the microphone is connected to one noise pickup unit, and the noise pickup unit removes ambient noise of the microphone. .

또한, 상기 마이크는 무지향성 마이크 어레이를 사용하는 것을 특징으로 한다.In addition, the microphone is characterized by using an omni-directional microphone array.

아울러, 상기 복수의 제1입력수단은 상기 키오스크 본체의 좌측 또는 우측에서 일렬로 배치될 시 상기 키오스크 본체의 세로 길이 내에서 간격이 5cm 이상으로 구비되는 것을 특징으로 한다.In addition, when the plurality of first input means are arranged in a row on the left or right side of the kiosk main body, a gap of 5 cm or more is provided within the vertical length of the kiosk main body.

또, 발화자의 음성을 인식하여 발화자가 요청하는 정보를 판단하는 음성 인식부를 포함하고, 상기 음성 인식부는 RNN(Recurrent Neural Networks) 모델을 사용하여 발화자의 음성을 인식함으로써, 발화자가 요청하는 정보를 판별하여 전달하는 것을 특징으로 한다.In addition, it includes a speech recognition unit for recognizing the speaker's voice and determining information requested by the speaker, and the speech recognition unit recognizes the speaker's requested information by recognizing the speaker's voice using a Recurrent Neural Networks (RNN) model It is characterized by delivering.

또한, 상기 복수의 제1입력수단의 근처에는 센서부가 마련되고, 상기 센서부는 발화자의 키 높이를 감지하여 상기 복수의 제1입력수단 중에서 상기 발화자의 키 높이에 근접하는 제1입력수단으로 정보 요청하기를 안내하는 것을 특징으로 한다.In addition, a sensor unit is provided in the vicinity of the plurality of first input means, and the sensor unit senses the key height of the talker and requests information from the plurality of first input means to the first input means close to the key height of the talker. It is characterized by guiding the following.

또, 상기 안내는 상기 제1입력수단의 근처에 구비된 LED가 점등하는 것을 특징으로 한다.In addition, the guide is characterized in that the LED provided near the first input means is lit.

또한, 상기 센서부는 초음파 센서, 적외선 센서 및 광센서 중 적어도 하나 이상인 것을 특징으로 한다.In addition, the sensor unit is characterized in that at least one or more of an ultrasonic sensor, an infrared sensor and an optical sensor.

본 발명에 따른 음성 인식 키오스크는 다음과 같은 효과가 있다.The speech recognition kiosk according to the present invention has the following effects.

첫째, 모든 연령대의 사람들에게 맞춤형으로 음성 인식이 가능하다. 본 발명에 따른 음성 인식 키오스크는 키오스크 본체의 전면에 복수의 입력수단이 구비되어 있다. 이러한 복수의 입력수단은 키오스크 본체의 전면 상단부터 하단까지 소정의 간격을 형성하여 일렬로 구비되어 있다. 즉, 복수의 입력수단이 위치한 높이가 서로 상이한 것이다. 이에 따라, 키가 작은 어린이들은 하단의 마이크를 통해 키오스크로 자신이 원하는 정보를 입력할 수 있고, 어린이보다 키가 큰 청소년들은 중단의 마이크를 통해 키오스크로 자신이 원하는 정보를 입력할 수 있으며, 청소년보다 키가 큰 어른들은 상단의 마이크를 통해 키오스크로 자신이 원하는 정보를 입력할 수 있다. 따라서 모든 연령대의 사람들의 키 높이에 맞게 음성 인식이 가능하다.First, speech recognition can be customized for people of all ages. The speech recognition kiosk according to the present invention is provided with a plurality of input means on the front of the kiosk body. The plurality of input means is provided in a line by forming a predetermined distance from the top to the bottom of the front of the kiosk body. That is, the heights of the plurality of input means are different from each other. Accordingly, children of short height can input their own information to the kiosk through the microphone at the bottom, and teenagers who are taller than children can input their desired information to the kiosk through the microphone of the middle, youth Taller adults can enter their desired information into the kiosk through the microphone at the top. Therefore, speech recognition is possible according to the height of people of all ages.

둘째, 사용자의 요구를 빠르게 적용할 수 있다. 본 발명에 따른 음성 인식 키오스크는 사용자가 디스플레이를 터치하여 주문 구매, 발권 등 자신이 원하는 정보를 입력하는 것이 가능하나, 음성 인식을 통해 사용자가 원하는 정보를 빠르게 디스플레이에 도시하거나, 정보를 입력하는 것 또한 가능하다. 이에 따라, 사용자가 자신이 원하는 주문, 구매, 발권 등을 빠르게 적용시킬 수 있다.Second, the user's needs can be quickly applied. The voice recognition kiosk according to the present invention allows a user to input desired information such as order purchase, ticketing, etc. by touching the display, but quickly displays information desired by the user through voice recognition or inputs information. It is also possible. Accordingly, the user can quickly apply the order, purchase, ticketing, etc. desired by the user.

도1은 본 발명의 실시예에 따른 복수의 입력수단이 구비된 음성 인식 키오스크의 사시도이다.
도2는 본 발명의 실시예에 따른 복수의 입력수단이 구비된 음성 인식 키오스크의 정면도이다.
도3은 본 발명의 실시예에 따른 복수의 입력수단이 구비된 음성 인식 키오스크의 블록도이다.
도4는 본 발명의 실시예에 따른 복수의 입력수단이 구비된 음성 인식 키오스크에서 하단에 구비된 마이크로 음성 인식을 하는 모습을 개략적으로 나타낸 도면이다.
도5는 본 발명의 실시예에 따른 복수의 입력수단이 구비된 음성 인식 키오스크에서 중단에 구비된 마이크로 음성 인식을 하는 모습을 개략적으로 나타낸 도면이다.
도6은 본 발명의 실시예에 따른 복수의 입력수단이 구비된 음성 인식 키오스크에서 상단에 구비된 마이크로 음성 인식을 하는 모습을 개략적으로 나타낸 도면이다.1 is a perspective view of a speech recognition kiosk equipped with a plurality of input means according to an embodiment of the present invention.
2 is a front view of a speech recognition kiosk equipped with a plurality of input means according to an embodiment of the present invention.
3 is a block diagram of a speech recognition kiosk equipped with a plurality of input means according to an embodiment of the present invention.
4 is a view schematically showing a state in which a voice recognition kiosk equipped with a plurality of input means according to an embodiment of the present invention performs micro voice recognition provided at the bottom.
5 is a view schematically showing a state in which a voice recognition kiosk equipped with a plurality of input means according to an embodiment of the present invention performs micro voice recognition provided in a middle stage.
6 is a view schematically showing a state in which a voice recognition kiosk equipped with a plurality of input means according to an embodiment of the present invention performs micro voice recognition provided at the top.

이하에서는 본 발명의 바람직한 실시예를 첨부된 도면을 참조하여 설명한다. 다만 발명의 요지와 무관한 일부 구성은 생략 또는 압축할 것이나, 생략된 구성이라고 하여 반드시 본 발명에서 필요가 없는 구성은 아니며, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 결합되어 사용될 수 있다.Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings. However, some components irrelevant to the subject matter of the invention will be omitted or compressed, but the omitted components are not necessarily required in the present invention, and may be used in combination with those skilled in the art to which the present invention pertains. You can.

도1은 본 발명의 실시예에 따른 복수의 입력수단이 구비된 음성 인식 키오스크의 사시도이고, 도2는 본 발명의 실시예에 따른 복수의 입력수단이 구비된 음성 인식 키오스크의 정면도이며, 도3은 본 발명의 실시예에 따른 복수의 입력수단이 구비된 음성 인식 키오스크의 블록도이다.1 is a perspective view of a speech recognition kiosk with a plurality of input means according to an embodiment of the present invention, and FIG. 2 is a front view of a speech recognition kiosk with a plurality of input means according to an embodiment of the present invention, FIG. Is a block diagram of a speech recognition kiosk equipped with a plurality of input means according to an embodiment of the present invention.

도1 내지 도3에 도시된 바와 같이, 본 발명의 실시예에 따른 복수의 입력수단이 구비된 음성 인식 키오스크(이하 "음성 인식 키오스크"라 함)은 키오스크 본체(100), 제1입력수단(110), 제2입력수단(120), 노이즈 픽업부(130), 음성 인식부(140), 제어부(150), 출력부(160) 및 데이터베이스(190)를 포함할 수 있다.1 to 3, a voice recognition kiosk (hereinafter referred to as a "voice recognition kiosk") equipped with a plurality of input means according to an embodiment of the present invention is a kiosk body 100, a first input means ( 110, a second input means 120, a noise pickup unit 130, a voice recognition unit 140, a control unit 150, an output unit 160 and a database 190.

본 발명에서는 음성 인식 키오스크를 주로 설명하고 있으나, 키오스크를 대신하는 어플리케이션으로 발권기, 자판기, DID(Digital Information Display) 등에서도 음성 인식을 사용하여 실시할 수 있다.In the present invention, a voice recognition kiosk is mainly described, but as an application that replaces the kiosk, it can be implemented by using voice recognition in a ticket machine, a vending machine, or a digital information display (DID).

키오스크 본체(100)는 소정의 크기로 전면, 후면이 형성되어 있고, 대략 직육면체로 형성되는 구성이다. 이러한 키오스크 본체(100)에는 음성 인식 키오스크를 구성하는 구성요소가 위치하며, 키오스크 본체(100)가 내부를 감싸는 케이스 역할을 수행할 수 있다. 키오스크 본체(100)는 백화점, 공항, 지하철, 음식점 등에서 지면에 기립하여 설치되거나, 벽면에 고정되어 설치될 수 있다.The kiosk main body 100 is configured to have a front surface and a rear surface in a predetermined size, and is formed in a substantially rectangular parallelepiped shape. The components constituting the speech recognition kiosk are located in the kiosk main body 100, and the kiosk main body 100 may serve as a case surrounding the inside. The kiosk main body 100 may be installed standing on the ground in a department store, airport, subway, restaurant, or the like, or fixed to a wall surface.

제1입력수단(110)은 발화자의 음성 신호를 수신하는 구성이다. 이러한 제1입력수단(110)은 예를 들어 마이크가 사용될 수 있다. 제1입력수단(110)은 키오스크 본체(100)의 전면에서 좌측 또는 우측에 복수로 구비될 수 있으며, 키오스크 본체(100)의 전면에 세로 방향으로 일렬로 배치될 수 있다. 이 때 복수의 제1입력수단(110)은 키오스크 본체(100)의 전면에서 상단, 중단, 하단 등에 구비될 수 있고, 제1입력수단(110)은 상단, 중단, 하단에 각각 1개씩 총 3개가 구비될 수 있다. 여기서, 제1입력수단(110)으로 사용되는 마이크는 무지향성 마이크 어레이가 사용될 수 있다. 또한, 복수의 제1입력수단(110)은 키오스크 본체(100)의 세로 길이 내에서 각각 5cm 이상의 간격으로 구비될 수 있다.The first input means 110 is configured to receive the talker's voice signal. The first input means 110, for example, a microphone may be used. The first input means 110 may be provided in plural on the left or right side of the kiosk main body 100, and may be arranged in a vertical direction in front of the kiosk main body 100. At this time, the plurality of first input means 110 may be provided at the top, middle, and bottom of the kiosk main body 100, and the first input means 110 is a total of three, one at the top, middle, and bottom. Dogs may be provided. Here, as the microphone used as the first input means 110, an omni-directional microphone array may be used. In addition, the plurality of first input means 110 may be provided at intervals of 5 cm or more, respectively, within the vertical length of the kiosk body 100.

무지향성 마이크는 모든 방향에 똑같은 감도를 가지고 있기 때문에 특정 방향에 대한 지향성 없이 모든 방향의 소리를 흡수하는 마이크이다. 전후좌우 모든 방향에서 신호를 받아들이며 녹음하는 공간의 잔향 등을 잡아낼 때 주로 사용한다. 이 외에도 물리적인 특성상 근접 효과 없이 모든 방향으로 부터 들어오는 음향 신호에 똑같이 반응하므로 '라발리에 마이크로폰(lavaliere mic 일명 핀 마이크)' 등 음원에 근접해서 사용하는 마이크로폰으로 사용한다. 여기서, 근접 효과란 마이크에 근접하면 생기는 저음역대의 음량이 커지는 현상을 말한다. 대편성 오케스트라나 홀과 스튜디오의 잔향을 받아들일 때 사용이 가능하다. 음원의 방향이 일정하지 않은 경우에도 사용한다. 따라서 본 발명에 따른 음성 인식 키오스크는 음원의 방향이 일정하지 않은 공공장소에서도 발화자의 음성 신호를 수신하여 발화자가 요청하는 정보를 출력할 수 있다.Omni-directional microphones have the same sensitivity in all directions, so they absorb sound in all directions without directing to a specific direction. It is mainly used to capture the reverberation of the recording space by receiving signals from all directions, front and rear, and left and right. In addition to this, it is used as a microphone used in close proximity to sound sources, such as a 'lavaliere mic (aka pin microphone)', because it reacts equally to sound signals coming from all directions without a proximity effect due to its physical characteristics. Here, the proximity effect refers to a phenomenon in which the volume of the low-frequency band generated when approaching the microphone is increased. It can be used to accommodate large-scale orchestras or hall and studio reverberations. It is used even when the direction of the sound source is not constant. Therefore, the voice recognition kiosk according to the present invention can output the information requested by the speaker by receiving the voice signal of the speaker even in a public place where the direction of the sound source is not constant.

제2입력수단(120)은 발화자가 음성 인식 키오스크에 자신이 원하는 정보를 요청할 시 사용할 수 있는 다른 하나의 구성이다. 이러한 제2입력수단(120)은 터치를 감지하여 감지된 입력 정보를 전달하는 수단으로서, 발화자가 키오스크 화면을 터치하는 것으로 입력 정보가 제어부(150)에 전송될 수 있다.The second input means 120 is another configuration that the talker can use when requesting information he or she wants from the speech recognition kiosk. The second input means 120 is a means for sensing the touch and transmitting the sensed input information. As the talker touches the kiosk screen, input information may be transmitted to the controller 150.

노이즈 픽업부(130)는 제1입력수단(110)과 연결되어 주변 소음을 제거하는 구성이다. 이러한 노이즈 픽업부(130)는 키오스크 본체(100)의 후면에 구비되어 제1입력수단(110)과 연결되어 있으며, 발화자가 제1입력수단(110)에 음성 신호를 전달할 시 발화자의 음성 신호를 제외하고 발화자의 주변에서 발생하는 다른 소리는 소음으로 간주하여 차단할 수 있다. 키오스크는 일반적으로 공공장소에 설치되어 사용되기 때문에 주변 소음이 발생할 수밖에 없다. 따라서 노이즈 픽업부(130)는 발화자가 음성 인식 키오스크 앞에서 음성 신호를 전달할 시 주변 소음은 차단하고, 발화자의 음성 신호만 수신되도록 하는 것이다.The noise pickup unit 130 is connected to the first input means 110 to remove ambient noise. The noise pickup unit 130 is provided on the rear of the kiosk main body 100 and is connected to the first input means 110, and when the talker transmits the voice signal to the first input means 110, the talker's voice signal is transmitted. Except for this, other sounds that occur around the talker can be considered as noise and blocked. Since kiosks are generally installed and used in public places, there is no choice but to generate ambient noise. Therefore, the noise pickup unit 130 blocks the ambient noise when the talker transmits the voice signal in front of the speech recognition kiosk, so that only the talker's voice signal is received.

예를 들어, 노이즈 픽업부(130)는 능동 소음 제어(active noise control, ANC), 노이즈 캔슬링(noise canceling), 능동 소음 감소(active noise reduction, ANR) 등의 방식이 사용될 수 있다. 구체적으로, 노이즈 픽업부(130)로 주변 소음이 감지되면, 해당 소음을 상쇄시키는 상쇄간섭을 회로에서 발생시켜 주변 소음을 차단할 수 있다.For example, the noise pickup unit 130 may use a scheme such as active noise control (ANC), noise canceling, or active noise reduction (ANR). Specifically, when ambient noise is detected by the noise pickup unit 130, the interference may be generated in the circuit to cancel the noise to block the ambient noise.

음성 인식부(140)는 발화자의 음성을 인식하여 발화자가 요청하는 정보를 판단하는 구성이다. 이러한 음성 인식부(140)는 제1입력수단(110)을 통해 발화자의 음성이 입력되면 아날로그 형태의 발화자 음성을 해석하여 발화자가 지시하는 입력코드를 파악한 후 제어부(150)에 전달한다. 여기서, 입력코드란 중앙처리부(120)에 입력될 코드를 의미하며, 제2입력수단(120)에서 해당 항목을 선택하였을 때 제어부(150)로 전달되는 코드와 동일한 코드일 수 있다. 즉, 음성 인식부(140)는 발화자가 말한 음성이 어떠한 서비스 항목에 관한 것인가를 판별하여 전달하는 역할을 수행할 수 있다.The voice recognition unit 140 is configured to recognize the speaker's voice and determine information requested by the speaker. When the voice of the talker is input through the first input means 110, the voice recognition unit 140 analyzes the voice of the talker in the analog form, determines the input code indicated by the talker, and transmits it to the controller 150. Here, the input code means a code to be input to the central processing unit 120, and may be the same code that is transmitted to the control unit 150 when a corresponding item is selected from the second input means 120. That is, the voice recognition unit 140 may serve to determine and deliver what service item the voice spoken by is spoken.

예를 들어, 음성 인식부(140)는 제1입력수단(110)을 통해 입력된 음성신호의 시작점과 끝점을 검출하고, 검출된 음성신호의 특징을 추출할 수 있다. 여기서, 입력된 음성신호의 증폭과 가청 주파수 대역외의 잡음을 제거하는 필터링 과정이 포함될 수 있다. 이후 추출된 특징 데이터를 데이터베이스(190)에 미리 저장되어 있는 음성 데이터 코드와 비교하여 가장 유사한 음성 데이터 코드를 판별할 수 있다. 여기서, 특징 데이터란 음성 인식 기술에 사용되는 일반적인 "특징 파라미터"를 의미할 수 있다. 그리고 판별된 음성데이터 코드에 해당하는 입력코드를 파악하여 제어부(150)로 전달할 수 있다. For example, the voice recognition unit 140 may detect a start point and an end point of the voice signal input through the first input means 110 and extract characteristics of the detected voice signal. Here, a filtering process of amplifying the input voice signal and removing noise outside the audible frequency band may be included. Thereafter, the extracted feature data can be compared with the voice data codes stored in the database 190 in advance to determine the most similar voice data codes. Here, the feature data may mean a general “feature parameter” used in speech recognition technology. Then, the input code corresponding to the determined voice data code can be grasped and transmitted to the control unit 150.

또한, 음성 인식부(140)는 발화자의 음성을 학습하여 음성 인식을 수행할 수 있다. 이 때 음성 인식부(140)는 대량의 음성과 음성이 나타내는 텍스트를 쌍으로 학습기에 입력하고 학습 모델을 생성할 수 있다. 이후 학습 모델에 음성 정보를 입력하면, 음성 정보에 해당하는 텍스트를 돌려받는다. 이러한 구조는 딥러닝 중 가변 길이의 데이터를 취급할 수 있는 순환 신경망인 RNN(Recurrent Neural Networks)을 사용할 수 있다.In addition, the voice recognition unit 140 may perform speech recognition by learning the speaker's voice. At this time, the voice recognition unit 140 may input a large amount of voice and text represented by the voice into the learner in pairs and generate a learning model. Subsequently, when voice information is input to the learning model, text corresponding to the voice information is returned. Such a structure may use Recurrent Neural Networks (RNN), which is a cyclic neural network capable of handling variable length data during deep learning.

RNN은 은닉층의 결과가 다시 같은 은닉층의 입력으로 들어가도록 연결되어 있다. 이러한 특성이 RNN이 순서 또는 시간이라는 측면을 고려할 수 있는 특징을 가질 수 있도록 한다. RNN은 순서적인 측면을 고려해서 판단할 수 있는 특성이 있기 때문에 시퀀스 데이터를 다루는데 도움이 된다. 시퀀스 데이터의 대표적인 형태로는 '문장'과 같은 데이터가 있다. 문자으이 단어 같은 경우 현재의 단어만으로 의미를 해석하는 것이 아니라 앞 단어와의 관계를 통해서 현재 단어의 의미를 해석할 수 있다. 이외에도 유전자, 손글씨, 음성 신호, 센서가 감지한 데이터, 주가 등의 배열 또는 시계열 데이터를 처리하는데 RNN이 사용될 수 있다.The RNN is connected so that the result of the hidden layer goes back to the input of the same hidden layer. This feature allows the RNN to have characteristics that can be considered in terms of order or time. RNN is useful for dealing with sequence data because it has the characteristics that can be judged by considering the sequential aspect. A typical form of sequence data is data such as a 'sentence'. In the case of words in letters, the meaning of the current word can be interpreted through the relationship with the previous word, not the meaning of the current word alone. In addition, RNN can be used to process sequence or time series data such as genes, handwriting, voice signals, data sensed by sensors, and stock prices.

제어부(150)는 음성 인식부(140)에서 전달한 입력코드에 따라 데이터베이스(190)에서 해당 선택항목에 대한 응답정보를 검색한 후 출력부(160)에 전달하는 구성이다. 이러한 응답정보는 음성에 대한 응답정보와 화면에 대한 응답정보가 포함될 수 있다. 이에 따라, 음성에 대한 응답정보는 스피커(180)로 출력할 수 있고, 화면에 대한 응답정보는 디스플레이(170)로 출력할 수 있다.The control unit 150 is a component that retrieves response information for a corresponding selection item from the database 190 according to the input code transmitted from the voice recognition unit 140 and transmits it to the output unit 160. The response information may include response information for voice and response information for a screen. Accordingly, the response information for the voice may be output to the speaker 180, and the response information for the screen may be output to the display 170.

출력부(160)는 제어부(150)로부터 음성에 대한 응답정보와 화면에 대한 응답정보를 수신하여 출력하는 구성이다. 이러한 출력부(160)는 디스플레이(170)와 스피커(180)를 포함할 수 있다.The output unit 160 is configured to receive and output response information for a voice and response information for a screen from the control unit 150. The output unit 160 may include a display 170 and a speaker 180.

디스플레이(170)는 키오스크 본체(100)의 전면에 구비된다. 이러한 디스플레이(170)는 소정의 크기로 형성되어 음성 인식 키오스크에 설정되어 있는 다양한 사용자 메뉴와 발화자가 음성 또는 터치로 요청하는 정보를 출력하여 발화자가 육안으로 확인하도록 할 수 있다.The display 170 is provided on the front surface of the kiosk body 100. The display 170 may be formed in a predetermined size to output various user menus set in the speech recognition kiosk and information requested by the talker by voice or touch, so that the talker can visually confirm.

스피커(180)는 키오스크 본체(100)의 일측에 구비되어 발화자에게 각종 소리로 안내하는 구성이다.The speaker 180 is provided on one side of the kiosk main body 100 to guide the speaker with various sounds.

데이터베이스(190)는 미리 설정되어 있는 다양한 사용자 메뉴, 각종 응답정보에 대한 데이터가 저장된 구성이다. 이러한 데이터베이스(190)에는 특징 데이터에 대응하는 그래픽 데이터, 음성 데이터가 저장될 수 있다.The database 190 is configured to store data on various user menus and various response information that are set in advance. Graphic data and voice data corresponding to the feature data may be stored in the database 190.

이하에서는 본 발명에 따른 음성 인식 키오스크를 다양한 연령대의 사람들이 사용하는 것을 도면을 참고하여 설명하기로 한다.Hereinafter, it will be described with reference to the drawings that people of various ages use the speech recognition kiosk according to the present invention.

도4는 본 발명의 실시예에 따른 복수의 입력수단이 구비된 음성 인식 키오스크에서 하단에 구비된 마이크로 음성 인식을 하는 모습을 개략적으로 나타낸 도면이다.4 is a view schematically showing a state in which a voice recognition kiosk equipped with a plurality of input means according to an embodiment of the present invention performs micro voice recognition provided at the bottom.

도4에 도시된 바와 같이, 본 발명에 따른 음성 인식 키오스크는 다양한 연령대의 사람들이 용이하게 이용할 수 있다. 또한, 사람들은 일반적으로 연령대에 따라 키 높이가 상이하다. 또, 키오스크는 일반적으로 높이와 크기가 큰 장치이다. 이에 따라, 키오스크에서 도시하는 디스플레이(170)의 크기도 상대적으로 크기 마련이다. 디스플레이(170)의 크기가 큼에 따라 키 높이가 작은 어린이나 청소년들은 키오스크를 이용하는데 어려움이 있을 수 있다.4, the speech recognition kiosk according to the present invention can be easily used by people of various ages. Also, people generally have different heights depending on their age. Moreover, kiosks are generally devices of high height and size. Accordingly, the size of the display 170 shown in the kiosk is also relatively large. Depending on the size of the display 170, children or adolescents having a small height may have difficulty using a kiosk.

즉, 키 높이가 작기 때문에 상대적으로 자신의 키 높이보다 높은 메뉴를 선택하는 것이 어려우며, 이를 해결하기 위해 매번 다른 사람의 도움을 받아야 하기 때문에 번거로움이 가중될 수 있는 것이다. That is, since the height of the key is small, it is difficult to select a menu that is relatively higher than the height of the user, and it can be cumbersome because it requires the assistance of another person every time to solve the problem.

하지만 본 발명에 따른 음성 인식 키오스크는 키오스크 본체(100)에 세로 길이 방향으로 복수의 제1입력수단(110)이 일렬로 소정의 간격으로 구비되어 있다. 이러한 제1입력수단(110)은 각각 하단, 중단 및 상단에 위치한다. 이에 따라, 키 높이가 작은 어린이나 청소년들은 하단에 구비된 제1입력수단(110)에 자신이 제공받기를 정보 또는 주문사항을 말하는 것으로 음성 신호를 전달하는 것이 가능하다.However, in the speech recognition kiosk according to the present invention, the kiosk main body 100 is provided with a plurality of first input means 110 in a line at a predetermined interval in the longitudinal direction. These first input means 110 are located at the bottom, middle and top, respectively. Accordingly, children or adolescents having a shorter height can transmit a voice signal to the first input means 110 provided at the bottom by referring to information or order information they receive.

음성 인식부(140)는 하단에 구비된 제1입력수단(110)에서 수신한 음성 신호를 기반으로 발화자의 음성을 인식하여 발화자가 요청하는 정보를 판단할 수 있다. 이후 발화자가 원하는 정보는 제어부(150)에 전달되어 최종적으로 제어부(150)의 제어에 의해 출력부(160)가 스피커(180) 및 디스플레이(170)로 발화자가 요청한 정보를 출력할 수 있다. 이렇게 제1입력수단(110)이 키오스크 본체(100)의 전면에서 하단에 구비됨에 따라 키 높이가 작은 어린이나 청소년들을 대상으로도 음성 인식률을 증가시킬 수 있다.The voice recognition unit 140 may recognize the speaker's voice based on the voice signal received from the first input means 110 provided at the bottom to determine information requested by the speaker. Thereafter, the information desired by the talker is transmitted to the controller 150, and finally, by the control of the controller 150, the output unit 160 may output information requested by the talker to the speaker 180 and the display 170. In this way, as the first input means 110 is provided at the bottom of the kiosk main body 100, the speech recognition rate can be increased even for children or adolescents having a small height.

또한, 키 높이가 어린이들이나 상대적으로 큰 청소년들은 키오스크 본체(100)의 전면에서 중단에 구비된 제1입력수단(110)을 통해 자신이 원하는 정보 또는 주문사항을 요청할 수 있다. 이를 도5를 참고하여 설명하도록 한다.In addition, children who are tall or relatively tall can request information or order they want through the first input means 110 provided in the middle of the kiosk main body 100. This will be described with reference to FIG. 5.

도5는 본 발명의 실시예에 따른 복수의 입력수단이 구비된 음성 인식 키오스크에서 중단에 구비된 마이크로 음성 인식을 하는 모습을 개략적으로 나타낸 도면이다.5 is a view schematically showing a state in which a voice recognition kiosk equipped with a plurality of input means according to an embodiment of the present invention performs micro voice recognition provided in a middle stage.

도5에 도시된 바와 같이, 본 발명에 따른 음성 인식 키오스크는 키오스크 본체(100)에 세로 길이 방향으로 복수의 제1입력수단(110)이 일렬로 소정의 간격으로 구비되어 있다. 이러한 제1입력수단(110)은 각각 하단, 중단 및 상단에 위치한다. 이에 따라, 어린이 보다는 키 높이가 큰 청소년들은 중단에 구비된 제1입력수단(110)에 자신이 제공받기를 정보 또는 주문사항을 말하는 것으로 음성 신호를 전달하는 것이 가능하다.As shown in FIG. 5, in the voice recognition kiosk according to the present invention, the kiosk main body 100 is provided with a plurality of first input means 110 lined up in a longitudinal direction at predetermined intervals. These first input means 110 are located at the bottom, middle and top, respectively. Accordingly, adolescents who are taller than a child can transmit a voice signal to the first input means 110 provided for interruption by referring to information or an order to be provided.

음성 인식부(140)는 중단에 구비된 제1입력수단(110)에서 수신한 음성 신호를 기반으로 발화자의 음성을 인식하여 발화자가 요청하는 정보를 판단할 수 있다. 이후 발화자가 원하는 정보는 제어부(150)에 전달되어 최종적으로 제어부(150)의 제어에 의해 출력부(160)가 스피커(180) 및 디스플레이(170)로 발화자가 요청한 정보를 출력할 수 있다. 이렇게 제1입력수단(110)이 키오스크 본체(100)의 전면에서 중단에 구비됨에 따라 키 높이가 어린이보다 상대적으로 큰 청소년들을 대상으로도 음성 인식률을 증가시킬 수 있다.The voice recognition unit 140 may recognize the speaker's voice based on the voice signal received from the first input means 110 provided in the interruption to determine information requested by the speaker. Thereafter, the information desired by the talker is transmitted to the controller 150, and finally, by the control of the controller 150, the output unit 160 may output information requested by the talker to the speaker 180 and the display 170. In this way, as the first input means 110 is provided at the middle of the kiosk main body 100, the speech recognition rate can be increased even for adolescents having a height higher than that of a child.

아울러, 키 높이가 큰 성인들은 키오스크 본체(100)의 전면에서 상단에 구비된 제1입력수단(110)을 통해 자신이 원하는 정보 또는 주문사항을 요청할 수 있다. 이를 도6를 참고하여 설명하도록 한다.In addition, adults having a high height can request information or an order they want through the first input means 110 provided at the top of the kiosk body 100 at the top. This will be described with reference to FIG. 6.

도6은 본 발명의 실시예에 따른 복수의 입력수단이 구비된 음성 인식 키오스크에서 상단에 구비된 마이크로 음성 인식을 하는 모습을 개략적으로 나타낸 도면이다.6 is a view schematically showing a state in which a voice recognition kiosk equipped with a plurality of input means according to an embodiment of the present invention performs micro voice recognition provided at the top.

도6에 도시된 바와 같이, 본 발명에 따른 음성 인식 키오스크는 키오스크 본체(100)에 세로 길이 방향으로 복수의 제1입력수단(110)이 일렬로 소정의 간격으로 구비되어 있다. 이러한 제1입력수단(110)은 각각 하단, 중단 및 상단에 위치한다. 이에 따라, 키 높이가 큰 성인들은 상단에 구비된 제1입력수단(110)에 자신이 제공받기를 정보 또는 주문사항을 말하는 것으로 음성 신호를 전달하는 것이 가능하다.As shown in Fig. 6, in the speech recognition kiosk according to the present invention, the kiosk main body 100 is provided with a plurality of first input means 110 lined up in a longitudinal direction at predetermined intervals. These first input means 110 are located at the bottom, middle and top, respectively. Accordingly, it is possible for a tall adult to transmit a voice signal to the first input means 110 provided at the top by referring to information or an order to be provided.

음성 인식부(140)는 상단에 구비된 제1입력수단(110)에서 수신한 음성 신호를 기반으로 발화자의 음성을 인식하여 발화자가 요청하는 정보를 판단할 수 있다. 이후 발화자가 원하는 정보는 제어부(150)에 전달되어 최종적으로 제어부(150)의 제어에 의해 출력부(160)가 스피커(180) 및 디스플레이(170)로 발화자가 요청한 정보를 출력할 수 있다. 이렇게 제1입력수단(110)이 키오스크 본체(100)의 전면에서 상단에 구비됨에 따라 키 높이가 큰 성인들을 대상으로도 음성 인식률을 증가시킬 수 있다.The voice recognition unit 140 may recognize the speaker's voice based on the voice signal received from the first input means 110 provided at the upper end to determine information requested by the speaker. Thereafter, the information desired by the talker is transmitted to the controller 150, and finally, by the control of the controller 150, the output unit 160 may output information requested by the talker to the speaker 180 and the display 170. Thus, as the first input means 110 is provided at the top of the kiosk main body 100 at the top, it is possible to increase the speech recognition rate even for adults having a high height.

한편, 각각의 제1입력수단(110)의 주변에는 센서부가 마련되어 키오스크 본체(100) 전방에 서 있는 발화자의 키 높이를 감지하고, 감지된 발화자의 키 높이에 따라 발화자에게 음성 신호를 수신할 제1입력수단(110)을 안내할 수 있다.On the other hand, a sensor unit is provided around each of the first input means 110 to detect the key height of the talker standing in front of the kiosk main body 100, and to receive a voice signal to the talker according to the detected key height of the talker. One can guide the input means (110).

구체적으로, 키오스크 본체(100)의 전면에서 하단, 중단 및 상단에 각각 구비된 제1입력수단(110)의 주변에는 센서부(미도시)가 마련될 수 있다. 이러한 센서부는 적외선 센서, 초음파 센서, 광센서 등이 사용될 수 있다.Specifically, a sensor unit (not shown) may be provided around the first input means 110 provided at the bottom, middle, and top of the kiosk body 100 at the front. An infrared sensor, an ultrasonic sensor, and an optical sensor may be used as the sensor unit.

이와 같은 적외선 센서, 초음파 센서 및 광센서는 하단, 중단, 상단에 전분 하나의 종류로만 사용할 수도 있고, 각각 하나의 종류가 사용될 수 있다. 즉, 하단에는 초음파 센서가 사용될 수 있고, 중단에는 적외선 센서가 사용될 수 있으며, 상단에는 광센서가 사용될 수 있다. 아울러, 하단, 중단 및 상단에는 적외선 센서, 초음파 센서 및 광센서가 각각 적어도 하나 이상이 조합되어 사용되는 것도 가능하다.The infrared sensor, the ultrasonic sensor, and the optical sensor may be used as only one type of starch at the bottom, middle, or top, and one type may be used for each. That is, an ultrasonic sensor may be used at the bottom, an infrared sensor may be used at the middle, and an optical sensor may be used at the top. In addition, at least one infrared sensor, an ultrasonic sensor, and an optical sensor may be used in combination with at least one of the lower, middle, and upper ends.

구비된 센서부는 키오스크 본체(100)의 전면에 서 있는 발화자의 키 높이를 감지하여 하단, 중단 및 상단에 구비된 제1입력수단(110) 중에서 발화자의 키 높이에 근접하는 제1입력수단(110)으로 발화할 것을 안내할 수 있다.The provided sensor unit senses the key height of the talker standing in front of the kiosk main body 100, and the first input means 110 that approaches the key height of the talker among the first input means 110 provided at the bottom, middle, and top. ) To guide the ignition.

먼저 센서부는 키오스크 본체(100)의 전방을 향하여 센싱을 위한 감지신호를 보내게 된다. 즉, 적외선 센서는 적외선을 발산하고, 초음파 센서는 초음파를 발산하며, 광센서는 빛을 발산하는 것이다. 이 때 발화자의 키 높이가 하단에 구비된 제1입력수단(110)의 높이에 근접한다면, 하단에 구비된 센서부는 발화자의 신체에 적외선이나 초음파 또는 광이 반사되어 다시 수신될 수 있다.First, the sensor unit sends a sensing signal for sensing toward the front of the kiosk main body 100. That is, the infrared sensor emits infrared rays, the ultrasonic sensor emits ultrasonic waves, and the optical sensor emits light. At this time, if the key height of the talker is close to the height of the first input means 110 provided at the bottom, the sensor unit provided at the bottom may be received again by infrared or ultrasonic or light reflection on the talker's body.

그러나 발화자의 키 높이가 중단 및 상단에 구비된 센서부에는 도달하지 못하기 때문에 적외선이나 초음파 또는 광이 반사될 수 없다. 따라서 적외선이나 초음파 또는 광이 반사된 것은 하단에 구비된 센서부만이기 때문에 제어부(150)는 발화자의 키 높이가 하단에 구비된 제1입력수단(110)의 근처라고 감지할 수 있고, 이에 따라 하단에 구비된 제1입력수단(110)의 근처에 마련된 LED가 점등하거나 스피커(180)로 하단에 구비된 제1입력수단(110)으로 발화자가 원하는 정보를 요청하는 음성을 출력하여 안내할 수 있다.However, since the height of the talker does not reach the sensor unit provided at the top and the middle, infrared rays, ultrasonic waves or light cannot be reflected. Therefore, since infrared, ultrasonic, or light is reflected only the sensor unit provided at the bottom, the controller 150 can sense that the key height of the talker is near the first input means 110 provided at the bottom, and accordingly The LED provided near the first input means 110 provided at the bottom lights or the speaker 180 outputs and guides a voice requesting information desired by the speaker to the first input means 110 provided at the bottom. have.

아울러, 발화자의 키 높이가 중단에 구비된 제1입력수단(110)의 높이에 근접한다면, 하단 및 중단에 구비된 센서부는 발화자의 신체에 적외선이나 초음파 또는 광이 반사되어 다시 수신될 수 있다. 그러나 발화자의 키 높이가 상단에 구비된 센서부에는 도달하지 못하기 때문에 적외선이나 초음파 또는 광이 반사될 수 없다. 따라서 적외선이나 초음파 또는 광이 반사된 것은 하단에 구비된 센서부와 중단에 구비된 센서부만이기 때문에 제어부(150)는 발화자의 키 높이가 중단에 구비된 제1입력수단(110)의 근처라고 감지할 수 있고, 이에 따라 중단에 구비된 제1입력수단(110)의 근처에 마련된 LED가 점등하거나 스피커(180)로 중단에 구비된 제1입력수단(110)으로 발화자가 원하는 정보를 요청하는 음성을 출력하여 안내할 수 있다.In addition, if the key height of the talker is close to the height of the first input means 110 provided at the middle, the sensor unit provided at the bottom and the middle may be received again by infrared or ultrasonic or light reflection on the talker's body. However, since the height of the talker does not reach the sensor unit provided at the top, infrared rays, ultrasonic waves, or light cannot be reflected. Therefore, since infrared, ultrasonic, or light reflections are only the sensor unit provided at the bottom and the sensor unit provided at the middle, the controller 150 says that the height of the talker is near the first input means 110 provided at the middle. LED that is provided in the vicinity of the first input means 110 provided in the middle can be lit or the speaker 180 requests the desired information to the first input means 110 provided in the middle with the speaker 180. You can output and guide the voice.

마지막으로, 발화자의 키 높이가 상단에 구비된 제1입력수단(110)의 높이에 근접한다면, 하단, 중단 및 상단에 구비된 센서부는 전부 발화자의 신체에 적외선이나 초음파 또는 광이 반사되어 다시 수신될 수 있다. 따라서 적외선이나 초음파 또는 광이 반사된 것은 하단에 구비된 센서부와 중단에 구비된 센서부 및 사단에 구비된 센서부 모두이기 때문에 제어부(150)는 발화자의 키 높이가 상단에 구비된 제1입력수단(110)의 근처라고 감지할 수 있고, 이에 따라 상단에 구비된 제1입력수단(110)의 근처에 마련된 LED가 점등하거나 스피커(180)로 상단에 구비된 제1입력수단(110)으로 발화자가 원하는 정보를 요청하는 음성을 출력하여 안내할 수 있다.Finally, if the key height of the talker is close to the height of the first input means 110 provided at the top, the sensor units provided at the bottom, middle and top are all reflected by infrared or ultrasonic waves or light reflected on the talker's body and received again. Can be. Therefore, since the infrared, ultrasonic, or light reflections are both the sensor unit provided at the bottom, the sensor unit provided at the middle, and the sensor unit provided at the division, the controller 150 has a first input having a key height of the talker at the top. It can be sensed that it is near the means 110, and accordingly, the LED provided near the first input means 110 provided at the top lights up or the speaker 180 moves to the first input means 110 provided at the top. The speaker can output and guide a voice requesting the desired information.

이상에서 상세히 설명한 바와 같이, 본 발명에 따른 음성 인식 키오스크는 모든 연령대의 사람들에게 맞춤형으로 음성 인식이 가능하다. 본 발명에 따른 음성 인식 키오스크는 키오스크 본체(100)의 전면에 복수의 입력수단으로써 마이크가 구비되어 있다. 이러한 복수의 입력수단은 키오스크 본체(100)의 전면 상단부터 하단까지 소정의 간격을 형성하여 일렬로 구비되어 있다. 즉, 복수의 입력수단이 위치한 높이가 서로 상이한 것이다. 이에 따라, 키가 작은 어린이들은 하단의 마이크를 통해 키오스크로 자신이 원하는 정보를 입력할 수 있고, 어린이보다 키가 큰 청소년들은 중단의 마이크를 통해 키오스크로 자신이 원하는 정보를 입력할 수 있으며, 청소년보다 키가 큰 어른들은 상단의 마이크를 통해 키오스크로 자신이 원하는 정보를 입력할 수 있다. 따라서 모든 연령대의 사람들의 키 높이에 맞게 음성 인식이 가능하다.As described in detail above, the speech recognition kiosk according to the present invention is capable of custom speech recognition for people of all ages. The speech recognition kiosk according to the present invention is equipped with a microphone as a plurality of input means on the front of the kiosk main body 100. The plurality of input means is provided in a line by forming a predetermined distance from the front top to the bottom of the kiosk body 100. That is, the heights of the plurality of input means are different from each other. Accordingly, children of short height can input their own information to the kiosk through the microphone at the bottom, and teenagers who are taller than children can input their desired information to the kiosk through the microphone of the middle, youth Taller adults can enter their desired information into the kiosk through the microphone at the top. Therefore, speech recognition is possible according to the height of people of all ages.

아울러, 사용자의 요구를 빠르게 적용할 수 있다. 본 발명에 따른 음성 인식 키오스크는 사용자가 디스플레이(170)를 터치하여 주문 구매, 발권 등 자신이 원하는 정보를 입력하는 것이 가능하나, 음성 인식을 통해 사용자가 원하는 정보를 빠르게 디스플레이(170)에 도시하거나, 정보를 입력하는 것 또한 가능하다. 이에 따라, 사용자가 자신이 원하는 주,문 구매, 발권 등을 빠르게 적용시킬 수 있다.In addition, the user's needs can be quickly applied. In the voice recognition kiosk according to the present invention, it is possible for a user to input information desired by the user by touching the display 170, such as order purchase, ticketing, etc. , It is also possible to enter information. Accordingly, the user can quickly apply his / her desired order, door purchase, ticketing, and the like.

상기한 본 발명의 바람직한 실시예는 예시의 목적을 위해 개시된 것이고, 본 발명에 대해 통상의 지식을 가진 당업자라면, 본 발명의 사상과 범위 안에서 다양한 수정, 변경 및 부가가 가능할 것이며, 이러한 수정, 변경 및 부가는 본 발명의 특허청구 범위에 속하는 것으로 보아야 할 것이다.The above-described preferred embodiments of the present invention are disclosed for the purpose of illustration, and those skilled in the art having ordinary knowledge of the present invention will be able to make various modifications, changes and additions within the spirit and scope of the present invention. And additions should be considered to fall within the scope of the claims of the present invention.

100 : 키오스크 본체
110 : 제1입력수단
120 : 제2입력수단
130 : 노이즈 픽업부
140 : 음성 인식부
150 : 제어부
160 : 출력부
170 : 디스플레이
180 : 스피커
190 : 데이터베이스100: kiosk body
110: first input means
120: second input means
130: noise pickup unit
140: speech recognition unit
150: control unit
160: output unit
170: display
180: speaker
190: database

Claims

In the speech recognition kiosk equipped with a plurality of input means,
A kiosk body provided with a display in a predetermined size;
A plurality of first input means provided on a front surface of the kiosk body to receive a voice signal of a talker;
A noise pickup unit connected to the plurality of first input means to remove ambient noise of a speaker; And
Includes a; output unit for outputting information requested by the speaker on the display based on the voice signal of the speaker from which the ambient noise has been removed.
The noise pickup unit is provided in hardware on the back of the kiosk body,
The rear surface of the kiosk body is an opposite surface to the front surface on which the plurality of first input means are disposed,
The plurality of first input means is a microphone,
The microphone is provided at regular intervals from the top to the bottom of the kiosk body,
A sensor unit is provided around the microphone,
The sensor unit detects the key height of the talker, and guides to request micro information that is close to the key height of the talker among the microphones.
The guidance is a voice recognition kiosk with a plurality of input means, characterized in that the LED provided near the microphone is lit.

delete

According to claim 1,
The microphone is provided one at each of the top, middle and bottom of the kiosk body,
The microphone is connected to one noise pickup,
A speech recognition kiosk with a plurality of input means, characterized in that the noise pickup unit removes ambient noise of the microphone.

According to claim 1,
The microphone is a voice recognition kiosk equipped with a plurality of input means, characterized in that using an omni-directional microphone array.

According to claim 1,
When the plurality of first input means are arranged in a row on the left or right side of the kiosk main body, a voice recognition kiosk with a plurality of input means, characterized in that an interval of 5 cm or more is provided within a vertical length of the kiosk main body. .

According to claim 1,
It includes a voice recognition unit for recognizing the voice of the talker to determine the information requested by the talker,
The speech recognition unit recognizes the speech of the talker using a Recurrent Neural Networks (RNN) model, thereby determining and transmitting the information requested by the talker, the speech recognition kiosk with a plurality of input means.

delete

According to claim 1,
The sensor unit is a voice recognition kiosk having a plurality of input means, characterized in that at least one of an ultrasonic sensor, an infrared sensor and an optical sensor.