KR102474804B1

KR102474804B1 - Apparatus for controlling voice recognition, system having the same and method thereof

Info

Publication number: KR102474804B1
Application number: KR1020170123613A
Authority: KR
Inventors: 김준영
Original assignee: 현대자동차주식회사; 기아 주식회사
Priority date: 2017-09-25
Filing date: 2017-09-25
Publication date: 2022-12-06
Also published as: KR20190034964A

Abstract

본 발명은 음성 인식 제어 장치, 그를 포함한 시스템 및 그 방법에 관한 것으로, 본 발명의 실시예에 따른 음성 인식 제어 장치는 웨이크업 키워드와 명령어를 포함하는 음성신호를 입력받는 음성신호 입력부; 상기 음성신호의 선두에 위치하는 웨이크업 키워드를 인식하여 복수개의 디바이스 중 상기 웨이크업 키워드에 해당하는 디바이스를 웨이크업시키는 음성 필터링부; 및 상기 음성신호의 명령어의 단어 및 동사 기반으로 주요 명령어를 선택하는 명령어 필터링부;를 포함할 수 있다.The present invention relates to a voice recognition control device, a system including the same, and a method thereof. A voice recognition control device according to an embodiment of the present invention includes a voice signal input unit for receiving a voice signal including a wakeup keyword and a command; a voice filtering unit that recognizes a wakeup keyword located at the head of the voice signal and wakes up a device corresponding to the wakeup keyword among a plurality of devices; and a command filtering unit that selects main commands based on words and verbs of commands of the voice signal.

Description

Apparatus for controlling voice recognition, system having the same and method thereof}

본 발명은 음성 인식 제어 장치, 그를 포함한 시스템 및 그 방법에 관한 것으로, 보다 상세하게는 복수개의 디바이스 중 유효한 디바이스에서 정확한 음성 명령 제어가 수행되도록 하는 기술에 관한 것이다.The present invention relates to a voice recognition control device, a system including the same, and a method thereof, and more particularly, to a technology for performing accurate voice command control in an effective device among a plurality of devices.

디바이스들이 스마트해지면서, 사용자의 음성 신호를 이용하여 디바이스의 기능을 실행시킬 수 있는 음성 인식 기능이 디바이스에 탑재되고 있다. As devices become smarter, a voice recognition function capable of executing a device function using a user's voice signal is being installed in the device.

디바이스에 탑재된 음성 인식 기능을 사용하기 위하여, 디바이스의 음성 인식 기능을 웨이크업 시켜야 한다. 이에 고정된 웨이크업 키워드를 이용하여 웨이크업시키게 되는데 이로 인하여 동일한 장소에 동일한 음성 인식 기능을 탑재한 복수의 디바이스가 있을 때, 원하지 않는 디바이스의 음성 인식 기능이 웨이크업 될 수 있다. In order to use the voice recognition function built into the device, the voice recognition function of the device must be woken up. Accordingly, a wake-up is performed using a fixed wake-up keyword. Due to this, when a plurality of devices equipped with the same voice recognition function exist in the same place, the voice recognition function of an unwanted device can be woken up.

또한, 기존의 음성 인식 기능은 웨이크업 키워드와 음성 명령을 나누어 처리하고 있다. 이에 따라 사용자는 웨이크업 키워드를 입력한 후, 디바이스의 음성 인식 기능이 웨이크업 되면, 음성 명령을 입력하여야 한다. 만약 사용자가 웨이크업 키워드와 음성 명령을 연속적으로 입력할 경우에, 기존의 음성 인식 기능은 웨이크업 되지 않거나 웨이크업 된다 하더라고 입력된 음성 명령에 대한 인식 오류가 발생될 수 있다. In addition, the existing voice recognition function separately processes a wake-up keyword and a voice command. Accordingly, after the user inputs the wakeup keyword, when the voice recognition function of the device wakes up, the user must input a voice command. If a user continuously inputs a wake-up keyword and a voice command, the existing voice recognition function may not wake up, or even if the user wakes up, an error in recognizing the input voice command may occur.

따라서, 보다 편리하고, 정확하게 디바이스의 음성 인식 기능을 웨이크업 시키면서 보다 정확하게 음성 명령을 인식할 수 있는 기술이 요구되고 있다. Accordingly, there is a demand for a technology capable of more accurately recognizing a voice command while more conveniently and accurately waking up a voice recognition function of a device.

본 발명의 실시예는 복수개의 디바이스 중 유효한 디바이스에서 정확한 음성인식을 통해 차량 제어를 수행할 수 있도록 하는 음성 인식 제어 장치, 그를 포함한 시스템 및 그 방법을 제공하고자 한다.An embodiment of the present invention is to provide a voice recognition control device, a system including the same, and a method for controlling a vehicle through accurate voice recognition in an effective device among a plurality of devices.

본 발명의 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재들로부터 당업자에게 명확하게 이해될 수 있을 것이다.The technical problems of the present invention are not limited to the technical problems mentioned above, and other technical problems not mentioned will be clearly understood by those skilled in the art from the description below.

본 발명의 실시예에 따른 음성 인식 제어 장치는 웨이크업 키워드와 명령어를 포함하는 음성신호를 입력받는 음성신호 입력부; 상기 음성신호의 선두에 위치하는 웨이크업 키워드를 인식하여 복수개의 디바이스 중 상기 웨이크업 키워드에 해당하는 디바이스를 웨이크업시키는 음성 필터링부; 및 상기 음성신호의 명령어의 단어 및 동사 기반으로 주요 명령어를 선택하는 명령어 필터링부;를 포함할 수 있다.A voice recognition control device according to an embodiment of the present invention includes a voice signal input unit for receiving a voice signal including a wakeup keyword and a command; a voice filtering unit that recognizes a wakeup keyword located at the head of the voice signal and wakes up a device corresponding to the wakeup keyword among a plurality of devices; and a command filtering unit that selects main commands based on words and verbs of commands of the voice signal.

일 실시예에서, 상기 웨이크업된 디바이스를 통한 차량 명령어 전송할지를 검증하는 검증부를 더 포함할 수 있다.In one embodiment, a verification unit for verifying whether to transmit a vehicle command through the wake-up device may be further included.

일 실시예에서, 상기 검증부는, 상기 웨이크업된 디바이스가 적어도 하나 이상인 경우, 상기 음성신호의 명령어가 각 디바이스별 명령어 문법에 맞는지를 확인하여 상기 음성신호의 명령어가 상기 적어도 하나 이상의 웨이크업된 디바이스와의 문법 일치율이 가장 높은 디바이스를 선택하여 차량 명령어를 전송하는 것을 포함할 수 있다.In one embodiment, the verifier checks whether the command of the audio signal conforms to the command grammar for each device when the number of the wake-up devices is at least one, and the command of the audio signal is the at least one wake-up device. It may include selecting a device having the highest grammar matching rate with and transmitting the vehicle command.

일 실시예에서, 상기 음성 필터링부는, 상기 음성신호로부터 문자를 추출하고, 상기 음성신호 중 적어도 하나 이상의 웨이크업 키워드가 존재하는 경우 상기 적어도 하나 이상의 웨이크업 키워드에 대한 음성신호 코릴레이션 및 문자 코릴레이션을 수행하여, 코릴레이션이 높은 웨이크업 키워드를 선택하는 것을 포함할 수 있다.In one embodiment, the voice filtering unit extracts text from the voice signal, and when at least one wakeup keyword exists in the voice signal, voice signal correlation and text correlation with respect to the at least one wakeup keyword It may include selecting a wakeup keyword having a high correlation by performing .

일 실시예에서, 상기 명령어 필터링부는, 상기 음성신호가 복수 개 입력되는 경우, 각 명령어의 문자 매칭률 및 신호의 세기를 판단하여, 문자 매칭률이 높거나 상기 신호의 세기가 큰 음성신호를 선택하는 것을 포함할 수 있다.In one embodiment, when a plurality of voice signals are input, the command filtering unit determines a character matching rate and signal strength of each command, and selects a voice signal having a high character matching rate or a large signal strength. may include doing

일 실시예에서, 상기 명령어 필터링부는, 상기 음성신호의 명령어가 복수개인 경우, 상기 명령어의 동사 및 단어 수를 기반으로 상기 복수개의 명령어 중 하나를 선택하는 것을 포함할 수 있다.In one embodiment, the command filtering unit may include selecting one of the plurality of commands based on the number of verbs and words of the command when the voice signal includes a plurality of commands.

일 실시예에서, 상기 명령어 필터링부는, 상기 명령어 내 동사 및 단어의 수가 많은 명령어를 선택하되, 상기 동사의 수를 상기 단어의 수보다 우선순위로 명령어를 선택하는 것을 포함할 수 있다.In one embodiment, the command filtering unit may include selecting a command having a large number of verbs and words in the command, and selecting the command by prioritizing the number of verbs over the number of words.

일 실시예에서, 상기 음성신호 입력 시 생체 인증을 수행하는 생체 인증부를 더 포함할 수 있다.In one embodiment, the biometric authentication unit may further include a biometric authentication unit performing biometric authentication when the voice signal is input.

일 실시예에서, 상기 검증부에 의해 검증이 완료되면, 상기 음성신호를 발화한 사용자에게 웨이크업 키워드가 인식되었음을 피드백하는 피드백부를 더 포함할 수 있다.In one embodiment, when verification is completed by the verification unit, a feedback unit for feeding back that the wakeup keyword has been recognized to the user who uttered the voice signal may be further included.

일 실시예에서, 상기 음성 필터링부는, 상기 음성신호 내에 웨이크업 키워드가 존재하는 경우, 상기 웨이크업 키워드가 상기 음성신호 선두에 위치하는 경우 유효한 웨이크업 키워드인 것으로 판단하는 것을 포함할 수 있다.In one embodiment, the voice filtering unit may include determining that the wakeup keyword is a valid wakeup keyword when the wakeup keyword is located at the beginning of the voice signal when the wakeup keyword exists in the voice signal.

본 발명의 실시예에 따른 음성 인식 제어 시스템은 웨이크업 키워드가 서로 다른 복수개의 디바이스에 대해, 사용자의 음성 신호의 선두에 위치하는 웨이크업 키워드를 인식하여 상기 인식된 웨이크업 키워드에 해당하는 디바이스를 웨이크업 시키고, 상기 음성신호 내의 명령어를 필터링하는 음성 인식 제어 장치; 및 상기 디바이스로부터 수신한 음성신호의 음성 인식을 수행하여 상기 음성 인식 수행 결과를 상기 디바이스로 제공하는 음성 인식 서버를 포함할 수 있다.A voice recognition control system according to an embodiment of the present invention recognizes a wakeup keyword located at the head of a user's voice signal for a plurality of devices having different wakeup keywords, and selects a device corresponding to the recognized wakeup keyword. a voice recognition control device that wakes up and filters commands in the voice signal; and a voice recognition server that performs voice recognition on the voice signal received from the device and provides a result of the voice recognition to the device.

일 실시예에서, 상기 디바이스는, 웨이크업 키워드와 명령어를 포함하는 음성신호를 입력받는 음성신호 입력부; 상기 음성신호의 선두에 위치하는 웨이크업 키워드를 인식하여 복수개의 디바이스 중 상기 웨이크업 키워드에 해당하는 디바이스를 웨이크업시키는 음성 필터링부; 상기 음성신호의 명령어의 단어 및 동사 기반으로 주요 명령어를 선택하는 명령어 필터링부; 및 상기 웨이크업된 디바이스를 통한 차량 명령어 전송할지를 검증하는 검증부를 포함할 수 있다.In one embodiment, the device may include a voice signal input unit for receiving a voice signal including a wakeup keyword and a command; a voice filtering unit that recognizes a wakeup keyword located at the head of the voice signal and wakes up a device corresponding to the wakeup keyword among a plurality of devices; a command filtering unit for selecting main commands based on words and verbs of commands of the voice signal; and a verification unit verifying whether to transmit a vehicle command through the wake-up device.

본 발명의 실시예에 따른 음성 인식 제어 방법은 웨이크업 키워드와 명령어를 포함하는 음성신호를 입력받는 단계; 상기 음성신호의 선두에 위치하는 웨이크업 키워드를 인식하여 복수개의 디바이스 중 상기 웨이크업 키워드에 해당하는 디바이스를 웨이크업시키는 단계; 상기 음성신호의 명령어의 단어 및 동사 기반으로 주요 명령어를 선택하는 단계; 상기 웨이크업된 디바이스를 통한 차량 명령어 전송할지를 검증하는 단계; 및 검증된 디바이스를 통해 차량 명령어를 전송하는 단계를 포함할 수 있다.A voice recognition control method according to an embodiment of the present invention includes receiving a voice signal including a wakeup keyword and a command; recognizing a wakeup keyword located at the head of the voice signal and waking up a device corresponding to the wakeup keyword among a plurality of devices; selecting a main command based on words and verbs of the command of the voice signal; verifying whether to transmit a vehicle command through the wake-up device; and transmitting a vehicle command through the verified device.

일 실시예에서, 상기 검증하는 단계는 상기 음성신호의 명령어가 각 디바이스별 명령어 문법에 맞는지를 확인하여 상기 음성신호의 명령어가 상기 적어도 하나 이상의 웨이크업된 디바이스와의 문법 일치율이 가장 높은 디바이스를 선택하여 차량 명령어를 전송하는 것을 포함할 수 있다.In one embodiment, the verifying step checks whether the command of the voice signal conforms to the command grammar for each device, and selects a device with the highest grammar matching rate of the command of the voice signal with the at least one wake-up device. It may include transmitting a vehicle command by doing so.

일 실시예에서, 상기 디바이스를 웨이크업시키는 단계는, 상기 음성신호로부터 문자를 추출하고, 상기 음성신호 중 적어도 하나 이상의 웨이크업 키워드가 존재하는 경우 상기 적어도 하나 이상의 웨이크업 키워드에 대한 음성신호 코릴레이션 및 문자 코릴레이션을 수행하여, 코릴레이션이 높은 웨이크업 키워드를 선택하는 것을 포함할 수 있다.In one embodiment, the step of waking up the device may include extracting text from the voice signal, and if at least one wakeup keyword exists in the voice signal, voice signal correlation with the at least one wakeup keyword and selecting a wakeup keyword having a high correlation by performing text correlation.

일 실시예에서, 상기 주요 명령어를 선택하는 단계는, 상기 음성신호가 복수 개 입력되는 경우, 각 명령어의 문자 매칭률 및 신호의 세기를 판단하여, 문자 매칭률이 높거나 상기 신호의 세기가 큰 음성신호를 선택하는 것을 포함할 수 있다.In one embodiment, in the step of selecting the main command, when a plurality of voice signals are input, a character matching rate and signal strength of each command are determined, and the character matching rate is high or the signal strength is high. It may include selecting an audio signal.

본 발명의 실시예에 따른 복수개의 디바이스 중 유효한 디바이스에서 정확한 음성인식을 통해 차량 제어를 수행할 수 있도록 한다. It is possible to perform vehicle control through accurate voice recognition in an effective device among a plurality of devices according to an embodiment of the present invention.

도 1은 본 발명의 실시예에 따른 음성 인식 제어 시스템의 구성도이다.
도 2는 도 1의 디바이스의 세부 구성도이다.
도 3은 본 발명의 실시예에 따른 음성 필터링 방법을 설명하기 위한 도면이다.
도 4는 본 발명의 실시예에 따른 음성 명령신호의 코릴레이션에 따른 인덱스를 나타내는 도면이다.
도 5는 도 1의 음성인식 서버의 세부 구성도이다.
도 6은 본 발명의 실시예에 따른 다중 대화 플랫폼 음성인식 제어 방법을 나타내는 순서도이다.
도 7은 본 발명의 실시예에 따른 음성 필터링 방법을 구체적으로 나타내는 순서도이다.
도 8은 본 발명의 실시예에 따른 명령어 필터링 방법을 구체적으로 나타내는 순서도이다.
도 9은 본 발명의 실시예에 따른 검증 방법을 구체적으로 나타내는 순서도이다.
도 10은 본 발명의 실시예에 따른 음성 인식 제어 방법을 설명하기 위한 도면이다.
도 11은 본 발명의 실시예에 따른 음성인식 제어 방법을 적용한 컴퓨터 시스템의 구성도이다.1 is a block diagram of a voice recognition control system according to an embodiment of the present invention.
2 is a detailed configuration diagram of the device of FIG. 1;
3 is a diagram for explaining a voice filtering method according to an embodiment of the present invention.
4 is a diagram showing indexes according to correlation of voice command signals according to an embodiment of the present invention.
5 is a detailed configuration diagram of the voice recognition server of FIG. 1 .
6 is a flowchart illustrating a multi-dialogue platform voice recognition control method according to an embodiment of the present invention.
7 is a flowchart illustrating a voice filtering method in detail according to an embodiment of the present invention.
8 is a flowchart specifically illustrating a command filtering method according to an embodiment of the present invention.
9 is a flowchart illustrating a verification method in detail according to an embodiment of the present invention.
10 is a diagram for explaining a voice recognition control method according to an embodiment of the present invention.
11 is a configuration diagram of a computer system to which a voice recognition control method according to an embodiment of the present invention is applied.

이하, 본 발명의 일부 실시예들을 예시적인 도면을 통해 상세하게 설명한다. 각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 발명의 실시예를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 실시예에 대한 이해를 방해한다고 판단되는 경우에는 그 상세한 설명은 생략한다.Hereinafter, some embodiments of the present invention will be described in detail through exemplary drawings. In adding reference numerals to components of each drawing, it should be noted that the same components have the same numerals as much as possible even if they are displayed on different drawings. In addition, in describing an embodiment of the present invention, if it is determined that a detailed description of a related known configuration or function hinders understanding of the embodiment of the present invention, the detailed description will be omitted.

본 발명의 실시예의 구성 요소를 설명하는 데 있어서, 제 1, 제 2, A, B, (a), (b) 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 또한, 다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가진 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.In describing the components of the embodiment of the present invention, terms such as first, second, A, B, (a), and (b) may be used. These terms are only used to distinguish the component from other components, and the nature, order, or order of the corresponding component is not limited by the term. In addition, unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by a person of ordinary skill in the art to which the present invention belongs. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the related art, and unless explicitly defined in the present application, they should not be interpreted in an ideal or excessively formal meaning. don't

이하, 도 1 내지 도 11을 참조하여, 본 발명의 실시예들을 구체적으로 설명하기로 한다.Hereinafter, embodiments of the present invention will be described in detail with reference to FIGS. 1 to 11 .

도 1은 본 발명의 실시예에 따른 음성 인식 제어 시스템의 구성도이다.1 is a block diagram of a voice recognition control system according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 실시예에 따른 음성 인식 제어 시스템은 음성신호가 입력되는 음성 인식 제어 장치(100), 적어도 하나 이상의 디바이스(301,….300n), 음성 인식 제어 장치(100)로부터 수신한 음성신호를 인식한 명령어를 디바이스(300)로 전달하여, 디바이스(300)가 차량 단말(400)로 명령어를 전송하는 구성을 가진다. Referring to FIG. 1, the voice recognition control system according to an embodiment of the present invention includes a voice recognition control device 100 to which a voice signal is input, at least one device 301, .... 300n, and a voice recognition control device 100. The device 300 transmits a command recognizing the voice signal received from the device 300, and the device 300 transmits the command to the vehicle terminal 400.

음성 인식 제어 장치(100)는 웨이크업 키워드가 서로 다른 복수개의 디바이스에 대해, 사용자의 음성 신호의 선두에 위치하는 웨이크업 키워드를 인식하여 상기 인식된 웨이크업 키워드에 해당하는 디바이스를 웨이크업 시키고, 상기 음성신호 내의 명령어를 필터링하고 웨이크업 된 디바이스를 검증하여 검증된 웨이크업 디바이스를 통해 차량 단말(400)로 차량 명령어를 전송한다.The voice recognition control apparatus 100 recognizes a wakeup keyword located at the head of a user's voice signal for a plurality of devices having different wakeup keywords and wakes up a device corresponding to the recognized wakeup keyword, Commands in the voice signal are filtered, the wake-up device is verified, and the vehicle command is transmitted to the vehicle terminal 400 through the verified wake-up device.

음성 인식 서버(200)는 음성인식 제어 장치(100)로부터 음성신호를 수신하면 음성인식을 수행한다. 즉 음성 인식 서버(200)는 음성신호를 문자열(텍스트)로 변환하고, 이를 위한 음향 모델(Acoustic Model)과 언어 모델(Language Model) 등을 포함할 수 있다. 음향 모델은 음성의 신호적인 특성을 모델링 한 것을 말한다. 언어 모델은 인식 어휘에 해당하는 단어나 음절 등의 언어적인 순서 관계를 모델링 한 것을 말한다.The voice recognition server 200 performs voice recognition upon receiving a voice signal from the voice recognition control device 100 . That is, the voice recognition server 200 converts a voice signal into a character string (text), and may include an acoustic model and a language model for this. The acoustic model refers to modeling the signal characteristics of voice. The language model refers to modeling a linguistic sequence relationship of words or syllables corresponding to a recognized vocabulary.

적어도 하나 이상의 디바이스(301,…,30n)는 서로 다른 웨이크업 키워드를 인식하며, 차량 내의 차량 단말, 이동통신 단말 등을 포함할 수 있다. 차량 단말은 음성 인식 기능을 가지는 AVN(Audio, video navigation), 텔레매틱스 단말, 네비게이션 장치 등을 차량 내 탑재되는 모든 장치를 포함할 수 있다. 이동 통신 단말은 스마트 폰(Smart Phone), 노트북(Notebook), 스마트 보드(Smart Board), 태블릿(Tablet) PC(Personal Computer), 핸드헬드(handheld) 디바이스, 핸드헬드 컴퓨터, 미디어 플레이어, 전자북 디바이스, 및 PDA(Personal Digital Assistant) 등과 같은 음성인식이 가능한 디바이스 중 적어도 하나를 포함할 수 있으나 본 개시에서 디바이스(300)는 상술한 바로 제한되지 않는다.At least one or more devices 301, ..., 30n recognize different wakeup keywords, and may include a vehicle terminal in a vehicle, a mobile communication terminal, and the like. The vehicle terminal may include all devices mounted in the vehicle, such as audio, video navigation (AVN) having a voice recognition function, a telematics terminal, and a navigation device. Mobile communication terminals include smart phones, notebooks, smart boards, tablet PCs (personal computers), handheld devices, handheld computers, media players, and e-book devices. , And at least one of voice recognition devices such as PDA (Personal Digital Assistant), but in the present disclosure, the device 300 is not limited to the above.

이때, 음성 인식 제어 장치(100), 적어도 하나 이상의 디바이스(301,…..300n), 음성 인식 서버(200)는 유선 또는/및 무선 네트워크 기반, 근거리 무선 네트워크 또는/및 원거리 무선 네트워크로 연결될 수 있다.At this time, the voice recognition control device 100, at least one or more devices 301, .... 300n, and the voice recognition server 200 may be connected to a wired or / and wireless network-based, a short-range wireless network, or / and a long-distance wireless network. have.

도 2는 도 1의 음성인식 제어 장치(100)의 세부 구성도이다.FIG. 2 is a detailed configuration diagram of the voice recognition control device 100 of FIG. 1 .

음성인식 제어 장치(100)는 사용자로부터 음성신호를 입력받아, 웨이크업 키워드가 인식되면 웨이크업되어 음성신호의 명령어 필터링 및 검증을 거친 후 음성 인식 서버(200)로 전송한다.The voice recognition control device 100 receives a voice signal from the user, wakes up when a wakeup keyword is recognized, filters and verifies commands of the voice signal, and transmits the voice signal to the voice recognition server 200 .

이를 위해, 음성인식 제어 장치(100)는 음성신호 입력부(111), 통신부(112), 디스플레이부(113), 저장부(114), 음성 필터링부(115), 명령어 필터링부(116), 검증부(117), 제어부(118), 생체 인증부(119), 피드백부(120)를 포함한다. To this end, the voice recognition control device 100 includes a voice signal input unit 111, a communication unit 112, a display unit 113, a storage unit 114, a voice filtering unit 115, a command filtering unit 116, and a verification unit. It includes a unit 117, a control unit 118, a biometric authentication unit 119, and a feedback unit 120.

음성신호 입력부(111)는 마이크로폰(microphone) 또는 단일 입력 음성 장치(single input voice device) 등을 통해 음성 신호를 입력받는다. 이때, 음성신호는 웨이크업 키워드와 명령어를 포함할 수 있다. The voice signal input unit 111 receives a voice signal through a microphone or a single input voice device. At this time, the voice signal may include a wakeup keyword and a command.

통신부(112)는 디바이스(300), 음성 인식 서버(200)와의 통신을 수행한다.The communication unit 112 performs communication with the device 300 and the voice recognition server 200 .

디스플레이부(113)는 음성 인식 서버(200)로부터 수신된 음성 인식 결과를 표시한다. 이를 위해 디스플레이부(113)는 액정 디스플레이(liquid crystal display), 박막 트랜지스터 액정 디스플레이(thin film transistor-liquid crystal display), 유기 발광 다이오드(organic light-emitting diode), 플렉시블 디스플레이(flexible display), 3차원 디스플레이(3D display), 또는 전기영동 디스플레이(electrophoretic display, EPD)를 포함할 수 있다.The display unit 113 displays the voice recognition result received from the voice recognition server 200 . To this end, the display unit 113 includes a liquid crystal display, a thin film transistor-liquid crystal display, an organic light-emitting diode, a flexible display, and a three-dimensional A display (3D display) or an electrophoretic display (EPD) may be included.

저장부(114)는 사용자로부터 입력받은 음성신호, 음성 인식 서버(200)로부터 수신된 음성 인식 결과, 디바이스별 웨이크업 키워드 리스트, 명령어 문법 정보 등을 저장한다. 이를 위해, 저장부(140)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(RAM, Random Access Memory), SRAM(Static Random Access Memory), 롬(ROM, Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 또는 광디스크 타입의 저장매체를 포함할 수 있다.The storage unit 114 stores a voice signal input from a user, a voice recognition result received from the voice recognition server 200, a wakeup keyword list for each device, command grammar information, and the like. To this end, the storage unit 140 may be a flash memory type, a hard disk type, a multimedia card micro type, or a card type memory (for example, SD or XD memory). etc.), RAM (RAM, Random Access Memory), SRAM (Static Random Access Memory), ROM (ROM, Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), It may include a storage medium in the form of a magnetic memory, a magnetic disk, or an optical disk.

음성 필터링부(115)는 사용자로부터 입력받은 음성신호 중 웨이크업 키워드 위치를 파악하고, 맨 앞단에 위치한 웨이크업 키워드를 인식하여 자신의 웨이크업 키워드와 일치하면 디바이스를 웨이크업 시킨다. The voice filtering unit 115 detects the position of the wakeup keyword among the voice signals input from the user, recognizes the wakeup keyword located at the front end, and wakes up the device when it matches its own wakeup keyword.

이를 위해, 음성 필터링부(115)는 음성 신호로부터 문자(character)를 추출하고, 웨이크업 키워드 리스트에 존재하는 웨이크업 키워드가 존재하는지를 필터링 한 후 다중 웨이크업 키워드가 인식된 경우 각 웨이크업 키워드에 대한 음성신호 코릴레이션 및 문자 코릴레이션을 수행하여 고연관 인덱스들을 가지는 웨이크업 키워드를 선택하고 노이즈를 필터링한다. To this end, the voice filtering unit 115 extracts a character from the voice signal, filters whether there is a wakeup keyword that exists in the wakeup keyword list, and when multiple wakeup keywords are recognized, each wakeup keyword By performing voice signal correlation and text correlation for the selected wakeup keywords having highly relevant indices, noise is filtered.

도 3은 본 발명의 실시예에 따른 음성 필터링 방법을 설명하기 위한 도면이다. 도 4는 본 발명의 실시예에 따른 음성 명령신호의 코릴레이션에 따른 인덱스를 나타내는 도면이다. 3 is a diagram for explaining a voice filtering method according to an embodiment of the present invention. 4 is a diagram showing indexes according to correlation of voice command signals according to an embodiment of the present invention.

도 3을 참조하면, 음성 신호가 “Ok, Google. Tell my vehicle to be ready for Alexa? party. Gigagenie”라고 입력된 경우, “Ok, Google”, “Alexa”, “Gigagenie”에 대해 각각 음성신호의 코릴레이션 및 문자 코릴레이션을 수행한다. 도 4를 참조하면 “Ok, Google”의 음성 코릴레이션과 문자 코릴레이션 값이 가장 높으므로, 가장 높은 연관성을 가지는 ok, google가 웨이크업 키워드로서 선택될 수 있다. 이때 도 4에서 인덱스는 낮을수록 고연관을 나타낸다.Referring to FIG. 3 , the voice signal “Ok, Google. Tell my vehicle to be ready for Alexa? party. When "Gigagenie" is input, voice signal correlation and text correlation are performed for "Ok, Google", "Alexa", and "Gigagenie", respectively. Referring to FIG. 4 , since “Ok, Google” has the highest voice correlation and text correlation value, ok and google having the highest correlation can be selected as a wakeup keyword. At this time, in FIG. 4, the lower the index, the higher the correlation.

이때, 음성 필터링부(115)는 이러한 코릴레이션을 통해 음성신호의 초반부에 입력되는 명령어를 선택하고 음성신호의 중반부나 말미에 입력되는 웨이크업 키워드는 선택하지 않도록 한다.At this time, the voice filtering unit 115 selects a command input at the beginning of the voice signal through such correlation and does not select a wakeup keyword input at the middle or end of the voice signal.

명령어 필터링부(116)는 입력된 복수의 음성신호 각각에 대해 텍스트 매칭을 수행하여 매칭률이 높거나 신호의 세기(SNR)가 높은 음성신호를 선택하고, 선택된 음성신호가 단일 명령인지 다중 명령인지를 판단한다.The command filtering unit 116 performs text matching on each of a plurality of input voice signals, selects a voice signal having a high matching rate or a high signal strength (SNR), and determines whether the selected voice signal is a single command or multiple commands. judge

또한 명령어 필터링부(116)는 다중 명령인 경우 다중 명령의 세부 내용을 파악하여 동사수 및 단어수를 기반으로 가장 명확한 명령어를 선택하고 불명확한 명령어를 필터링한다. 이때, 동사수가 단어수보다 우선순위를 가질 수 있다. In addition, in the case of multiple commands, the command filtering unit 116 identifies the details of the multiple commands, selects the clearest command based on the number of verbs and words, and filters out unclear commands. In this case, the number of verbs may have priority over the number of words.

검증부(117)는 명령어와 관련된 디바이스의 수를 체크하고, 명령어 패턴과 제어 경로를 분석한 후 이 명령어의 제어 경로가 중복인지를 판단하여, 중복이면 중복 경로들 중 하나를 선택하고 중복이 아닌 경우, 특정 디바이스에 대한 명령어의 문장 형식의 문법을 체크한다. 즉 검증부(117)는 웨이크업된 디바이스가 적어도 하나 이상인 경우, 음성신호의 명령어가 각 디바이스별 명령어 문법에 맞는지를 확인하여 음성신호의 명령어가 적어도 하나 이상의 웨이크업된 디바이스와의 문법 일치율이 가장 높은 디바이스를 선택하여 차량 명령어를 전송할 수 있다.The verification unit 117 checks the number of devices related to the command, analyzes the command pattern and control path, and determines whether the control path of the command is redundant. If so, check the syntax of the sentence format of the command for the specific device. That is, if there are at least one device that has been woken up, the verification unit 117 checks whether the command of the voice signal conforms to the command grammar for each device, and the command of the audio signal has the highest grammar matching rate with the one or more devices that have been woken up. A high device can be selected to transmit vehicle commands.

제어부(118)는 각 구성요소의 전반적인 동작을 제어한다. The control unit 118 controls the overall operation of each component.

생체 인증부(119)는 음성신호 입력 시 생체 인증을 수행할 수 있다.The biometric authentication unit 119 may perform biometric authentication when a voice signal is input.

피드백부(120)는 검증부(117)에 의해 검증이 완료되면, 음성신호를 발화한 사용자에게 웨이크업 키워드가 인식되었음을 피드백한다.When verification by the verification unit 117 is completed, the feedback unit 120 feeds back that the wakeup keyword has been recognized to the user who uttered the voice signal.

도 5는 도 1의 음성인식 서버의 세부 구성도이다. 5 is a detailed configuration diagram of the voice recognition server of FIG. 1 .

음성 인식 서버(200)는 통신부(210), 저장부(220), 음성 인식부(230), 제어부(240)를 포함한다.The voice recognition server 200 includes a communication unit 210, a storage unit 220, a voice recognition unit 230, and a control unit 240.

통신부(210)는 음성인식 제어 장치(100)와의 통신을 수행한다.The communication unit 210 communicates with the voice recognition control device 100 .

저장부(220)는 음성인식을 위한 데이터를 저장한다. The storage unit 220 stores data for voice recognition.

음성 인식부(230)는 음성인식 제어 장치(100)로부터 수신한 음성신호의 음성인식을 수행한다. The voice recognition unit 230 performs voice recognition of the voice signal received from the voice recognition control device 100 .

제어부(240)는 각 구성의 전반적인 동작을 제어한다. The controller 240 controls the overall operation of each component.

본 발명에서는 음성 신호의 필터링, 명령어 필터링부, 검증부의 구성을 포함하는 음성 인식 제어 장치가 음성인식 제어 장치(100) 내에 포함되는 경우를 도시하고 있으나, 이동통신 단말이나 차량 단말과 별도의 단말로 음성 인식 제어 장치를 구현할 수도 있다. Although the present invention shows a case in which the voice recognition control device including voice signal filtering, command filtering unit, and verification unit is included in the voice recognition control device 100, it is configured as a terminal separate from a mobile communication terminal or a vehicle terminal. A voice recognition control device may be implemented.

이하, 도 6을 참조하여, 본 발명의 실시예에 따른 음성인식 제어 방법을 설명하기로 한다. 도 6은 본 발명의 실시예에 따른 음성인식 제어 방법을 나타내는 순서도이다. Hereinafter, a voice recognition control method according to an embodiment of the present invention will be described with reference to FIG. 6 . 6 is a flowchart illustrating a voice recognition control method according to an embodiment of the present invention.

도 6을 참조하면 음성인식 제어 장치(100)는 입력된 음성신호에 대해 음성 필터링을 처리한다(S100)). 즉 음성 인식 제어 장치(100)는 웨이크업 키워드와 명령어를 포함하는 음성신호를 입력받고, 음성신호의 선두에 위치하는 웨이크업 키워드를 인식하여 복수개의 디바이스 중 웨이크업 키워드에 해당하는 디바이스를 웨이크업시킨다.Referring to FIG. 6 , the voice recognition control device 100 processes voice filtering on an input voice signal (S100). That is, the voice recognition control apparatus 100 receives a wakeup keyword and a voice signal including a command, recognizes the wakeup keyword located at the head of the voice signal, and wakes up a device corresponding to the wakeup keyword among a plurality of devices. let it

이어, 음성 인식 제어 장치(100)는 명령어 필터링을 수행한다(S200). 즉, 음성 인식 제어 장치(100)는 음성신호의 명령어의 단어 및 동사 기반으로 주요 명령어를 선택한다.Next, the voice recognition control device 100 performs command filtering (S200). That is, the voice recognition control device 100 selects a main command based on words and verbs of the command of the voice signal.

그 후, 음성 인식 제어 장치(100)는 검증 처리를 수행한다(S300). 즉, 음성 인식 제어 장치(100)는 웨이크업된 디바이스를 통한 차량 명령어 전송할지를 검증한다.After that, the voice recognition control device 100 performs a verification process (S300). That is, the voice recognition control apparatus 100 verifies whether to transmit a vehicle command through the wake-up device.

이어 음성 인식 제어 장치(100)는 검증된 디바이스를 통해 차량으로 명령어를 전송한다(S400). Subsequently, the voice recognition control device 100 transmits a command to the vehicle through the verified device (S400).

이하, 도 7을 참조하여 본 발명의 실시예에 따른 음성 필터링 방법을 구체적으로 설명하기로 한다. 도 7는 본 발명의 실시예에 따른 음성 필터링 방법을 구체적으로 나타내는 순서도이다. Hereinafter, a voice filtering method according to an embodiment of the present invention will be described in detail with reference to FIG. 7 . 7 is a flowchart specifically illustrating a voice filtering method according to an embodiment of the present invention.

음성 인식 제어 장치(100)는 입력된 음성 신호로부터 문자(character)를 추출하고(S101), 추출된 문자로부터 웨이크업 키워드를 추출한다(S102). 이때, 웨이크업 키워드는 음성신호의 선두, 중반, 말미에 위치할 수 있으며 본 발명에서는 선두에 위치한 웨이크업 키워드만 웨이크업 키워드로서 인정한다. 예컨데, “시리야”가 웨이크업 키워드인 디바이스에 대해 “밥시리야호”라고 발화한 경우 발화 내용에 “시리야”를 포함하기는 하나 문장의 중간에 위치하므로 웨이크업 키워드로 인정되지 않는다. The voice recognition control apparatus 100 extracts a character from the input voice signal (S101) and extracts a wakeup keyword from the extracted character (S102). At this time, the wakeup keyword may be located at the beginning, middle, or end of the voice signal, and only the wakeup keyword located at the beginning is recognized as the wakeup keyword in the present invention. For example, if “Hey Siri” is uttered to a device where “Hey Siri” is a wakeup keyword, “Hey Siri” is included in the utterance, but it is not recognized as a wakeup keyword because it is located in the middle of the sentence.

한편, 음성 인식 제어 장치(100)는 다중 웨이크업 키워드가 인식되었는지를 판단하고(S103) 다중 웨이크업 키워드가 인식되지 않고 단일의 웨이크업 키워드만 인식된 경우 해당 웨이크업 키워드에 대해 노이즈 필터링을 수행한 후 출력한다(S105).Meanwhile, the voice recognition control device 100 determines whether multiple wakeup keywords are recognized (S103), and if multiple wakeup keywords are not recognized and only a single wakeup keyword is recognized, noise filtering is performed on the corresponding wakeup keyword. After that, it is output (S105).

한편, 다중 웨이크업 키워드가 인식된 경우 각 웨이크업 키워드에 대해 음성신호 코릴레이션 및 문자 코릴레이션을 수행하여(S105) 고연관 인덱스를 가지는 웨이크업 명령어를 선택한다(S106).Meanwhile, when multiple wakeup keywords are recognized, voice signal correlation and text correlation are performed on each wakeup keyword (S105) to select a wakeup command having a highly correlated index (S106).

이어 선택된 웨이크업 명령어에 대해 노이즈 필터링을 수행한다(S105).Subsequently, noise filtering is performed on the selected wakeup command (S105).

이하, 도 8을 참조하여 본 발명의 실시예에 따른 명령어 필터링 방법을 구체적으로 설명하기로 한다. 도 8은 본 발명의 실시예에 따른 명령어 필터링 방법을 구체적으로 나타내는 순서도이다. 도 8에서는 음성신호내의 웨이크업 키워드를 제외한 명령어에 대한 필터링 방법을 설명한다.Hereinafter, a command filtering method according to an embodiment of the present invention will be described in detail with reference to FIG. 8 . 8 is a flowchart specifically illustrating a command filtering method according to an embodiment of the present invention. 8 describes a filtering method for commands other than the wakeup keyword in the voice signal.

음성 인식 제어 장치(100)는 디바이스로부터 다중 명령어 입력을 위해 대기하고(S201), 다중 명령어의 텍스트 매칭률이 90% 이상인지를 판단한다(S202). 이때, 텍스트 매칭률은 명령어 DB에 저장된 텍스트와 비교함으로써 산출될 수 있다.The voice recognition control apparatus 100 waits for input of multiple commands from the device (S201) and determines whether the text matching rate of the multiple commands is 90% or higher (S202). At this time, the text matching rate may be calculated by comparing the text stored in the command DB.

텍스트 매칭률이 90% 이상이 아닌 경우 음성 인식 제어 장치(100)는 신호의 세기가 높은 명령어를 선택하고, 텍스트 매칭률이 90% 이상인 경우 매칭률이 높은 명령어를 선택한다(S203).If the text matching rate is not 90% or more, the voice recognition control device 100 selects a command with a high signal strength, and if the text matching rate is 90% or more, selects a command with a high matching rate (S203).

이후, 음성 인식 제어 장치(100)는 명령어가 단일 명령어인지 다중 명령어인지를 판단하고(S205), 단일 명령어인 경우 검증부(117)로 단일 명령어를 전달한다(S206). 이때, 단일 명령어는 하나의 명령어를 포함하는 경우이고, 다중 명령어는 복수개의 명령어를 포함하는 경우이다. 예를 들어, “창문 열고 라디오 틀어”의 경우 다중 명령어이다. Thereafter, the voice recognition control device 100 determines whether the command is a single command or multiple commands (S205), and transmits the single command to the verifier 117 if it is a single command (S206). In this case, a single command is a case of including one command, and a multi-command is a case of including a plurality of commands. For example, “Open a window and play the radio” is a multi-command.

한편, 다중 명령어인 경우, 음성 인식 제어 장치(100)는 해당 명령어가 직접 명령어 인지를 판단하고(S207) 직접 명령어인 경우 불명확한 명령어를 필터링한 후(S208), 검증부(117)로 단일 명령어를 전달한다(S206).On the other hand, in the case of multiple commands, the voice recognition control device 100 determines whether the corresponding command is a direct command (S207), and if the command is a direct command, after filtering out unclear commands (S208), the verifier 117 performs a single command is transmitted (S206).

직접 명령어가 아닌 경우, 명령어의 동사수 및 단어수를 기반으로 명령어를 선택한다(S209). 예컨데, “차량 시동을 걸어”는 직접 명령어이고, “곧 떠날 준비를 해”는 간접 명령어이다. “차량 시동을 걸어”는 동사가 1개 단어가 3개이고, “곧 떠날 준비를 해”는 동사가 1개 단어가 4개이다. 이때, 동사수 또는 단어수가 많은 명령어가 선택될 수 있다.If it is not a direct command, a command is selected based on the number of verbs and words of the command (S209). For example, “start the car” is a direct command, and “prepare to leave soon” is an indirect command. “Start the car” has 1 verb and 3 words, and “Get ready to leave soon” has 1 verb and 4 words. At this time, a command having a large number of verbs or words may be selected.

이후, 선택된 명령어에 대한 불명확한 명령어제거를 위한 필터링을 수행한 후(S208), 검증부(117)로 단일 명령어를 전달한다(S206).Thereafter, after performing filtering to remove ambiguous commands for the selected command (S208), a single command is delivered to the verifier 117 (S206).

이하, 도 9를 참조하여 본 발명의 실시예에 따른 검증 방법을 구체적으로 설명하기로 한다. 도 9은 본 발명의 실시예에 따른 검증 방법을 구체적으로 나타내는 순서도이다. Hereinafter, a verification method according to an embodiment of the present invention will be described in detail with reference to FIG. 9 . 9 is a flowchart illustrating a verification method in detail according to an embodiment of the present invention.

음성 인식 제어 장치(100)는 음성 인식을 위해 웨이크업 된 디바이스 수를 체크하고(S301), 명령어 패턴 및 제어 경로를 분석하고(S302), 중복 제어 경로가 존재하는 지를 판단한다(S303). 이때 제어 경로는 웨이크업된 디바이스에서 차량 단말로 차량 명령어를 전송하는 경로를 의미하며, 이러한 제어 경로가 2개이상 존재하는 지를 판단하는 것이다. The voice recognition control apparatus 100 checks the number of wake-up devices for voice recognition (S301), analyzes command patterns and control paths (S302), and determines whether duplicate control paths exist (S303). At this time, the control path refers to a path for transmitting a vehicle command from a wake-up device to a vehicle terminal, and it is determined whether two or more such control paths exist.

중복 제어 경로가 없는 경우 음성 인식 제어 장치(100)는 해당 경로를 최적의 제어 경로로 선택하고(S304), 중복 제어 경로가 존재하는 경우 중복 제어 경로 각각의 디바이스별 문법을 체크한다(S305). 이에 문법 일치율은 디바이스를 선택하여 차량 명령어를 차량 단말로 전송할 수 있다. If there is no redundant control path, the voice recognition control apparatus 100 selects the corresponding path as an optimal control path (S304), and if there is a redundant control path, checks the grammar for each device of each redundant control path (S305). Accordingly, the grammar matching rate may select a device and transmit a vehicle command to the vehicle terminal.

예를 들어, 디바이스 1과 디바이스가 2가 웨이크업된 경우, 디바이스 1의 문법 일치 성공율은 90%, 문법 일치 실패율은 10%이고, 디바이스 2의 문법 일치 성공율은 89%이고 문법 일치 실패율이 5%인 경우, 디바이스 2를 선택할 수 있다. 이때, 디바이스별 명령어 문법 형식이 존재하는데, 이러한 문법과 입력된 명령어의 문법이 일치하는 지를 판단하는 것이다. For example, if device 1 and device 2 wake up, device 1 has a grammar match success rate of 90% and a grammar match failure rate of 10%, and device 2 has a grammar match success rate of 89% and a grammar match failure rate of 5% In case of , device 2 can be selected. At this time, there is a command grammar form for each device, and it is determined whether the grammar matches the grammar of the input command.

이하 도 10을 참조하여 예를 들어 따른 음성 인식 제어 방법을 설명하기로 한다. 도 10을 참조하면, 사용자가 “Ok, Google. Tell my vehicle to be ready for the destination of Alexa’s party.”라고 발화한 경우, 음성 필터링부(115)는 웨이크업 키워드 리스트에 존재하는 “Ok, google”, “Alexa”를 추출하는데, 음성신호의 선두에 존재하는 “Ok, google”을 웨이크업 키워드로 인식하고 “Alexa”는 웨이크업 키워드로 인식하지 않는다. 이에 “Ok, google”을 웨이크업 키워드로 사용하는 디바이스로서 디바이스1, 디바이스2가 존재한다고 가정하면, 디바이스1, 디바이스2가 웨이크업 된다.Hereinafter, a voice recognition control method according to an example will be described with reference to FIG. 10 . Referring to FIG. 10, when the user says “ Ok, Google . When saying “Tell my vehicle to be ready for the destination of Alexa ’s party.”, the voice filtering unit 115 extracts “Ok, google” and “Alexa” from the wake-up keyword list. “Ok, google” at the beginning of is recognized as a wake-up keyword, and “Alexa” is not recognized as a wake-up keyword. Accordingly, assuming that device 1 and device 2 exist as devices using “Ok, google” as a wake-up keyword, device 1 and device 2 wake up.

이 후, 명령어 필터링부(116)는 웨이크업 키워드 뒷단에 입력되는 명령어” Tell my vehicle to be ready for the destination of Alexa’s party.” 를 필터링한다. 즉, 다중 명령어이므로 이를 분리하면“Tell my vehicle to be ready”와 “for the destination of Alexa’s party”로 2개의 명령어로 분리되고, “Tell my vehicle to be ready”는 1개의 동사와 5개의 단어로 구성되고, “for the destination of Alexa’s party”0개의 동사와 6개의 단어로 구성됨을 파악한다. After that, the command filtering unit 116 executes the command “Tell my vehicle to be ready for the destination of Alexa’s party.” to filter In other words, since it is a multi-command, if you separate it, “Tell my vehicle to be ready” and “for the destination of Alexa's party” are separated into two commands, and “Tell my vehicle to be ready” is divided into one verb and five words. It is composed of 0 verbs and 6 words “for the destination of Alexa's party”.

이에 명령어 필터링부(116)는 다중 명령어 중 동사수가 많은 “Tell my vehicle to be ready”을 주요 명령어로 선택한다. Accordingly, the command filtering unit 116 selects “Tell my vehicle to be ready” with a large number of verbs as a main command among multiple commands.

그 후, 검증부(117)는 웨이크업된 두개의 디바이스들, 디바이스1, 디바이스2 각각에서 차량 단말로 명령어로 전송하는 제어 경로를 파악하고, 디바이스의 문법 일치율을 판단한다. 즉 디바이스별 명령어 문법이 존재하는데, 사용자가 발화한“Tell my vehicle to be ready”명령어와 문법 일치율이 높은 디바이스를 선택한다. 예를 들어, 디바이스 1은 문법 일치 성공율이 90%, 문법 일치 실패율이 10%이고, 디바이스 2는 문법 일치 성공율이 89%, 문법 일치 실패율이 5%인 경우, 성공율에 비해 실패율이 더 낮은 디바이스 2를 선택하여 차량 단말로 명령어를 전달하도록 할 수 있다. After that, the verification unit 117 identifies control paths transmitted as commands from the two wake-up devices, device 1 and device 2, to the vehicle terminal, and determines the grammar matching rate of the devices. That is, there is a command grammar for each device, and a device with a high grammar matching rate with the “Tell my vehicle to be ready” command uttered by the user is selected. For example, if device 1 has a grammar match success rate of 90% and a grammar match failure rate of 10%, and device 2 has a grammar match success rate of 89% and a grammar match failure rate of 5%, then device 2 with a lower failure rate than the success rate By selecting , a command may be transmitted to the vehicle terminal.

도 11은 본 발명의 실시예에 따른 음성인식 제어 방법을 적용한 컴퓨터 시스템의 구성도이다. 11 is a configuration diagram of a computer system to which a voice recognition control method according to an embodiment of the present invention is applied.

도 11을 참조하면, 컴퓨팅 시스템(1000)은 버스(1200)를 통해 연결되는 적어도 하나의 프로세서(1100), 메모리(1300), 사용자 인터페이스 입력 장치(1400), 사용자 인터페이스 출력 장치(1500), 스토리지(1600), 및 네트워크 인터페이스(1700)를 포함할 수 있다. Referring to FIG. 11 , a computing system 1000 includes at least one processor 1100, a memory 1300, a user interface input device 1400, a user interface output device 1500, and a storage connected through a bus 1200. 1600, and a network interface 1700.

프로세서(1100)는 중앙 처리 장치(CPU) 또는 메모리(1300) 및/또는 스토리지(1600)에 저장된 명령어들에 대한 처리를 실행하는 반도체 장치일 수 있다. 메모리(1300) 및 스토리지(1600)는 다양한 종류의 휘발성 또는 불휘발성 저장 매체를 포함할 수 있다. 예를 들어, 메모리(1300)는 ROM(Read Only Memory) 및 RAM(Random Access Memory)을 포함할 수 있다. The processor 1100 may be a central processing unit (CPU) or a semiconductor device that processes commands stored in the memory 1300 and/or the storage 1600 . The memory 1300 and the storage 1600 may include various types of volatile or nonvolatile storage media. For example, the memory 1300 may include read only memory (ROM) and random access memory (RAM).

따라서, 본 명세서에 개시된 실시예들과 관련하여 설명된 방법 또는 알고리즘의 단계는 프로세서(1100)에 의해 실행되는 하드웨어, 소프트웨어 모듈, 또는 그 2 개의 결합으로 직접 구현될 수 있다. 소프트웨어 모듈은 RAM 메모리, 플래시 메모리, ROM 메모리, EPROM 메모리, EEPROM 메모리, 레지스터, 하드 디스크, 착탈형 디스크, CD-ROM과 같은 저장 매체(즉, 메모리(1300) 및/또는 스토리지(1600))에 상주할 수도 있다. Accordingly, the steps of a method or algorithm described in connection with the embodiments disclosed herein may be directly implemented as hardware executed by the processor 1100, a software module, or a combination of the two. A software module resides in a storage medium (i.e., memory 1300 and/or storage 1600) such as RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, or a CD-ROM. You may.

예시적인 저장 매체는 프로세서(1100)에 커플링되며, 그 프로세서(1100)는 저장 매체로부터 정보를 판독할 수 있고 저장 매체에 정보를 기입할 수 있다. 다른 방법으로, 저장 매체는 프로세서(1100)와 일체형일 수도 있다. 프로세서 및 저장 매체는 주문형 집적회로(ASIC) 내에 상주할 수도 있다. ASIC는 사용자 단말기 내에 상주할 수도 있다. 다른 방법으로, 프로세서 및 저장 매체는 사용자 단말기 내에 개별 컴포넌트로서 상주할 수도 있다.An exemplary storage medium is coupled to the processor 1100, and the processor 1100 can read information from, and write information to, the storage medium. Alternatively, the storage medium may be integral with the processor 1100. The processor and storage medium may reside within an application specific integrated circuit (ASIC). An ASIC may reside within a user terminal. Alternatively, the processor and storage medium may reside as separate components within a user terminal.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. The above description is merely an example of the technical idea of the present invention, and various modifications and variations can be made to those skilled in the art without departing from the essential characteristics of the present invention.

따라서, 본 발명에 개시된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.Therefore, the embodiments disclosed in the present invention are not intended to limit the technical idea of the present invention, but to explain, and the scope of the technical idea of the present invention is not limited by these embodiments. The protection scope of the present invention should be construed according to the claims below, and all technical ideas within the equivalent range should be construed as being included in the scope of the present invention.

100 : 디바이스
200 : 음성 인식 서버
110 : 차량 단말
120 : 이동 통신 단말
111 : 음성 신호 입력부
112 : 통신부
113 : 디스플레이부
114 : 저장부
115 : 음성 필터링부
116 : 명령어 필터링부
117 : 검증부
118 : 제어부 100: device
200: voice recognition server
110: vehicle terminal
120: mobile communication terminal
111: audio signal input unit
112: communication department
113: display unit
114: storage unit
115: voice filtering unit
116: command filtering unit
117: verification unit
118: control unit

Claims

a voice signal input unit that receives a voice signal including a wakeup keyword and a command;
a voice filtering unit that recognizes a wakeup keyword positioned at the head of the voice signal and wakes up a device corresponding to the wakeup keyword among a plurality of devices; and
a command filtering unit for selecting main commands based on words and verbs of commands of the voice signal;
including,
The command filtering unit,
If the command is one multi-command including a plurality of instructions, the number of verbs and the number of words of each of the plurality of instructions included in the one multi-command are determined, and the number of verbs or the number of words is the highest among the plurality of instructions. Voice recognition control device to select many commands.

The method of claim 1,
Verification unit for verifying whether to transmit a vehicle command through the wake-up device
Voice recognition control device characterized in that it further comprises.

The method of claim 2,
The verification unit,
If the wake-up device is at least one,
Checking whether the command of the voice signal matches the command grammar for each device, selecting a device with the highest grammar matching rate with the at least one wake-up device, and transmitting the vehicle command. Characterized in that Voice recognition control device.

The method of claim 1,
The voice filtering unit,
Text is extracted from the voice signal, and if at least one wakeup keyword is present in the voice signal, voice signal correlation and text correlation are performed for the at least one wakeup keyword, and wake with the highest correlation is performed. A voice recognition control device characterized by selecting an up keyword.

The method of claim 1,
The command filtering unit,
When a plurality of voice signals are input, a character matching rate and signal strength of each command are determined, and a voice signal having a high character matching rate or a high signal strength is selected.

delete

The method of claim 1,
The command filtering unit,
and selecting a command having the greatest number of verbs and words in the command, giving priority to the number of verbs over the number of words.

The method of claim 1,
A biometric authentication unit that performs biometric authentication when the voice signal is input
Voice recognition control device characterized in that it further comprises.

The method of claim 2,
When verification is completed by the verification unit, a feedback unit for feeding back that the wakeup keyword has been recognized to the user who uttered the voice signal
Voice recognition control device characterized in that it further comprises.

The method of claim 1,
The voice filtering unit,
When the wakeup keyword exists in the voice signal, it is determined that the wakeup keyword is a valid wakeup keyword when the wakeup keyword is located at the beginning of the voice signal.

For a plurality of devices having different wakeup keywords, recognizing a wakeup keyword located at the beginning of a user's voice signal, waking up a device corresponding to the recognized wakeup keyword, and filtering commands in the voice signal voice recognition control device; and
A voice recognition server for performing voice recognition on the voice signal received from the device and providing the voice recognition result to the device.
including,
The voice recognition control device,
If the command is one multi-command including a plurality of instructions, the number of verbs and the number of words of each of the plurality of instructions included in the one multi-command is determined, and the number of verbs or the number of words is the largest among the plurality of instructions. Voice recognition control system, characterized in that for selecting a command.

The method of claim 11,
The device,
a voice signal input unit that receives a voice signal including a wakeup keyword and a command;
a voice filtering unit that recognizes a wakeup keyword positioned at the head of the voice signal and wakes up a device corresponding to the wakeup keyword among a plurality of devices;
a command filtering unit for selecting main commands based on words and verbs of commands of the voice signal; and
Verification unit for verifying whether to transmit a vehicle command through the wake-up device
Voice recognition control system comprising a.

The method of claim 12,
The verification unit,
If the wake-up device is at least one,
Checking whether the command of the voice signal matches the command grammar for each device, selecting a device with the highest grammar matching rate with the at least one wake-up device, and transmitting the vehicle command. Characterized in that Voice recognition control system.

The method of claim 12,
The voice filtering unit,
Text is extracted from the voice signal, and if at least one wakeup keyword exists in the voice signal, voice signal correlation and text correlation are performed for the at least one wakeup keyword, and the wakeup with the highest correlation is performed. Voice recognition control system characterized by selecting up keywords.

The method of claim 12,
The command filtering unit,
When a plurality of voice signals are input, a character matching rate of each command and signal strength are determined, and a voice signal having a high character matching rate or a high signal strength is selected.

delete

Receiving a voice signal including a wakeup keyword and a command;
recognizing a wakeup keyword located at the head of the voice signal and waking up a device corresponding to the wakeup keyword among a plurality of devices;
selecting a main command based on words and verbs of the command of the voice signal;
verifying whether to transmit a vehicle command through the wake-up device; and
Transmitting vehicle commands through a verified device
including,
The step of selecting the command,
If the command is one multi-command including a plurality of instructions, the number of verbs and the number of words of each of the plurality of instructions included in the one multi-command are determined, and the number of verbs or the number of words is the highest among the plurality of instructions. A voice recognition control method characterized in that a number of commands are selected.

The method of claim 17
The step of verifying whether to transmit the vehicle command is
Checking whether the command of the voice signal matches the command grammar for each device, selecting a device with the highest grammar matching rate with the at least one wake-up device and transmitting the vehicle command. Characterized in that Voice recognition control method.

The method of claim 17
Waking up the device,
Text is extracted from the voice signal, and if at least one wakeup keyword exists in the voice signal, voice signal correlation and text correlation are performed for the at least one wakeup keyword, and the wakeup with the highest correlation is performed. A voice recognition control method comprising selecting an up keyword.

The method of claim 17
The step of selecting the main command,
When a plurality of voice signals are input, a character matching rate and signal strength of each command are determined, and a voice signal having a high character matching rate or a large signal strength is selected.