KR102417901B1

KR102417901B1 - Apparatus and method for recognizing voice using manual operation

Info

Publication number: KR102417901B1
Application number: KR1020170160366A
Authority: KR
Inventors: 신용진
Original assignee: 현대자동차주식회사; 기아 주식회사
Priority date: 2017-11-28
Filing date: 2017-11-28
Publication date: 2022-07-07
Also published as: KR20190061705A

Abstract

본 발명의 수동조작을 이용한 음성인식 장치는 사용자로부터 명령어를 수신하는 입력부 및 상기 입력부에 수신된 명령어가 오인식되면 상기 명령어에 대응하는 수동조작을 인식하고, 상기 수신된 명령어 및 상기 수동조작을 태깅하여 상기 명령어를 후보명령어로 판단하고 상기 후보명령어가 학습되면 상기 후보명령어를 정식명령어로 판단하는 제어부를 포함하는 것을 특징으로 하여, 사용자가 원하는 용어를 정식명령어로 저장하여 음성인식 되도록 할 수 있다. 사용자가 원하는 용어를 음성명령어로 사용함으로써, 기 저장된 명령어에 해당하는 음성인식 명령어가 아니더라도 차량 내 기기의 제어와 같은 동작을 수행할 수 있다. The voice recognition apparatus using a manual operation of the present invention recognizes a manual operation corresponding to the command when the input unit for receiving a command from the user and the command received in the input unit are erroneously recognized, and tags the received command and the manual operation. and a control unit for determining the command as a candidate command and determining the candidate command as a formal command when the candidate command is learned. By using a term desired by the user as a voice command, an operation such as controlling an in-vehicle device can be performed even if it is not a voice recognition command corresponding to a pre-stored command.

Description

Voice recognition device and method using manual operation {APPARATUS AND METHOD FOR RECOGNIZING VOICE USING MANUAL OPERATION}

본 발명은 수동조작을 이용한 음성인식 장치 및 방법에 관한 것으로, 보다 자세하게는 명령어의 음성인식이 실패한 경우, 이후 명령어에 대응하는 수동조작을 통하여 기능을 동작시키는 것을 기반으로 해당 명령어가 인식되도록 하는 음성인식 장치 및 방법에 관한 것이다. The present invention relates to a voice recognition apparatus and method using a manual operation, and more particularly, when the voice recognition of a command fails, a voice for recognizing a corresponding command based on operating a function through a manual operation corresponding to the command thereafter It relates to a recognition device and method.

음성인식이란 음성에 포함된 음향 신호, 즉 음향학적 정보로부터 음운, 즉 언어적 정보를 추출하여 이를 기계적으로 인식하는 기술이다. 인간 대 기계 인터페이스 방법론 중 음성인식은 가장 편리한 방법으로 여겨지지만 음성인식을 통해 인간이 기계를 제어하기 위해서는 인간의 음성을 기계가 처리할 수 있는 정보로 변환하여야 하며, 이러한 변환 과정이 음성인식 기술이다.Speech recognition is a technology for mechanically recognizing phonology, ie, linguistic information, from an acoustic signal included in a voice, ie, acoustic information. Among the human-machine interface methodologies, speech recognition is considered the most convenient method, but in order for humans to control machines through speech recognition, human speech needs to be converted into machine-processable information, and this conversion process is a speech recognition technology. .

최근에는 음성인식 기술을 차량에 적용하여 차량의 편의 장치, 예컨대 윈도우의 승강, 와이퍼의 작동 및 정지, 에어컨의 작동, 전조등의 점등 및 소등의 또는 멀티미디어 기기의 작동을 사용자의 음성 명령에 따라 제어하는 기술이 사용되고 있다. Recently, voice recognition technology has been applied to vehicles to control vehicle convenience devices, such as raising and lowering windows, operation and stopping of wipers, operation of air conditioners, turning on and off headlights, or operation of multimedia devices according to the user's voice commands. technology is being used.

그러나 차량에서 사용자의 음성 명령에 대해 음성인식이 이루어지지 않는 경우, 전통적인 헤드유닛을 통해 버튼 등을 조작하여 차량 내 기기의 작동을 제어하는 방법 및 터치스크린을 통해 차량 내 기기를 제어하는 수동제어 방법으로 멀티미디어 기기의 작동을 수행하여야 한다. However, when voice recognition is not performed on the user's voice command in the vehicle, a method of controlling the operation of in-vehicle devices by manipulating buttons, etc. through a traditional head unit, and a manual control method of controlling in-vehicle devices through a touch screen Multimedia devices should be operated with

따라서, 이러한 수동제어를 통한 차량 내 기기의 제어와 같은 동작을 수행하는 음성인식 명령어가 차량 내 기기의 음성인식을 통한 제어에 사용되게 되며, 사용자는 수동제어 방법을 통해 동작을 제어하던 기능을 음성인식을 통해 제어하기 위하여 수동 명령어에 해당하는 음성인식 명령어를 기억하여야 할 필요가 있다.Therefore, a voice recognition command that performs an operation such as controlling an in-vehicle device through such manual control is used for control through voice recognition of an in-vehicle device, and the user can use the voice function to control the operation through the manual control method. In order to control through recognition, it is necessary to memorize a voice recognition command corresponding to a manual command.

그러나, 음성인식 명령어를 기억하지 못하는 경우, 사용자가 음성인식을 통하여 차량 내 기기의 제어와 같은 동작을 수행하기 어려운 문제점이 있다. 또한, 음성인식 명령어 이외의 사용자가 직관적으로 의도하고자 하는 명령어를 사용하기 어려운 한계가 있다.However, when the voice recognition command cannot be remembered, there is a problem in that it is difficult for the user to perform an operation such as controlling an in-vehicle device through voice recognition. In addition, there is a limit in that it is difficult to use a command that the user intuitively intends to use other than the voice recognition command.

본 발명은 상술한 한계점을 극복하기 위한 것으로, 사용자가 명령어를 발화한 이후, 수행된 수동동작과 명령어를 연결하여, 사용자로부터 발화된 명령어를 정식명령어로 추가함으로써, 사용자가 발화한 명령어의 음성인식을 용이하게 하도록 하는 수동조작을 이용한 음성인식 장치 및 방법을 제공하는데 목적이 있다.The present invention is to overcome the above-described limitation, and after the user has uttered the command, by connecting the command and the manual operation performed, and adding the command uttered from the user as a formal command, voice recognition of the command uttered by the user An object of the present invention is to provide a voice recognition apparatus and method using a manual operation to facilitate

본 발명의 수동조작을 이용한 음성인식 장치는 사용자로부터 명령어를 수신하는 입력부 및 상기 입력부에 수신된 명령어가 오인식되면 상기 명령어에 대응하는 상기 사용자의 수동조작을 인식하고, 상기 수신된 명령어 및 상기 수동조작을 태깅하여 상기 명령어를 후보명령어로 판단하고 상기 후보명령어가 학습되면 상기 후보명령어를 정식명령어로 판단하는 제어부를 포함하는 것을 특징으로 한다.The voice recognition apparatus using a manual operation of the present invention recognizes the user's manual operation corresponding to the command when the input unit for receiving a command from the user and the command received in the input unit are erroneously recognized, and the received command and the manual operation and a controller for determining the command as a candidate command by tagging and determining the candidate command as a formal command when the candidate command is learned.

그리고, 상기 입력부는 자연어를 수신하는 것을 특징으로 한다.And, the input unit is characterized in that it receives a natural language.

그리고, 상기 제어부는 상기 수신된 명령어가 기 저장되어 학습되어진 정식명령어가 아닌 경우, 상기 수신된 명령어가 오인식된 것으로 판단하는 것을 특징으로 한다.And, when the received command is not a pre-stored and learned formal command, the controller determines that the received command is misrecognized.

그리고, 상기 제어부는 상기 사용자가 상기 명령어가 오인식된 후 소정시간 내에 상기 수신된 명령어를 실행하기 위해 수행되는 상기 수동조작을 인식하는 것을 특징으로 한다.And, the control unit is characterized in that the user recognizes the manual operation performed to execute the received command within a predetermined time after the command is misrecognized.

그리고, 상기 제어부는 상기 수동조작의 도메인을 판단하면, 상기 도메인 내에서 상기 수신된 명령어와 상기 수동조작을 태깅(tagging)하는 것을 특징으로 한다.And, when the control unit determines the domain of the manual operation, it is characterized in that the received command and the manual operation are tagged in the domain.

그리고, 상기 제어부는 상기 수동조작의 도메인이 판단되지 않으면, 신규 도메인을 생성하는 것을 특징으로 한다.And, if the domain of the manual operation is not determined, the control unit generates a new domain.

그리고, 상기 제어부는 상기 후보명령어가 소정횟수 재수신되어 상기 재수신된 후보명령어와 상기 수동조작을 태깅하는 동작이 소정횟수 반복수행되면, 상기 후보명령어가 학습된 것으로 판단하는 것을 특징으로 한다.And, when the candidate command is re-received a predetermined number of times and the operation of tagging the re-received candidate command and the manual operation is repeatedly performed a predetermined number of times, it is characterized in that it is determined that the candidate command has been learned.

그리고, 상기 제어부는 상기 후보명령어가 학습되면, 상기 후보명령어를 정식명령어로 저장할 지 여부를 상기 운전자에게 선택하도록 하는 메세지를 출력하도록 하고, 상기 운전자가 선택하면 상기 후보명령어를 정식명령어로 판단하는 것을 특징으로 한다.Then, when the candidate command is learned, the control unit outputs a message to the driver to select whether to store the candidate command as a formal command, and when the driver selects the candidate command, determining the candidate command as a formal command characterized.

본 발명의 수동조작을 이용한 음성인식 방법은 사용자로부터 명령어를 수신하는 단계와, 상기 수신된 명령어가 오인식되면, 상기 명령어에 대응하는 수동조작을 인식하는 단계와, 상기 수신된 명령어를 후보명령어로 판단하는 단계와, 상기 후보명령어를 학습하는 단계 및 상기 후보명령어가 학습되면, 상기 후보명령어를 정식명령어로 판단하는 단계를 포함하는 것을 특징으로 한다.A voice recognition method using a manual operation of the present invention includes the steps of receiving a command from a user, recognizing a manual operation corresponding to the command when the received command is erroneously recognized, and determining the received command as a candidate command and learning the candidate command, and when the candidate command is learned, determining the candidate command as a formal command.

그리고, 사용자로부터 명령어를 수신하는 단계는 자연어를 수신하는 것을 특징으로 한다.And, the step of receiving the command from the user is characterized in that receiving a natural language.

그리고, 상기 수신된 명령어가 오인식되면, 상기 명령어에 대응하는 수동조작을 인식하는 단계에서, 상기 수신된 명령어가 기 저장되어 학습되어진 정식명령어가 아닌 경우, 상기 수신된 명령어를 오인식된 것으로 판단하는 것을 특징으로 한다.And, when the received command is erroneously recognized, in the step of recognizing a manual operation corresponding to the command, if the received command is not a pre-stored and learned formal command, determining that the received command is misrecognized characterized.

그리고, 상기 수신된 명령어가 오인식되면, 상기 명령어에 대응하는 수동조작을 인식하는 단계는 상기 사용자가 상기 명령어가 오인식된 후 소정시간 내에 상기 수신된 명령어를 실행하기 위해 수행되는 상기 수동조작을 인식하는 것을 특징으로 한다.And, when the received command is erroneously recognized, the step of recognizing a manual operation corresponding to the command is that the user recognizes the manual operation performed to execute the received command within a predetermined time after the command is erroneously recognized characterized in that

그리고, 상기 수신된 명령어가 오인식되면, 상기 명령어에 대응하는 수동조작을 인식하는 단계 이후, 상기 수동조작의 도메인을 판단하는 단계를 더 수행하는 것을 특징으로 한다.And, when the received command is erroneously recognized, after recognizing the manual operation corresponding to the command, the step of determining the domain of the manual operation is further performed.

그리고, 상기 수신된 명령어를 후보명령어로 판단하는 단계는, 상기 수동조작의 도메인이 판단되면, 상기 도메인 내에서 상기 수신된 명령어와 상기 수동조작을 태깅(tagging)하는 것을 특징으로 한다.The determining of the received command as a candidate command may include tagging the received command and the manual operation within the domain when the domain of the manual operation is determined.

그리고, 상기 수신된 명령어를 후보명령어로 판단하는 단계는, 상기 수동조작의 도메인이 판단되지 않으면, 신규 도메인을 생성하여 상기 신규 도메인 내에서 상기 수신된 명령어와 상기 수동조작을 태깅하는 것을 특징으로 한다.And, in the step of determining the received command as a candidate command, if the domain of the manual operation is not determined, a new domain is created and the received command and the manual operation are tagged in the new domain. .

그리고, 상기 후보명령어를 학습하는 단계는 상기 후보명령어가 소정횟수 재수신되어 상기 후보명령어와 상기 수동조작을 태깅하는 동작을 소정횟수 반복수행하는 것을 특징으로 한다.And, the step of learning the candidate command is characterized in that the candidate command is re-received a predetermined number of times, and the operation of tagging the candidate command and the manual operation is repeatedly performed a predetermined number of times.

그리고, 상기 후보명령어가 학습되면, 상기 후보명령어를 정식명령어로 판단하는 단계는 상기 후보명령어를 정식명령어로 저장할 지 여부를 상기 운전자에게 선택하도록 하는 메세지를 출력하고, 상기 운전자가 선택하면 상기 후보명령어를 정식명령어로 판단하는 것을 특징으로 한다.And, when the candidate command is learned, the step of determining the candidate command as a formal command outputs a message for allowing the driver to select whether to store the candidate command as a formal command, and when the driver selects the candidate command It is characterized in that it is determined as a formal command.

본 발명은 사용자가 명령어를 발화한 후, 사용자가 수행한 수동조작을 명령어와 연결하여, 사용자가 발화한 명령어를 정식명령어로 저장하여 음성인식 되도록 할 수 있다. 사용자가 발화한 명령어를 정식명령어로 사용함으로써, 기 저장된 명령어에 해당하는 음성인식 명령어가 아니더라도 차량 내 기기의 제어와 같은 동작을 용이하게 수행할 수 있다. According to the present invention, after the user has uttered a command, the manual operation performed by the user is connected with the command, and the command uttered by the user is stored as a formal command so that voice recognition can be performed. By using the command uttered by the user as the official command, it is possible to easily perform an operation such as controlling an in-vehicle device even if it is not a voice recognition command corresponding to a pre-stored command.

도 1은 본 발명의 수동조작을 이용한 음성인식 장치이다.
도 2 내지 도 6은 본 발명의 실시예를 나타낸 도면이다.
도 7은 본 발명의 수동조작을 이용한 음성인식 방법을 나타낸 순서도이다.1 is a voice recognition apparatus using a manual operation of the present invention.
2 to 6 are views showing an embodiment of the present invention.
7 is a flowchart illustrating a voice recognition method using a manual operation according to the present invention.

이하, 본 발명의 일부 실시예들을 예시적인 도면을 통해 상세하게 설명한다. 각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 발명의 실시예를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 실시예에 대한 이해를 방해한다고 판단되는 경우에는 그 상세한 설명은 생략한다.Hereinafter, some embodiments of the present invention will be described in detail with reference to exemplary drawings. In adding reference numerals to the components of each drawing, it should be noted that the same components are given the same reference numerals as much as possible even though they are indicated on different drawings. In addition, in describing the embodiment of the present invention, if it is determined that a detailed description of a related known configuration or function interferes with the understanding of the embodiment of the present invention, the detailed description thereof will be omitted.

본 발명의 실시예의 구성 요소를 설명하는 데 있어서, 제 1, 제 2, A, B, (a), (b) 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 또한, 다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가진 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.In describing the components of the embodiment of the present invention, terms such as first, second, A, B, (a), (b), etc. may be used. These terms are only for distinguishing the elements from other elements, and the essence, order, or order of the elements are not limited by the terms. In addition, unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which the present invention belongs. Terms such as those defined in a commonly used dictionary should be interpreted as having a meaning consistent with the meaning in the context of the related art, and should not be interpreted in an ideal or excessively formal meaning unless explicitly defined in the present application. does not

본 발명은 사용자가 발화한 명령어에 대하여 음성인식이 되지 않는 경우, 사용자가 명령어에 해당하는 기능을 직접 수동으로 조작하는데서 착안된 것이다. 본 발명은 발화한 명령어와 수동조작된 기능을 연결하고 해당 명령어를 정식명령어로판단함으로써, 추후에 사용자가 동일한 명령어를 발화한 경우 음성이 인식되도록 하여 원하는 기능을 수행되도록 할 수 있다. 여기서, 정식명령어는 학습이 완료되어 음성인식이 가능한 명령어를 의미할 수 있다. The present invention was conceived in that when voice recognition is not performed for a command uttered by the user, the user directly manually manipulates a function corresponding to the command. The present invention connects the uttered command with a manually operated function and determines the command as a formal command, so that when the user utters the same command later, the voice is recognized so that a desired function can be performed. Here, the formal command word may mean a command capable of voice recognition after learning has been completed.

도 1은 본 발명의 수동조작을 이용한 음성인식 장치이다. 도 1에 도시된 바와 같이, 본 발명의 수동조작을 이용한 음성인식 장치는 입력부(10), 저장부(20), 제어부(30) 및 출력부(40)를 포함할 수 있다. 1 is a voice recognition apparatus using a manual operation of the present invention. As shown in FIG. 1 , the voice recognition apparatus using manual operation of the present invention may include an input unit 10 , a storage unit 20 , a control unit 30 , and an output unit 40 .

입력부(10)는 사용자가 발화한 명령어를 수신할 수 있다. 명령어는 '아이폰 뮤직', '화면밝기 좀 올려줘', '홍길동 책임에게 전화걸어줘'등의 자연어를 포함할 수 있다. The input unit 10 may receive a command uttered by the user. The command may include natural language such as 'iPhone music', 'turn up the screen brightness', and 'call Gil-dong Hong'.

입력부(10)는 외부의 음향신호를 입력받는 과정에서 발생 되는 잡음(noise)을 제거하기 위한 다양한 잡음 제거 알고리즘에 기초한 동작을 수행하도록 구성될 수 있다. 입력부(10)는 헤드유닛(head unit, H/U) 내에 구비된 마이크로 폰으로 구성될 수 있다. The input unit 10 may be configured to perform operations based on various noise removal algorithms for removing noise generated in the process of receiving an external sound signal. The input unit 10 may be configured as a microphone provided in a head unit (H/U).

저장부(20)는 사용자로부터 최초로 발화된 명령어를 도메인과 연결하여 저장할 수 있다. 예를들면, 저장부(20)는 '아이폰 뮤직'라는 명령어를 미디어의 도메인 내에 추가하여 저장할 수 있다. 또한, 저장부(20)는 '화면밝기 좀 올려줘'라는 명령어를 셋업의 도메인 내에 추가하여 저장할 수 있다. 또한, 저장부(20)는 '홍길도 책임에게 전화걸어줘'라는 명령어를 블루투스의 도메인 내에서 추가하여 저장할 수 있다. The storage unit 20 may store the command first uttered by the user in connection with the domain. For example, the storage unit 20 may add and store the command 'iPhone music' in the domain of the media. In addition, the storage unit 20 may add and store a command 'raise the screen brightness' in the domain of the setup. Also, the storage unit 20 may add and store a command 'Call Gildo Hong to the manager' within the Bluetooth domain.

저장부(20)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(RAM, Random Access Memory), SRAM(Static Random Access Memory), 롬(ROM, Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기메모리, 자기 디스크, 또는 광디스크 타입의 저장매체를 포함할 수 있다.The storage unit 20 is a flash memory type (flash memory type), hard disk type (hard disk type), multimedia card micro type (multimedia card micro type), card type memory (eg, SD or XD memory, etc.), RAM (Random Access Memory), SRAM (Static Random Access Memory), ROM (Read-Only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory), PROM (Programmable Read-Only Memory), magnetic memory, It may include a magnetic disk or an optical disk type storage medium.

제어부(30)는 음성인식모드로 설정하고, 사용자가 의도하고자 하는 기능에 대한 명령어가 입력부(10)에 수신되는 경우, 명령어에 대한 음성신호를 서버로 전송한다. 서버는 수신된 음성신호에 대하여 받아쓰기(DICTATION)의 결과를 차량으로 전송할 수 있다. The control unit 30 sets the voice recognition mode, and when a command for a function intended by the user is received by the input unit 10, a voice signal for the command is transmitted to the server. The server may transmit the result of dictation with respect to the received voice signal to the vehicle.

제어부(30)는 입력부(10)에 수신된 명령어가 음성인식 명령어 즉, 기 설정되어 학습된 명령어일 경우, 음성인식이 된 것으로 판단할 수 있다. 또한, 입력부(10)에 수신된 명령어가 음성인식 명령어가 아닌경우 즉, 기 설정된 명령어가 아닌 명령어일 경우 음성이 오인식된 것으로 판단할 수 있다. When the command received by the input unit 10 is a voice recognition command, that is, a preset and learned command, the control unit 30 may determine that voice recognition has been performed. In addition, when the command received by the input unit 10 is not a voice recognition command, that is, when it is a command other than a preset command, it may be determined that the voice is misrecognized.

음성이 오인식된 것으로 판단하는 경우, 제어부(30)는 음성인식모드를 종료할 수 있다. 제어부(30)는 음성인식모드가 종료되면, 사용자가 의도하고자 하는 기능을 직접 조작하도록 수동입력모드로 전환할 수 있다. When it is determined that the voice is misrecognized, the controller 30 may end the voice recognition mode. When the voice recognition mode ends, the controller 30 may switch to the manual input mode so that the user directly manipulates the intended function.

제어부(30)는 수동입력모드로 전환되는 경우, 사용자가 직접 수행한 수동조작을 인식할 수 있다. 여기서, 수동조작은 소정시간 내에 수신된 명령어를 실행하기 위해 헤드유닛 내에 구비된 소정버튼이나 디스플레이를 조작하는 것을 의미할 수 있다. When the control unit 30 is switched to the manual input mode, the control unit 30 may recognize the manual operation directly performed by the user. Here, the manual operation may mean operating a predetermined button or display provided in the head unit to execute a command received within a predetermined time.

그리고, 수동조작에 따른 동작이 포함되는 도메인이 존재하는지 판단한다. 여기서 도메인은 사용자가 발화한 명령어가 사용될 수 있는 상황을 분류하는 개념으로 이해될 수 있다. 예를들면 미디어, 내비게이션, 블루투스 등으로 분류될 수 있다. Then, it is determined whether a domain including an operation according to a manual operation exists. Here, the domain may be understood as a concept for classifying situations in which commands uttered by the user can be used. For example, it can be classified into media, navigation, Bluetooth, and the like.

그리고, 각각의 도메인은 상황에 따른 복수의 명령어를 포함할 수 있다. 여기서, 도메인 내에 포함된 복수의 명령어는 기 설정되어 학습된 명령어일 수 있다. 예를들면, 미디어의 도메인은 '음악검색','USB','CD'등의 명령어를 포함할 수 있다. 그리고, 내비게이션의 도메인은 '목적지 설정', '주변 검색' 등의 명령어를 포함할 수 있다. 그리고, 블루투스의 도메인은 '..에게 전화', '전화걸기', '블루투스 기기변경'등의 명령어를 포함할 수 있다. In addition, each domain may include a plurality of commands according to a situation. Here, the plurality of commands included in the domain may be preset and learned commands. For example, the domain of media may include commands such as 'music search', 'USB', and 'CD'. In addition, the navigation domain may include commands such as 'set destination' and 'nearby search'. In addition, the Bluetooth domain may include commands such as 'call to...', 'make a call', and 'change Bluetooth device'.

제어부(30)는 인식된 수동조작에 따른 동작이 소정의 도메인이 존재하는 것으로 판단되면, 해당 도메인 내에서 입력부(10)에 수신된 음성신호의 받아쓰기 결과에 수동조작에 따른 기능을 태깅(tagging)할 수 있다. When it is determined that the operation according to the recognized manual operation exists in a predetermined domain, the control unit 30 tags the function according to the manual operation to the dictation result of the voice signal received by the input unit 10 within the domain. can do.

제어부(30)는 수신된 명령와 수동조작이 태깅되면, 수신된 명령어를 후보명령어로 판단하고, 저장부(20)에 임시저장되도록 할 수 있다. 그리고, 후보명령어가 소정횟수 이상 반복되고, 후보명령어와 해당 수동조작이 소정횟수 이상 태깅되면, 제어부(30)는 후보명령어를 정식명령어로 판단할 수 있다. 보다 자세한 설명은 도 2 내지 도 5를 참조한다. When the received command and manual operation are tagged, the control unit 30 may determine the received command as a candidate command and temporarily store the received command in the storage unit 20 . And, if the candidate command is repeated a predetermined number of times or more, and the candidate command and the corresponding manual operation are tagged more than a predetermined number of times, the control unit 30 may determine the candidate command as an official command. For more detailed description, refer to FIGS. 2 to 5 .

제어부(30)는 사용자로부터 발화된 명령어에 따라 도 2에 도시된 바와 같이, 도메인 내에서 신규 명령어를 추가할 수 있고, 도 3에 도시된 바와 같이, 신규 도메인을 추가할 수 있고, 도 4 및 도 5에 도시된 바와 같이, 통합 명령어를 추가할 수 있다.As shown in FIG. 2 , the control unit 30 may add a new command within a domain according to a command uttered by the user, and may add a new domain as shown in FIG. 3 , and FIG. 4 and FIG. As shown in FIG. 5 , an integrated command may be added.

일 실시예에 따르면, 도 2에 도시된 바와 같이, 사용자는 '아이폰 뮤직'이라는 명령어를 입력할 수 있다. 본 발명에서 사용자가 입력한 명령어는 기 저장되어 학습되어진 정식명령어가 아닐 수 있다. 즉, 사용자가 직관적으로 의도하고자 하는 자연어를 포함할 수 있다. According to an embodiment, as shown in FIG. 2 , the user may input a command 'iPhone Music'. In the present invention, the command input by the user may not be a pre-stored and learned formal command. That is, it may include a natural language that the user intuitively intends to use.

제어부(30)는 사용자가 발화한 명령어 '아이폰 뮤직'의 음성신호를 서버로 송신할 수 있고, 서버는 '아이폰 뮤직'에 대한 받아쓰기 결과를 수신할 수 있다. 제어부(30)는 사용자가 입력한 명령어가 기 저장되어 학습되어진 정식명령어가 아니기 때문에 음성신호를 오인식하고 음성인식모드를 종료할 수 있다. The control unit 30 may transmit a voice signal of the command 'iPhone Music' uttered by the user to the server, and the server may receive a dictation result for 'iPhone Music'. Since the command input by the user is not a pre-stored and learned formal command, the control unit 30 may misrecognize the voice signal and end the voice recognition mode.

제어부(30)는 음성인식모드가 종료된 경우, 사용자가 직접 조작한 수동조작을 인식할 수 있다. 여기서, 사용자가 직접 조작한 수동조작은 사용자가 입력한 명령어를 실행하기 위해 수행되는 조작일 수 있다. 즉, '아이폰 뮤직'에 대한 수동조작일 수 있다. 예를들면, 수동조작은 아이폰의 음악을 재생하기 위하여 블루투스 오디오로 전환하는 동작을 포함할 수 있으며, 제어부(30)는 수동조작을 통하여 블루투스 오디오 모드로 전환된 것을 인지할 수 있다. When the voice recognition mode is terminated, the control unit 30 may recognize a manual operation directly operated by the user. Here, the manual operation directly operated by the user may be an operation performed to execute a command input by the user. That is, it may be a manual operation for 'iPhone Music'. For example, the manual operation may include an operation of switching to Bluetooth audio in order to play music from the iPhone, and the controller 30 may recognize that the Bluetooth audio mode has been switched to the Bluetooth audio mode through the manual operation.

제어부(30)는 수동조작 즉, 블루투스 오디오로 전환하는 동작의 도메인을 판단할 수 있다. 실시예에 따르면, 미디어의 도메인으로 판단할 수 있으며, 도메인이 판단되면 해당 도메인 내에서 입력부(10)에 수신된 음성신호의 받아쓰기 결과에 수동조작에 따른 기능을 태깅(tagging)하도록 할 수 있다. 여기서 태깅은 받아쓰기 결과에 키 워드 처리를 해 주는 것으로 이해되는 것이 바람직하고, 수신된 음성신호의 받아쓰기 결과와 수동조작에 따른 기능이 연결되도록 하는 것을 의미할 수 있다. The controller 30 may determine the domain of manual operation, that is, the operation of switching to Bluetooth audio. According to an embodiment, it may be determined as a domain of the media, and when the domain is determined, a function according to a manual operation may be tagged to the dictation result of the voice signal received by the input unit 10 within the domain. Here, it is preferable that tagging be understood as performing keyword processing on the dictation result, and may mean connecting the dictation result of the received voice signal with a function according to manual operation.

제어부(30)는 수신된 명령어 '아이폰 뮤직'과 블루투스 오디오를 전환하는 기능이 태깅되면, 도 6에 도시된 바와 같이, '아이폰 뮤직'을 후보명령어로 판단할 수 있으며, 저장부(20)에 임시저장할 수 있다. 그리고, '아이폰 뮤직'이 소정횟수 이상 수신되고, 아이폰 뮤직과 블루투스 오디오를 전환하는 기능을 태깅하는 것이 소정횟수 이상 수행되면 학습된 것으로 판단할 수 있다. 이는, 후보명령어 '아이폰 뮤직'이 정식명령어로 판단될 수 있는 조건을 만족한 것으로 판단할 수 있다.When the received command 'iPhone Music' and the function for switching Bluetooth audio are tagged, the controller 30 may determine 'iPhone Music' as a candidate command as shown in FIG. It can be temporarily stored. And, when 'iPhone Music' is received a predetermined number of times or more, and the tagging function for switching between iPhone music and Bluetooth audio is performed more than a predetermined number of times, it may be determined that it has been learned. It can be determined that the candidate command 'iPhone Music' satisfies the condition for being determined as the official command word.

후보명령어의 학습이 완료된 후, 제어부(30)는 '아이폰 뮤직'이 재수신되는 경우, 태깅된 기능을 수행할지 여부를 사용자에게 선택하도록 하기 위해, 출력부(40)를 통하여 메세지를 음성 또는 이미지로 출력할 수 있다. 즉, '아이폰 뮤직'의 명령어가 수신되면 예를들어 '블루투스 오디오로 전환할까요?'라는 메세지를 출력하고, 사용자에게 선택하도록 할 수 있다. After the learning of the candidate command is completed, when 'iPhone Music' is re-received, the controller 30 transmits a message through the output unit 40 to voice or image so that the user can select whether to perform the tagged function. can be output as That is, when a command of 'iPhone Music' is received, for example, a message 'Do you want to switch to Bluetooth audio?' can be output and the user can make a selection.

사용자가 해당기능을 선택하는 경우, 수신된 명령어를 정식명령어로 판단하고, 저장부(20)에 정식명령어로서 저장될 수 있다. 그러나 사용자가 해당기능을 선택하지 않는 경우 수신된 명령어는 삭제될 수 있다. When the user selects a corresponding function, the received command may be determined as a formal command and stored as a formal command in the storage unit 20 . However, if the user does not select the corresponding function, the received command may be deleted.

다른 실시예에 따르면, 도 3에 도시된 바와 같이, 사용자는 '화면밝기 좀 올려줘'라는 명령어를 입력할 수 있다. 본 발명에서 사용자가 입력한 명령어는 기 저장되어 학습되어진 정식명령어가 아닐 수 있다. 즉, 사용자가 직관적으로 의도하고자 하는 명령어를 포함할 수 있다. According to another embodiment, as shown in FIG. 3 , the user may input a command 'raise the screen brightness a bit'. In the present invention, the command input by the user may not be a pre-stored and learned formal command. That is, it may include a command intuitively intended by the user.

제어부(30)는 사용자가 발화한 명령어 '화면밝기 좀 올려줘'의 음성신호를 서버로 송신할 수 있고, 서버는 '화면밝기 좀 올려줘'에 대한 받아쓰기 결과를 수신할 수 있다. 제어부(30)는 사용자가 입력한 명령어가 기 저장되어 학습되어진 정식명령어가 아니기 때문에 음성신호를 오인식하고 음성인식모드를 종료할 수 있다. The controller 30 may transmit a voice signal of the command 'raise screen brightness' uttered by the user to the server, and the server may receive a dictation result for 'raise screen brightness'. Since the command input by the user is not a pre-stored and learned formal command, the control unit 30 may misrecognize the voice signal and end the voice recognition mode.

제어부(30)는 음성인식모드가 종료된 경우, 사용자가 직접 조작한 수동조작을 인식할 수 있다. 여기서, 사용자가 직접 조작한 수동조작은 사용자가 입력한 명령어에 대한 동작일 수 있다. 즉, '화면밝기 좀 올려줘'에 대한 수동조작일 수 있다. 예를들면, 수동조작은 화면밝기를 밝게하기 위한 소정의 동작, 예를들면 차량 내 구비된 스크롤 휠, 버튼 하우징을 조작하거나 터치스크린의 화면을 터치하는 동작을 포함할 수 있으며, 제어부(30)는 수동조작을 통하여 화면밝기가 제어되는 것을 인지할 수 있다. When the voice recognition mode is terminated, the control unit 30 may recognize a manual operation directly operated by the user. Here, the manual operation directly operated by the user may be an operation for a command input by the user. In other words, it may be a manual operation for 'Raise the screen brightness'. For example, the manual operation may include a predetermined operation to brighten the screen brightness, for example, an operation of operating a scroll wheel or button housing provided in the vehicle, or touching the screen of the touch screen, and the control unit 30 can recognize that the screen brightness is controlled through manual operation.

제어부(30)는 화면밝기가 제어되도록 하는 동작의 도메인을 판단할 수 있다. 여기서, 도메인이 판단되지 않는 경우, 신규 도메인을 생성할 수 있다. 그리고, 신규 도메인 내에서 입력부(10)에 수신된 음성신호의 받아쓰기 결과에 수동조작에 따른 기능을 태깅(tagging)할 수 있다. 여기서 태깅은 받아쓰기 결과에 키 워드 처리를 해 주는 것으로 이해되는 것이 바람직하고, 수신된 음성신호의 받아쓰기 결과와 수동조작에 따른 기능이 연결되도록 할 수 있다. The controller 30 may determine the domain of the operation for controlling the screen brightness. Here, when the domain is not determined, a new domain may be created. In addition, a function according to a manual operation may be tagged to the dictation result of the voice signal received by the input unit 10 in the new domain. Here, it is preferable that tagging be understood as processing the dictation result with keywords, and it is possible to connect the dictation result of the received voice signal with a function according to manual operation.

제어부(30)는 수신된 명령어 '화면밝기 좀 올려줘'와 화면밝기를 제어하는 기능이 태깅되면, '화면밝기 좀 올려줘'을 후보명령어로 판단할 수 있으며, 저장부(20)에 임시저장할 수 있다. 그리고, '화면밝기 좀 올려줘'가 소정횟수 이상 수신되고, 후보명령어와 화면밝기를 제어하는 기능을 태깅하는 것이 소정횟수 이상 수행되면 학습된 것으로 판단할 수 있다. 이는, 후보명령어 '화면밝기 좀 올려줘'가 정식명령어로 판단될 수 있는 조건을 만족한 것으로 판단할 수 있다.When the received command 'Raise the screen brightness' and the function to control the screen brightness are tagged, the control unit 30 may determine that 'Raise the screen brightness' as a candidate command, and temporarily store it in the storage unit 20 . In addition, when 'Raise the screen brightness' is received more than a predetermined number of times, and tagging a candidate command and a function for controlling the screen brightness is performed more than a predetermined number of times, it can be determined that it has been learned. It can be determined that the candidate command 'raise the screen brightness' satisfies the condition that can be determined as an official command.

후보 명령어의 학습이 완료된 후, 제어부(30)는 '화면밝기 좀 올려줘'가 재수신되는 경우, 태깅된 기능을 수행할지 여부를 사용자에게 선택하도록 하기 위해, 출력부(40)를 통하여 메세지를 음성 또는 이미지로 출력할 수 있다. 즉, '화면밝기 좀 올려줘'의 명령어가 수신되면 화면밝기를 제어하는 셋팅화면으로 전환할지 여부를 출력하고, 사용자에게 선택하도록 할 수 있다. After the learning of the candidate command is completed, the controller 30 sends a voice message through the output unit 40 in order to allow the user to select whether or not to perform the tagged function when 'raise the screen brightness' is re-received. Alternatively, it can be output as an image. That is, when a command of 'increase screen brightness' is received, whether to switch to a setting screen for controlling screen brightness is output, and the user can select.

또 다른 실시예에 따르면, 도 4에 도시된 바와 같이, 사용자는 '홍길동 책임에게 전화걸어줘'라는 명령어를 입력할 수 있다. 본 발명에서 사용자가 입력한 명령어는 기 저장된 수동 명령어에 해당하고 학습되어진 고정의 음성인식 명령어가 아닐 수 있다. 즉, 사용자가 직관적으로 의도하고자 하는 명령어를 포함할 수 있다. According to another embodiment, as shown in FIG. 4 , the user may input a command 'Call Gil-Dong Hong'. In the present invention, the command input by the user may correspond to a pre-stored manual command and may not be a fixed voice recognition command that has been learned. That is, it may include a command intuitively intended by the user.

제어부(30)는 사용자가 발화한 명령어 '홍길동 책임에게 전화걸어줘'의 음성신호를 서버로 송신할 수 있고, 서버는 '홍길동 책임에게 전화걸어줘'에 대한 받아쓰기 결과를 수신할 수 있다. 제어부(30)는 사용자가 입력한 명령어가 기 저장된 수동 명령어에 해당하고, 학습되어진 고정의 음성인식 명령어가 아니기 때문에 음성신호를 오인식하고 음성인식모드를 종료할 수 있다. The control unit 30 may transmit a voice signal of the command 'Call Director Hong Gil-dong' uttered by the user to the server, and the server may receive a dictation result for 'Call Director Hong Gil-dong'. The controller 30 may misrecognize the voice signal and end the voice recognition mode because the command input by the user corresponds to a pre-stored manual command and is not a fixed voice recognition command that has been learned.

제어부(30)는 음성인식모드가 종료된 경우, 사용자가 직접 조작한 수동조작을 인식할 수 있다. 여기서, 사용자가 직접 조작한 수동조작은 사용자가 입력한 명령어에 대한 동작일 수 있다. 즉, '홍길동 책임에게 전화걸어줘'에 대한 수동조작일 수 있다. 예를들면, 수동조작은 홍길동 책임에게 전화를 걸기위한 소정의 동작, 예를들면 차량 내 구비된 스크롤 휠, 버튼 하우징을 조작하거나 터치스크린의 화면을 터치하는 동작을 포함을 포함할 수 있으며, 제어부(30)는 수동조작을 통하여 전화걸기로 전환되는 것을 인지할 수 있다.When the voice recognition mode is terminated, the control unit 30 may recognize a manual operation directly operated by the user. Here, the manual operation directly operated by the user may be an operation for a command input by the user. In other words, it may be a manual operation for 'Call Gil-Dong Hong'. For example, the manual operation may include a predetermined operation for calling Hong Gil-dong, for example, operation of operating a scroll wheel or button housing provided in the vehicle, or touching the screen of the touch screen, and the control unit (30) can recognize the conversion to dialing through manual operation.

제어부(30)는 홍길동 책임에게 전화거는 것의 도메인을 판단할 수 있다. 실시예에 따르면, '블루투스' 도메인으로 판단할 수 있으며, 도메인이 판단되면 해당 도메인 내에서 입력부(10)에 수신된 음성신호의 받아쓰기 결과에 수동조작에 따른 기능을 태깅(tagging)할 수 있다. 여기서 태깅은 받아쓰기 결과에 키 워드 처리를 해 주는 것으로 이해되는 것이 바람직하고, 수신된 음성신호의 받아쓰기 결과와 수동조작이 연결되도록 할 수 있다. The control unit 30 may determine the domain of calling Hong Gil-dong. According to an embodiment, it may be determined as a 'Bluetooth' domain, and when the domain is determined, a function according to a manual operation may be tagged with the dictation result of the voice signal received in the input unit 10 within the corresponding domain. Here, it is preferable that tagging be understood as processing the dictation result with keywords, and the dictation result of the received voice signal and manual operation can be connected.

제어부(30)는 수신된 명령어 '홍길동 책임에게 전화걸어줘'와 홍길동 책임에게 전화걸기 동작이 태깅되면, '홍길동 책임에게 전화걸어줘'를 후보명령어로 판단할 수 있으며, 이를 저장부(20)에 임시저장할 수 있다. 그리고, '홍길동 책임에게 전화걸어줘'가 소정횟수 이상 수신되고, '홍길동 책임에게 전화걸어줘'와 홍길동 책임에게 전화걸기 동작을 태깅하는 것이 소정횟수 이상 수행되면 학습된 것으로 판단할 수 있다. 이는, 후보명령어 '홍길동 책임에게 전화걸어줘'가 정식명령어로 판단될 수 있는 조건을 만족한 것으로 판단할 수 있다.When the received command 'Call Gil-Dong Hong' and the action of making a call to Director Hong Gil-Dong are tagged, the control unit 30 may determine that 'Call Director Hong Gil-Dong' as a candidate command, and this is the storage unit 20 can be temporarily stored in In addition, if 'Call Gil-Dong Hong' is received more than a predetermined number of times, and tagging of 'Call Gil-Dong Hong' and 'Call Gil-Dong Hong' is performed more than a predetermined number of times, it can be determined that it has been learned. It can be determined that the candidate command 'Call Gil-Dong Hong' satisfies the condition that can be determined as an official command.

후보 명령어의 학습이 완료된 후, 제어부(30)는 '홍길동 책임에게 전화걸어줘'가 재수신되는 경우, 태깅된 기능을 수행할지 여부를 사용자에게 선택하도록 하기 위해, 출력부(40)를 통하여 음성 또는 이미지로 출력할 수 있다. 즉, '홍길동 책임에게 전화걸어줘'의 명령어가 수신되면 홍길동 책임에게 전화걸기를 수행할지 여부를 출력하고, 사용자에게 선택하도록 할 수 있다. After the learning of the candidate command is completed, when 'Call Gil-Dong Hong' is re-received, the control unit 30 allows the user to select whether or not to perform the tagged function. Alternatively, it can be output as an image. That is, when a command of 'Call Gil-Dong Hong' is received, it is possible to output whether or not to make a call to Gil-Dong Hong, and to let the user select.

한편, 전화걸기의 대상은 변화될 수 있다. 즉, 홍길동 책임 이외에도 도 4에 도시된 바와 같이 '김삿갓 책임에게 전화걸어줘', '김현대 책임에게 전화걸어줘' 등으로 사용자는 전화걸기의 대상을 달리하여 발화할 수 있다. On the other hand, the target of making a call may be changed. That is, in addition to the responsibility of Hong Gil-dong, as shown in FIG. 4 , the user can utter a different target of making a call, such as 'Call Kim Sat-gat' or 'Call Kim Hyun-Dai'.

따라서, 제어부(30)는 '~에게 전화걸어줘'와 같이 유사한 명령어가 반복 입력되는 경우, 통합 명령어가 입력된 것으로 판단할 수 있다. 예를들면, 도 5에 도시된 바와 같이, '홍길동 책임에게 전화걸어줘'라고 수신된 명령어에 대하여 '~에게 전화걸어줘'라는 통합 명령어가 수신된 것으로 판단할 수 있다. Accordingly, when a similar command such as 'Call to' is repeatedly input, the controller 30 may determine that the integrated command has been input. For example, as shown in FIG. 5 , it may be determined that an integrated command of 'Call to ~' is received with respect to a command received 'Call Gil-Dong Hong'.

제어부(30)는 '~에게 전화걸어줘'에 대한 동작의 도메인을 판단할 수 있다. 실시예에 따라, 제어부(30)는 '~에게 전화걸어줘'에 대한 수동조작의 도메인을 블루투스의 도메인으로 판단할 수 있으며, 통합 명령어에 대한 음성신호의 받아쓰기 결과에 전화걸기 기능을 태깅할 수 있다. The controller 30 may determine the domain of the operation for 'Call to'. According to an embodiment, the control unit 30 may determine the domain of the manual operation for 'Call to' as the domain of Bluetooth, and tag the dictation result of the voice signal for the integrated command with the dialing function. have.

제어부(30)는 수신된 명령어 '~에게 전화걸어줘'와 화면밝기를 제어하는 기능이 태깅되면, '~에게 전화걸어줘'을 후보명령어로 판단할 수 있으며, 저장부(20)에 임시저장할 수 있다. 그리고, '~에게 전화걸어줘'가 소정횟수 이상 수신되고, 후보명령어와 화면밝기를 제어하는 기능을 태깅하는 것이 소정횟수 이상 수행되면 학습된 것으로 판단할 수 있다. 이는, 후보명령어 '~에게 전화걸어줘'가 정식명령어로 판단될 수 있는 조건을 만족한 것으로 판단할 수 있다.When the received command 'Call to' and the function to control screen brightness are tagged, the control unit 30 may determine 'Call to ~' as a candidate command, and temporarily store in the storage unit 20. can In addition, when 'Call to' is received more than a predetermined number of times, and tagging a candidate command and a function for controlling screen brightness is performed more than a predetermined number of times, it can be determined that it has been learned. In this case, it can be determined that the candidate command 'Call me' satisfies the condition that can be determined as the official command.

후보 명령어의 학습이 완료된 후, 제어부(30)는 '~에게 전화걸어줘'가 재수신되는 경우, 태깅된 기능을 수행할지 여부를 사용자에게 선택하도록 하기 위해, 출력부(40)를 통하여 메세지를 음성 또는 이미지로 출력할 수 있다. 즉, '~에게 전화걸어줘'의 명령어가 수신되면 화면밝기를 제어하는 셋팅화면으로 전환할지 여부를 출력하고, 사용자에게 선택하도록 할 수 있다. After the learning of the candidate command is completed, the control unit 30 sends a message through the output unit 40 in order to allow the user to select whether or not to perform the tagged function when 'Call to' is received again. It can be output as audio or image. That is, when a command of 'Call to' is received, whether or not to switch to a setting screen for controlling screen brightness is output, and the user can select it.

도 7은 본 발명의 수동조작을 이용한 음성인식 방법을 나타낸 순서도이다.7 is a flowchart illustrating a voice recognition method using a manual operation according to the present invention.

먼저, 사용자가 발화한 명령어를 수신할 수 있다(S100). S100에서 사용자가 발화한 명령어는 기 저장된 수동 명령어에 해당하고 학습되어진 고정의 음성인식 명령어가 아닐 수 있다. 즉, 사용자가 직관적으로 의도하고자 하는 자연어를 포함할 수 있다. 실시예에 따르면, '아이폰 뮤직', '화면밝기 좀 올려줘', '홍길동 책임에게 전화걸어줘'등을 포함할 수 있다. First, a command uttered by a user may be received (S100). The command uttered by the user in S100 may correspond to a pre-stored manual command and may not be a fixed voice recognition command that has been learned. That is, it may include a natural language that the user intuitively intends to use. According to an embodiment, it may include 'iPhone Music', 'Raise the screen brightness', 'Call Gil-Dong Hong', and the like.

명령어에 대한 음성신호를 서버로 전송한다(S110). 그리고, 서버는 수신된 음성신호에 대하여 받아쓰기(DICTATION)의 결과를 차량으로 전송할 수 있다(S120).A voice signal for the command is transmitted to the server (S110). Then, the server may transmit a result of dictation with respect to the received voice signal to the vehicle (S120).

그리고, 수신된 명령어가 인식되었는지 판단한다(S130). S130는 수신된 명령어가 기 설정된 명령어인지 여부를 판단한다. 수신된 명령어가 기 설정된 명령어인 경우(YES) 음성이 인식된 것으로 판단하고, 인식결과에 따라 기능을 동작시킬 수 있다(S140). 수신된 명령어가 기 설정된 명령어가 아닌 경우(NO), 음성은 오인식된 것으로 판단하고, 음성인식모드를 종료할 수 있다. Then, it is determined whether the received command is recognized (S130). S130 determines whether the received command is a preset command. If the received command is a preset command (YES), it is determined that the voice has been recognized, and a function may be operated according to the recognition result (S140). When the received command is not a preset command (NO), it is determined that the voice is misrecognized, and the voice recognition mode may be terminated.

음성인식모드가 종료된 후, 사용자가 의도하고자 하는 기능을 직접 조작하도록 수동입력모드로 전환하고, 소정시간 이내에 사용자가 수동으로 조작하는 기능을 인식할 수 있다(S150). After the voice recognition mode is terminated, the user may switch to the manual input mode to directly operate the intended function, and recognize the function manually operated by the user within a predetermined time (S150).

S150은 일 실시예에 따르면, S100에서 수신된 명령어 '아이폰 뮤직'에 대한 수동조작은 아이폰의 음악을 재생하기 위하여 블루투스 오디오로 전환하는 동작을 포함할 수 있으며, 수동조작을 통하여 블루투스 오디오 모드로 전환된 것을 인지할 수 있다.In S150, according to an embodiment, the manual operation of the command 'iPhone music' received in S100 may include an operation of switching to Bluetooth audio to play the music of the iPhone, and switching to the Bluetooth audio mode through the manual operation. can be recognized that

S150은 다른 실시예에 따르면, S100에서 수신된 명령어 '화면밝기 좀 올려줘'에 대한 수동조작은 화면밝기를 밝게하기 위한 소정의 동작, 예를들면 차량 내 구비된 스크롤 휠, 버튼 하우징을 조작하거나 터치스크린의 화면을 터치하는 동작을 포함할 수 있으며, 수동조작을 통하여 화면밝기가 제어되는 것을 인지할 수 있다. In S150, according to another embodiment, manual operation of the command 'raise screen brightness' received in S100 is a predetermined operation for brightening the screen, for example, operating or touching a scroll wheel or button housing provided in the vehicle. It may include an operation of touching the screen of the screen, and it may be recognized that the brightness of the screen is controlled through manual operation.

S150은 또 다른 실시예에 따르면, S100에서 입력된 명령어 '홍길동 책임에게 전화걸어줘'에 대한 수동조작은 홍길동 책임에게 전화를 걸기위한 소정의 동작, 예를들면 차량 내 구비된 스크롤 휠, 버튼 하우징을 조작하거나 터치스크린의 화면을 터치하는 동작을 포함을 포함할 수 있으며, 수동조작을 통하여 전화걸기로 전환되는 것을 인지할 수 있다.According to another embodiment, in S150, the manual operation of the command 'Call Gil-Dong Hong' input in S100 is a predetermined operation for calling Director Gil-Dong Hong, for example, a scroll wheel provided in the vehicle, a button housing It may include an operation of manipulating or touching the screen of the touch screen, and it may be recognized that conversion to dialing is performed through manual operation.

수동조작에 따른 동작이 포함되는 도메인이 존재하는지 여부를 판단한다(S160). 여기서 도메인은 사용자가 발화한 명령어가 사용될 수 있는 상황을 분류하는 개념으로 이해될 수 있다. 예를들면 도메인은 미디어, 내비게이션, 블루투스 등으로 분류될 수 있다. It is determined whether a domain including an operation according to a manual operation exists (S160). Here, the domain may be understood as a concept for classifying situations in which commands uttered by the user can be used. For example, domains can be classified into media, navigation, Bluetooth, and the like.

S160에서 수동조작에 대한 도메인이 존재하는 것으로 판단되면, 해당 수동조작에 대응하는 명령어를 후보명령어로 판단하고 임시저장한다(S170). S170은 판단된 도메인 내에서 S100에서 수신된 음성신호의 받아쓰기 결과에 수동조작에 따른 기능을 태깅(tagging)하도록 하고, 태깅되면 수신된 명령어를 후보명령어로 판단할 수 있다. 여기서 태깅은 받아쓰기 결과에 키 워드 처리를 해 주는 것으로 이해되는 것이 바람직하고, 수신된 음성신호의 받아쓰기 결과와 수동조작에 따른 기능이 연결되도록 하는 것을 의미할 수 있다. If it is determined in S160 that a domain for manual operation exists, a command corresponding to the manual operation is determined as a candidate command and temporarily stored (S170). In S170, a function according to a manual operation is tagged to the dictation result of the voice signal received in S100 within the determined domain, and when tagged, the received command may be determined as a candidate command. Here, it is preferable that tagging be understood as performing keyword processing on the dictation result, and may mean connecting the dictation result of the received voice signal with a function according to manual operation.

S170은 일 실시예에 따르면 S150에서 수행된 블루투스 오디오로 전환하는 동작에 대하여 S160에서 도메인이 존재하는 것으로 판단(YES)되면, 판단된 도메인 내에서 '아이폰 뮤직'의 명령어와 블루투스 오디오를 전환하는 기능을 태깅하고, '아이폰 뮤직'을 후보명령어로 판단하고 임시저장할 수 있다. In S170, according to an embodiment, when it is determined (YES) that the domain exists in S160 with respect to the operation of switching to Bluetooth audio performed in S150, the function of switching between the command of 'iPhone Music' and Bluetooth audio within the determined domain tag, judge 'iPhone Music' as a candidate command, and temporarily save it.

S170은 다른 실시예에 따르면 S150에서 수행된 전화걸기 동작에 대하여 도메인이 존재하는 것으로 판단(YES)되면, 판단된 도메인 내에서 '홍길동 책임에게 전화걸어줘'의 명령어와 전화걸기 기능을 태깅하고, '홍길동 책임에게 전화걸어줘'를 후보명령어로 판단하고 임시저장할 수 있다. In S170, according to another embodiment, if it is determined that the domain exists for the dialing operation performed in S150 (YES), the command of 'Call Gil-Dong Hong' and the dialing function are tagged in the determined domain, 'Call Gil-dong Hong' can be judged as a candidate command and stored temporarily.

한편, S160에서 수동조작에 대한 도메인이 존재하지 않는 것으로 판단되면, 신규 도메인을 생성한다(S180). On the other hand, if it is determined that the domain for manual operation does not exist in S160, a new domain is created (S180).

S180은 실시예에 따르면, S160에서 수행된 화면밝기를 제어하는 동작에 대하여 도메인이 존재하지 않는 것으로 판단(NO)되면, '셋업' 도메인을 신규로 생성할 수 할 수 있다. 그리고, 신규 도메인에서 '화면밝기 좀 올려줘'의 명령어와 화면밝기를 제어하는 기능을 태깅하고 '화면밝기 좀 올려줘'를 후보명령어로 판단하고 임시저장할 수 있다.According to an embodiment, in S180, if it is determined that the domain does not exist (NO) with respect to the operation of controlling the screen brightness performed in S160, the 'setup' domain may be newly created. In addition, in the new domain, the command of 'Raise the screen brightness' and the function to control the screen brightness can be tagged, and 'Raise the screen brightness' can be judged as a candidate command and stored temporarily.

후보명령어의 학습 여부를 판단한다(S190). S190은 후보명령어가 소정횟수 이상 수신되고, 후보명령어와 수동조작을 태깅하는 것이 소정횟수 이상 수행되면, 후보명령어가 학습된 것으로 판단할 수 있다. 이는 후보명령어가 정식명령어로 판단될 수 있는 조건을 만족하는지 여부를 판단하는 것으로 이해될 수 있다.It is determined whether the candidate command is learned (S190). S190 may determine that the candidate command has been learned when the candidate command is received a predetermined number of times or more, and tagging the candidate command and the manual operation is performed a predetermined number of times or more. This may be understood as determining whether the candidate instruction satisfies a condition that can be determined as a formal instruction.

후보명령어가 학습된 것으로 판단(YES)되면, 후보명령어가 재수신 되었을때, 태깅된 기능을 수행할지 여부를 사용자에게 선택하도록 한다(S200). 이를 위하여 S200에서 출력부(40)를 통하여 메세지를 음성 또는 이미지로 출력할 수 있다. 한편, 후보명령어가 학습되지 않은 것으로 판단(No)되면, 후보명령어를 정식명령어로 판단 불가한 것으로 판단한다(S220).If it is determined that the candidate command has been learned (YES), when the candidate command is re-received, the user is allowed to select whether or not to perform the tagged function (S200). To this end, the message may be output as voice or image through the output unit 40 in S200. On the other hand, if it is determined that the candidate command has not been learned (No), it is determined that the candidate command cannot be determined as a formal command (S220).

실시예에 따르면, '아이폰 뮤직'의 명령어가 수신되면 블루투스 오디오로 전환할지 여부를 출력하고 사용자에게 선택하도록 할 수 있다. 또한, '화면밝기 좀 올려줘'의 명령어가 수신되면 화면밝기를 제어하는 셋팅화면으로 전환할지 여부를 출력하고, 사용자에게 선택하도록 할 수 있다.According to an embodiment, when a command of 'iPhone Music' is received, whether to switch to Bluetooth audio or not may be output and the user may select. In addition, when a command of 'increase the screen brightness' is received, whether or not to switch to a setting screen for controlling screen brightness is output, and the user can select it.

사용자가 해당기능을 선택하면(YES), 후보명령어를 정식명령어로 판단하고, 저장한다(S210). 한편, 사용자가 해당기능을 선택하지 않으면(NO), 후보명령어를 정식명령어로 저장 불가한 것으로 판단한다(S220). S220에서 후보명령어는 삭제될 수 있다. When the user selects the corresponding function (YES), the candidate command is determined as an official command and is stored (S210). On the other hand, if the user does not select the corresponding function (NO), it is determined that the candidate command cannot be stored as a formal command (S220). In S220, the candidate command may be deleted.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. The above description is merely illustrative of the technical spirit of the present invention, and various modifications and variations will be possible without departing from the essential characteristics of the present invention by those skilled in the art to which the present invention pertains.

따라서, 본 발명에 개시된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.Therefore, the embodiments disclosed in the present invention are not intended to limit the technical spirit of the present invention, but to explain, and the scope of the technical spirit of the present invention is not limited by these embodiments. The protection scope of the present invention should be construed by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present invention.

입력부 10
저장부 20
제어부 30
출력부 40input 10
storage 20
control 30
output 40

Claims

an input unit for receiving a command from a user; and
When the command received in the input unit is erroneously recognized, the user's manual operation corresponding to the command is recognized, the received command and the manual operation are tagged to determine the command as a candidate command, and when the candidate command is learned, the candidate command Including a control unit that determines the command as a formal command,
the control unit
The voice recognition apparatus using manual operation, characterized in that when it is determined that a domain including an operation controlled in response to the manual operation is tagged, the received command and the manual operation are tagged within the domain.

The method according to claim 1,
the input unit
A voice recognition device using manual operation, characterized in that it receives natural language.

The method according to claim 1,
the control unit
Voice recognition apparatus using manual operation, characterized in that when the received command is not a pre-stored and learned formal command, it is determined that the received command is misrecognized.

The method according to claim 1,
the control unit
The voice recognition apparatus using manual operation, characterized in that the user recognizes the manual operation performed to execute the received command within a predetermined time after the command is erroneously recognized.

delete

The method according to claim 1,
the control unit
The voice recognition apparatus using manual operation, characterized in that when it is not determined that a domain including an operation controlled according to the manual operation is not determined, a new domain is created.

The method according to claim 1,
the control unit
Voice recognition apparatus using manual operation, characterized in that when the candidate command is re-received a predetermined number of times and the operation of tagging the re-received candidate command and the manual operation is repeatedly performed a predetermined number of times, it is determined that the candidate command has been learned. .

The method according to claim 1,
the control unit
Manual operation, characterized in that when the candidate command is learned, a message is output to the user to select whether to store the candidate command as a formal command, and when the user selects the candidate command, the candidate command is determined as a formal command A voice recognition device using

receiving a command from a user;
when the received command is erroneously recognized, recognizing a manual operation corresponding to the command;
determining the received command as a candidate command;
learning the candidate command; and
When the candidate command is learned, the step of determining the candidate command as a formal command,
The step of determining the received command as a candidate command,
When it is determined that a domain including an operation controlled in response to the manual operation is determined, the received command and the manual operation are tagged within the domain.

10. The method of claim 9,
Receiving a command from a user
A voice recognition method using manual operation, characterized in that receiving natural language.

10. The method of claim 9,
When the received command is erroneously recognized, in the step of recognizing a manual operation corresponding to the command,
When the received command is not a pre-stored and learned formal command, the voice recognition method using manual operation, characterized in that it is determined that the received command is misrecognized.

10. The method of claim 9,
When the received command is erroneously recognized, the step of recognizing a manual operation corresponding to the command is
The voice recognition method using manual operation, characterized in that the user recognizes the manual operation performed to execute the received command within a predetermined time after the command is erroneously recognized.

10. The method of claim 9,
If the received command is erroneously recognized, after recognizing a manual operation corresponding to the command,
The voice recognition method using manual operation, characterized in that further performing the step of determining a domain including an operation controlled in response to the manual operation.

delete

14. The method of claim 13,
The step of determining the received command as a candidate command,
If it is not determined that a domain including an operation controlled according to the manual operation is not determined, a new domain is created and the received command and the manual operation are tagged in the new domain. .

10. The method of claim 9,
The step of learning the candidate command is
The voice recognition method using manual operation, characterized in that the candidate command word is re-received a predetermined number of times, and the operation of tagging the candidate command word and the manual operation is repeated a predetermined number of times.

10. The method of claim 9,
When the candidate command is learned, the step of determining the candidate command as a formal command is
A voice recognition method using manual operation, characterized in that outputting a message for allowing the user to select whether to store the candidate command word as a formal command word, and determining the candidate command word as a formal command word when the user selects the candidate command word.