KR20180133078A

KR20180133078A - Apparatus and method for speech recognition processing, vehicle system

Info

Publication number: KR20180133078A
Application number: KR1020170069569A
Authority: KR
Inventors: 조재민
Original assignee: 현대자동차주식회사; 기아자동차주식회사
Priority date: 2017-06-05
Filing date: 2017-06-05
Publication date: 2018-12-13
Also published as: KR102383429B1

Abstract

The present invention relates to an apparatus and method for voice recognition process to improve accuracy of a voice recognition result and a vehicle system. According to the present invention, the apparatus comprises: a voice recognition processing unit respectively transmitting voice data uttered from a user to a vehicle terminal and a server to request voice recognition process and confirms a name of a receiver and content of a text message from a voice recognition result of a voice recognition engine in the vehicle terminal and the server to organize a text message when a text message transmission function is executed; and a correction unit correcting the name of the receiver confirmed from a first voice recognition result of the voice recognition engine in the server based on a second voice recognition result of the voice recognition engine in the vehicle terminal.

Description

[0001] APPARATUS AND METHOD FOR SPEECH RECOGNITION PROCESSING, VEHICLE SYSTEM [0002]

본 발명은 음성 인식 처리 장치 및 방법, 그리고 차량 시스템에 관한 것이다.The present invention relates to a speech recognition processing apparatus and method, and a vehicle system.

차량 내 디바이스에서 특정인에게 문자 보내기 기능을 사용하기 위해서는 서버의 음성인식 기술이 필요하다. 종래에는, 문자 메시지를 송신할 수신자에 대한 음성인식 정확도를 높이기 위해 블루투스 혹은 와이파이 등을 통해 사용자 단말에 접근하여 폰북 데이터를 다운로드하고, 다운로드 한 폰북 데이터를 서버로 전송해야 했다. In order to use the text sending function for a specific person in a vehicle device, a voice recognition technology of the server is required. Conventionally, in order to increase the accuracy of speech recognition for a receiver to transmit a text message, the user has to access the user terminal via Bluetooth or Wi-Fi to download the phone book data and transmit the downloaded phone book data to the server.

하지만, 폰북 데이터의 크기가 큰 경우에는 음성인식 진입 시까지의 속도 지연이 발생하고, 폰북 데이터를 서버로 전송하기 위해 별도의 데이터 패킷을 소모해야 했다.However, when the size of the phonebook data is large, a speed delay occurs until the voice recognition is entered, and separate data packets have to be consumed to transmit the phonebook data to the server.

또한, 서버의 네트워크 통신 상태가 불안정한 경우에는 음성인식 정확도 및/또는 성공률이 낮아질 수 있다. 또한, 서버에 폰북 정보가 없는 경우 대용량의 이름 정보에서 음성인식 결과와 매칭되는 이름을 검색해야 하므로 응답 속도 지연이 발생하였다.In addition, when the network communication state of the server is unstable, the accuracy of speech recognition and / or the success rate may be lowered. In addition, when the server does not have the phone book information, the name matching with the voice recognition result is searched for in the large-capacity name information.

한국등록특허 제10-1684554호Korean Patent No. 10-1684554

본 발명의 목적은, 폰북 데이터를 전송하지 않고 음성인식 결과에 대한 정확도를 높이도록 한, 음성 인식 처리 장치 및 방법, 그리고 차량 시스템을 제공함에 있다.It is an object of the present invention to provide a speech recognition processing apparatus and method, and a vehicle system, in which accuracy of a speech recognition result is enhanced without transmitting phone book data.

본 발명의 다른 목적은, 서버 내 음성인식 엔진의 음성인식 결과로부터 확인된 수신자의 이름을 차량 단말 내 음성인식 엔진의 음성 인식 결과에 기초하여 보정함으로써 수신자 이름에 대한 음성인식 정확도가 향상되도록 한, 음성 인식 처리 장치 및 방법, 그리고 차량 시스템을 제공함에 있다.It is another object of the present invention to provide a method and apparatus for correcting a name of a receiver identified from a speech recognition result of a speech recognition engine in a server based on a speech recognition result of a speech recognition engine in a vehicle terminal, A voice recognition processing apparatus and method, and a vehicle system.

본 발명의 기술적 과제들은 이상에서 언급한 기술적 과제들로 제한되지 않으며, 언급되지 않은 또 다른 기술적 과제들은 아래의 기재들로부터 당업자에게 명확하게 이해될 수 있을 것이다.The technical problems of the present invention are not limited to the above-mentioned technical problems, and other technical problems which are not mentioned can be understood by those skilled in the art from the following description.

상기의 목적을 달성하기 위한 본 발명의 일 실시예에 따른 음성 인식 처리 장치는, 문자 메시지 전송 기능이 실행되면, 사용자로부터 발화된 음성 데이터를 차량 단말 및 서버로 각각 전송하여 음성 인식 처리를 요청하고, 상기 차량 단말 및 상기 서버 내 음성인식 엔진의 음성인식 결과로부터 수신자 이름 및 문자 메시지 내용을 확인하여 문자 메시지를 구성하는 음성인식 처리부, 및 상기 차량 단말 내 음성인식 엔진의 제2 음성인식 결과에 기초하여 상기 서버 내 음성인식 엔진의 제1 음성인식 결과로부터 확인된 수신자 이름을 보정하는 보정부를 포함하는 것을 특징으로 한다.According to an aspect of the present invention, there is provided a voice recognition processing apparatus for transmitting a voice message from a user to a vehicle terminal and a server, A voice recognition processor for confirming the name of the receiver and the content of the text message from the voice recognition result of the vehicle terminal and the voice recognition engine in the server to form a text message; And a correcting unit correcting the receiver name confirmed from the first speech recognition result of the speech recognition engine in the server.

상기 음성인식 처리부는, 상기 제1 음성인식 결과로부터 수신자 이름 및 문자 메시지 내용을 확인하고, 상기 제2 음성인식 결과로부터 수신자 이름을 확인하는 것을 특징으로 한다.The speech recognition processing unit identifies the recipient name and the contents of the text message from the first speech recognition result and confirms the recipient name from the second speech recognition result.

상기 제2 음성인식 결과는, 상기 음성 데이터에 대해 음성 인식된 텍스트와 상기 차량 단말 내 폰북 DB에 저장된 폰북 데이터의 텍스트 매칭을 통해 인식된 결과인 것을 특징으로 한다.And the second voice recognition result is a result recognized through text matching of the text recognized as a voice for the voice data and the phone book data stored in the phonebook DB in the vehicle terminal.

상기 보정부는, 상기 제1 음성인식 결과 및 상기 제2 음성인식 결과로부터 확인된 수신자 이름이 서로 상이한 경우, 상기 제2 음성인식 결과에 대한 신뢰도에 따라 상기 제1 음성인식 결과로부터 확인된 수신자 이름을 보정하는 것을 특징으로 한다.Wherein the correcting unit corrects the recipient name confirmed from the first speech recognition result according to the reliability of the second speech recognition result when the recipient names identified from the first speech recognition result and the second speech recognition result are different from each other And the correction is performed.

상기 보정부는, 상기 제2 음성인식 결과에 대한 신뢰도가 기준치를 초과하면 제2 음성인식 결과에 기초하여 상기 서버 내 음성인식 엔진의 제1 음성인식 결과로부터 확인된 수신자 이름을 보정하는 것을 특징으로 한다.And the correcting unit corrects the receiver name confirmed from the first speech recognition result of the intra-server speech recognition engine based on the second speech recognition result when the reliability of the second speech recognition result exceeds the reference value .

상기 보정부는, 상기 제2 음성인식 결과에 대한 신뢰도가 기준치 이하이면, 상기 제1 음성인식 결과로부터 확인된 수신자 이름과 폰북 데이터의 이름을 매칭하여 상기 제1 음성인식 결과로부터 확인된 수신자 이름을 보정하는 것을 특징으로 한다.Wherein the correcting unit corrects the receiver name confirmed from the first voice recognition result by matching the receiver name and the phonebook data name confirmed from the first voice recognition result when the reliability of the second voice recognition result is less than the reference value, .

상기 음성인식 처리부는, 상기 보정된 수신자 이름 및 상기 제1 음성인식 결과로부터 확인된 문자 메시지 내용에 기초하여 최종 문자 메시지를 구성하는 것을 특징으로 한다.And the voice recognition processing unit configures a final text message based on the corrected recipient name and the contents of the text message confirmed from the first speech recognition result.

상기 음성인식 처리부는, 상기 서버의 네트워크 통신 상태에 따라 상기 제1 음성인식 결과가 수신되지 않은 경우, 상기 사용자에게 문자 메시지 내용에 대한 재발화를 요청하는 메시지를 출력하는 것을 특징으로 한다.The voice recognition processor may output a message requesting recall of the text message content to the user when the first voice recognition result is not received according to the network communication state of the server.

상기 음성인식 처리부는, 상기 제1 음성인식 결과로부터 상기 문자 메시지 내용의 확인이 불가한 경우, 상기 사용자에게 문자 메시지 내용에 대한 재발화를 요청하는 메시지를 출력하는 것을 특징으로 한다.And the voice recognition processing unit outputs a message requesting the user to recycle the contents of the text message if the contents of the text message can not be confirmed from the first voice recognition result.

상기 음성인식 처리부는, 상기 사용자에 의해 재발화된 음성에 대한 음성 인식 결과 및 상기 제2 음성인식 결과로부터 확인된 수신자 이름에 기초하여 최종 문자 메시지를 구성하는 것을 특징으로 한다.The voice recognition processing unit configures a final text message based on a voice recognition result of the voice recalled by the user and a receiver name confirmed from the second voice recognition result.

상기 음성인식 처리부는, 상기 구성된 문자 메시지를 디스플레이 화면을 통해 출력하고, 사용자의 응답에 따라 상기 구성된 문자 메시지를 전송하는 것을 특징으로 한다.The speech recognition processing unit outputs the configured text message through a display screen, and transmits the configured text message according to a user's response.

또한, 본 발명의 일 실시예에 따른 장치는, 문자 메시지와 관련된 게이트 명령 및 폰북 데이터의 이름을 상기 음성인식 엔진의 발음열로 변환하고, 상기 변환된 발음열에 기초하여 음성인식 트리를 구성하는 음성인식 트리 생성부를 더 포함하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided an apparatus for converting a name of a gate command and a phonebook data associated with a text message into a pronunciation column of the speech recognition engine, And a recognition tree generation unit.

또한, 상기의 목적을 달성하기 위한 본 발명의 일 실시예에 따른 음성 인식 처리 방법은, 문자 메시지 전송 기능이 실행되면, 사용자로부터 발화된 음성 데이터를 차량 단말 및 서버로 각각 전송하여 음성 인식 처리를 요청하는 단계, 상기 서버 내 음성인식 엔진의 제1 음성인식 결과 및 상기 차량 단말 내 음성인식 엔진의 제2 음성인식 결과에 기초하여 수신자 이름 및 문자 메시지 내용을 확인하는 단계, 제2 음성인식 결과에 기초하여 상기 서버 내 음성인식 엔진의 제1 음성인식 결과로부터 확인된 수신자 이름을 보정하는 단계, 및 상기 보정된 수신자 이름 및 상기 제1 음성인식 결과로부터 확인된 문자 메시지 내용에 기초하여 최종 문자 메시지를 구성하는 단계를 포함하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided a method for processing speech recognition, the method comprising the steps of: receiving a speech message transmitted from a user; Confirming a recipient name and contents of a text message based on a first speech recognition result of the speech recognition engine in the server and a second speech recognition result of the speech recognition engine in the vehicle terminal; Based on the corrected recipient name and the contents of the text message identified from the first speech recognition result, the corrected final recipient name, The method comprising the steps of:

또한, 상기의 목적을 달성하기 위한 본 발명의 일 실시예에 따른 차량 시스템은, 음성인식 엔진 및 폰북 데이터가 저장된 폰북 DB를 구비하고, 입력된 음성 데이터에 대한 상기 음성인식 엔진의 음성인식 결과와 상기 폰북 데이터의 텍스트 매칭을 통해 음성 인식을 수행하는 차량 단말, 음성인식 엔진을 구비하고, 입력된 음성 데이터에 대한 음성 인식을 수행하는 서버, 및 상기 음성 데이터에 대해 상기 서버로부터 수신된 제1 음성인식 결과에 기초하여 수신자 이름 및 문자 메시지 내용을 확인하고, 상기 차량 단말로부터 수신된 제2 음성인식 결과에 기초하여 수신자 이름을 확인하고, 상기 제2 음성인식 결과에 기초하여 상기 제1 음성인식 결과로부터 확인된 수신자 이름을 보정하여 문자 메시지를 구성하는 음성 인식 처리 장치를 포함하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided a vehicle system including a voice recognition engine and a phonebook DB in which phonebook data is stored, the voice recognition result of the voice recognition engine with respect to the input voice data, A server for performing voice recognition through text matching of the phone book data, a server for performing voice recognition on the input voice data, and a server for performing voice recognition on the input voice data, Recognizing the recipient name and the content of the text message based on the recognition result, confirming the recipient name based on the second speech recognition result received from the vehicle terminal, and receiving the first speech recognition result And a voice recognition processing device for composing a text message by correcting the receiver name confirmed by the voice recognition processing device .

본 발명에 따르면, 폰북 데이터를 전송하지 않고 음성인식 결과에 대한 정확도를 높이는 효과가 있으며, 서버 내 음성인식 엔진의 음성인식 결과로부터 확인된 수신자의 이름을 차량 단말 내 음성인식 엔진의 음성 인식 결과에 기초하여 보정함으로써 수신자 이름에 대한 음성인식 정확도를 향상시킬 수 있는 효과가 있다.According to the present invention, there is an effect of increasing the accuracy of the speech recognition result without transmitting the phone book data, and the name of the receiver identified from the speech recognition result of the speech recognition engine in the server is used as the speech recognition result of the speech recognition engine in the vehicle terminal The accuracy of speech recognition on the recipient name can be improved.

도 1은 본 발명의 일 실시예에 따른 음성 인식 처리 장치가 적용된 시스템을 도시한 도면이다.
도 2는 본 발명의 일 실시예에 따른 음성 인식 처리 장치의 구성을 도시한 도면이다.
도 3 내지 도 5는 본 발명의 일 실시예에 따른 음성 인식 처리 장치의 동작을 설명하는데 참조되는 실시예를 도시한 도면이다.
도 6 내지 도 9는 본 발명의 일 실시예에 따른 음성 인식 처리 방법에 대한 동작 흐름을 도시한 도면이다.
도 10은 본 발명의 일 실시예에 따른 방법이 실행되는 컴퓨팅 시스템을 도시한 도면이다.1 is a diagram illustrating a system to which a speech recognition processing apparatus according to an embodiment of the present invention is applied.
2 is a diagram showing a configuration of a speech recognition processing apparatus according to an embodiment of the present invention.
3 to 5 are diagrams illustrating an embodiment of the speech recognition apparatus according to an embodiment of the present invention.
6 to 9 are diagrams illustrating an operation flow for a speech recognition processing method according to an embodiment of the present invention.
10 is a diagram illustrating a computing system in which a method according to one embodiment of the invention is implemented.

이하, 본 발명의 일부 실시예들을 예시적인 도면을 통해 상세하게 설명한다. 각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 부호를 가지도록 하고 있음에 유의해야 한다. 또한, 본 발명의 실시예를 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 실시예에 대한 이해를 방해한다고 판단되는 경우에는 그 상세한 설명은 생략한다.Hereinafter, some embodiments of the present invention will be described in detail with reference to exemplary drawings. It should be noted that, in adding reference numerals to the constituent elements of the drawings, the same constituent elements are denoted by the same reference symbols as possible even if they are shown in different drawings. In the following description of the embodiments of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the difference that the embodiments of the present invention are not conclusive.

본 발명의 실시예의 구성 요소를 설명하는 데 있어서, 제 1, 제 2, A, B, (a), (b) 등의 용어를 사용할 수 있다. 이러한 용어는 그 구성 요소를 다른 구성 요소와 구별하기 위한 것일 뿐, 그 용어에 의해 해당 구성 요소의 본질이나 차례 또는 순서 등이 한정되지 않는다. 또한, 다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가진 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.In describing the components of the embodiment of the present invention, terms such as first, second, A, B, (a), and (b) may be used. These terms are intended to distinguish the constituent elements from other constituent elements, and the terms do not limit the nature, order or order of the constituent elements. Also, unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries should be interpreted as having a meaning consistent with the meaning in the context of the relevant art and are to be interpreted in an ideal or overly formal sense unless explicitly defined in the present application Do not.

도 1은 본 발명의 일 실시예에 따른 음성 인식 처리 장치가 적용된 시스템을 도시한 도면이다.1 is a diagram illustrating a system to which a speech recognition processing apparatus according to an embodiment of the present invention is applied.

도 1을 참조하면, 본 발명의 일 실시예에 따른 시스템은 차량 단말(10), 서버(50) 및 음성 인식 처리 장치(100)를 포함할 수 있다.Referring to FIG. 1, a system according to an embodiment of the present invention may include a vehicle terminal 10, a server 50, and a speech recognition processing apparatus 100.

차량 단말(10)은 차량에 구비된 제어 단말로서, 일 예로 차량의 헤드 유닛이 해당될 수 있다. 차량 단말(10)은 음성 인식 처리 장치(100)와 통신 인터페이스를 통해 연결되며, 음성 인식 처리 장치(100)의 요청에 의해 음성인식을 수행할 수 있다. 이에, 차량 단말(10)은 음성인식 엔진을 구비하고, 음성 인식 처리 장치(100)의 요청에 의해 입력된 음성 데이터에 대한 음성 인식을 수행한다. 차량 단말(10)은 음성인식 엔진에 의한 음성 인식 결과를 음성 인식 처리 장치(100)로 전송한다. 이하에서는, 차량 단말(10)에 구비된 음성인식 엔진을 제2 엔진이라 칭하여 설명한다.The vehicle terminal 10 is a control terminal provided in the vehicle, and may be, for example, a head unit of a vehicle. The vehicle terminal 10 is connected to the speech recognition processing device 100 through a communication interface and can perform speech recognition at the request of the speech recognition processing device 100. [ Accordingly, the vehicle terminal 10 includes a speech recognition engine and performs speech recognition on the speech data inputted by the speech recognition processing apparatus 100 at the request. The vehicle terminal 10 transmits the speech recognition result by the speech recognition engine to the speech recognition processing device 100. [ Hereinafter, the speech recognition engine provided in the vehicle terminal 10 will be referred to as a second engine.

또한, 차량 단말(10)은 사용자의 폰북 데이터가 저장된 폰북 DB(11)를 포함할 수 있다. 폰북 DB(11)는 사용자에 의해 수동 입력된 폰북 데이터가 저장될 수 있으며, 통신 인터페이스를 통해 외부로부터 수신된 폰북 데이터가 저장될 수도 있다. 일 예로, 차량 단말(10)은 차량 내 사용자 단말(미도시)과 통신 연결된 경우, 통신 연결된 사용자 단말로부터 폰북 데이터를 수신하여 폰북 DB(11)에 저장할 수 있다.In addition, the vehicle terminal 10 may include a phone book DB 11 in which phone book data of a user is stored. The phonebook DB 11 may store phone book data manually input by a user, and phone book data received from an outside via a communication interface may be stored. For example, when the vehicle terminal 10 is communicatively connected to a user terminal (not shown) in the vehicle, the vehicle terminal 10 may receive phone book data from a communication-connected user terminal and store the phone book data in the phone book DB 11.

차량 단말(10)은 음성 인식 처리 장치(100)의 요청이 있는 경우에 폰북 DB(11)에 등록된 일부 또는 전체의 폰북 데이터를 음성 인식 처리 장치(100)로 전송할 수 있다.The vehicle terminal 10 can transmit part or all of the phone book data registered in the phonebook DB 11 to the voice recognition processing device 100 when there is a request from the voice recognition processing device 100. [

서버(50)는 문자 보내기 기능을 지원한다. 서버(50)는 음성 인식 처리 장치(100)와 통신 인터페이스를 통해 연결되며, 음성 인식 처리 장치(100)의 요청에 의해 음성인식을 수행할 수 있다. 이에, 서버(50)는 음성인식 엔진을 구비하고, 음성 인식 처리 장치(100)의 요청에 의해 입력된 음성 데이터에 대한 음성 인식을 수행한다. 서버(50)는 음성인식 엔진에 의한 음성 인식 결과를 음성 인식 처리 장치(100)로 전송한다. 이하에서는, 서버(50)에 구비된 음성인식 엔진을 제1 엔진이라 칭하여 설명한다.The server 50 supports a text sending function. The server 50 is connected to the speech recognition processing device 100 through a communication interface and can perform speech recognition at the request of the speech recognition processing device 100. [ The server 50 includes a speech recognition engine and performs speech recognition on the speech data input by the speech recognition processing apparatus 100 at the request. The server 50 transmits the result of speech recognition by the speech recognition engine to the speech recognition processing apparatus 100. [ Hereinafter, the speech recognition engine provided in the server 50 will be referred to as a first engine.

음성 인식 처리 장치(100)는 사용자에 의해 문자 보내기 기능이 선택되면, 사용자에 의해 발화된 음성에 대한 음성 인식 결과를 이용하여 문자 메시지를 구성할 수 있다. 또한, 음성 인식 처리 장치(100)는 구성된 문자 메시지를 서버(50)를 통해 전송하거나, 문자 메시지의 전송을 처리하는 차량 내 제어유닛으로 해당 문자 메시지를 전송할 수 있다.When the character sending function is selected by the user, the voice recognition processing apparatus 100 can construct a text message using the voice recognition result of the voice uttered by the user. In addition, the speech recognition processing apparatus 100 may transmit the configured text message to the in-vehicle control unit that transmits the text message through the server 50 or processes the transmission of the text message.

음성 인식 처리 장치(100)는 사용자에 의해 발화된 음성에 대한 음성 인식 처리를 위해 차량 단말(10) 및 서버(50)로 음성 데이터를 전송한다. 이때, 음성 인식 처리 장치(100)는 제1 엔진 및 제2 엔진의 음성 인식 결과를 조합하여 문자 메시지를 구성하기 위한 수신자 이름 및 문자 메시지 내용을 분석하고, 최종 문자 메시지를 구성할 수 있다. 음성 인식 처리 장치(100)의 세부 구성은 도 2를 참조하여 더욱 상세히 설명한다.The speech recognition processing apparatus 100 transmits the voice data to the vehicle terminal 10 and the server 50 for voice recognition processing on the voice uttered by the user. At this time, the speech recognition processing apparatus 100 may analyze the recipient name and text message contents for composing the text message by combining the speech recognition results of the first engine and the second engine, and configure the final text message. The detailed configuration of the speech recognition processing device 100 will be described in more detail with reference to FIG.

본 발명에 따른 음성 인식 처리 장치(100)는 차량의 내부에 구현될 수 있다. 이때, 장치(100)는 차량의 내부 제어유닛들과 일체로 형성될 수 있으며, 별도의 장치로 구현되어 별도의 연결 수단에 의해 차량의 제어유닛들과 연결될 수도 있다. The speech recognition processing apparatus 100 according to the present invention can be implemented inside a vehicle. At this time, the apparatus 100 may be integrally formed with the internal control units of the vehicle, or may be implemented as a separate apparatus and connected to the control units of the vehicle by separate connecting means.

도 2는 본 발명의 일 실시예에 따른 음성 인식 처리 장치의 구성을 도시한 도면이다.2 is a diagram showing a configuration of a speech recognition processing apparatus according to an embodiment of the present invention.

도 2를 참조하면, 음성 인식 처리 장치(100)는 제어부(110), 인터페이스부(120), 통신부(130), 저장부(140), 음성인식 트리 생성부(150), 음성인식 처리부(160) 및 보정부(170)를 포함할 수 있다. 여기서, 제어부(110)는 음성 인식 처리 장치(100)의 각 구성요소들 간에 전달되는 신호를 처리할 수 있다.2, the speech recognition processing apparatus 100 includes a control unit 110, an interface unit 120, a communication unit 130, a storage unit 140, a speech recognition tree generation unit 150, a speech recognition processing unit 160 And a correcting unit 170. [0033] FIG. Here, the controller 110 may process signals transmitted between the components of the voice recognition processor 100.

인터페이스부(120)는 사용자로부터의 제어 명령을 입력 받기 위한 입력수단과 음성 인식 처리 장치(100)의 동작 상태 및 결과 등을 출력하는 출력수단을 포함할 수 있다.The interface unit 120 may include input means for receiving a control command from a user and output means for outputting an operation state and a result of the voice recognition processing apparatus 100.

여기서, 입력수단은 키 버튼을 포함할 수 있으며, 마우스, 조이스틱, 조그셔틀, 스타일러스 펜 등을 포함할 수도 있다. 또한, 입력수단은 디스플레이 상에 구현되는 소프트 키를 포함할 수도 있다. 또한, 입력수단은 사용자로부터 발화된 음성을 입력 기 위한 마이크를 더 포함할 수도 있다.Here, the input means may include a key button, and may include a mouse, a joystick, a jog shuttle, a stylus pen, or the like. Further, the input means may comprise a soft key implemented on the display. Further, the input means may further comprise a microphone for inputting a voice uttered by the user.

출력수단은 디스플레이를 포함할 수 있으며, 스피커와 같은 음성출력수단을 포함할 수도 있다. 이때, 터치 필름, 터치 시트, 터치 패드 등의 터치 센서가 디스플레이에 구비되는 경우, 디스플레이는 터치 스크린으로 동작하며, 입력수단과 출력수단이 통합된 형태로 구현될 수 있다.The output means may comprise a display and may comprise a voice output means such as a speaker. In this case, when a touch sensor such as a touch film, a touch sheet, or a touch pad is provided on the display, the display operates as a touch screen, and the input means and the output means may be integrated.

이때, 디스플레이는 액정 디스플레이(Liquid Crystal Display, LCD), 박막 트랜지스터 액정 디스플레이(Thin Film Transistor-Liquid Crystal Display, TFT LCD), 유기 발광 다이오드(Organic Light-Emitting Diode, OLED), 플렉시블 디스플레이(Flexible Display), 전계 방출 디스플레이(Feld Emission Display, FED), 3차원 디스플레이(3D Display) 중에서 적어도 하나를 포함할 수 있다.The display may be a liquid crystal display (LCD), a thin film transistor liquid crystal display (TFT LCD), an organic light-emitting diode (OLED), a flexible display, , A field emission display (FED), and a 3D display (3D display).

통신부(130)는 차량에 구비된 전장품 및/또는 제어유닛들과의 통신 인터페이스를 지원하는 통신모듈을 포함할 수 있다. 일 예로서, 통신모듈은 차량 단말(10)과 통신 연결되어 차량 단말(10)로 음성 데이터를 전송하고, 차량 단말(10)로부터의 음성 인식 결과를 수신할 수 있다. 또한, 통신모듈은 차량 단말(10)에 등록된 폰북 데이터를 수신할 수도 있다. 또한, 통신모듈은 차량에 구비된 마이크를 통해 입력된 사용자의 음성 데이터를 수신할 수도 있다.The communication unit 130 may include a communication module that supports communication interfaces with the electric components and / or control units provided in the vehicle. As one example, the communication module may communicate with the vehicle terminal 10 to transmit voice data to the vehicle terminal 10, and receive the voice recognition result from the vehicle terminal 10. The communication module may also receive phone book data registered in the vehicle terminal 10. [ In addition, the communication module may receive voice data of a user input through a microphone provided in the vehicle.

여기서, 통신모듈은 CAN(Controller Area Network) 통신, LIN(Local Interconnect Network) 통신, 플렉스레이(Flex-Ray) 통신 등의 차량 네트워크 통신을 지원하는 모듈을 포함할 수 있다. Here, the communication module may include a module supporting vehicle network communication such as CAN (Controller Area Network) communication, LIN (Local Interconnect Network) communication, and Flex-Ray communication.

또한, 통신모듈은 무선 인터넷 접속을 위한 모듈 또는 근거리 통신(Short Range Communication)을 위한 모듈을 포함할 수도 있다. 여기서, 무선 인터넷 기술로는 무선랜(Wireless LAN, WLAN), 와이브로(Wireless Broadband, Wibro), 와이파이(Wi-Fi), 와이맥스(World Interoperability for Microwave Access, Wimax) 등이 포함될 수 있으며, 근거리 통신 기술로는 블루투스(Bluetooth), 지그비(ZigBee), UWB(Ultra Wideband), RFID(Radio Frequency Identification), 적외선통신(Infrared Data Association, IrDA) 등이 포함될 수 있다.In addition, the communication module may include a module for wireless Internet access or a module for short range communication. Here, the wireless Internet technology may include a wireless LAN (WLAN), a wireless broadband (WIBRO), a Wi-Fi, a World Interoperability for Microwave Access (WIMAX) (Bluetooth), ZigBee, Ultra Wideband (UWB), Radio Frequency Identification (RFID), Infrared Data Association (IrDA), and the like.

일 예로서, 통신모듈은 서버(50)와 통신 연결되어 서버(50)로 음성 데이터를 전송하고, 서버(50)로부터의 음성 인식 결과를 수신할 수 있다. 또한, 통신모듈은 사용자 단말과 통신 연결되어, 사용자 단말에 등록된 폰북 데이터를 수신할 수 있다.As an example, the communication module may communicate with the server 50 to transmit the voice data to the server 50, and receive the voice recognition result from the server 50. Also, the communication module may be connected to the user terminal and receive phone book data registered in the user terminal.

저장부(140)는 음성 인식 처리 장치(100)가 동작하는데 필요한 데이터 및/또는 알고리즘 등을 저장할 수 있다. The storage unit 140 may store data and / or algorithms necessary for the speech recognition processing apparatus 100 to operate.

저장부(140)는 차량 단말(10) 또는 사용자 단말로부터 수신된 폰북 데이터가 저장될 수 있다. 또한, 저장부(140)는 음성인식 트리를 구성하기 위한 명령 및 설정값이 저장될 수 있으며, 문자 메시지와 관련된 발음열 및 폰북의 이름과 관련된 발음열에 대해 구성된 음성인식 트리가 저장될 수 있다. 또한, 저장부(140)는 음성 인식을 처리하고, 음성 인식 결과로부터 문자 메시지를 구성하기 위한 명령 및/또는 알고리즘이 저장될 수 있다. The storage unit 140 may store phone book data received from the vehicle terminal 10 or the user terminal. In addition, the storage unit 140 may store a command and a set value for configuring a voice recognition tree, and may store a voice recognition tree configured for a pronunciation column related to a name of a phonebook and a pronunciation column related to a text message. Further, the storage unit 140 may store instructions and / or algorithms for processing speech recognition and for composing a text message from speech recognition results.

여기서, 저장부(140)는 램(Random Access Memory, RAM), SRAM(Static Random Access Memory), 롬(Read-Only Memory, ROM), PROM(Programmable Read-Only Memory), EEPROM(Electrically Erasable Programmable Read-Only Memory)와 같은 저장매체를 포함할 수 있다.The storage unit 140 may be a random access memory (RAM), a static random access memory (SRAM), a read-only memory (ROM), a programmable read-only memory (PROM), an electrically erasable programmable read- -Only Memory).

음성인식 트리 생성부(150)는 문자 메시지와 관련된 게이트 명령에 대한 발음열 및 폰북의 이름에 대한 발음열을 이용하여 음성인식 트리를 구성하고, 구성된 음성인식 트리를 저장부(140)에 저장한다.The speech recognition tree generation unit 150 constructs a speech recognition tree using a pronunciation column for the pronunciation order of the gate command related to the text message and the name of the phone book, and stores the constructed speech recognition tree in the storage unit 140 .

여기서, 음성인식 트리 생성부(150)는 문자 메시지와 관련된 게이트 명령(gate command)을 미리 설정하고, 음성인식 엔진용 발음열로 변환한다. 이때, 음성인식 트리 생성부(150)는 변환된 발음열을 이용하여 문자 메시지와 관련된 게이트 명령에 대한 음성인식 트리를 구성할 수 있다. 또한, 음성인식 트리 생성부(150)는 저장부(140)에 저장된 폰북 데이터의 이름을 로딩하여 음성인식 엔진용 발음열로 변환한다. 음성인식 트리 생성부(150)는 변환된 발음열을 이용하여 폰북에 대한 음성인식 트리를 구성할 수 있다.Here, the speech recognition tree generation unit 150 sets a gate command related to the text message in advance, and converts it into a pronunciation string for the speech recognition engine. At this time, the speech recognition tree generation unit 150 can construct a speech recognition tree for the gate command related to the text message using the converted pronunciation string. In addition, the voice recognition tree generation unit 150 loads the name of the phone book data stored in the storage unit 140 and converts the name into the pronunciation column for the voice recognition engine. The speech recognition tree generation unit 150 may construct a speech recognition tree for the phone book using the converted pronunciation column.

음성인식 트리 생성부(150)에 의해 구성된 음성인식 트리에 대한 실시예는 도 3을 참조하도록 한다.An embodiment of the speech recognition tree constructed by the speech recognition tree generation unit 150 will be described with reference to FIG.

도 3을 참조하면, 음성인식 트리 생성부(150)는 (a)에 도시된 바와 같이, 사전에 문자 메시지와 관련된 게이트 명령(gate command)을 "send msg <Name>, text msg <Name> … >"과 같은 형식으로 설정할 수 있다. Referring to FIG. 3, the speech recognition tree generating unit 150 generates a gate command related to a text message in advance as shown in (a), "send msg <Name>, text msg <Name> ... > ".

또한, 음성인식 트리 생성부(150)는 저장부(140)에 저장된 폰북 데이터로부터 'James', 'Jorge', 'Jesus'와 같은 이름을 로딩한다. 이후, 음성인식 트리 생성부(150)는 (b)에 도시된 바와 같이, 문자 메시지와 관련된 게이트 명령 및 폰북의 이름을 음성인식 엔진용 발음열로 변환하여 음성인식 트리를 구성할 수 있다.The voice recognition tree generation unit 150 also loads names such as 'James', 'Jorge', and 'Jesus' from the phonebook data stored in the storage unit 140. Then, the speech recognition tree generating unit 150 may construct a speech recognition tree by converting the name of the gate command and the phone book related to the text message into the pronunciation column for the speech recognition engine, as shown in (b).

음성인식 처리부(160)는 문자 메시지 전송 기능이 실행되고 사용자로부터 발화된 음성이 입력되면, 입력된 음성 데이터를 통신부(130)를 통해 차량 단말(10) 및 서버(50)로 각각 전송한다.The voice recognition processing unit 160 transmits the inputted voice data to the vehicle terminal 10 and the server 50 through the communication unit 130, when the text message transmission function is executed and the voice uttered by the user is inputted.

이때, 음성인식 처리부(160)는 서버(50)로부터 제1 엔진에 의한 제1 음성인식 결과가 수신되면, 수신된 제1 음성인식 결과로부터 수신자 이름 및 문자 메시지 내용을 분석한다.At this time, when the first speech recognition result by the first engine is received from the server 50, the speech recognition processing unit 160 analyzes the recipient name and text message content from the received first speech recognition result.

또한, 음성인식 처리부(160)는 차량 단말(10)로부터 제2 엔진에 의한 제2 음성인식 결과가 수신되면, 수신된 제2 음성인식 결과로부터 수신자 이름을 확인한다.Further, when the second voice recognition result by the second engine is received from the vehicle terminal 10, the voice recognition processing unit 160 confirms the receiver name from the received second voice recognition result.

이때, 보정부(170)는 제2 음성인식 결과 또는 폰북 데이터를 이용하여 제1 음성인식 결과로부터 확인된 수신자 이름을 보정할 수 있다.At this time, the corrector 170 may correct the receiver name identified from the first speech recognition result using the second speech recognition result or the phone book data.

일 예로, 보정부(170)는 제1 음성인식 결과로부터 확인된 수신자 이름이 제2 음성인식 결과로부터 확인된 수신자 이름과 상이한 경우, 제2 음성인식 결과로부터 확인된 수신자 이름을 이용하여 제1 음성인식 결과로부터 확인된 수신자 이름을 보정할 수 있다.For example, when the recipient name identified from the first speech recognition result is different from the recipient name confirmed from the second speech recognition result, the correction unit 170 may use the recipient name identified from the second speech recognition result to generate the first speech The recipient name confirmed from the recognition result can be corrected.

다른 예로, 보정부(170)는 제1 음성인식 결과로부터 확인된 수신자 이름이 제2 음성인식 결과로부터 확인된 수신자 이름과 상이한 경우, 제2 음성인식 결과에 대한 신뢰도를 확인한다. 이때, 보정부(170)는 제2 음성인식 결과에 대해 확인된 신뢰도가 기준치를 초과하는 경우에 제2 음성인식 결과로부터 확인된 수신자 이름을 이용하여 제1 음성인식 결과로부터 확인된 수신자 이름을 보정할 수 있다. As another example, when the recipient name identified from the first speech recognition result is different from the recipient name confirmed from the second speech recognition result, the correction unit 170 confirms the reliability of the second speech recognition result. At this time, when the confirmed reliability of the second speech recognition result exceeds the reference value, the corrector 170 corrects the receiver name confirmed from the first speech recognition result using the receiver name confirmed from the second speech recognition result can do.

한편, 보정부(170)는 제2 음성인식 결과에 대해 확인된 신뢰도가 기준치 이하인 경우에 제1 음성인식 결과로부터 확인된 수신자 이름과 폰북 데이터의 이름을 매칭하여 제1 음성인식 결과로부터 확인된 수신자 이름을 보정할 수도 있다.On the other hand, when the confirmed reliability of the second speech recognition result is less than or equal to the reference value, the corrector 170 matches the name of the receiver name and the phonebook data identified from the first speech recognition result, You can also correct the name.

따라서, 음성인식 처리부(160)는 보정부(170)에 의해 보정된 수신자 이름과, 제1 음성인식 결과로부터 확인된 문자 메시지 내용을 이용하여 최종 문자 메시지를 구성하고, 디스플레이 화면을 통해 출력할 수 있다.Therefore, the speech recognition processing unit 160 constructs the final text message using the recipient name corrected by the correction unit 170 and the contents of the text message confirmed from the first speech recognition result, and outputs it through the display screen have.

상기 동작에 의해 음성인식 처리부(160)에서 최종 구성된 문자 메시지를 전송하는 실시예는 도 4를 참조하도록 한다.Referring to FIG. 4, an embodiment in which the speech recognition processor 160 transmits the finally configured text message by the above operation will be referred to.

도 4를 참조하면, (a)는 사용자로부터 발화되어 입력된 음성을 나타낸 것이다. 음성인식 처리부(160)는 (a)에 도시된 바와 같이 사용자로부터 발화된 음성 "Send Message to James I'm on the way. See you soon"이 입력되면, 해당 음성 데이터를 차량 단말(10) 및 서버(50)로 전송한다.Referring to FIG. 4, (a) shows a speech input from a user. When the voice "Send Message to James I'm on the way. See you soon" is input from the user as shown in (a), the voice recognition processing unit 160 transmits the voice data to the vehicle terminal 10 and To the server (50).

이때, 서버(50)의 제1 엔진은 사용자의 음성 데이터에 대한 음성 인식을 수행하여 (b)에 도시된 도면부호 411과 같이 수신자 이름을 'Jane'으로 인식하고, 도면부호 415과 같이 문자 메시지 내용을 'I'm on the way. See you soon'로 인식할 수 있다. 따라서, 서버(50)는 사용자의 음성 데이터에 대한 제1 음성인식 결과를 음성 인식 처리 장치(100)로 전송할 수 있다.At this time, the first engine of the server 50 performs speech recognition on the user's voice data, recognizes the recipient name as 'Jane' as indicated by reference numeral 411 in (b) I'm on the way. See you soon. Therefore, the server 50 can transmit the first speech recognition result of the user's speech data to the speech recognition processing apparatus 100. [

또한, 차량 단말(10)의 제2 엔진은 사용자의 음성 데이터에 대해 음성 인식된 텍스트와, 폰북 DB(11)에 등록된 폰북 데이터의 텍스트 매칭을 통해 (c)에 도시된 도면부호 421과 같이 수신자 이름을 'James'로 인식할 수 있다. 따라서, 차량 단말(10)은 사용자의 음성 데이터에 대한 제2 음성인식 결과를 음성 인식 처리 장치(100)로 전송할 수 있다. Also, the second engine of the vehicle terminal 10 receives the voice recognition text of the voice data of the user and textbook matching of the phonebook data registered in the phonebook DB 11, as indicated by reference numeral 421 (c) The recipient name can be recognized as 'James'. Therefore, the vehicle terminal 10 can transmit the second voice recognition result of the voice data of the user to the voice recognition processing device 100. [

여기서, 제2 엔진은 폰북 DB(11)에 등록된 폰북 데이터와의 텍스트 매칭을 통해 수신자 이름을 인식하기 때문에, 제2 엔진에 의해 인식된 수신자 이름은 제1 엔진에 의해 인식된 수신자 이름 보다 정확도가 높을 수 있다. 따라서, 음성인식 처리부(160)는 제1 음성인식 결과로부터 확인된 문자 메시지 내용 'I'm on the way. See you soon'과, 제2 음성인식 결과로부터 확인된 수신자 이름 'James'에 근거하여 최종 문자 메시지를 구성한다.Here, since the second engine recognizes the recipient name through text matching with the phonebook data registered in the phonebook DB 11, the recipient name recognized by the second engine is more accurate than the recipient name recognized by the first engine Can be high. Accordingly, the speech recognition processing unit 160 recognizes the text message content 'I'm on the way.' Quot; See you soon " and the recipient name 'James' confirmed from the second speech recognition result.

이때, 음성인식 처리부(160)는 (d)에 도시된 바와 같이, 최종 구성된 문자 메시지에 근거하여 ["I'm on the way. See you soon" Would you like to send this message to James?]와 같은 내용을 디스플레이 화면을 통해 출력하여 사용자에게 문자 메시지 내용 및 수신자 이름을 확인할 수 있다.At this time, as shown in (d), based on the finally configured text message, the speech recognition processing unit 160 extracts [[I'm on the way. The same contents can be outputted through the display screen so that the contents of the text message and the name of the recipient can be confirmed to the user.

이때, 음성인식 처리부(160)는 디스플레이 화면에 출력된 최종 문자 메시지에 대한 사용자의 응답에 따라 최종 구성된 문자 메시지를 통신부(130)를 통해 수신자 단말로 전송하거나, 문자 메시지의 전송을 처리하는 차량 내 제어유닛으로 전송할 수 있다.At this time, the voice recognition processing unit 160 transmits the final configured text message to the recipient terminal through the communication unit 130 in response to the user's response to the final text message output on the display screen, To the control unit.

한편, 서버(50)의 제1 엔진이 음성 인식을 수행하는 중 통신 네트워크의 상태 악화로 인해 제1 엔진이 일부 음성만 인식하거나 음성 인식 자체가 불가능할 수 있다. 또한, 서버(50)가 제1 음성인식 결과를 음성 인식 처리 장치(100)로 송신하는 중 음성 인식 처리 장치(100)와 서버(50) 간 통신 네트워크의 상태 악화로 인해 음성 인식 처리 장치(100)가 제1 음성인식 결과를 수신하지 못할 수 있다.On the other hand, due to the deterioration of the state of the communication network while the first engine of the server 50 performs speech recognition, the first engine may recognize only a part of the voice or the speech recognition itself. Further, due to the deterioration of the state of the communication network between the speech recognition processing device 100 and the server 50 while the server 50 is transmitting the first speech recognition result to the speech recognition processing device 100, May not receive the first speech recognition result.

이와 같이, 음성인식 처리부(160)는 제1 음성인식 결과가 없고 제2 음성인식 결과만 있는 경우, 제2 음성인식 결과로부터 수신자의 이름을 확인하고, 문자 메시지의 내용 확인을 위해 사용자에게 문자 메시지 내용의 발화를 요청할 수 있다. 이때, 음성인식 처리부(160)는 문자 메시지 내용의 발화를 요청하는 메시지를 디스플레이 화면을 통해 출력할 수 있다.In this way, when there is no first speech recognition result but only second speech recognition result, the speech recognition processing unit 160 confirms the name of the receiver from the second speech recognition result, and sends a text message You can request the utterance of the contents. At this time, the speech recognition processing unit 160 may output a message requesting the utterance of the text message content through the display screen.

사용자에 의해 문자 메시지 내용이 재발화되면, 음성인식 처리부(160)는 재발화된 음성을 이용하여 음성 인식 처리를 다시 수행할 수 있다. 이때, 음성인식 처리부(160)는 서버(50)로 재발화된 음성을 전송하여 문자 메시지 내용에 대한 음성 인식 결과를 재수신할 수 있으며, 통신 네트워크의 상태에 따라 차량 단말(10)로 재발화된 음성을 전송하여 문자 메시지 내용에 대한 음성 인식 결과를 수신할 수도 있다.If the contents of the text message are re-generated by the user, the speech recognition processing unit 160 can perform the speech recognition processing again using the re-generated speech. At this time, the voice recognition processing unit 160 may transmit the re-generated voice to the server 50 to re-receive the voice recognition result of the text message content, And may receive the voice recognition result of the text message contents.

따라서, 음성인식 처리부(160)는 수신자 이름과, 재수신된 음성인식 결과로부터 확인된 문자 메시지 내용을 이용하여 최종 문자 메시지를 구성하고, 디스플레이 화면을 통해 출력할 수 있다.Accordingly, the speech recognition processor 160 can construct the final text message using the recipient name and the contents of the text message confirmed from the re-received speech recognition result, and output it through the display screen.

상기 동작에 의해 음성인식 처리부(160)에서 최종 구성된 문자 메시지를 전송하는 실시예는 도 5를 참조하도록 한다.An embodiment in which the speech recognition processor 160 transmits the finally configured text message according to the above operation will be described with reference to FIG.

도 5를 참조하면, (a)는 사용자로부터 발화되어 입력된 음성을 나타낸 것이다. 음성인식 처리부(160)는 (a)에 도시된 바와 같이 사용자로부터 발화된 음성 "Send Message to James I'm on the way. See you soon"이 입력되면, 해당 음성 데이터를 차량 단말(10) 및 서버(50)로 전송한다. 한편, 서버(50)의 통신 네트워크의 상태 악화로 인해 서버(50)의 제1 엔진으로부터 도면부호 511의 수신자 이름 및 문자 메시지 내용의 인식이 불가할 수 있다.Referring to FIG. 5, (a) shows a speech input from a user. When the voice "Send Message to James I'm on the way. See you soon" is input from the user as shown in (a), the voice recognition processing unit 160 transmits the voice data to the vehicle terminal 10 and To the server (50). On the other hand, recognition of the recipient name and text message content of the reference numeral 511 from the first engine of the server 50 may be impossible due to the deterioration of the state of the communication network of the server 50.

한편, 차량 단말(10)의 제2 엔진은 사용자의 음성 데이터에 대해 음성 인식된 텍스트와, 폰북 DB(11)에 등록된 폰북 데이터의 텍스트 매칭을 통해 (b)에 도시된 도면부호 521과 같이 수신자 이름을 'James'로 인식할 수 있다. 따라서, 차량 단말(10)은 사용자의 음성 데이터에 대한 제2 음성인식 결과를 음성 인식 처리 장치(100)로 전송할 수 있다. On the other hand, the second engine of the vehicle terminal 10 searches the text data of the user's voice data and the phonebook data registered in the phonebook DB 11 through text matching, as indicated by reference numeral 521 (b) The recipient name can be recognized as 'James'. Therefore, the vehicle terminal 10 can transmit the second voice recognition result of the voice data of the user to the voice recognition processing device 100. [

이에, 음성인식 처리부(160)는 제1 음성인식 결과를 수신하지 못하였으므로, (c)에 도시된 바와 같이 문자 메시지 내용의 재발화를 요청하는 메시지 [Network is not available, please say the message you want to send.]를 디스플레이 화면을 통해 출력할 수 있다.Accordingly, since the speech recognition processing unit 160 has not received the first speech recognition result, as shown in (c), a message requesting the recurrence of the text message content [Network is not available, please say the message you want to send.] on the display screen.

따라서, (d)에 도시된 바와 같이, 사용자로부터 'I'm on the way. See you soon'이 재발화되면, 음성인식 처리부(160)는 제2 음성인식 결과로부터 확인된 수신자 이름 'James' 및 사용자에 의해 재발화된 문자 메시지 내용에 대한 음성 인식 결과에 근거하여 최종 문자 메시지를 구성한다.Thus, as shown in (d), 'I'm on the way. Quot; See you soon " is re-issued, the speech recognition processor 160 generates a final text message based on the recipient name 'James' confirmed from the second speech recognition result and the speech recognition result of the text message content re- .

이때, 음성인식 처리부(160)는 (e)에 도시된 바와 같이, 최종 구성된 문자 메시지에 근거하여 ["I'm on the way. See you soon" Would you like to send this message to James?]와 같은 내용을 디스플레이 화면을 통해 출력하여 사용자에게 문자 메시지 내용 및 수신자 이름을 확인할 수 있다.At this time, as shown in (e), the speech recognition processing unit 160 determines whether or not [I'm on the way. See you soon "Would you like to send this message to James? The same contents can be outputted through the display screen so that the contents of the text message and the name of the recipient can be confirmed to the user.

상기에서와 같이 동작하는 본 실시예에 따른 음성 인식 처리 장치(100)는 독립적인 하드웨어 장치 형태로 구현될 수 있으며, 적어도 하나 이상의 프로세서(processor)로서 마이크로프로세서나 범용 컴퓨터 시스템과 같은 다른 하드웨어 장치에 포함된 형태로 구동될 수 있다.The speech recognition processor 100 according to the present embodiment may be implemented as an independent hardware device and may be implemented as at least one processor in a hardware device such as a microprocessor or a general purpose computer system And can be driven in an embedded form.

상기와 같이 구성되는 본 발명에 따른 음성 인식 처리 장치의 동작 흐름을 보다 상세히 설명하면 다음과 같다.The operation flow of the speech recognition apparatus according to the present invention will now be described in more detail.

도 6은 본 발명에 따른 음성인식 트리 구성 동작에 대한 흐름을 나타낸 도면이다.FIG. 6 is a flowchart illustrating a speech recognition tree configuration operation according to the present invention.

도 6을 참조하면, 음성 인식 처리 장치(100)는 문자 메시지와 관련된 게이트 명령(gate command)을 설정하고(S10), 문자 메시지와 관련된 게이트 명령을 음성인식 엔진용 발음열로 변환한다(S20). 또한, 음성 인식 장치는 폰북 데이터의 이름을 로딩하고(S30), 로딩된 이름을 음성인식 엔진용 발음열로 변환한다(S40). Referring to FIG. 6, the speech recognition processing apparatus 100 sets a gate command related to a text message (S10), and converts a gate command related to the text message into a pronunciation string for a speech recognition engine (S20) . Further, the speech recognition apparatus loads the name of the phonebook data (S30), and converts the name into the pronunciation string for the speech recognition engine (S40).

이후, 음성 인식 처리 장치(100)는 'S20' 및 'S40' 과정에서 변환된 발음열을 이용하여 문자 메시지와 관련된 게이트 명령 및 폰북에 대한 음성인식 트리를 구성하여(S50), 저장할 수 있다(S60).Thereafter, the speech recognition processing apparatus 100 may construct a voice recognition tree for the gate command and the phone book related to the text message (S50) using the converted pronunciation string in the steps S20 and S40 (S50) S60).

도 7은 본 발명의 제1 실시예에 따른 음성 인식 처리 방법의 동작 흐름을 나타낸 도면이다.7 is a flowchart illustrating an operation of the speech recognition processing method according to the first embodiment of the present invention.

도 7을 참조하면, 음성 인식 처리 장치(100)는 문자 메시지 전송 기능이 실행되고(S100), 사용자로부터 발화된 음성이 입력되면(S110), 입력된 음성 데이터를 차량 단말(10) 및 서버(50) 내 음성인식 엔진으로 각각 전송한다(S120).Referring to FIG. 7, the voice recognition processing apparatus 100 executes a text message transmission function (S100). When a voice uttered by the user is inputted (S110), the voice recognition processing apparatus 100 transmits the inputted voice data to the vehicle terminal 10 and the server 50), respectively (S120).

이때, 음성 인식 처리 장치(100)는 서버(50)로부터 제1 엔진에 의한 제1 음성인식 결과가 수신되면(S130), 수신된 제1 음성인식 결과로부터 수신자 이름 및 문자 메시지 내용을 분석한다(S140).At this time, when the first speech recognition result by the first engine is received from the server 50 (S130), the speech recognition processing device 100 analyzes the recipient name and contents of the text message from the received first speech recognition result ( S140).

또한, 음성 인식 처리 장치(100)는 차량 단말(10)로부터 제2 엔진에 의한 제2 음성인식 결과가 수신되면(S150), 수신된 제2 음성인식 결과로부터 수신자 이름을 확인한다(S160).When receiving the second voice recognition result by the second engine from the vehicle terminal 10 (S150), the voice recognition processing device 100 confirms the receiver name from the received second voice recognition result (S160).

이때, 음성 인식 처리 장치(100)는 제2 음성인식 결과를 이용하여 제1 음성인식 결과로부터 확인된 수신자 이름을 보정한다(S170).At this time, the speech recognition processing apparatus 100 corrects the recipient name identified from the first speech recognition result using the second speech recognition result (S170).

이후, 음성 인식 처리 장치(100)는 'S170' 과정에서 보정된 수신자 이름과, 'S140' 과정에서 확인된 문자 메시지 내용을 이용하여 최종 문자 메시지를 구성하고, 디스플레이 화면을 통해 출력하여 최종 문자 메시지를 확인한다(S180). 이때, 음성 인식 처리 장치(100)는 'S180' 과정에서 확인된 문자 메시지를 수신자 단말로 전송하거나, 문자 메시지의 전송을 처리하는 차량 내 제어유닛으로 전송할 수 있다(S190).Thereafter, the speech recognition processor 100 constructs a final text message using the recipient name corrected in the process 'S170' and the contents of the text message confirmed in the process 'S140', outputs the final text message through a display screen, (S180). At this time, the speech recognition processing apparatus 100 may transmit the text message confirmed in the process 'S180' to the receiver terminal or may transmit the text message to the in-vehicle control unit that processes the transmission of the text message (S190).

도 8은 본 발명의 제1 실시예에 따른 음성 인식 처리 방법의 동작 흐름을 나타낸 도면이다.8 is a flowchart illustrating an operation of the speech recognition processing method according to the first embodiment of the present invention.

도 8을 참조하면, 음성 인식 처리 장치(100)는 문자 메시지 전송 기능이 실행되고(S200), 사용자로부터 발화된 음성이 입력되면(S210), 입력된 음성 데이터를 차량 단말(10) 및 서버(50) 내 음성인식 엔진으로 각각 전송한다(S220).Referring to FIG. 8, the voice recognition processing apparatus 100 executes a text message transmission function (S200). When a voice uttered by the user is inputted (S210), the voice recognition processing apparatus 100 transmits the inputted voice data to the vehicle terminal 10 and the server 50), respectively (S220).

이때, 음성 인식 처리 장치(100)는 서버(50)로부터 제1 엔진에 의한 제1 음성인식 결과가 수신되면(S230), 수신된 제1 음성인식 결과로부터 수신자 이름 및 문자 메시지 내용을 분석한다(S240).At this time, when the first speech recognition result by the first engine is received from the server 50 (S230), the speech recognition processing apparatus 100 analyzes the recipient name and contents of the text message from the received first speech recognition result ( S240).

또한, 음성 인식 처리 장치(100)는 차량 단말(10)로부터 제2 엔진에 의한 제2 음성인식 결과가 수신되면(S250), 수신된 제2 음성인식 결과로부터 수신자 이름을 확인한다(S260).Further, when the second voice recognition result by the second engine is received from the vehicle terminal 10 (S250), the voice recognition processing device 100 confirms the receiver name from the received second voice recognition result (S260).

이때, 음성 인식 처리 장치(100)는 제1 음성인식 결과로부터 확인된 수신자 이름이 제2 음성인식 결과로부터 확인된 수신자 이름과 상이한 경우, 제2 음성인식 결과에 대한 신뢰도를 확인한다. 제2 음성인식 결과에 대해 확인된 신뢰도가 기준치를 초과하는 경우(S270), 음성 인식 처리 장치(100)는 제2 음성인식 결과를 이용하여 제1 음성인식 결과로부터 확인된 수신자 이름을 보정한다(S280). At this time, the speech recognition processing apparatus 100 confirms the reliability of the second speech recognition result when the receiver name identified from the first speech recognition result is different from the receiver name confirmed from the second speech recognition result. If the confirmed reliability of the second speech recognition result exceeds the reference value (S270), the speech recognition processing device 100 corrects the receiver name identified from the first speech recognition result using the second speech recognition result ( S280).

반면, 음성 인식 처리 장치(100)는 제2 음성인식 결과에 대해 확인된 신뢰도가 기준치 이하인 경우(S270), 폰북 데이터를 호출하고(S290), 제1 음성인식 결과로부터 확인된 수신자 이름과 폰북 데이터의 이름을 매칭하여(S300), 제1 음성인식 결과로부터 확인된 수신자 이름을 보정한다(S310).On the other hand, when the confirmed reliability of the second speech recognition result is less than the reference value (S270), the speech recognition processing apparatus 100 calls the phonebook data (S290) (S300), and corrects the receiver name identified from the first speech recognition result (S310).

이후, 음성 인식 처리 장치(100)는 'S280' 또는 'S310' 과정에서 보정된 수신자 이름과, 'S240' 과정에서 확인된 문자 메시지 내용을 이용하여 최종 문자 메시지를 구성하고, 디스플레이 화면을 통해 출력하여 최종 문자 메시지를 확인한다(S320). 이때, 음성 인식 처리 장치(100)는 'S320' 과정에서 확인된 문자 메시지를 수신자 단말로 전송하거나, 문자 메시지의 전송을 처리하는 차량 내 제어유닛으로 전송할 수 있다(S330).Then, the speech recognition processor 100 constructs a final text message using the recipient name corrected in the process of 'S280' or 'S310' and the content of the text message confirmed in the process of 'S240' And confirms the final text message (S320). At this time, the voice recognition processing apparatus 100 may transmit the text message confirmed in the process of 'S320' to the receiver terminal or may transmit the text message to the in-vehicle control unit for processing the transmission of the text message (S330).

도 9는 본 발명의 제3 실시예에 따른 음성 인식 처리 방법의 동작 흐름을 나타낸 도면이다.9 is a flowchart illustrating an operation of the speech recognition processing method according to the third embodiment of the present invention.

도 9를 참조하면, 음성 인식 처리 장치(100)는 문자 메시지 전송 기능이 실행되고(S400), 사용자로부터 발화된 음성이 입력되면(S410), 입력된 음성 데이터를 차량 단말(10) 및 서버(50) 내 음성인식 엔진으로 각각 전송한다(S420).9, the speech recognition processing apparatus 100 executes a text message transmission function (S400). When a voice uttered by the user is inputted (S410), the voice recognition processing apparatus 100 transmits the inputted voice data to the vehicle terminal 10 and the server 50), respectively (S420).

이때, 음성 인식 처리 장치(100)는 통신 네트워크의 상태 악화 등으로 인해 서버(50)로부터 제1 엔진에 의한 제1 음성인식 결과가 수신되지 않고(S430), 차량 단말(10)로부터 제2 엔진에 의한 제2 음성인식 결과만 수신되었다면(S440), 수신된 제2 음성인식 결과로부터 수신자 이름을 확인한다(S450).At this time, the speech recognition processing apparatus 100 does not receive the first speech recognition result by the first engine from the server 50 (S430) due to the deterioration of the communication network, (S440), the recipient name is confirmed from the received second speech recognition result (S450).

음성 인식 처리 장치(100)는 서버(50)로부터 제1 음성인식 결과를 수신하지 못하였으므로, 사용자에게 문자 메시지 내용의 재발화를 요청하고(S460), 사용자에 의해 재발화된 음성에 대한 음성 인식 결과에 근거하여 문자 메시지 내용을 확인한다(S470).Since the speech recognition processing apparatus 100 has failed to receive the first speech recognition result from the server 50, the speech recognition processing apparatus 100 requests the user to recapture the contents of the text message (S460) The contents of the text message are confirmed based on the result (S470).

이후, 음성 인식 처리 장치(100)는 'S450' 과정에서 확인된 수신자 이름과, 'S470' 과정에서 확인된 문자 메시지 내용을 이용하여 최종 문자 메시지를 구성하고, 디스플레이 화면을 통해 출력하여 최종 문자 메시지를 확인한다(S480). 이때, 음성 인식 처리 장치(100)는 'S480' 과정에서 확인된 문자 메시지를 수신자 단말로 전송하거나, 문자 메시지의 전송을 처리하는 차량 내 제어유닛으로 전송할 수 있다(S490).Then, the speech recognition processor 100 constructs a final text message using the recipient name confirmed in the process of 'S450' and the contents of the text message confirmed in the process of 'S470', outputs the final text message through a display screen, (S480). At this time, the voice recognition processor 100 may transmit the text message confirmed in the process of 'S480' to the receiver terminal or may transmit the text message to the in-vehicle control unit for processing the transmission of the text message (S490).

도 10은 본 발명의 일 실시예에 따른 방법이 실행되는 컴퓨팅 시스템을 도시한 도면이다.10 is a diagram illustrating a computing system in which a method according to one embodiment of the invention is implemented.

도 10을 참조하면, 컴퓨팅 시스템(1000)은 버스(1200)를 통해 연결되는 적어도 하나의 프로세서(1100), 메모리(1300), 사용자 인터페이스 입력 장치(1400), 사용자 인터페이스 출력 장치(1500), 스토리지(1600), 및 네트워크 인터페이스(1700)를 포함할 수 있다. 10, a computing system 1000 includes at least one processor 1100 connected via a bus 1200, a memory 1300, a user interface input device 1400, a user interface output device 1500, (1600), and a network interface (1700).

프로세서(1100)는 중앙 처리 장치(CPU) 또는 메모리(1300) 및/또는 스토리지(1600)에 저장된 명령어들에 대한 처리를 실행하는 반도체 장치일 수 있다. 메모리(1300) 및 스토리지(1600)는 다양한 종류의 휘발성 또는 불휘발성 저장 매체를 포함할 수 있다. 예를 들어, 메모리(1300)는 ROM(Read Only Memory) 및 RAM(Random Access Memory)을 포함할 수 있다. The processor 1100 may be a central processing unit (CPU) or a memory device 1300 and / or a semiconductor device that performs processing for instructions stored in the storage 1600. Memory 1300 and storage 1600 may include various types of volatile or non-volatile storage media. For example, the memory 1300 may include a ROM (Read Only Memory) and a RAM (Random Access Memory).

따라서, 본 명세서에 개시된 실시예들과 관련하여 설명된 방법 또는 알고리즘의 단계는 프로세서(1100)에 의해 실행되는 하드웨어, 소프트웨어 모듈, 또는 그 2개의 결합으로 직접 구현될 수 있다. 소프트웨어 모듈은 RAM 메모리, 플래시 메모리, ROM 메모리, EPROM 메모리, EEPROM 메모리, 레지스터, 하드 디스크, 착탈형 디스크, CD-ROM과 같은 저장 매체(즉, 메모리(1300) 및/또는 스토리지(1600))에 상주할 수도 있다. 예시적인 저장 매체는 프로세서(1100)에 커플링되며, 그 프로세서(1100)는 저장 매체로부터 정보를 판독할 수 있고 저장 매체에 정보를 기입할 수 있다. 다른 방법으로, 저장 매체는 프로세서(1100)와 일체형일 수도 있다. 프로세서 및 저장 매체는 주문형 집적회로(ASIC) 내에 상주할 수도 있다. ASIC는 사용자 단말기 내에 상주할 수도 있다. 다른 방법으로, 프로세서 및 저장 매체는 사용자 단말기 내에 개별 컴포넌트로서 상주할 수도 있다.Thus, the steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by processor 1100, or in a combination of the two. The software module may reside in a storage medium (i.e., memory 1300 and / or storage 1600) such as a RAM memory, a flash memory, a ROM memory, an EPROM memory, an EEPROM memory, a register, a hard disk, a removable disk, You may. An exemplary storage medium is coupled to the processor 1100, which can read information from, and write information to, the storage medium. Alternatively, the storage medium may be integral to the processor 1100. [ The processor and the storage medium may reside within an application specific integrated circuit (ASIC). The ASIC may reside within the user terminal. Alternatively, the processor and the storage medium may reside as discrete components in a user terminal.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. The foregoing description is merely illustrative of the technical idea of the present invention, and various changes and modifications may be made by those skilled in the art without departing from the essential characteristics of the present invention.

따라서, 본 발명에 개시된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.Therefore, the embodiments disclosed in the present invention are intended to illustrate rather than limit the scope of the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments. The scope of protection of the present invention should be construed according to the following claims, and all technical ideas within the scope of equivalents should be construed as falling within the scope of the present invention.

10: 차량 단말 11: 폰북 DB
15: 음성인식 엔진 50: 서버
55: 음성인식 엔진 100: 음성 인식 처리 장치
110: 제어부 120: 인터페이스부
130: 통신부 140: 저장부
150: 음성인식 트리 생성부 160: 음성인식 처리부
170: 보정부10: vehicle terminal 11: phonebook DB
15: speech recognition engine 50: server
55: speech recognition engine 100: speech recognition processing device
110: control unit 120: interface unit
130: communication unit 140:
150: speech recognition tree generation unit 160: speech recognition processing unit
170:

Claims

When the text message transmission function is executed, transmits voice data uttered by the user to the vehicle terminal and the server, respectively, to request voice recognition processing, extracts a receiver name and a text message from voice recognition results of the vehicle terminal and the voice recognition engine in the server A voice recognition processing unit for confirming the contents and composing a text message; And
And a correcting unit for correcting the receiver name confirmed from the first speech recognition result of the intra-server speech recognition engine based on the second speech recognition result of the speech recognition engine in the vehicle terminal.
And a speech recognition unit for recognizing the speech.

The method according to claim 1,
The speech recognition processing unit,
Confirms the recipient name and contents of the text message from the first speech recognition result, and confirms the recipient name from the second speech recognition result.

The method according to claim 1,
Wherein the second speech recognition result includes:
Wherein the recognition result is a result recognized through text matching of text recognized as speech data with textbook data stored in the phonebook DB in the vehicle terminal.

The method according to claim 1,
Wherein,
And correcting the receiver name identified from the first speech recognition result according to the reliability of the second speech recognition result when the receiver names identified from the first speech recognition result and the second speech recognition result are different from each other And the speech recognition processing unit.

The method of claim 4,
Wherein,
And corrects the receiver name confirmed from the first speech recognition result of the intra-server speech recognition engine based on the second speech recognition result when the reliability of the second speech recognition result exceeds the reference value. .

The method of claim 4,
Wherein,
And corrects the receiver name identified from the first voice recognition result by matching the receiver name and the phone book data name identified from the first voice recognition result when the reliability of the second voice recognition result is less than the reference value The speech recognition apparatus comprising:

The method according to claim 1,
The speech recognition processing unit,
And a final text message is configured based on the corrected recipient name and the content of the text message confirmed from the first speech recognition result.

The method according to claim 1,
The speech recognition processing unit,
And when the first voice recognition result is not received according to the network communication state of the server, a message for requesting recall of the contents of the text message is output to the user.

The method according to claim 1,
The speech recognition processing unit,
If the contents of the text message can not be confirmed from the first speech recognition result, a message for requesting the user to recall the contents of the text message is output.

The method according to claim 8 or 9,
The speech recognition processing unit,
Wherein the voice recognition unit constructs a final text message based on a voice recognition result of the voice recalled by the user and a recipient name confirmed from the second voice recognition result.

The method according to claim 1,
The speech recognition processing unit,
Outputs the configured text message through a display screen, and transmits the configured text message according to a user's response.

The method according to claim 1,
Further comprising a speech recognition tree generation unit for converting a name of a gate command and a phonebook data related to a text message into a pronunciation column of the speech recognition engine and composing a speech recognition tree based on the converted pronunciation column, Processing device.

When the text message transmission function is executed, transmitting voice data from the user to the vehicle terminal and the server, respectively, and requesting speech recognition processing;
Confirming a recipient name and contents of a text message based on a first speech recognition result of the speech recognition engine in the server and a second speech recognition result of the speech recognition engine in the vehicle terminal;
Correcting the recipient name identified from the first speech recognition result of the intra-server speech recognition engine based on the second speech recognition result; And
Constructing a final text message based on the corrected recipient name and the content of the text message identified from the first speech recognition result
The speech recognition method comprising the steps of:

14. The method of claim 13,
Wherein the step of verifying the recipient name and the text message content comprises:
Confirms the recipient name and contents of the text message from the first speech recognition result, and confirms the recipient name from the second speech recognition result.

14. The method of claim 13,
Further comprising confirming the reliability of the second speech recognition result when the first and second speech recognition results are different from each other.

16. The method of claim 15,
Wherein the correcting comprises:
If the reliability of the second speech recognition result exceeds a reference value, corrects the receiver name identified from the first speech recognition result of the intra-server speech recognition engine based on the second speech recognition result, And correcting the recipient name identified from the first speech recognition result by matching the recipient name and the name of the phone book data identified from the result.

14. The method of claim 13,
When the first speech recognition result is not received or the confirmation of the contents of the text message can not be confirmed from the first speech recognition result according to the network communication state of the server, a message requesting the user to re- And outputting the speech recognition result.

18. The method of claim 17,
Further comprising the step of constructing a final text message based on a speech recognition result for the voice recalled by the user and a recipient name confirmed from the second speech recognition result.

14. The method of claim 13,
Outputting the configured text message through a display screen; And
And transmitting the configured text message according to a user's response to the output text message.

14. The method of claim 13,
Converting a name of a gate command and phone book data related to a text message into a pronunciation column of the speech recognition engine; And
And constructing a speech recognition tree based on the converted pronunciation string.

A vehicle terminal having a voice recognition engine and a phonebook DB in which phonebook data is stored and performing voice recognition through voice recognition of the voice recognition engine and text matching of the phonebook data with respect to voice data inputted;
A server having a speech recognition engine and performing speech recognition on the input speech data; And
Confirms the recipient name and contents of the text message based on the first speech recognition result received from the server with respect to the speech data, confirms the recipient name based on the second speech recognition result received from the vehicle terminal, 2 speech recognition processing unit for composing a text message by correcting the receiver name identified from the first speech recognition result based on the speech recognition result,
&Lt; / RTI >