KR101579537B1

KR101579537B1 - Vehicle and method of controlling voice recognition of vehicle

Info

Publication number: KR101579537B1
Application number: KR1020140139713A
Authority: KR
Inventors: 허동필; 조성동; 이윤재; 임규형
Original assignee: 현대자동차주식회사
Priority date: 2014-10-16
Filing date: 2014-10-16
Publication date: 2015-12-22

Abstract

Disclosed are a vehicle and a voice recognition control method of a vehicle. The present invention has a purpose of automatically distinguishing a reserved word (command word) for voice recognition control of an AVN from a reserved word (command word) for voice recognition control of a mobile terminal and transmitting the same to a corresponding voice recognition processing unit when a driver generates a voice signal of a command word for the voice recognition control. To this end, the present invention provides the vehicle including a voice recognition based multimedia device and connecting the multimedia device to a voice recognition based external device, comprises: a voice recognition processing unit of the multimedia device receiving the voice signal and converting the voice signal into corresponding text data; and a relay unit transmitting the text data to the multimedia device and controlling the voice recognition of the multimedia device when the text data is the reserved word for the voice recognition control of the multimedia device, or transmitting the text data to the external device and controlling the voice recognition of the external device when the text data is not the reserved word for the voice recognition control of the multimedia device.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a voice recognition method,

본 발명은 자동차에 관한 것으로, 특히 음성 인식 제어를 기반으로 하는 자동차에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a vehicle, and more particularly, to a vehicle based on voice recognition control.

음성 인식이란 음성에 포함된 음향학적 정보로부터 음운 즉 언어적 정보를 추출하여 이를 기계가 인지하고 반응하도록 하는 일련의 과정이다. 음성으로 대화하는 것은 수많은 인간과 기계의 정보 교환 매체 중 가장 자연스럽고 간편한 방법으로 인식되고 있지만 기계와 음성으로 대화하기 위해서는 인간의 음성을 기계가 처리할 수 있는 코드로 변환을 해줘야 하는 제약이 따르게 된다. 이런 코드로 변환해주는 과정이 바로 음성 인식이다.Speech recognition is a series of processes in which phonemes, or linguistic information, are extracted from the acoustic information contained in the voice and are recognized and reacted by the machine. Speech conversation is perceived as the most natural and easy way to communicate among a large number of human and machine information exchanges, but in order to communicate with machine and voice, there is a restriction to convert the human voice into a code that can be processed by the machine . Speech recognition is the process of converting these codes.

자동차에서도 사용자의 편의를 위해 음성 인식 제어를 기반으로 하는 사용자 인터페이스가 마련되어 있다. 특히 멀티미디어 기기의 한 종류인 오디오/비디오/내비게이션(Audio/Video/Navigation, 이하 AVN)에도 음성 인식 기반의 제어가 이루어져서 직접 버튼을 누르거나 터치스크린을 터치하지 않고도 AVN의 기능을 사용할 수 있다.In the automobile, a user interface based on voice recognition control is provided for the convenience of the user. In particular, the audio / video / navigation (AVN), which is a kind of multimedia device, is controlled based on voice recognition so that the function of the AVN can be used without directly pressing the button or touching the touch screen.

특히, AVN 중에는 모바일 단말기와 링크를 통해 동기화됨으로써 모바일 단말기에서 수행되는 기능을 마치 AVN에서 수행되는 것처럼 활용할 수 있다. 예를 들면, 모바일 단말기에서 수행되는 내비게이션 기능을 AVN의 디스플레이에서도 동일하게 구현함으로써 운전자가 AVN을 통해 모바일 단말기의 내비게이션 기능을 이용할 수 있다.Particularly, the AVN is synchronized with the mobile terminal through the link, so that the function performed in the mobile terminal can be utilized as if it is performed in the AVN. For example, the navigation function performed in the mobile terminal is implemented in the same manner on the display of the AVN, so that the driver can use the navigation function of the mobile terminal through the AVN.

이와 같은 AVN과 모바일 단말기의 동기화는 자동차의 AVN을 통해 모바일 단말기의 음성 인식 제어 기능을 활용할 수 있도록 해준다. 운전자가 AVN과 모바일 단말기가 링크를 통해 동기화된 상태에서 자동차의 마이크로폰을 통해 음성 신호를 발생시키면 AVN의 음성 인식 제어는 물론 모바일 단말기의 음성 인식 제어 기능도 이용할 수 있다.Synchronization of the AVN with the mobile terminal makes it possible to utilize the voice recognition control function of the mobile terminal through the AVN of the car. If the driver generates a voice signal through the microphone of the vehicle while the AVN and the mobile terminal are synchronized through the link, the voice recognition control of the mobile terminal as well as the voice recognition control of the AVN can be utilized.

이 때 문제가 되는 것은 AVN의 음성 인식 제어를 위한 예약어(명령어) 체계와 모바일 단말기의 음성 인식 제어를 위한 예약어(명령어) 체계가 서로 다를 때, 운전자가 음성 인식 제어를 위한 명령어의 음성 신호를 발생시킬 때 그것이 AVN의 음성 인식 제어를 위한 예약어(명령어)인지 아니면 모바일 단말기의 음성 인식 제어를 위한 예약어(명령어)인지를 의도적으로 구분해 주어야 한다는 것이다. 예를 들면 AVN의 음성 인식 제어를 위한 예약어(명령어)를 발화할 때에는 음성 인식 버튼을 ‘짧게’ 1회 누른 뒤 발화하고, 모바일 단말기의 음성 인식 제어를 위한 예약어(명령어)를 발화할 때에는 음성 인식 버튼을 ‘길게’ 1회 누른 뒤 발화하는 형태이다.The problem at this time is that when the reserved word (command) system for voice recognition control of AVN and the reserved word (command) system for voice recognition control of mobile terminal are different, the driver generates voice signal of command for voice recognition control It is necessary to distinguish intentionally whether it is a reserved word (command word) for voice recognition control of AVN or a reserved word (command word) for voice recognition control of mobile terminal. For example, when a reserved word (command word) for speech recognition control of the AVN is uttered, the speech recognition button is shortened by one short press and the utterance of a reserved word (command word) It is a form to press the button 'long' once and then to fire.

이처럼, 운전자가 음성 인식 제어를 위한 명령어의 음성 신호를 발생시킬 때 그것이 AVN의 음성 인식 제어를 위한 예약어(명령어)인지 아니면 모바일 단말기의 음성 인식 제어를 위한 예약어(명령어)인지를 의도적으로 구분해 주어야 하기 때문에 자칫 운전에 집중하지 못할 수도 있고, 또 AVN의 음성 인식 제어를 위한 예약어(명령어)와 모바일 단말기의 음성 인식 제어를 위한 예약어(명령어)를 구분하기 위한 음성 인식 버튼의 조작 약속(‘짧게’ 또는 ‘길게’)을 기억하지 못하면 잘 못된 예약어의 발화로 인해 운전자가 원하는 음성 인식 제어가 이루어지지 못할 수도 있다.In this way, when a driver generates a voice signal of a command for voice recognition control, it must deliberately distinguish whether it is a reserved word (command word) for voice recognition control of the AVN or a reserved word (command word) for voice recognition control of the mobile terminal And the operation promise of the voice recognition button for discriminating the reserved word (command) for voice recognition control of the AVN from the reserved word (command word) for voice recognition control of the mobile terminal ('short' Or " long "), the speech recognition control desired by the driver may not be performed due to the incorrect speech of the reserved words.

본 발명의 일 측면에 따르면, 운전자가 음성 인식 제어를 위한 명령어의 음성 신호를 발생시킬 때 AVN의 음성 인식 제어를 위한 예약어(명령어)와 모바일 단말기의 음성 인식 제어를 위한 예약어(명령어)를 자동으로 구분하여 해당 음성 인식 처리부로 전송되도록 하는데 그 목적이 있다. According to an aspect of the present invention, when a driver generates a voice signal of a command for voice recognition control, a reserved word (command word) for voice recognition control of the AVN and a reserved word (command word) for voice recognition control of the mobile terminal are automatically And to be transmitted to the corresponding speech recognition processing unit.

상술한 목적의 본 발명에 따른 자동차는, 음성 인식 기반의 멀티미디어 기기를 포함하고, 음성 인식 기반의 외부 기기가 멀티미디어 기기에 연결되는 자동차에 있어서, 음성 신호를 수신하여 음성 신호에 상응하는 텍스트 데이터로 변환하는 멀티미디어 기기의 음성 인식 처리부와; 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어일 때 텍스트 데이터를 멀티미디어 기기로 전송하여 멀티미디어 기기의 음성 인식 제어가 이루어지도록 하며, 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어가 아닐 때 텍스트 데이터를 외부 기기로 전송하여 외부 기기의 음성 인식 제어가 이루어지도록 하는 중계 수단을 포함한다.According to another aspect of the present invention, there is provided an automobile including a multimedia device based on speech recognition, wherein an external device based on speech recognition is connected to the multimedia device, the automobile receiving the audio signal and generating text data corresponding to the audio signal A voice recognition processing unit of the multimedia device to be converted; When the text data is a reserved word for controlling the voice recognition of the multimedia device, the text data is transmitted to the multimedia device so that the voice recognition control of the multimedia device is performed. When the text data is not a reserved word for controlling the voice recognition of the multimedia device, To the external device so that the voice recognition control of the external device is performed.

또한 상술한 자동차에서, 텍스트 데이터의 패턴 매칭을 통해 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어인지를 판단한다.Also, in the above-mentioned automobile, it is determined through the pattern matching of the text data whether the text data is a reserved word for speech recognition control of the multimedia device.

또한 상술한 자동차에서, 텍스트 데이터의 의도 분석을 통해 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어인지를 판단한다.Also, in the automobile described above, it is determined through the intention analysis of the text data whether the text data is a reserved word for speech recognition control of the multimedia device.

또한 상술한 자동차에서, 텍스트 데이터가 명령문일 때 멀티미디어 기기의 음성 인식 제어를 위한 예약어인 것으로 판단한다.Also, in the above-mentioned automobile, when the text data is a statement, it is judged that it is a reserved word for voice recognition control of the multimedia device.

또한 상술한 자동차에서, 텍스트 데이터가 의문문일 때 멀티미디어 기기의 음성 인식 제어를 위한 예약어가 아닌 것으로 판단한다.Also, in the above-described automobile, it is determined that the text data is not a reserved word for voice recognition control of the multimedia device when the text data is a question text.

또한 상술한 자동차에서, 중계 수단이 멀티미디어 기기를 위한 미들웨어이다.Also, in the automobile described above, the relay means is middleware for multimedia devices.

또한 상술한 자동차에서, 미들웨어가 게이트웨이를 포함하고; 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어인지를 게이트웨이가 판단한다.Also in the automobile described above, the middleware includes a gateway; The gateway determines whether the text data is a reserved word for voice recognition control of the multimedia device.

또한 상술한 자동차에서, 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어가 아닐 때 텍스트 데이터가 외부 기기를 통해 외부 기기의 음성 인식 제어부로 전송된다.Also, in the automobile, when the text data is not a reserved word for controlling the voice recognition of the multimedia device, the text data is transmitted to the voice recognition control unit of the external device through the external device.

상술한 목적의 본 발명에 따른 자동차의 음성 인식 제어 방법은, 음성 인식 기반의 멀티미디어 기기를 포함하고, 음성 인식 기반의 외부 기기가 멀티미디어 기기에 연결되는 자동차의 음성 인식 제어 방법에 있어서, 음성 신호를 수신하여 음성 신호에 상응하는 텍스트 데이터로 변환하는 단계와; 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어일 때 텍스트 데이터를 멀티미디어 기기로 전송하여 멀티미디어 기기의 음성 인식 제어가 이루어지도록 하는 단계와; 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어가 아닐 때 텍스트 데이터를 외부 기기로 전송하여 외부 기기의 음성 인식 제어가 이루어지도록 하는 단계를 포함한다.A method for controlling a voice recognition of an automobile according to the present invention for a vehicle having a voice recognition based multimedia device and a voice recognition based external device connected to the multimedia device, Converting the received speech data into text data corresponding to the speech signal; Transmitting text data to a multimedia device when the text data is a reserved word for voice recognition control of the multimedia device, so that voice recognition control of the multimedia device is performed; And transmitting the text data to an external device when the text data is not a reserved word for controlling the voice recognition of the multimedia device so that the voice recognition control of the external device is performed.

또한 상술한 자동차의 음성 인식 제어 방법에서, 텍스트 데이터의 패턴 매칭을 통해 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어인지를 판단한다.Also, in the above-described voice recognition control method for a car, it is determined whether text data is a reserved word for voice recognition control of a multimedia device through pattern matching of text data.

또한 상술한 자동차의 음성 인식 제어 방법에서, 텍스트 데이터의 의도 분석을 통해 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어인지를 판단한다.Also, in the above-described voice recognition control method for a car, it is determined through the intention analysis of the text data whether the text data is a reserved word for voice recognition control of the multimedia device.

또한 상술한 자동차의 음성 인식 제어 방법에서, 텍스트 데이터가 명령문일 때 멀티미디어 기기의 음성 인식 제어를 위한 예약어인 것으로 판단한다.Also, in the above-described voice recognition control method for a car, it is determined that the text data is a reserved word for speech recognition control of the multimedia device when the text data is a statement.

또한 상술한 자동차의 음성 인식 제어 방법에서, 텍스트 데이터가 의문문일 때 멀티미디어 기기의 음성 인식 제어를 위한 예약어가 아닌 것으로 판단한다.Further, in the above-described voice recognition control method for a car, it is determined that the text data is not a reserved word for voice recognition control of the multimedia device when the text data is a question text.

또한 상술한 자동차의 음성 인식 제어 방법에서, 중계 수단이 멀티미디어 기기를 위한 미들웨어이다.Also, in the above-described voice recognition control method for a car, the relay means is middleware for a multimedia device.

또한 상술한 자동차의 음성 인식 제어 방법에서, 미들웨어가 게이트웨이를 포함하고; 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어인지를 게이트웨이가 판단한다.Further, in the above-described voice recognition control method for a car, the middleware includes a gateway; The gateway determines whether the text data is a reserved word for voice recognition control of the multimedia device.

또한 상술한 자동차의 음성 인식 제어 방법에서, 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어가 아닐 때 텍스트 데이터가 외부 기기를 통해 외부 기기의 음성 인식 제어부로 전송된다.Also, in the above-described voice recognition control method for a car, when text data is not a reserved word for voice recognition control of a multimedia device, text data is transmitted to a voice recognition control unit of an external device through an external device.

상술한 목적의 본 발명에 따른 또 다른 자동차는, 음성 인식 기반의 멀티미디어 기기를 포함하고, 음성 인식 기반의 외부 기기가 멀티미디어 기기에 연결되는 자동차에 있어서, 음성 신호가 입력되는 마이크로폰과; 마이크로폰을 통해 음성 신호를 수신하여 음성 신호에 상응하는 텍스트 데이터로 변환하는 멀티미디어 기기의 음성 인식 처리부와; 마이크로폰과 멀티미디어 기기의 음성 인식 처리부 사이에 마련되고, 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어일 때 텍스트 데이터를 멀티미디어 기기로 전송하여 멀티미디어 기기의 음성 인식 제어가 이루어지도록 하며, 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어가 아닐 때 텍스트 데이터를 외부 기기로 전송하여 외부 기기의 음성 인식 제어가 이루어지도록 하는 중계 수단을 포함한다.According to another aspect of the present invention, there is provided an automobile including a multimedia device based on speech recognition, the external device based on speech recognition being connected to a multimedia device, the automobile comprising: a microphone for inputting a voice signal; A voice recognition processor of a multimedia device that receives a voice signal through a microphone and converts the voice signal into text data corresponding to the voice signal; The present invention relates to a method and apparatus for controlling a voice recognition of a multimedia device by transmitting text data to a multimedia device when the text data is a reserved word for voice recognition control of the multimedia device, And a relay means for transmitting the text data to an external device when the device is not a reserved word for voice recognition control so that voice recognition control of the external device is performed.

상술한 목적의 본 발명에 따른 또 다른 자동차의 제어 방법은, 음성 인식 기반의 멀티미디어 기기를 포함하고, 음성 인식 기반의 외부 기기가 멀티미디어 기기에 연결되는 자동차에 있어서, 음성 신호가 입력되는 마이크로폰과; 마이크로폰을 통해 음성 신호를 수신하여 음성 신호에 상응하는 텍스트 데이터로 변환하는 멀티미디어 기기의 음성 인식 처리부와; 멀티미디어 기기의 음성 인식 처리부의 출력 측에 마련되고, 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어일 때 텍스트 데이터를 멀티미디어 기기로 전송하여 멀티미디어 기기의 음성 인식 제어가 이루어지도록 하며, 텍스트 데이터가 멀티미디어 기기의 음성 인식 제어를 위한 예약어가 아닐 때 텍스트 데이터를 외부 기기로 전송하여 외부 기기의 음성 인식 제어가 이루어지도록 하는 중계 수단을 포함한다.According to another aspect of the present invention, there is provided a method for controlling a vehicle, the method comprising: a microphone including a voice recognition-based multimedia device, the voice recognition-based external device being connected to the multimedia device; A voice recognition processor of a multimedia device that receives a voice signal through a microphone and converts the voice signal into text data corresponding to the voice signal; The present invention relates to a multimedia device, and more particularly, to a multimedia device, which is provided at an output side of a speech recognition processing unit of a multimedia device, and transmits text data to a multimedia device when the text data is a reserved word for voice recognition control of the multimedia device, And a relay means for transmitting the text data to an external device when the device is not a reserved word for voice recognition control so that voice recognition control of the external device is performed.

본 발명의 일 측면에 따르면, 운전자가 음성 인식 제어를 위한 명령어의 음성 신호를 발생시킬 때 AVN의 음성 인식 제어를 위한 예약어(명령어)와 모바일 단말기의 음성 인식 제어를 위한 예약어(명령어)를 자동으로 구분하여 해당 음성 인식 처리부로 전송함으로써 운전자의 편의를 도모하고 운전자가 원하는 음성 인식 제어가 이루어질 수 있도록 한다.According to an aspect of the present invention, when a driver generates a voice signal of a command for voice recognition control, a reserved word (command word) for voice recognition control of the AVN and a reserved word (command word) for voice recognition control of the mobile terminal are automatically And transmits it to the voice recognition processing unit so that the convenience of the driver can be improved and the voice recognition control desired by the driver can be performed.

도 1은 본 발명의 실시 예에 따른 자동차의 내부를 나타낸 도면이다.
도 2는 도 1에 나타낸 자동차의 AVN과 모바일 단말기의 연결을 나타낸 도면이다.
도 3은 본 발명의 실시 예에 따른 자동차의 AVN의 구성을 나타낸 도면이다.
도 4는 도 3에 나타낸 미들웨어의 구성을 나타낸 도면이다.
도 5는 도 3에 나타낸 음성 인식 처리부의 구성을 나타낸 도면이다.
도 6은 본 발명의 실시 예에 따른 자동차의 음성 인식 제어 방법을 나타낸 도면이다.
도 7은 본 발명의 또 다른 실시 예에 따른 자동차의 AVN의 구성을 나타낸 도면이다.
도 8은 본 발명의 또 다른 실시 예에 따른 자동차의 음성 인식 제어 방법을 나타낸 도면이다.1 is a view showing the inside of a vehicle according to an embodiment of the present invention.
FIG. 2 is a diagram illustrating a connection between an AVN of a car shown in FIG. 1 and a mobile terminal.
3 is a diagram showing a configuration of an AVN of a vehicle according to an embodiment of the present invention.
4 is a diagram showing the configuration of the middleware shown in FIG.
5 is a diagram showing a configuration of the speech recognition processing unit shown in Fig.
6 is a diagram illustrating a speech recognition control method for a vehicle according to an embodiment of the present invention.
7 is a diagram illustrating a configuration of an AVN of a vehicle according to another embodiment of the present invention.
8 is a diagram illustrating a method of controlling speech recognition of a vehicle according to another embodiment of the present invention.

도 1은 본 발명의 실시 예에 따른 자동차의 내부를 나타낸 도면이다. 도 1에 나타낸 바와 같이, 운전석의 전면에는 AVN(100)과 스티어링 휠(102)이 장착된다. AVN(100)은 오디오(Audio)/비디오(Video)/내비게이션(Navigation)이 일체화된 것으로서, 디스플레이(도 2의 214 참조)를 포함한다. AVN(100)은 음성 인식 제어를 기반으로 한다. 이를 위해 스티어링 휠(102)에는 음성 인식 버튼(104)이 장착된다. 또한 운전석의 상부에는 마이크로폰(106)이 장착된다. 운전석의 좌측 도어와 동반자석의 우측 도어에는 스피커(116)가 장착된다. 본 발명의 실시 예에 따른 자동차의 AVN(100)은 음성 인식 제어를 기반으로 하며, 음성 인식 버튼(104)과 마이크로폰(106), 스피커(116) 등은 AVN(100)의 음성 인식 제어를 위한 보조 도구로 사용될 수 있다.1 is a view showing the inside of a vehicle according to an embodiment of the present invention. 1, the AVN 100 and the steering wheel 102 are mounted on the front of the driver's seat. The AVN 100 is an integrated audio / video / navigation, and includes a display (see 214 in FIG. 2). The AVN 100 is based on speech recognition control. To this end, a voice recognition button 104 is mounted on the steering wheel 102. A microphone 106 is mounted on the upper portion of the driver's seat. A speaker 116 is mounted on the left door of the driver's seat and the right door of the accompanying magnet. The AVN 100 of the vehicle according to the embodiment of the present invention is based on voice recognition control and the voice recognition button 104, the microphone 106, the speaker 116 and the like are used for voice recognition control of the AVN 100 It can be used as an auxiliary tool.

도 2는 도 1에 나타낸 자동차의 AVN과 모바일 단말기의 연결을 나타낸 도면이다. 즉 도 2는 도 1의 A로 지시된 원 부분을 확대하여 나타낸 도면이다. 도 2에 나타낸 바와 같이, AVN(100)은 외부 기기인 모바일 단말기(252)와 링크를 통해 동기화됨으로써 모바일 단말기(252)에서 수행되는 기능을 마치 AVN(100)에서 수행되는 것처럼 활용할 수 있다. 예를 들면, 모바일 단말기(252)에서 수행되는 내비게이션 기능을 AVN(100)의 디스플레이에서도 동일하게 구현함으로써 운전자가 AVN(100)을 통해 모바일 단말기(252)의 내비게이션 기능을 이용할 수 있다. 내비게이션 기능뿐만 아니라 모바일 단말기(252)의 다른 기능들도 AVN(100)을 통해 이용할 수 있다.FIG. 2 is a diagram illustrating a connection between an AVN of a car shown in FIG. 1 and a mobile terminal. 2 is an enlarged view of a circle portion indicated by A in Fig. As shown in FIG. 2, the AVN 100 is synchronized with a mobile terminal 252, which is an external device, through a link, so that a function performed in the mobile terminal 252 can be utilized as if it is performed in the AVN 100. For example, the navigation function performed by the mobile terminal 252 is implemented in the same manner on the display of the AVN 100, so that the driver can use the navigation function of the mobile terminal 252 through the AVN 100. [ Other functions of the mobile terminal 252 as well as the navigation function may be available through the AVN 100.

이와 같은 AVN(100)과 모바일 단말기(252)의 동기화는 자동차의 AVN(100)을 통해 모바일 단말기(252)의 음성 인식 제어 기능을 활용할 수 있도록 해준다. AVN(100)과 모바일 단말기(252)가 링크를 통해 동기화된 상태에서 운전자가 자동차의 마이크로폰(106)을 통해 음성 신호를 발생시키면 AVN(100)의 음성 인식 제어는 물론 모바일 단말기(252)의 음성 인식 제어 기능도 이용할 수 있다.The synchronization between the AVN 100 and the mobile terminal 252 enables the voice recognition control function of the mobile terminal 252 to be utilized through the AVN 100 of the automobile. If the driver generates a voice signal through the microphone 106 of the vehicle while the AVN 100 and the mobile terminal 252 are synchronized via the link, the voice recognition control of the AVN 100 as well as the voice of the mobile terminal 252 A recognition control function is also available.

도 3은 본 발명의 실시 예에 따른 자동차의 AVN의 구성을 나타낸 도면이다. 도 3에 나타낸 AVN(100)은 음성 인식 제어를 기반으로 한다. 도 3에 나타낸 바와 같이, AVN(100)의 구성은 크게 음성 인식 기능을 위한 요소와, 일반적인 입력 기능을 위한 요소, 방송/통신 기능을 위한 요소, 내비게이션 기능을 위한 요소, 오디오/비디오 기능을 위한 요소, 복수의 기능에 공통적으로 사용될 수 있는 요소로 구분할 수 있다.3 is a diagram showing a configuration of an AVN of a vehicle according to an embodiment of the present invention. The AVN 100 shown in FIG. 3 is based on voice recognition control. As shown in FIG. 3, the AVN 100 includes elements for speech recognition, elements for general input functions, elements for broadcast / communication functions, elements for navigation functions, Elements, and elements that can be used in common for a plurality of functions.

음성 인식 기능을 위한 구성은 음성 인식 버튼(104)과 마이크로폰(106), 미들웨어(322), 음성 인식 처리부(308), 명령 출력 인터페이스(318)를 포함한다. AVN(100)의 구성 요소는 아니지만, 외부 기기로서의 모바일 단말기(252)를 통해 원격지의 서버에 마련되는 모바일 음성 인식 처리부(324)가 미들웨어(322) 및 제어부(312)에 통신 가능하도록 연결될 수 있다. 방송/통신 기능을 위한 요소는 안테나(352)와 튜너부(354), 방송 신호 처리부(356), 통신 신호 처리부(358)를 포함한다. 내비게이션 기능을 위한 요소는 내비게이션 데이터베이스(362)와 내비게이션 구동부(364)를 포함한다. 오디오/비디오 기능을 위한 요소는 오디오/비디오 입력부(372)와 오디오/비디오 재생부(374)를 포함한다. 일반적인 입력 기능을 위한 구성은 입력부(372)를 포함한다. 복수의 기능에 공통적으로 사용될 수 있는 요소는 메모리(310)와 제어부(312), 디스플레이(314), 스피커(116)를 포함한다. 이와 같은 기능 상의 구분은 위에 기재한 것에 한정되지 않으며, 어느 하나의 기능을 위한 요소가 다른 기능을 위해서도 사용될 수 있다.The configuration for the voice recognition function includes a voice recognition button 104, a microphone 106, a middleware 322, a voice recognition processing section 308, and a command output interface 318. The mobile voice recognition processor 324 provided in the remote server through the mobile terminal 252 as an external device may be connected to the middleware 322 and the controller 312 so as to be able to communicate with the AVN 100 . Elements for the broadcast / communication function include an antenna 352, a tuner unit 354, a broadcast signal processing unit 356, and a communication signal processing unit 358. The elements for the navigation function include a navigation database 362 and a navigation drive 364. [ The elements for the audio / video function include an audio / video input unit 372 and an audio / video playback unit 374. [ The configuration for a general input function includes an input 372. Elements that can be commonly used for a plurality of functions include a memory 310 and a control unit 312, a display 314, and a speaker 116. Such functional division is not limited to those described above, and an element for one function may be used for another function.

음성 인식 버튼(104)은 운전자가 AVN(100)의 오디오 기능과 비디오 기능, 내비게이션 기능, 정보 통신 기능 등의 복합 기능을 실행하여 이용할 수 있도록 한다. 이를 위해 음성 인식 버튼(104)은 푸쉬-투-토크(Push-To-Talk, PTT) 방식의 원-키 조작을 지원한다. 음성 인식 버튼(104)은 운전자가 운전 중에도 편리하게 조작할 수 있도록 스티어링 휠(102)에 설치될 수 있다. 스티어링 휠(102)은 자동차의 바퀴를 좌우로 움직여 자동차의 진행 방향을 변경하는데 사용되는 조향 장치이다. 운전자는 운전 중하는 동안 항상 스티어링 휠(102)을 파지하기 때문에, 음성 인식 버튼(104)을 스티어링 휠(102)에 설치하면 운전자가 운전 중에 음성 인식 버튼(104)을 편리하게 조작할 수 있다. 스티어링 휠(102) 외에, 운전자가 운전 중에 음성 인식 버튼(104)을 용이하게 조작할 수 있는 위치라면 자동차의 어느 위치에도 음성 인식 버튼(104)이 설치될 수 있다.The voice recognition button 104 allows the driver to use the audio function of the AVN 100 by executing a composite function such as a video function, a navigation function, and an information communication function. For this purpose, the voice recognition button 104 supports a one-key operation of a push-to-talk (PTT) scheme. The voice recognition button 104 may be installed on the steering wheel 102 so that the driver can operate the vehicle comfortably during operation. The steering wheel 102 is a steering device used to change the traveling direction of the automobile by moving the wheels of the automobile to the left and right. Since the driver grasps the steering wheel 102 all the time while the driver is driving, if the voice recognition button 104 is installed on the steering wheel 102, the driver can conveniently operate the voice recognition button 104 during operation. In addition to the steering wheel 102, if the driver is able to easily operate the voice recognition button 104 during operation, the voice recognition button 104 may be provided at any position of the vehicle.

마이크로폰(106)은 음성 인식 제어 기능이 실행 중인 상태에서 운전자가 발성하는 음성 신호를 수신하고 수신된 음성 신호를 전기 신호로 변환한다. 마이크로폰(106)은 음성 인식 제어를 위해 마련된 마이크로폰이거나, 자동차의 핸즈프리용 마이크로폰을 공유하는 것일 수 있다. 또한 마이크로폰(106)은 운전자가 휴대한 모바일 단말기의 마이크로폰 일 수 있다. 모바일 단말기의 마이크로폰을 이용할 경우 모바일 단말기와 AVN(100)은 블루투스 등의 근거리 통신을 통해 서로 연결되어야 한다.The microphone 106 receives a voice signal generated by the driver while the voice recognition control function is being executed and converts the received voice signal into an electric signal. The microphone 106 may be a microphone provided for speech recognition control or a microphone for hands-free use of a car. Also, the microphone 106 may be a microphone of a mobile terminal carried by the driver. When a microphone of the mobile terminal is used, the mobile terminal and the AVN 100 must be connected to each other through Bluetooth or other short-distance communication.

AVN(100)의 음성 인식 처리부(308)는 마이크로폰(106)에 의해 변환된 전기 신호를 미들웨어(322)를 통해 전달받아 변환된 전기 신호를 대상으로 음성 인식을 수행하고, 음성 인식의 결과로서 음성 명령 정보로서의 텍스트 데이터를 추출한다. 음성 인식 처리부(308)에서 추출된 텍스트 데이터는 제어부(312)에 전달되기에 앞서 미들웨어(322)로 전달된다.The speech recognition processing unit 308 of the AVN 100 receives the electrical signal converted by the microphone 106 through the middleware 322 and performs speech recognition on the converted electrical signal, And extracts text data as command information. The text data extracted by the speech recognition processing unit 308 is transmitted to the middleware 322 before being transmitted to the control unit 312. [

미들웨어(322)는 중계 수단으로서, AVN(100)의 음성 인식 처리부(308)로부터 전달받은 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 판단한다. 미들웨어(322)는 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어일 때 텍스트 데이터를 AVN(100)의 제어부(312)로 전송하여 AVN(100)의 음성 인식 제어가 이루어지도록 한다. 이와 달리, 만약 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어가 아닐 때 텍스트 데이터를 모바일 단말기(252) 로 전송하여 모바일 단말기(252)의 음성 인식 제어가 이루어지도록 한다. 즉, 운전자의 발화에 의해 생성되는 음성 신호가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 미들웨어(322)에서 자동으로 판단하여 중계한다. 이 과정에서 AVN(100)의 음성 인식 제어를 위한 예약어와 모바일 단말기(252)의 음성 인식 제어를 위한 예약어의 구분을 위한 운전자의 의도된 개입은 필요치 않다.The middleware 322 is a means for relaying the text data received from the voice recognition processing unit 308 of the AVN 100 to the AVN 100 for the voice recognition control of the mobile terminal 252 It is judged whether or not it is a reserved word. The middleware 322 transmits text data to the control unit 312 of the AVN 100 when the text data is a reserved word for voice recognition control of the AVN 100 so that voice recognition control of the AVN 100 is performed. Alternatively, if text data is not a reserved word for voice recognition control of AVN 100, text data is sent to mobile terminal 252 to allow voice recognition control of mobile terminal 252 to be performed. That is, the middleware 322 automatically determines whether the voice signal generated by the driver's utterance is a reserved word for voice recognition control of the AVN 100 or a reserved word for voice recognition control of the mobile terminal 252, and repeats the determination . In this process, it is not necessary for the driver to intervene to distinguish between the reserved words for the voice recognition control of the AVN 100 and the reserved words for the voice recognition control of the mobile terminal 252.

명령 출력 인터페이스(318)는 음성 인식의 결과로서 추출되는 음성 명령 정보에 상응하는 제어 명령의 신호를 제어부(312)로부터 제어 대상 장치로 전달하기 위한 것이다.The command output interface 318 is for transmitting a control command signal corresponding to voice command information extracted as a result of speech recognition from the control unit 312 to the control target apparatus.

안테나(352)는 방송 신호의 수신을 위한 목적 또는 통신 신호의 송신 및 수신을 위한 목적으로 공중의 전파를 받거나 또는 공중으로 전파를 보내기 위한 장치이다. 안테나(352)는 튜너부(354)에 통신 가능하도록 연결된다. 따라서 안테나(352)가 받은 전파는 튜너부(354)에 전달된다. 안테나(352)는 복수의 서로 다른 형태의 방송/통신 신호를 위해 복수의 형태의 안테나로 구성될 수 있다.The antenna 352 is a device for receiving a broadcast signal or for transmitting or receiving a communication signal, or for transmitting a radio wave to the air. The antenna 352 is communicably connected to the tuner unit 354. Therefore, the radio wave received by the antenna 352 is transmitted to the tuner unit 354. The antenna 352 may be composed of a plurality of types of antennas for a plurality of different types of broadcast / communication signals.

튜너부(354)는 안테나(352)가 받은 전파를 전달받아 중간 주파수 신호 등으로 변환한다. 또한 튜너부(354)는 송신하고자 하는 데이터 신호를 공중에 전파할 수 있는 형태로 변환하여 안테나(352)를 통해 공중으로 보낸다. 즉, 튜너부(354)는 특정 대역의 신호만을 추출하거나 반송파 신호에 데이터 신호를 결합하는 등의 작업을 수행한다. 튜너부(354)는 방송 신호의 수신과 통신 신호의 송신 및 수신을 수행한다. 방송 신호는 라디오 방송 신호와 디엠비(Digital Multimedia Broadcasting) 방송 신호를 포함할 수 있다. 통신 신호는 전지구 위치 파악 시스템(Global Positioning System) 위성(이하 GPS 위성)과의 위성 통신 신호를 포함할 수 있다. 또한 통신 신호는 텔레매틱스(Telematics)를 위한 통신 신호를 포함할 수 있다. 튜너부(354)에서 어떤 신호를 수신하여 처리할 것인지는 제어부(312)에서 튜너부(354)로 전달되는 제어 신호에 의해 결정된다. 예를 들면 제어부(312)에서 특정 채널의 라디오 방송 신호를 수신하도록 튜너부(354)로 제어 신호를 발생시키면 튜너부(354)는 제어부(312)로부터 전달되는 제어 신호에 응답하여 해당 채널의 라디오 방송 신호를 수신한다. 만약 제어부(312)에서 텔레매틱스 신호의 송신을 위한 제어 신호 및 송신 데이터를 튜너부(354)로 전달하면, 튜너부(354)는 제어부(312)로부터 전달되는 제어 신호에 응답하여 송신 데이터를 공중으로 보낼 수 있는 형태로 변환하고, 변환된 신호를 안테나(352)를 통해 공중으로 보낸다. 또한 튜너부(354)는 방송 신호에 포함되어 있는 방송 채널의 정보를 획득한다. 튜너부(354)에 입력되는 방송 신호에는 방송 채널의 명칭과 서비스 ID(IDentification), 방송 데이터가 포함된다. 튜너부(354)는 방송 신호에 포함되어 있는 방송 채널의 명칭과 서비스 ID, 방송 데이터를 추출하여 후단의 방송 신호 처리부(356)와 제어부(312)에 전달한다.The tuner unit 354 receives the radio wave received by the antenna 352 and converts it into an intermediate frequency signal or the like. The tuner unit 354 converts the data signal to be transmitted into a form that can be propagated to the air, and sends the data signal to the air through the antenna 352. That is, the tuner unit 354 performs operations such as extracting only a signal of a specific band or combining a data signal with a carrier signal. The tuner unit 354 performs reception of broadcast signals and transmission and reception of communication signals. The broadcast signal may include a radio broadcast signal and a DMB (Digital Multimedia Broadcasting) broadcast signal. The communication signal may include a satellite communication signal with a Global Positioning System satellite (hereinafter, GPS satellite). The communication signal may also include a communication signal for telematics. Which signal is to be received and processed by the tuner unit 354 is determined by the control signal transmitted to the tuner unit 354 in the control unit 312. For example, when the control unit 312 generates a control signal to the tuner unit 354 so as to receive a radio broadcast signal of a specific channel, the tuner unit 354 generates a radio signal of the corresponding channel in response to the control signal transmitted from the control unit 312. [ And receives broadcast signals. If the control unit 312 transmits a control signal and transmission data for transmitting the telematics signal to the tuner unit 354, the tuner unit 354 transmits the transmission data to the air in response to the control signal transmitted from the control unit 312 And transmits the converted signal through the antenna 352 to the air. In addition, the tuner unit 354 acquires information on a broadcast channel included in the broadcast signal. The broadcast signal input to the tuner unit 354 includes a name of a broadcast channel, a service ID (IDentification), and broadcast data. The tuner unit 354 extracts the name of the broadcast channel included in the broadcast signal, the service ID, and the broadcast data and transmits the same to the broadcast signal processing unit 356 and the control unit 312 at the subsequent stage.

방송 신호 처리부(356)는 튜너부(354)를 거친 방송 신호를 비디오 방송 신호와 오디오 방송 신호로 구분하여 일련의 신호 처리를 수행한다. 방송 신호 처리부(356)에서 이루어지는 일련의 신호 처리는 아날로그-디지털 변환이나 디지털-아날로그 변환, 비디오 데이터를 디스플레이(314)를 구동할 수 있는 형태의 신호로 변환하는 것 등을 포함할 수 있다.The broadcast signal processing unit 356 divides the broadcast signal passed through the tuner unit 354 into a video broadcast signal and an audio broadcast signal to perform a series of signal processing. A series of signal processing performed by the broadcast signal processing unit 356 may include analog-to-digital conversion or digital-to-analog conversion, converting the video data into a signal capable of driving the display 314, and the like.

통신 신호 처리부(358)는 GPS 위성과의 통신 신호와 텔레매틱스 통신 신호의 처리를 수행한다. 즉, 통신 신호 처리부(358)는 수신되는 통신 신호를 제어부(312)에 전달하기 위한 데이터의 형태로 변환하거나, 튜너부(354) 및 안테나(352)를 통해 송신하고자 하는 데이터를 제어부(312)로부터 전달받아 통신 가능한 형태의 신호로 변환한다.The communication signal processing unit 358 performs processing of a communication signal with the GPS satellite and a telematics communication signal. That is, the communication signal processing unit 358 converts the received communication signal into data for transmitting the control signal to the control unit 312, or transmits data to be transmitted through the tuner unit 354 and the antenna 352 to the control unit 312, And converts the received signal into a signal of a communicable type.

내비게이션 데이터베이스(362)는 내비게이션을 구현하기 위한 데이터들을 포함한다. 내비게이션 데이터베이스(362)는 메모리 카드나 DVD(Digital Versatile Disc) 형태일 수 있다. 또한 유선/무선 방식의 링크(예를 들면 카플레이(CarPlay) 또는 안드로이드 오토(Android Auto))를 통해 연결되는 모바일 단말기로부터 제공되는 내비게이션 데이터를 내비게이션 데이터베이스로서 활용할 수도 있다.The navigation database 362 includes data for implementing navigation. The navigation database 362 may be in the form of a memory card or a DVD (Digital Versatile Disc). Navigation data provided from a mobile terminal connected through a wired / wireless link (for example, CarPlay or Android Auto) may be utilized as a navigation database.

내비게이션 구동부(364)는 내비게이션 데이터베이스(362)로부터 제공되는 데이터를 이용하여 디스플레이(314) 상에 내비게이션 화면을 구성한다. 이를 위해 운전자가 설정한 목적지와 경유지, 경로 형태 등의 내비게이션 설정 정보를 제어부(312)로부터 제공받는다. 또한 내비게이션의 구현을 위해 GPS 위성과의 통신을 통해 확보한 자동차의 현재 위치 정보를 제어부(312)로부터 제공받는다.The navigation drive unit 364 configures a navigation screen on the display 314 using data provided from the navigation database 362. To this end, the controller 312 receives navigation setting information such as a destination, a transit route, and a route form set by the driver. In addition, the control unit 312 receives current position information of the vehicle secured through communication with the GPS satellite to implement navigation.

오디오/비디오 입력부(372)는 광 디스크 드라이브(Optical Disc Drive)일 수 있다. 또는 오디오/비디오 입력부(372)는 범용 직렬 버스(USB) 입출력 장치 또는 예비 입출력 단자(일명 AUX)일 수 있다. 또는 오디오/비디오 입력부(372)는 모바일 단말기와의 무선 연결을 위한 블루투스 장치일 수 있다. 오디오/비디오 입력부(372)에 블루투스를 통해 연결되는 모바일 단말기는 이동 전화 또는 휴대용 디지털 음원 재생 장치일 수 있다.The audio / video input unit 372 may be an optical disc drive. Or audio / video input 372 may be a universal serial bus (USB) input / output device or a preliminary input / output terminal (aka AUX). Or audio / video input 372 may be a Bluetooth device for wireless connection with the mobile terminal. The mobile terminal connected to the audio / video input unit 372 through Bluetooth may be a mobile phone or a portable digital sound source reproducing apparatus.

오디오/비디오 재생부(374)는 오디오/비디오 입력부(372)를 통해 입력되는 오디오/비디오 데이터를 스피커(116) 또는 디스플레이(314)로 출력될 수 있도록 한다. 예를 들면 오디오/비디오 입력부(372)가 광 디스크 드라이브일 때, 광 디스크 드라이브는 광 디스크(CD/DVD/BD 등)에 기록되어 있는 오디오/비디오 데이터를 판독하여 오디오/비디오 데이터를 인출하고, 오디오/비디오 재생부(374)는 오디오/비디오 입력부(372)에 의해 인출된 오디오/비디오 데이터를 스피커(116) 또는 디스플레이(314)를 구동할 수 있는 형태의 신호로 변환하여 스피커(116) 또는 디스플레이(314)로 전달함으로써 오디오/비디오가 재생될 수 있도록 한다. 광 디스크 이외의 다른 매체로부터 제공되는 오디오/비디오 데이터의 경우에도 오디오/비디오 재생부(374)를 거치면서 스피커(116) 또는 디스플레이(314)를 구동할 수 있는 형태의 신호로 변환될 수 있다.The audio / video reproducing unit 374 allows the audio / video data inputted through the audio / video input unit 372 to be output to the speaker 116 or the display 314. [ For example, when the audio / video input unit 372 is an optical disc drive, the optical disc drive reads out audio / video data recorded on an optical disc (CD / DVD / BD, etc.) The audio / video reproducing unit 374 converts the audio / video data fetched by the audio / video input unit 372 into a signal capable of driving the speaker 116 or the display 314, Display 314 so that audio / video can be played back. In the case of audio / video data provided from a medium other than the optical disc, it can be converted into a signal capable of driving the speaker 116 or the display 314 while passing through the audio / video reproducing unit 374.

입력부(382)는 AVN(100)에 마련되는 적어도 하나의 버튼 또는 디스플레이(314) 상에 구현되는 터치스크린일 수 있다. 운전자는 입력부(382)의 조작을 통해 AVN(100)의 복합 기능 중 하나를 선택할 수 있고 선택한 기능으로부터 기대하는 작업이 이루어질 수 있도록 다양한 설정을 가할 수 있다. 앞서 설명한 스티어링 휠(102)의 음성 인식 버튼(104)도 입력부(382)를 구성하는 적어도 하나의 버튼에 포함될 수 있다.The input unit 382 may be a touch screen implemented on the at least one button or display 314 provided in the AVN 100. The driver can select one of the composite functions of the AVN 100 through the operation of the input unit 382 and apply various settings so that the expected operation can be performed from the selected function. The voice recognition button 104 of the steering wheel 102 described above may be included in at least one button constituting the input unit 382 as well.

제어부(312)는 AVN(100)의 동작 전반에 관여하여 필요한 제어를 수행한다. 예를 들면 음성 인식 버튼(104)의 조작에 응답하여 메모리(310)의 음성 인식 기능 관련 어플리케이션을 구동하여 초기 진입 화면이 표시되고 관련 음성 안내 메시지가 출력되도록 한다. 또한 제어부(312)는 음성 인식 처리부(308)로부터 제공되는 음성 명령 정보를 전달받아 해당 음성 명령 정보에 상응하는 제어 명령을 발생시켜서 음성 명령 정보에 해당하는 제어가 이루어지도록 한다. 또한 제어부(312)는 방송/통신 신호의 처리를 수행할 수 있다. 만약 방송/통신 신호의 처리 후 발생하는 오디오/비디오 데이터가 스피커(116) 또는 디스플레이(314)로 출력되어야 하는 경우 해당 오디오/비디오 데이터가 스피커(116) 또는 디스플레이(314)로 전달되도록 제어함으로써 필요한 오디오/비디오 데이터의 출력이 이루어질 수 있도록 한다. 또한 제어부(312)는 운전자가 내비게이션 기능을 선택하는 경우 내비게이션 데이터베이스(362)와 내비게이션 구동부(364), 디스플레이(314), 스피커(116)를 제어하여 내비게이션이 구현될 수 있도록 한다. 또한 제어부(312)는 오디오/비디오 입력부(372)를 통해 입력되는 오디오/비디오 데이터가 오디오/비디오 재생부(374)에 의해 재생되어 스피커(116) 또는 디스플레이(314)로 전달되도록 제어함으로써 필요한 오디오/비디오 데이터의 출력이 이루어질 수 있도록 한다. 또한 제어부(312)는 튜너부(354)가 방송 신호로부터 추출한 방송 채널의 명칭을 텍스트로 변환하여 음성 인식 처리부(308)로 전달한다.The control unit 312 is involved in the overall operation of the AVN 100 and performs necessary control. For example, in response to the operation of the voice recognition button 104, an application related to the voice recognition function of the memory 310 is driven to display an initial entry screen and output an associated voice guidance message. In addition, the control unit 312 receives the voice command information provided from the voice recognition processor 308, generates a control command corresponding to the voice command information, and performs control corresponding to the voice command information. Also, the control unit 312 may process the broadcast / communication signal. Video data to be transmitted to the speaker 116 or the display 314 when the audio / video data generated after the processing of the broadcast / communication signal is to be output to the speaker 116 or the display 314, So that output of audio / video data can be performed. When the driver selects the navigation function, the controller 312 controls the navigation database 362, the navigation driver 364, the display 314, and the speaker 116 so that navigation can be implemented. The control unit 312 controls the audio / video data input through the audio / video input unit 372 to be reproduced by the audio / video reproducing unit 374 and transmitted to the speaker 116 or the display 314, / Video data can be output. The control unit 312 converts the name of the broadcast channel extracted from the broadcast signal by the tuner unit 354 into text and transmits the converted text to the speech recognition processing unit 308. [

메모리(310)는 AVN(100)의 음성 인식 기능과 방송/통신 기능, 내비게이션 기능, 오디오/비디오 기능 각각을 수행하기 위해 실행되는 다양한 어플리케이션들과, 어플리케이션들의 실행에 필요한 화면 표시 데이터와 음성 데이터, 효과음 데이터 등이 저장된다.The memory 310 stores various applications executed to perform the voice recognition function, the broadcasting / communication function, the navigation function, and the audio / video function of the AVN 100, and the display data, voice data, Effect sound data and the like are stored.

디스플레이(314)는 AVN(100)의 음성 인식 기능과 방송/통신 기능, 내비게이션 기능, 오디오/비디오 기능 등의 복합 기능이 수행될 때 수반되는 비디오를 출력한다. 예를 들면 각 기능 별 안내 화면이나 메시지, 비디오 자료 등이 디스플레이(314)를 통해 출력된다.The display 314 outputs video accompanied by a voice recognition function of the AVN 100, a broadcast / communication function, a navigation function, and an audio / video function. For example, a guide screen for each function, a message, video data, and the like are output through the display 314.

스피커(116)는 AVN(100)의 음성 인식 기능과 방송/통신 기능, 내비게이션 기능, 오디오/비디오 기능 등의 복합 기능이 수행될 때 수반되는 오디오를 출력한다. 예를 들면, 각 기능 별 안내 멘트나 효과 음, 오디오 자료 등이 스피커(116)를 통해 출력된다.The speaker 116 outputs audio accompanied by a voice recognition function of the AVN 100, a broadcast / communication function, a navigation function, and an audio / video function. For example, announcements, effect sounds, audio data, etc. for each function are outputted through the speaker 116.

도 4는 도 3에 나타낸 미들웨어의 구성을 나타낸 도면이다. 도 4에 나타낸 바와 같이, 본 발명의 실시 예에 따른 AVN(100)을 위한 미들웨어(322)는 음성 입출력부(402)와 음성 인식 게이트웨이(404)를 포함한다.4 is a diagram showing the configuration of the middleware shown in FIG. 4, the middleware 322 for the AVN 100 according to the embodiment of the present invention includes a voice input / output unit 402 and a voice recognition gateway 404.

음성 입출력부(Voice Handler)(402)는 음성 인식 버튼(104)의 조작이 발생하면 이에 응답하여 마이프로폰(106)을 음성 입력이 가능한 상태로 활성화시킨다. 마이크로폰(106)이 활성화된 상태에서 운전자가 발화하고 마이크로폰(106)을 통해 음성 신호가 입력되면, 음성 입출력부(402)는 입력된 음성 신호를 AVN(100)의 음성 인식 처리부(308)로 전송한다.The voice input / output unit (402) activates the microprocessor (106) in a voice input enabled state in response to the operation of the voice recognition button (104). When the driver speaks while the microphone 106 is activated and a voice signal is input through the microphone 106, the voice input / output unit 402 transmits the input voice signal to the voice recognition processing unit 308 of the AVN 100 do.

이와 같이 전달된 음성 신호는 AVN(100)의 음성 인식 처리부(308)에서 텍스트 데이터로 변환된 후 다시 미들웨어(322)의 음성 인식 게이트웨이(404)로 전달된다.The voice signal thus transmitted is converted into text data in the voice recognition processing unit 308 of the AVN 100 and then transmitted to the voice recognition gateway 404 of the middleware 322 again.

음성 인식 게이트웨이(404)는 AVN(100)의 음성 인식 처리부(308)로부터 전달받은 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 판단한다. 미들웨어(322)는 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어일 때 텍스트 데이터를 AVN(100)의 제어부(312)로 전송하여 AVN(100)의 음성 인식 제어가 이루어지도록 한다. 이와 달리, 만약 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어가 아닐 때 텍스트 데이터를 모바일 단말기(252) 로 전송하여 모바일 단말기(252)의 음성 인식 제어가 이루어지도록 한다.The voice recognition gateway 404 determines whether the text data received from the voice recognition processing unit 308 of the AVN 100 is a reserved word for voice recognition control of the AVN 100 or a reserved word for voice recognition control of the mobile terminal 252 . The middleware 322 transmits text data to the control unit 312 of the AVN 100 when the text data is a reserved word for voice recognition control of the AVN 100 so that voice recognition control of the AVN 100 is performed. Alternatively, if text data is not a reserved word for voice recognition control of AVN 100, text data is sent to mobile terminal 252 to allow voice recognition control of mobile terminal 252 to be performed.

텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 판단하는 과정은, 음성 인식 게이트웨이(404)에서 음성 인식 게이트웨이(404)에 마련되어 있는 비교 기준 정보를 참조하여 이루어진다. 예를 들면 음성 인식 게이트웨이(404)에는 텍스트 패턴 매칭 로직과 의도 분석 로직이 마련된다.The process of determining whether the text data is a reserved word for voice recognition control of the AVN 100 or a reserved word for voice recognition control of the mobile terminal 252 is provided to the voice recognition gateway 404 in the voice recognition gateway 404 And the comparison reference information. For example, the speech recognition gateway 404 is provided with text pattern matching logic and intent analysis logic.

텍스트 패턴 매칭 로직은 AVN(100)의 음성 인식 처리부(308)로부터 음성 인식 게이트웨이(404)로 입력되는 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 텍스트 데이터의 패턴 매칭을 통해 비교하여 판단한다.The text pattern matching logic determines whether text data input from the voice recognition processing unit 308 of the AVN 100 to the voice recognition gateway 404 is a reserved word for voice recognition control of the AVN 100 or voice recognition of the mobile terminal 252 And determines whether or not the word is a reserved word for control through pattern matching of text data.

의도 분석 로직은 AVN(100)의 음성 인식 제어를 위한 예약어의 고유 특징과 모바일 단말기(252)의 음성 인식 제어를 위한 예약어의 고유 특징을 참조하여 AVN(100)의 예약어와 모바일 단말기(252)의 예약어를 구분한다. 예를 들면 신호 파형을 통해 의문형과 명령형 등을 구분하고, 단문/명령형은 AVN(100)의 음성 인식 제어를 위한 예약어로 판단하고 장문/의문형은 모바일 단말기(252)의 음성 인식 제어를 위한 예약어로 판단할 수 있다.The intention analysis logic refers to the inherent characteristics of the reserved words for the voice recognition control of the AVN 100 and the unique characteristics of the reserved words for the voice recognition control of the mobile terminal 252, Identify reserved words. For example, the short message / imperative type is discriminated as a reserved word for speech recognition control of the AVN 100, and the long / question type is a reserved word for voice recognition control of the mobile terminal 252 It can be judged.

이는 AVN(100)의 음성 인식 제어를 위한 예약어의 경우 단어 위주의 명령어 형태가 주를 이루고, 모바일 단말기(252)의 음성 인식 제어를 위한 예약어의 경우 문장 위주의 질의 응답 형태가 주를 이루는데 따른 것이다. 의문문의 경우 문장의 끝 부분이 다른 부분보다 음이 올라가는 형태인 것을 통해 의문형을 판단하고, 명령형은 끝 부분의 음이 올라가지 않는 것을 통해 일반적인 명령형을 판단할 수 있다.This is because a word-oriented command form is mainly used for a reserved word for voice recognition control of the AVN 100, and a sentence-oriented query response form is used for a reserved word for voice recognition control of the mobile terminal 252 will be. In the case of a question, it is judged the question type through the form in which the end part of the sentence is higher than the other part, and the imperative type can judge the general imperative type by the sound of the end part not rising.

이와 같은 패턴 매칭과 의도 분석 외에 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 구분할 수 있는 다른 판단 기준이 적용될 수 도 있다.In addition to the pattern matching and intent analysis, other criteria may be applied to distinguish whether the AVN 100 is a reserved word for speech recognition control or a reserved word for voice recognition control of the mobile terminal 252.

도 5는 도 3에 나타낸 음성 인식 처리부의 구성을 나타낸 도면이다. 도 5에 나타낸 바와 같이, 음성 인식 처리부(308)는 아날로그-디지털 변환부(502)와 음성 인식부(504), 음성 검출부(506), 단어 인식부(508), 단어 표준 패턴 데이터베이스(510)를 포함한다.5 is a diagram showing a configuration of the speech recognition processing unit shown in Fig. 5, the speech recognition processing unit 308 includes an analog-to-digital conversion unit 502, a speech recognition unit 504, a speech detection unit 506, a word recognition unit 508, a word standard pattern database 510, .

마이크로폰(106)에서 출력되는 전기 신호로 변환된 음성 신호는 아날로그 형태이다. 음성 인식을 위해서는 이 아날로그 형태의 전기 신호를 디지털 형태의 전기 신호로 변환해야 한다. 아날로그-디지털 변환부(502)는 마이크로폰(106)으로부터 입력되는 아날로그 형태의 전기 신호를 디지털 형태의 전기 신호로 변환한다.The voice signal converted into the electric signal output from the microphone 106 is in analog form. For speech recognition, this analog type electrical signal must be converted into a digital type electrical signal. The analog-to-digital conversion unit 502 converts an analog type electrical signal input from the microphone 106 into a digital type electrical signal.

음성 분석부(504)는 디지털 형태의 전기 신호로 변환된 음성 신호를 분석하여 특징 패턴을 추출한다. 음성 분석부(504)는 전기 신호로 변환된 음성 신호를 미리 설정된 크기(예를 들면 10ms 또는 30ms 크기)의 프레임으로 분할하고, 각 프레임마다 특징 패턴을 추출한다. 음성 분석부(504)에서의 특징 패턴의 추출은 음성 신호의 주파수 또는 진폭 등의 정보의 분석을 통해 이루어진다.The voice analysis unit 504 analyzes the voice signal converted into the digital type electrical signal and extracts the feature pattern. The voice analysis unit 504 divides the voice signal converted into the electric signal into frames of a predetermined size (for example, 10 ms or 30 ms) and extracts the feature patterns for each frame. The extraction of the feature pattern in the voice analysis unit 504 is performed through analysis of information such as frequency or amplitude of the voice signal.

음성 검출부(506)는 디지털 형태의 전기 신호로 변환된 음성 신호에서 실제의 음성이 존재하는 구간을 검출한다. 자동차가 주행할 때 자동차의 실내는 다양한 형태의 잡음이 발생하게 된다. 예를 들면 자동차의 엔진 음이나 배기 음, 풍절 음 등이 실내로 유입되어 잡음으로 작용할 수 있다. 또한 AVN(100)의 오디오 기능을 이용하여 음악을 감상하는 동안에는 음성 인식의 측면에서는 음악 소리가 잡음으로 작용할 수 있다. 자동차의 실내에서 사용되는 AVN(100)은 이와 같은 잡음이 많은 환경에서 정확하고 높은 음성 인식률을 제공하기 위해서는 음성 신호의 음성이 존재하는 구간의 시작과 끝을 검출하는 것이 매우 중요하다.The voice detection unit 506 detects an interval in which the actual voice exists in the voice signal converted into the digital type electrical signal. When the vehicle is traveling, various types of noise are generated in the interior of the vehicle. For example, an engine sound, an exhaust sound, a wind noise, etc. of a car may enter the room and act as a noise. Also, while listening to music using the audio function of the AVN 100, music sound may act as noise in terms of speech recognition. In order to provide accurate and high voice recognition rate in such a noisy environment, it is very important for the AVN 100 used in an automobile interior to detect the beginning and end of a section in which a voice signal exists.

단어 인식부(508)는 마이크로폰(106)을 통해 입력된 음성 신호와 가장 유사한 단어를 선택한다. 단어 표준 패턴 데이터베이스(510)에는 복수의 예약어들(의미와 용법이 미리 정의된 단어들)과 각 예약어들의 특징 패턴들이 표준 패턴으로서 저장되어 있다. 단어 인식부(508)는 음성 신호의 분석을 통해 획득한 특징 패턴과 미리 마련되어 있는 표준 패턴을 비교하여 특징 패턴과 가장 비슷한 표준 패턴을 선택한다. 단어 인식부(508)는 선택한 표준 패턴에 대응되는 예약어의 정보를 인식의 결과로서 제어부(312)에 제공한다.The word recognition unit 508 selects a word most similar to the voice signal input through the microphone 106. [ In the word standard pattern database 510, a plurality of reserved words (words with predefined meaning and usage) and characteristic patterns of respective reserved words are stored as standard patterns. The word recognition unit 508 compares the characteristic pattern acquired through the analysis of the speech signal with a standard pattern prepared in advance and selects a standard pattern most similar to the characteristic pattern. The word recognition unit 508 provides information of the reserved words corresponding to the selected standard pattern to the control unit 312 as the recognition result.

음성 인식에서, 음성 신호의 특징 패턴과 미리 마련되는 표준 패턴의 비교의 정확성이 음성 인식률에 큰 영향을 미친다. 특징 패턴과 비교 패턴의 비교 방법은 매우 다양한데, 크게 다음의 두 가지 방법을 예로 들 수 있다.In speech recognition, the accuracy of comparison between a feature pattern of a speech signal and a standard pattern prepared in advance greatly affects the speech recognition rate. There are many ways to compare feature patterns and comparison patterns. For example, the following two methods can be used.

1980 년대에 주류를 이루었던 ‘동적 프로그래밍 정합(Dynamic Programming Matching, DPM)’ 방법은, 같은 단어라도 발성 상황이나 발성 스피드에 의해 부분적으로 단어의 길이가 변화하는 것에 근거하여 음성 신호를 신축성 있게 표준 패턴과 비교하는 방법이다.Dynamic Programming Matching (DPM), a mainstream method in the 1980s, is based on the fact that the length of a word is partially changed by a vocal situation or a speed of speech even with the same word, .

최근에 주류를 이루는 ‘숨겨진 마르코프 모델(Hidden Markov Model, HMM)’ 방법은 통계적 기법을 통해 시간 축의 변화뿐만 아니라 개인 차에 의한 스펙트럼의 변화에도 대응할 수 있도록 한 방법이다. HMM은 미리 음성의 각 기본 단위의 패턴을 미리 준비해 두고 음성 신호의 특징 패턴이 준비된 패턴 가운데 어느 패턴에 가까운지를 통계적으로 처리하여 비교한다. HMM 방법의 발전에 의해 화자 독립 인식의 음성 인식률이 매우 향상되었다. 화자 독립 인식은 불특정 화자 즉 임의의 화자의 발성을 인식하기 위한 기술로서 미리 다수의 불특정 화자의 음성 신호의 패턴 정보를 추출하여 데이터베이스로 구축하여 운용함으로써 불특정 화자를 대상으로 하는 음성 인식이 가능하도록 한 것이다.Recently, the mainstream 'Hidden Markov Model (HMM)' method is a method that can cope with not only time axis change but also spectrum change by individual difference through statistical technique. The HMM prepares a pattern of each basic unit of speech in advance and statistically processes the characteristic pattern of the speech signal to determine which pattern is closest to the prepared pattern. With the development of the HMM method, the speech recognition rate of the speaker independent recognition is greatly improved. Speaker independent recognition is a technology for recognizing the speaker of an unspecified speaker, that is, an arbitrary speaker, in which pattern information of voice signals of a plurality of unspecified speakers is extracted in advance and constructed and operated as a database so that speech recognition targeting unspecified speakers is possible will be.

도 6은 본 발명의 실시 예에 따른 자동차의 음성 인식 제어 방법을 나타낸 도면이다. 도 6에 나타낸 자동차의 음성 인식 제어 방법은 도 1 내지 도 5의 장치 구성을 기반으로 한다. 또한 도 6에 나타낸 자동차의 음성 인식 제어 방법은 미들웨어(322)의 관점을 중심으로 나타낸 것이다.6 is a diagram illustrating a speech recognition control method for a vehicle according to an embodiment of the present invention. The voice recognition control method of the automobile shown in Fig. 6 is based on the device configuration of Fig. 1 to Fig. The voice recognition control method of the vehicle shown in Fig. 6 is based on the viewpoint of the middleware 322. [

도 6에 나타낸 바와 같이, AVN(100)이 파워 온 된 상태로 대기 중인 상태에서(602), 운전자가 음성 인식 제어 시스템을 이용하기 위해 음성 인식 버튼(104)을 조작하면(604의 ‘예’) 음성 인식 버튼(104)을 통해 전기 신호가 발생하여 미들웨어(322)에 전달되고, 미들웨어(322)의 음성 입출력부(402)는 음성 인식 버튼(104)을 통해 발생한 전기 신호에 응답하여 마이프로폰(106)을 음성 입력이 가능한 상태로 활성화시킨다(606). 또한 미들웨어(322)의 음성 입출력부(402)는 음성 신호를 수신하여 AVN(100)의 음성 인식 처리부(308)로 전송한다(608). 음성 입출력부(402)를 통한 음성 신호의 전송 과정에서 음성 신호의 노이즈 제거와 필터링 등이 이루어질 수 있다.6, when the driver operates the voice recognition button 104 to use the voice recognition control system in the state in which the AVN 100 is in the standby state with the power on (602) (Yes in 604) The voice input / output unit 402 of the middleware 322 generates an electric signal through the voice recognition button 104 and transmits it to the middleware 322. In response to the electric signal generated through the voice recognition button 104, And activates the phone 106 in a voice input enabled state (606). The voice input / output unit 402 of the middleware 322 receives the voice signal and transmits it to the voice recognition processing unit 308 of the AVN 100 (608). Noise removal and filtering of the voice signal may be performed during the transmission of the voice signal through the voice input / output unit 402. [

미들웨어(322)가 마이크로폰(106)을 통해 입력되는 음성 신호를 수신하여 AVN(100)의 음성 인식 처리부(308)로 전송하면, AVN(100)의 음성 인식 처리부(308)는 전송되는 음성 신호에 대해 음성 인식 처리를 수행하여 음성 신호에 상응하는 텍스트 데이터를 생성한다. 미들웨어(322)는 AVN(100)의 음성 인식 처리부(308)로부터 텍스트 데이터를 수신한다(610).The middleware 322 receives the voice signal input through the microphone 106 and transmits the voice signal to the voice recognition processing unit 308 of the AVN 100. The voice recognition processing unit 308 of the AVN 100 transmits And generates text data corresponding to the voice signal. The middleware 322 receives text data from the voice recognition processing unit 308 of the AVN 100 (610).

미들웨어(322)가 AVN(100)의 음성 인식 처리부(308)로부터 텍스트 데이터를 수신하는 것은 AVN(100)의 음성 인식 처리부(308)에서 생성되는 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 판단하기 위한 것이다. 미들웨어(322)의 음성 인식 게이트웨이(404)에서는 적어도 하나의 판단 로직을 이용하여 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 판단한다. 예를 들면 음성 인식 게이트웨이(404)에서는 텍스트 데이터의 패턴 매칭(612)과 텍스트 데이터의 의도 분석(614)의 방법을 이용하여 예약어를 판단한다.The reason why the middleware 322 receives text data from the voice recognition processing unit 308 of the AVN 100 is that the text data generated by the voice recognition processing unit 308 of the AVN 100 is used for voice recognition control of the AVN 100 Or whether it is a reserved word for voice recognition control of the mobile terminal 252. The voice recognition gateway 404 of the middleware 322 uses at least one determination logic to determine whether the text data is a reserved word for voice recognition control of the AVN 100 or a reserved word for voice recognition control of the mobile terminal 252 . For example, the voice recognition gateway 404 determines a reserved word using a method of pattern matching 612 of text data and an analysis of intent 614 of text data.

텍스트 데이터의 패턴 매칭(612)은 AVN(100)의 음성 인식 처리부(308)로부터 음성 인식 게이트웨이(404)로 입력되는 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 텍스트 데이터의 패턴 매칭을 통해 비교하여 판단하는 것이다.Pattern matching 612 of the text data is performed when the text data input from the voice recognition processing unit 308 of the AVN 100 to the voice recognition gateway 404 is a reserved word for voice recognition control of the AVN 100, ) Through the pattern matching of the text data.

텍스트 데이터의 의도 분석(614)은 AVN(100)의 음성 인식 제어를 위한 예약어의 고유 특징과 모바일 단말기(252)의 음성 인식 제어를 위한 예약어의 고유 특징을 참조하여 AVN(100)의 예약어와 모바일 단말기(252)의 예약어를 구분한다. 예를 들면 신호 파형을 통해 의문형과 명령형 등을 구분하고, 단문/명령형은 AVN(100)의 음성 인식 제어를 위한 예약어로 판단하고 장문/의문형은 모바일 단말기(252)의 음성 인식 제어를 위한 예약어로 판단할 수 있다.The intention analysis 614 of the text data refers to the reserved word of the AVN 100 and the unique identifier of the reserved word for the voice recognition control of the mobile terminal 252, And distinguishes the reserved words of the terminal 252. For example, the short message / imperative type is discriminated as a reserved word for speech recognition control of the AVN 100, and the long / question type is a reserved word for voice recognition control of the mobile terminal 252 It can be judged.

미들웨어(322)의 음성 인식 게이트웨이(404)는, AVN(100)의 음성 인식 제어를 위한 예약어와 모바일 단말기(252)의 음성 인식 제어를 위한 예약어의 구분이 이루어지면, 텍스트 데이터를 AVN(100)의 제어부(312)로 출력되도록 하거나 또는 텍스트 데이터를 모바일 단말기(252)의 음성 인식 처리부(324)로 전송한다.The voice recognition gateway 404 of the middleware 322 extracts the text data from the AVN 100 when the reserved word for the voice recognition control of the AVN 100 is distinguished from the reserved word for the voice recognition control of the mobile terminal 252. [ Or transmits the text data to the voice recognition processing unit 324 of the mobile terminal 252. The voice recognition processing unit 324 of the mobile terminal 252 receives the text data.

즉, 만약 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어인 것으로 판단될 때(616의 ‘예’), 미들웨어(322)의 음성 인식 게이트웨이(404)는 텍스트 데이터가 AVN(100)의 음성 인식 처리부(308)를 통해 제어부(312)로 출력되도록 한다(618). 제어부(312)는 미들웨어(322)로부터 전달받은 텍스트 데이터에 준하는 제어를 수행한다.That is, if it is determined that the text data is a reserved word for speech recognition control of the AVN 100 (Yes in 616), the voice recognition gateway 404 of the middleware 322 determines whether text data is And outputted to the control unit 312 through the voice recognition processing unit 308 (618). The control unit 312 performs control based on the text data received from the middleware 322.

만약 반대로, 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어가 아닌 것으로 판단될 때(616의 ‘아니오’), 미들웨어(322)의 음성 인식 게이트웨이(404)는 해당 음성 신호를 모바일 단말기(252)의 음성 인식 처리부(324)로 전송하여 모바일 단말기(252) 에서 해당 음성 신호의 음성 인식 제어가 이루어지도록 한다(620).If it is determined that the text data is not a reserved word for voice recognition control of AVN 100 ('NO' at 616), voice recognition gateway 404 of middleware 322 transmits the voice signal to mobile terminal 252 to the voice recognition processor 324 so that the voice recognition control of the voice signal is performed by the mobile terminal 252 (620).

이와 같이, 운전자의 발화에 의해 생성되는 음성 신호가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 미들웨어(322)에서 자동으로 판단하여 중계하되, 이 과정에서 AVN(100)의 음성 인식 제어를 위한 예약어와 모바일 단말기(252)의 음성 인식 제어를 위한 예약어의 구분을 위한 운전자의 의도된 개입은 필요치 않음을 알 수 있다.As described above, the middleware 322 automatically determines whether the voice signal generated by the driver's utterance is a reserved word for voice recognition control of the AVN 100 or a reserved word for voice recognition control of the mobile terminal 252, It is understood that the driver's intentional intervention for distinguishing between the reserved words for voice recognition control of the AVN 100 and the reserved words for voice recognition control of the mobile terminal 252 is not necessary in this process.

도 7 은 본 발명의 또 다른 실시 예에 따른 자동차의 AVN의 구성을 나타낸 도면이다. 도 7에 나타낸 AVN(100)은 음성 인식 제어를 기반으로 한다. 도 7에 나타낸 바와 같이, AVN(100)의 구성은 크게 음성 인식 기능을 위한 요소와, 일반적인 입력 기능을 위한 요소, 방송/통신 기능을 위한 요소, 내비게이션 기능을 위한 요소, 오디오/비디오 기능을 위한 요소, 복수의 기능에 공통적으로 사용될 수 있는 요소로 구분할 수 있다.7 is a diagram illustrating a configuration of an AVN of a vehicle according to another embodiment of the present invention. The AVN 100 shown in FIG. 7 is based on voice recognition control. 7, the configuration of the AVN 100 is largely divided into an element for a voice recognition function, an element for a general input function, an element for a broadcast / communication function, a element for a navigation function, Elements, and elements that can be used in common for a plurality of functions.

음성 인식 기능을 위한 구성은 음성 인식 버튼(104)과 마이크로폰(106), 미들웨어(722), 음성 인식 처리부(708), 음성 인식 게이트웨이(704), 명령 출력 인터페이스(318)를 포함한다. AVN(100)의 구성 요소는 아니지만, 외부 기기로서의 모바일 단말기(252)를 통해 원격지의 서버에 마련되는 모바일 음성 인식 처리부(724)가 미들웨어(722) 및 제어부(312)에 통신 가능하도록 연결될 수 있다. 방송/통신 기능을 위한 요소는 안테나(352)와 튜너부(354), 방송 신호 처리부(356), 통신 신호 처리부(358)를 포함한다. 내비게이션 기능을 위한 요소는 내비게이션 데이터베이스(362)와 내비게이션 구동부(364)를 포함한다. 오디오/비디오 기능을 위한 요소는 오디오/비디오 입력부(372)와 오디오/비디오 재생부(374)를 포함한다. 일반적인 입력 기능을 위한 구성은 입력부(372)를 포함한다. 복수의 기능에 공통적으로 사용될 수 있는 요소는 메모리(310)와 제어부(312), 디스플레이(314), 스피커(116)를 포함한다. 이와 같은 기능 상의 구분은 위에 기재한 것에 한정되지 않으며, 어느 하나의 기능을 위한 요소가 다른 기능을 위해서도 사용될 수 있다.The configuration for the voice recognition function includes a voice recognition button 104 and a microphone 106, a middleware 722, a voice recognition processing unit 708, a voice recognition gateway 704, and a command output interface 318. The mobile voice recognition processing unit 724 provided in the remote server through the mobile terminal 252 as an external device may be connected to the middleware 722 and the control unit 312 so as to be able to communicate with the AVN 100 . Elements for the broadcast / communication function include an antenna 352, a tuner unit 354, a broadcast signal processing unit 356, and a communication signal processing unit 358. The elements for the navigation function include a navigation database 362 and a navigation drive 364. [ The elements for the audio / video function include an audio / video input unit 372 and an audio / video playback unit 374. [ The configuration for a general input function includes an input 372. Elements that can be commonly used for a plurality of functions include a memory 310 and a control unit 312, a display 314, and a speaker 116. Such functional division is not limited to those described above, and an element for one function may be used for another function.

AVN(100)의 음성 인식 처리부(708)는 마이크로폰(106)에 의해 변환된 전기 신호를 미들웨어(722)를 통해 전달받아 변환된 전기 신호를 대상으로 음성 인식을 수행하고, 음성 인식의 결과로서 음성 명령 정보로서의 텍스트 데이터를 추출한다. 음성 인식 처리부(708)에서 추출된 텍스트 데이터는 제어부(312)에 전달되기에 앞서 음성 인식 게이트웨이(704)로 전달된다.The voice recognition processing unit 708 of the AVN 100 receives the electric signal converted by the microphone 106 via the middleware 722 and performs voice recognition on the converted electric signal, And extracts text data as command information. The text data extracted by the speech recognition processing unit 708 is transmitted to the voice recognition gateway 704 before being transmitted to the control unit 312.

음성 인식 게이트웨이(704)는 AVN(100)의 음성 인식 처리부(708)와 제어부(312) 사이에 마련되어, AVN(100)의 음성 인식 처리부(708)로부터 전달받은 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 판단한다. 미들웨어(722)는 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어일 때 텍스트 데이터를 AVN(100)의 제어부(312)로 전송하여 AVN(100)의 음성 인식 제어가 이루어지도록 한다. 이와 달리, 만약 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어가 아닐 때 텍스트 데이터를 모바일 단말기(252) 로 전송하여 모바일 단말기(252)의 음성 인식 제어가 이루어지도록 한다. 즉, 운전자의 발화에 의해 생성되는 음성 신호가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 미들웨어(722)에서 자동으로 판단하여 중계한다. 이 과정에서 AVN(100)의 음성 인식 제어를 위한 예약어와 모바일 단말기(252)의 음성 인식 제어를 위한 예약어의 구분을 위한 운전자의 의도된 개입은 필요치 않다.The voice recognition gateway 704 is provided between the voice recognition processing unit 708 of the AVN 100 and the control unit 312 so that the text data received from the voice recognition processing unit 708 of the AVN 100 is transmitted to the AVN 100 Whether it is a reserved word for recognition control or a reserved word for speech recognition control of the mobile terminal 252. [ The middleware 722 transmits the text data to the control unit 312 of the AVN 100 when the text data is a reserved word for voice recognition control of the AVN 100 so that voice recognition control of the AVN 100 is performed. Alternatively, if text data is not a reserved word for voice recognition control of AVN 100, text data is sent to mobile terminal 252 to allow voice recognition control of mobile terminal 252 to be performed. That is, the middleware 722 automatically determines whether the voice signal generated by the driver's utterance is a reserved word for the voice recognition control of the AVN 100 or a reserved word for the voice recognition control of the mobile terminal 252, and relays it . In this process, it is not necessary for the driver to intervene to distinguish between the reserved words for the voice recognition control of the AVN 100 and the reserved words for the voice recognition control of the mobile terminal 252.

제어부(312)는 AVN(100)의 동작 전반에 관여하여 필요한 제어를 수행한다. 예를 들면 음성 인식 버튼(104)의 조작에 응답하여 메모리(310)의 음성 인식 기능 관련 어플리케이션을 구동하여 초기 진입 화면이 표시되고 관련 음성 안내 메시지가 출력되도록 한다. 또한 제어부(312)는 음성 인식 처리부(708)로부터 제공되는 음성 명령 정보를 전달받아 해당 음성 명령 정보에 상응하는 제어 명령을 발생시켜서 음성 명령 정보에 해당하는 제어가 이루어지도록 한다. 또한 제어부(312)는 방송/통신 신호의 처리를 수행할 수 있다. 만약 방송/통신 신호의 처리 후 발생하는 오디오/비디오 데이터가 스피커(116) 또는 디스플레이(314)로 출력되어야 하는 경우 해당 오디오/비디오 데이터가 스피커(116) 또는 디스플레이(314)로 전달되도록 제어함으로써 필요한 오디오/비디오 데이터의 출력이 이루어질 수 있도록 한다. 또한 제어부(312)는 운전자가 내비게이션 기능을 선택하는 경우 내비게이션 데이터베이스(362)와 내비게이션 구동부(364), 디스플레이(314), 스피커(116)를 제어하여 내비게이션이 구현될 수 있도록 한다. 또한 제어부(312)는 오디오/비디오 입력부(372)를 통해 입력되는 오디오/비디오 데이터가 오디오/비디오 재생부(374)에 의해 재생되어 스피커(116) 또는 디스플레이(314)로 전달되도록 제어함으로써 필요한 오디오/비디오 데이터의 출력이 이루어질 수 있도록 한다. 또한 제어부(312)는 튜너부(354)가 방송 신호로부터 추출한 방송 채널의 명칭을 텍스트로 변환하여 음성 인식 처리부(708)로 전달한다.The control unit 312 is involved in the overall operation of the AVN 100 and performs necessary control. For example, in response to the operation of the voice recognition button 104, an application related to the voice recognition function of the memory 310 is driven to display an initial entry screen and output an associated voice guidance message. The control unit 312 receives the voice command information provided from the voice recognition processor 708, generates a control command corresponding to the voice command information, and performs control corresponding to the voice command information. Also, the control unit 312 may process the broadcast / communication signal. Video data to be transmitted to the speaker 116 or the display 314 when the audio / video data generated after the processing of the broadcast / communication signal is to be output to the speaker 116 or the display 314, So that output of audio / video data can be performed. When the driver selects the navigation function, the controller 312 controls the navigation database 362, the navigation driver 364, the display 314, and the speaker 116 so that navigation can be implemented. The control unit 312 controls the audio / video data input through the audio / video input unit 372 to be reproduced by the audio / video reproducing unit 374 and transmitted to the speaker 116 or the display 314, / Video data can be output. The control unit 312 converts the name of the broadcast channel extracted from the broadcast signal by the tuner unit 354 into text and transmits the text to the speech recognition processing unit 708.

도 8은 본 발명의 또 다른 실시 예에 따른 자동차의 음성 인식 제어 방법을 나타낸 도면이다. 도 8에 나타낸 자동차의 음성 인식 제어 방법은 도 1과 도 2, 도 4, 도 5, 도 7의 장치 구성을 기반으로 한다. 또한 도 8에 나타낸 자동차의 음성 인식 제어 방법은 음성 인식 게이트웨이(704)의 관점을 중심으로 나타낸 것이다.8 is a diagram illustrating a method of controlling speech recognition of a vehicle according to another embodiment of the present invention. The voice recognition control method of the automobile shown in Fig. 8 is based on the device configurations of Figs. 1, 2, 4, 5, and 7. The voice recognition control method of the vehicle shown in Fig. 8 is based on the viewpoint of the voice recognition gateway 704. [

도 8에 나타낸 바와 같이, AVN(100)이 파워 온 된 상태로 대기 중인 상태에서(802), AVN(100)의 음성 인식 처리부(708)가 음성 신호에 상응하는 텍스트 데이터를 생성하면 음성 인식 게이트웨이(704)는 AVN(100)의 음성 인식 처리부(308)로부터 텍스트 데이터를 수신한다(810).8, in the state where the AVN 100 is in a standby state with the power-on state (802), when the voice recognition processing unit 708 of the AVN 100 generates text data corresponding to the voice signal, (704) receives text data from the speech recognition processing unit 308 of the AVN (100) (810).

음성 인식 게이트웨이(704)가 AVN(100)의 음성 인식 처리부(308)로부터 텍스트 데이터를 수신하는 것은 AVN(100)의 음성 인식 처리부(308)에서 생성되는 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 판단하기 위한 것이다. 음성 인식 게이트웨이(704)에서는 적어도 하나의 판단 로직을 이용하여 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 판단한다. 예를 들면 음성 인식 게이트웨이(704)에서는 텍스트 데이터의 패턴 매칭(812)과 텍스트 데이터의 의도 분석(814)의 방법을 이용하여 예약어를 판단한다.The reason that the speech recognition gateway 704 receives text data from the speech recognition processing unit 308 of the AVN 100 is that the text data generated by the speech recognition processing unit 308 of the AVN 100 is sent to the speech recognition control Is a reserved word for the voice recognition control of the mobile terminal 252 or a reserved word for the voice recognition control of the mobile terminal 252. The voice recognition gateway 704 uses at least one determination logic to determine whether the text data is a reserved word for speech recognition control of the AVN 100 or a reserved word for voice recognition control of the mobile terminal 252. For example, the speech recognition gateway 704 determines a reserved word using a method of pattern matching 812 of text data and intention analysis 814 of text data.

텍스트 데이터의 패턴 매칭(812)은 AVN(100)의 음성 인식 처리부(308)로부터 음성 인식 게이트웨이(704)로 입력되는 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 텍스트 데이터의 패턴 매칭을 통해 비교하여 판단하는 것이다.Pattern matching 812 of the text data is performed when the text data input from the voice recognition processing unit 308 of the AVN 100 to the voice recognition gateway 704 is a reserved word for voice recognition control of the AVN 100, ) Through the pattern matching of the text data.

텍스트 데이터의 의도 분석(814)은 AVN(100)의 음성 인식 제어를 위한 예약어의 고유 특징과 모바일 단말기(252)의 음성 인식 제어를 위한 예약어의 고유 특징을 참조하여 AVN(100)의 예약어와 모바일 단말기(252)의 예약어를 구분한다. 예를 들면 신호 파형을 통해 의문형과 명령형 등을 구분하고, 단문/명령형은 AVN(100)의 음성 인식 제어를 위한 예약어로 판단하고 장문/의문형은 모바일 단말기(252)의 음성 인식 제어를 위한 예약어로 판단할 수 있다.The intention analysis 814 of the text data refers to the reserved word of the AVN 100 and the unique identifier of the reserved word for controlling the voice recognition of the mobile terminal 252, And distinguishes the reserved words of the terminal 252. For example, the short message / imperative type is discriminated as a reserved word for speech recognition control of the AVN 100, and the long / question type is a reserved word for voice recognition control of the mobile terminal 252 It can be judged.

음성 인식 게이트웨이(704)는, AVN(100)의 음성 인식 제어를 위한 예약어와 모바일 단말기(252)의 음성 인식 제어를 위한 예약어의 구분이 이루어지면, 텍스트 데이터를 AVN(100)의 제어부(312)로 출력되도록 하거나 또는 텍스트 데이터를 모바일 단말기(252)의 음성 인식 처리부(724)로 전송한다.When the voice recognition gateway 704 distinguishes between the reserved words for voice recognition control of the AVN 100 and the reserved words for voice recognition control of the mobile terminal 252, Or transmits the text data to the voice recognition processing unit 724 of the mobile terminal 252. [

즉, 만약 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어인 것으로 판단될 때(816의 ‘예’), 음성 인식 게이트웨이(704)는 텍스트 데이터가 AVN(100)의 음성 인식 처리부(308)를 통해 제어부(312)로 출력되도록 한다(818). 제어부(312)는 미들웨어(722)로부터 전달받은 텍스트 데이터에 준하는 제어를 수행한다.That is, when it is determined that the text data is a reserved word for voice recognition control of the AVN 100 (Yes in 816), the voice recognition gateway 704 transmits the text data to the voice recognition processing unit 308 To the control unit 312 (818). The control unit 312 performs control based on the text data received from the middleware 722. [

만약 반대로, 텍스트 데이터가 AVN(100)의 음성 인식 제어를 위한 예약어가 아닌 것으로 판단될 때(816의 ‘아니오’), 음성 인식 게이트웨이(704)는 해당 음성 신호를 모바일 단말기(252)의 음성 인식 처리부(724)로 전송하여 모바일 단말기(252) 에서 해당 음성 신호의 음성 인식 제어가 이루어지도록 한다(820).If it is determined that the text data is not a reserved word for voice recognition control of the AVN 100 (NO at 816), the voice recognition gateway 704 transmits the voice signal to the voice recognition Processing unit 724 to allow the mobile terminal 252 to perform voice recognition control of the voice signal (operation 820).

이와 같이, 운전자의 발화에 의해 생성되는 음성 신호가 AVN(100)의 음성 인식 제어를 위한 예약어인지 아니면 모바일 단말기(252)의 음성 인식 제어를 위한 예약어인지를 음성 인식 게이트웨이(704)에서 자동으로 판단하여 중계하되, 이 과정에서 AVN(100)의 음성 인식 제어를 위한 예약어와 모바일 단말기(252)의 음성 인식 제어를 위한 예약어의 구분을 위한 운전자의 의도된 개입은 필요치 않음을 알 수 있다.In this manner, the voice recognition gateway 704 automatically judges whether the voice signal generated by the driver's utterance is a reserved word for voice recognition control of the AVN 100 or a reserved word for voice recognition control of the mobile terminal 252 It is understood that the driver's intentional intervention for distinguishing between the reserved words for voice recognition control of the AVN 100 and the reserved words for voice recognition control of the mobile terminal 252 is not necessary in this process.

100 : AVN(Audio/Video/Navigation)
102 : 스티어링 휠
104 : 음성 인식 버튼
106 : 마이크로폰
116 : 스피커
308 : 음성 인식 처리부
310 : 메모리
312 : 제어부
314 : 디스플레이
318 : 명령 출력 인터페이스
322, 722 : 미들웨어
324, 724 : 모바일 단말기의 음성 인식 처리부
352 : 안테나
356 : 방송 신호 처리부
358 : 통신 신호 처리부
362 : 내비게이션 데이터베이스
364 : 내비게이션 구동부
372 : 오디오/비디오 입력부
374 : 오디오/비디오 재생부
382 : 입력부
402 : 음성 입출력부
404, 704 : 음성 인식 게이트웨이
502 : 아날로그-디지털 변환부
504 : 음성 인식부
506 : 음성 검출부
508 : 단어 인식부
510 : 단어 표준 패턴 데이터베이스100: AVN (Audio / Video / Navigation)
102: Steering wheel
104: Voice recognition button
106: microphone
116: Speaker
308: Voice recognition processor
310: memory
312:
314: Display
318: Command output interface
322, 722: Middleware
324, 724: a voice recognition processing unit of the mobile terminal
352: Antenna
356: Broadcast signal processor
358: Communication signal processor
362: Navigation database
364: Navigation drive unit
372: Audio / video input
374: audio / video reproducing unit
382:
402: Voice input / output unit
404, 704: Voice recognition gateway
502: Analog-to-digital conversion section
504:
506:
508:
510: Word Standard Pattern Database

Claims

In a vehicle including a multimedia device based on speech recognition and an external device based on speech recognition is connected to the multimedia device,
A voice recognition processor of the multimedia device receiving the voice signal and converting the voice signal into text data corresponding to the voice signal;
When the text data is a reserved word for voice recognition control of the multimedia device, transmitting the text data to the multimedia device so that voice recognition control of the multimedia device is performed, and the text data is subjected to voice recognition control of the multimedia device And transmitting the text data to the external device when the text data is not a reserved word for the external device to perform voice recognition control of the external device.

The method according to claim 1,
And determines whether the text data is a reserved word for voice recognition control of the multimedia device through pattern matching of the text data.

The method according to claim 1,
And determining whether the text data is a reserved word for voice recognition control of the multimedia device through intentional analysis of the text data.

4. The automobile of claim 3, wherein when the text data is a statement, it is determined that the text data is a reserved word for speech recognition control of the multimedia device.

4. The automobile according to claim 3, wherein when the text data is a question text, it is determined that the text data is not a reserved word for speech recognition control of the multimedia device.

The method according to claim 1,
Wherein the relay means is middleware for the multimedia device.

The method according to claim 6,
Wherein the middleware comprises a gateway;
Wherein the gateway determines that the text data is a reserved word for speech recognition control of the multimedia device.

The method according to claim 1,
Wherein the text data is transmitted to the voice recognition control unit of the external device through the external device when the text data is not a reserved word for voice recognition control of the multimedia device.

A voice recognition control method for a vehicle including a multimedia device based on speech recognition, wherein an external device based on speech recognition is connected to the multimedia device,
Receiving a voice signal and converting the voice signal into text data corresponding to the voice signal;
Transmitting the text data to the multimedia device when the text data is a reserved word for voice recognition control of the multimedia device, so that voice recognition control of the multimedia device is performed;
And transmitting the text data to the external device when the text data is not a reserved word for voice recognition control of the multimedia device, so that voice recognition control of the external device is performed.

10. The method of claim 9,
And determining whether the text data is a reserved word for voice recognition control of the multimedia device through pattern matching of the text data.

10. The method of claim 9,
And determining whether the text data is a reserved word for voice recognition control of the multimedia device through intentional analysis of the text data.

12. The method according to claim 11, wherein when the text data is a statement, it is determined that the text data is a reserved word for voice recognition control of the multimedia device.

12. The method according to claim 11, wherein when the text data is a question text, it is determined that the text data is not a reserved word for voice recognition control of the multimedia device.

10. The method of claim 9,
And relay means for transmitting the text data to the multimedia device;
Wherein the relay means is middleware for the multimedia device.

15. The method of claim 14,
Wherein the middleware comprises a gateway;
Wherein the gateway determines that the text data is a reserved word for voice recognition control of the multimedia device.

15. The apparatus according to claim 14,
When the text data is not a reserved word for voice recognition control of the multimedia device, the text data is transmitted to the voice recognition control unit of the external device through the external device so that voice recognition control of the external device is performed Recognition control method.

In a vehicle including a multimedia device based on speech recognition and an external device based on speech recognition is connected to the multimedia device,
A microphone for inputting a voice signal;
A voice recognition processor of the multimedia device receiving the voice signal through the microphone and converting the voice signal into text data corresponding to the voice signal;
Wherein the text data is transmitted to the multimedia device when the text data is a reserved word for voice recognition control of the multimedia device, and voice recognition control of the multimedia device is performed And transmitting the text data to the external device when the text data is not a reserved word for voice recognition control of the multimedia device, so that voice recognition control of the external device is performed.

In a vehicle including a multimedia device based on speech recognition and an external device based on speech recognition is connected to the multimedia device,
A microphone for inputting a voice signal;
A voice recognition processor of the multimedia device receiving the voice signal through the microphone and converting the voice signal into text data corresponding to the voice signal;
Wherein the multimedia data is transmitted to the multimedia device when the text data is a reserved word for controlling the voice recognition of the multimedia device so that the voice recognition control of the multimedia device is performed And relay means for transmitting the text data to the external device when the text data is not a reserved word for voice recognition control of the multimedia device, so that voice recognition control of the external device is performed.