KR100738414B1

KR100738414B1 - Method for improving performance of speech recognition in telematics environment and device for executing the method

Info

Publication number: KR100738414B1
Application number: KR1020060011076A
Authority: KR
Inventors: 최인정; 한익상; 정상배; 김남훈
Original assignee: 삼성전자주식회사
Priority date: 2006-02-06
Filing date: 2006-02-06
Publication date: 2007-07-11

Abstract

A method for improving speech recognition performance under telematics environments and an apparatus for performing the method are provided to minimize an error that a recognition result is recognized again by removing a recognition result receiving a negative response about system confirmation of user's speech from a re-recognition process. A wide POI(Point Of Interest) is inputted if recognition of an inputted detailed POI fails. A POI network associated with the wide POI is activated. A POI having the highest similarity with the detailed POI which is inputted again is searched from the POI network and a searched result is provided as a recognition result. When the POI network associated with the wide POI is activated, an N-best POI result of the inputted wide POI is generated and a POI network corresponding to the N-best POI result is activated.

Description

METHOD FOR IMPROVING PERFORMANCE OF SPEECH RECOGNITION IN TELEMATICS ENVIRONMENT AND DEVICE FOR EXECUTING THE METHOD}

도 1은 종래기술에 따른 단계별 음성 인식 방법의 일례를 도시한 도면이다.1 is a diagram illustrating an example of a step-by-step speech recognition method according to the prior art.

도 2는 종래기술에 따른 시스템 컨펌(System Confirm)을 이용한 음성 인식 방법의 일례를 도시한 도면이다.2 is a diagram illustrating an example of a speech recognition method using a system confirm according to the prior art.

도 3은 텔레매틱스(Telematics) 음성 인식 방법의 일례를 도시한 흐름도이다.3 is a flowchart illustrating an example of a method of telematics speech recognition.

도 4는 본 발명의 일실시예에 있어서, 단계간 N-best POI 결과를 활용하기 위한 방법을 도시한 흐름도이다.4 is a flowchart illustrating a method for utilizing inter-step N-best POI results according to an embodiment of the present invention.

도 5는 본 발명의 다른 실시예에 있어서, 사용자 발화에 대한 시스템 컨펌과 연관된 정보를 이용하기 위한 방법을 도시한 흐름도이다.5 is a flowchart illustrating a method for using information associated with a system confirmation for user utterance in another embodiment of the present invention.

도 6은 본 발명의 일실시예에 있어서, 단계간 N-best POI 결과를 활용하는 장치의 내부 구성을 설명하기 위한 블록도이다.FIG. 6 is a block diagram illustrating an internal configuration of an apparatus using the N-best POI result between steps according to an embodiment of the present invention.

도 7은 본 발명의 다른 실시예에 있어서, 사용자 발화에 대한 시스템 컨펌과 연관된 정보를 이용하는 장치의 내부 구성을 설명하기 위한 블록도이다.FIG. 7 is a block diagram illustrating an internal configuration of an apparatus using information associated with a system confirmation about user speech according to another embodiment of the present invention.

도 8은 본 발명의 일실시예에 있어서, 단계간 N-best POI 결과를 활용한 음 성 인식의 일례를 도시한 도면이다.8 is a diagram illustrating an example of speech recognition using an N-best POI result between steps according to an embodiment of the present invention.

도 9는 본 발명의 다른 실시예에 있어서, 사용자 발화에 대한 시스템 컨펌과 연관된 정보를 이용한 음성 인식의 일례를 도시한 도면이다.FIG. 9 is a diagram for one example of speech recognition using information associated with system confirmation of user speech in another embodiment of the present invention.

도 10은 본 발명의 또 다른 실시예에 있어서, POI의 추가 분류 정보를 이용하는 일례를 도시한 도면이다.FIG. 10 is a diagram for one example of using additional classification information of POI according to another embodiment of the present invention.

<도면의 주요 부분에 대한 부호의 설명><Explanation of symbols for main parts of the drawings>

620: 활성화부620: activator

621: N-best POI 결과 생성부621: N-best POI result generator

622: POI 네트워크 활성화부622: POI network activation unit

720: 제외/인식부720: excluded / recognized

721: 부정 인식 결과 제공부721: negative recognition result provider

722: 부정 인식 결과 제외부722: exclude negative recognition result

723: 인식부723: recognition unit

본 발명은 텔레매틱스(Telematics) 환경에서 음성 인식의 성능을 향상시키기 위한 방법 및 상기 방법을 수행하는 장치에 관한 것이다.The present invention relates to a method for improving the performance of speech recognition in a telematics environment and an apparatus for performing the method.

텔레매틱스와 같은 차량 환경 하에서의 음성인식의 경우, 순순히 음성 인식기 성능에만 의존하는 경우, 잡음으로 인하여 실제 체감인식률이 상당히 떨어진다. 이는 음성 인식 과정에서 유사한 POI(Point Of Interest)명에 대한 혼동으로 인해 인식 성능 저하 현상이 많이 발생하기 때문이다. 특히, 위와 같이 차량 환경 내에서는 차량 소음 및 주변 소음 등의 잡음으로 인해 POI간에 변별력이 저하되어 유사한 POI명에 대한 음성 인식 성능 저하 현상이 더 자주 발생한다.In the case of speech recognition in a vehicle environment such as telematics, if the mere relying on the speech recognizer performance, the actual haptic recognition rate is considerably lowered due to noise. This is because the recognition performance decreases due to confusion of similar point of interest (POI) names in the speech recognition process. In particular, in the vehicle environment as described above, the discrimination between POIs is reduced due to noise such as vehicle noise and ambient noise, so that the speech recognition performance degradation of the similar POI name occurs more frequently.

종래기술에서는 위와 같은 문제점을 해결하기 위해, 매 발화시마다 사용자에게 발성 내용에 대한 확인을 요구한다. 위와 같은 사용자 발화 및 확인의 단계별 인식을 통해 인식 대상 어휘를 축소하고, 각 단계별 인식 대상 어휘의 탐색 네트워크(Network)만을 이용하여 인식하는 방법으로 유사한 POI명으로 인한 인식 성능 저하를 부분적으로나마 해소한다.In the prior art, in order to solve the above problems, the user is required to check the utterance contents at every utterance. The recognition target vocabulary is reduced by recognizing the user's utterance and confirmation as described above, and the recognition performance is partially eliminated due to the similar POI name by recognizing using only the search network of each target recognition vocabulary.

도 1은 종래기술에 따른 단계별 음성 인식 방법의 일례를 도시한 도면이다. 도 1과 같은 종래기술에 따르면, 하나의 단계에서 사용자 발화(101)에 대해 가장 유사한 하나의 POI(102)와 관련된 POI 네트워크(103)만을 이용하여 다음 단계에서 최종 목표로 하는 결과(104)를 인식하기 때문에, 도 1의 일례와 같이 인식이 실패(105)하게 된다. 즉, 하나의 단계에서 오류가 발생하면 그 오류가 다음 단계로 이어지고 결국 음성 인식 과정을 처음부터 다시 거쳐야 하는 문제점이 발생한다.1 is a diagram illustrating an example of a step-by-step speech recognition method according to the prior art. According to the prior art as shown in FIG. 1, in one step, only the POI network 103 associated with one POI 102 that is most similar to the user speech 101 is used to obtain the result 104 as the final goal in the next step. Because of the recognition, recognition fails as shown in the example of FIG. 1. In other words, if an error occurs in one step, the error leads to the next step, and eventually a problem of having to go through the speech recognition process from the beginning again occurs.

도 2는 종래기술에 따른 시스템 컨펌(System Confirm)을 이용한 음성 인식 방법의 일례를 도시한 도면이다. 도 2와 같은 종래기술에 따르면, 사용자 발화(201)에 대한 시스템 컨펌(System Confirm)(202)에 부정 응답이 입력된 경우, 단계별 인식 과정에서 부정된 인식 결과(203)가 다시 나타나게 되어 같은 오류가 반복적으로 발생한다. 즉, 사용자 확인으로 사용자가 원하지 않는 인식 결과임이 확인 된 인식 결과가 재인식 과정에서 다시 인식되어 같은 오류가 반복적으로 나타나는 문제점을 가지고 있다.2 is a diagram illustrating an example of a speech recognition method using a system confirm according to the prior art. According to the related art as shown in FIG. 2, when a negative response is inputted to a system confirm 202 for a user speech 201, a negative recognition result 203 is displayed again in a stepwise recognition process. Occurs repeatedly. In other words, the recognition result confirmed that the user does not want the recognition result by the user confirmation is re-recognized in the re-recognition process has the problem that the same error appears repeatedly.

이에 더해, 단계별로 구분된 POI 중에서 범위가 넓은 POI명의 경우, 하나의 광역 POI에 너무 많은 수의 상세 POI가 연결되어 메모리(Memory)가 비 효율적으로 관리되는 문제점이 있다.In addition, in the case of POI names having a wide range among POIs classified in stages, there is a problem in that memory is managed inefficiently because too many detailed POIs are connected to one wide area POI.

본 발명은, 상기와 같은 종래기술의 문제점을 해결하기 위해, 텔레매틱스(Telematics) 환경에서 음성 인식의 성능을 향상시키기 위한 방법 및 상기 방법을 수행하는 장치에 관한 새로운 기술을 제안한다.The present invention proposes a new technique related to a method for improving the performance of speech recognition in a telematics environment and an apparatus for performing the method, in order to solve the above problems of the prior art.

본 발명은 단계별 인식 과정에서 이전 단계의 인식 결과의 정보를 이용하여 상기 이전 단계의 유사 POI(Point Of Interest)명에 의한 오류가 현재 단계까지 이어져서 발생하는 단계별 인식 결과 오류를 최소화 하는 것을 목적으로 한다.An object of the present invention is to minimize a step recognition result error caused by an error caused by a similar point of interest (POI) name of the previous step to the current step by using the information of the recognition result of the previous step in the step recognition process. do.

본 발명의 다른 목적은 사용자 발화에 대한 시스템 컨펌(System Confirm)에 대해 부정 응답을 받은 인식 결과는 재 인식 과정에서 제외하여, 상기 인식 결과가 다시 인식되는 오류를 최소화하는 것이다.Another object of the present invention is to minimize an error in which the recognition result is re-recognized by excluding a recognition result that has received a negative response to a system confirmation for a user utterance in the re-recognition process.

본 발명의 또 다른 목적은 규모가 큰 광역지명에 대해서는 추가적인 분류 정보를 이용하여 탐색망의 크기를 감소시키고 비 효율적인 메모리(Memory) 사용 문제를 해결하는 것이다.It is still another object of the present invention to reduce the size of the search network and solve an inefficient use of memory by using additional classification information for large area names.

상기의 목적을 달성하고, 상술한 종래기술의 문제점을 해결하기 위하여, 본 발명의 일실시예에 따른 텔레매틱스(Telematics) 음성 인식 방법은, 입력된 상세 POI(Point of Interest)의 인식이 실패한 경우, 광역 POI를 입력 받는 단계, 상기 광역 POI와 연관된 POI 네트워크(Network)를 활성화하는 단계, 및 재 입력된 상기 상세 POI와의 유사도가 가장 높은 POI를 상기 POI 네트워크에서 검색하여 인식 결과로서 제공하는 단계를 포함한다.In order to achieve the above object and to solve the above-mentioned problems of the prior art, the telematics speech recognition method according to an embodiment of the present invention, when the recognition of the input point of interest (POI) fails, Receiving a wide area POI, activating a POI network associated with the wide area POI, and searching for the POI having the highest similarity with the detailed POI re-entered in the POI network and providing the recognition result as a recognition result. do.

본 발명의 일측에 따르면, 상기 광역 POI와 연관된 POI 네트워크를 활성화하는 상기 단계는, 상기 입력 받은 광역 POI의 N-best POI 결과를 생성하는 단계, 및 상기 N-best POI에 해당하는 POI 네트워크를 활성화하는 단계를 포함할 수 있다According to one aspect of the invention, the step of activating the POI network associated with the wide area POI, generating an N-best POI result of the received wide area POI, and activates the POI network corresponding to the N-best POI It may include the steps to

본 발명의 다른 측면에 따르면, 상기 N-best POI 결과는, 입력된 POI와의 유사도 순서로 N개의 POI를 활성화 POI 네트워크에서 검색하여, 검색된 순서로 정렬한 리스트(List)인 것을 특징으로 할 수 있다.According to another aspect of the present invention, the N-best POI result may be a list in which N POIs are searched in the active POI network in the order of similarity with the input POIs, and arranged in the searched order. .

본 발명의 또 다른 측면에 따르면, 상기 광역 POI는 상기 상세 POI의 범위를 한정하기 위한 POI이고, 상기 광역 POI가 일정 수 이상의 상기 상세 POI를 포함하는 경우, 추가적인 분류정보를 포함할 수 있다.According to another aspect of the present invention, the wide area POI is a POI for limiting the range of the detailed POI, and when the wide area POI includes the predetermined number or more of the detailed POI, it may include additional classification information.

본 발명의 다른 실시예에 따른 텔레매틱스 음성 인식 방법은, 입력된 POI에 대한 인식 결과와 연관된 소정의 시스템 컨펌(System Confirm)을 제공하는 단계, 및 상기 시스템 컨펌의 응답과 연관하여 상기 인식 결과를 제외하고 인식하는 단계를 포함할 수 있다.Telematics speech recognition method according to another embodiment of the present invention, providing a predetermined system confirmation associated with the recognition result for the input POI, and excludes the recognition result in association with the response of the system confirmation And recognizing.

본 발명의 또 다른 실시예에 따른, 텔레매틱스 음성 인식 장치는, 입력된 상세 POI의 인식이 실패한 경우, 광역 POI를 입력 받는 광역 POI 입력부, 상기 광역 POI와 연관된 POI 네트워크를 활성화하는 활성화부, 및 재 입력된 상세 POI와의 유사도가 가장 높은 POI를 상기 POI 네트워크에서 검색하여 인식 결과로서 제공하는 인식 결과 제공부를 포함할 수 있다. According to another embodiment of the present invention, the telematics speech recognition apparatus may include a wide area POI input unit receiving a wide area POI, an activation unit for activating a POI network associated with the wide area POI, when recognizing the input detailed POI fails. It may include a recognition result providing unit for searching for the POI having the highest similarity with the input detailed POI in the POI network and providing it as a recognition result.

이하 첨부된 도면을 참조하여 본 발명에 따른 다양한 실시예를 상세히 설명하기로 한다.Hereinafter, various embodiments of the present disclosure will be described in detail with reference to the accompanying drawings.

도 3은 텔레매틱스 음성 인식 방법의 일례를 도시한 흐름도이다.3 is a flowchart illustrating an example of a telematics speech recognition method.

도 3에 도시한 바와 같이, 본 발명에서는 POI를 광역, 중급, 또는 상세 POI로 분류하여 상기 상세 POI 인식이 실패한 경우(301), 상기 광역 POI로 시/군명 인식(302) 및 상기 중급 POI로 구/업종 인식(303)의 단계별 POI 인식을 통하여 상기 POI의 범위를 축소하여 목표하는 최종 POI(304)를 인식한다. 이 경우, 상기 광역 POI는 상기 상세 POI의 범위를 한정하기 위한 POI이고, 상기 광역 POI가 일정 수 이상의 상기 상세 POI를 포함하는 경우, 추가적인 분류정보로서 중급 POI를 포함할 수 있다.As shown in FIG. 3, in the present invention, when the POI is classified into a wide area, intermediate, or detailed POI and the detailed POI recognition fails (301), the local POI is recognized as a city / county name recognition 302 and the intermediate POI. The target POI 304 is recognized by reducing the range of the POI through step-by-step POI recognition of the old / business type recognition 303. In this case, the wide area POI is a POI for limiting the range of the detailed POI, and when the wide area POI includes the predetermined number of detailed POIs, the wide area POI may include an intermediate POI as additional classification information.

위와 같이, 단계간에 이전 단계의 인식 결과 후보 리스트인 N-best POI 결과를 현 단계에서 활용함으로써 이전 단계의 오류로 인한 현 단계의 인식 결과 오류를 최소화 할 수 있다.As described above, by using the N-best POI result, which is the recognition result candidate list of the previous step, in the current step between steps, it is possible to minimize the error in the current step due to the error in the previous step.

이에 더해, 사용자 발화에 대해 시스템 컨펌을 제공하고 부정 응답이 인식된 POI를 최근 등록 POI 리스트(305)에 저장하여, 이후 단계에서 상기 POI가 다시 인식 결과 후보에 등록되어 있는 경우, 상기 POI를 상기 인식 결과 후보에서 제외함으로써, 상기 POI가 다시 인식 결과로 인식되는 오류를 최소화 할 수 있다.In addition, a POI provided with a system confirmation for user utterance and a negative response is recognized in the recently registered POI list 305, and if the POI is registered again as a recognition result candidate in a later step, the POI is recalled. By excluding from the recognition result candidate, an error in which the POI is recognized as the recognition result can be minimized.

단계(S401)에서 텔레매틱스 음성 인식을 위한 장치는, 사용자 발화에 대한 POI를 상세 POI로 인식한다.In operation S401, the apparatus for recognizing the telematics voice recognizes the POI for the user speech as the detailed POI.

단계(S402)에서 상기 장치는, 상기 상세 POI의 인식이 성공한 경우 단계(S409)를 수행하고, 상기 인식이 성공하지 못한 경우 단계(S403)를 수행한다.In step S402, the device performs step S409 when the recognition of the detailed POI is successful, and step S403 when the recognition is not successful.

단계(S403)에서 상기 장치는, 사용자에게 상세 POI와 연관된 사용자 발화를 요청하고 상기 사용자 발화를 광역 POI로 인식한다. 이 경우, 상기 입력 받은 광역 POI의 N-best POI 결과를 생성한다. 이에 더해, 상기 광역 POI는 상기 상세 POI의 범위를 한정하기 위한 POI이고, 상기 광역 POI가 일정 수 이상의 상기 상세 POI를 포함하는 경우, 추가적인 분류정보로 중급 POI를 포함할 수 있다.In step S403, the device requests a user to speak a user speech associated with a detailed POI and recognizes the user speech as a wide area POI. In this case, an N-best POI result of the received wide area POI is generated. In addition, the wide area POI is a POI for limiting the range of the detailed POI, and when the wide area POI includes more than a predetermined number of detailed POIs, the wide area POI may include an intermediate POI as additional classification information.

단계(S404)에서 상기 장치는, 단계(S403)에서 생성된 상기 N-best POI에 해당하는 POI 네트워크(Network)를 활성화한다. 이 경우, 상기 N-best POI 결과는, 입력된 상기 광역 POI와의 유사도 순서로 N개의 POI를 활성화된 POI 네트워크에서 검색하여, 검색된 순서로 정렬한 리스트(List)인 것을 특징으로 할 수 있다.In step S404, the device activates a POI network corresponding to the N-best POI generated in step S403. In this case, the N-best POI result may be a list in which N POIs are searched in the activated POI network in the order of similarity with the inputted wide area POIs, and arranged in the searched order.

단계(S405)에서 상기 장치는, 상기 사용자에게 상기 상세 POI에 대한 사용자 발화를 요청하고 상기 사용자 발화를 상기 상세 POI로 재 인식한다.In step S405, the device requests the user to speak the user for the detailed POI and re-recognizes the user speech as the detailed POI.

단계(S406)에서 상기 장치는, 상기 상세 POI의 인식이 성공한 경우 단계(S409)를 수행하고, 상기 인식이 성공하지 못한 경우 단계(S408)를 수행한다.In step S406, the device performs step S409 when the recognition of the detailed POI succeeds and step S408 when the recognition does not succeed.

단계(S407)에서 상기 장치는, 상기 사용자에게 사용자 발화에 대한 음성 인 식 실패에 대한 정보를 제공한다.In step S407, the device provides the user with information about a failure in speech recognition for user speech.

단계(S408)에서 상기 장치는, 상기 사용자에게 상기 사용자 발화에 대한 인식 결과를 제공한다.In operation S408, the device provides the user with a result of recognizing the user's speech.

위와 같이, 단계별 인식과정에서 상기 광역 POI를 인식하는 단계에서 유사 POI명으로 인한 오류가 발생하는 경우, 상기 N-best POI 결과를 이용하여 광역 POI와 연관된 상기 POI 네트워크가 상세 POI를 인식하는 단계에서 제외되는 것을 방지하여 이전 단계의 상기 유사 POI명으로 인한 오류가 현재 단계까지 이어지는 문제를 최소화 할 수 있다.As described above, when an error due to a similar POI name occurs in the step of recognizing the wide area POI in the step-by-step recognition process, in the step of recognizing a detailed POI by the POI network associated with the wide area POI using the N-best POI result It can be prevented from being excluded to minimize the problem that the error caused by the similar POI name of the previous step continues to the current step.

도 5는 본 발명의 다른 실시예에 있어서, 사용자 발화에 대한 시스템 컨펌과 연관된 정보를 이용하기 위한 방법을 도시한 흐름도이다. 도 5에 도시한 바와 같이, 도 5의 각 단계는 도 4의 단계와 연관될 수 있고, 도 5의 단계(S504) 내지 단계(S507)는 도 4의 단계(S405)에 포함되어 수행될 수 있다.5 is a flowchart illustrating a method for using information associated with a system confirmation for user utterance in another embodiment of the present invention. As shown in FIG. 5, each step of FIG. 5 may be associated with the step of FIG. 4, and steps S504 to S507 of FIG. 5 may be performed by being included in step S405 of FIG. 4. have.

단계(S501)에서 상기 장치는, 단계(S401) 및 단계(S402)를 통해 상세 POI 인식에 성공한 경우, 상기 사용자 발화에 대해 시스템 컨펌을 제공한다. 이 경우, 상기 시스템 컨펌은, 상기 사용자 발화에 대한 인식 결과가 사용자의 의도에 맞는 POI 인지 여부를 확인하기 위한 질문을 포함할 수 있다.In step S501, the device provides a system confirmation for the user speech when the detailed POI recognition is successful in steps S401 and S402. In this case, the system confirmation may include a question for confirming whether the recognition result of the user utterance is a POI suitable for the user's intention.

단계(S502)에서 상기 장치는, 상기 시스템 컨펌에 대해 긍정 응답을 인식한 경우, 단계(S407)를 수행하여 상기 인식 결과를 제공하고, 부정 응답을 수신한 경우 단계(S503)를 수행한다.In step S502, when the device recognizes the acknowledgment of the system confirmation, the device performs step S407 to provide the recognition result, and when the negative response is received, step S503.

단계(S503)에서 상기 장치는, 상기 부정 응답에 대한 부정 인식 결과를 최근 등록 POI 리스트(305)에 저장하고 단계(S403)를 통해 광역 POI를 인식한다. 이 경우, 단계(S403)은 상기 광역 POI 인식을 통해 제1 N-best POI 결과를 생성하고, 단계(S404)는 상기 제1 N-best POI 결과와 연관된 POI 네트워크를 활성화한다.In step S503, the device stores the negative recognition result for the negative response in the recent registered POI list 305 and recognizes the wide area POI in step S403. In this case, step S403 generates a first N-best POI result through the wide area POI recognition, and step S404 activates a POI network associated with the first N-best POI result.

단계(S504)에서 상기 장치는, 상기 상세 POI를 재 인식하기 위해, 상기 POI 네트워크를 이용하여 상기 상세 POI의 제2 N-best POI를 검색한다.In step S504, the device searches for the second N-best POI of the detailed POI using the POI network to re-recognize the detailed POI.

단계(S505)에서 상기 장치는, 검색 결과, 상기 제2 N-best POI 결과에 단계(S503)에서 최근 등록 POI 리스트(305)에 저장된 상기 부정 인식 결과가 포함되어 있는 경우 단계(S506)를 수행하고, 포함되어 있지 않은 경우 단계(S507)를 수행한다.In operation S505, the device performs operation S506 when the search result and the second N-best POI result include the negative recognition result stored in the recently registered POI list 305 in operation S503. If it is not included, step S507 is performed.

단계(S506)에서 상기 장치는, 상기 제2 N-best POI 결과에서 상기 부정 인식 결과를 제외하고 단계(S507)를 수행한다.In operation S506, the device performs operation S507 excluding the negative recognition result from the second N-best POI result.

단계(S507)에서 상기 장치는, 상세 POI를 인식하고, 인식의 성공 여부(S406)에 따라 인식 실패에 관한 정보 제공하는 단계(S407) 또는 인식 결과를 제공하는 단계(S408)를 수행한다.In operation S507, the apparatus recognizes the detailed POI, and provides information on recognition failure (S407) or providing a recognition result (S408) according to whether the recognition is successful (S406).

위와 같이, 사용자 발화에 대한 시스템 컨펌에 대해 부정 응답을 받은 상기 부정 인식 결과를 저장하고, 재 인식 과정에서 상기 저장된 부정 인식 결과가 다시 상기 제2 N-best POI 결과에 포함되는 경우, 상기 제2 N-best POI 결과에서 상기 부정 인식 결과를 제외함으로써 상기 부정 인식 결과가 상기 재 인식 과정에서 다시 인식되는 오류를 최소화할 수 있다.As described above, when the negative recognition result received a negative response to the system confirmation for the user utterance is stored, and when the stored negative recognition result is included in the second N-best POI result again during the re-recognition process, the second By excluding the negative recognition result from the N-best POI result, an error in which the negative recognition result is re-recognized in the re-recognition process may be minimized.

도 6은 본 발명의 일실시예에 있어서, 단계간 N-best POI 결과를 활용하는 장치의 내부 구성을 설명하기 위한 블록도이다. 이 경우, 텔레매틱스 음성 인식 장치는 광역 POI 입력부(610), 활성화부(620), 및 인식 결과 제공부(630)를 포함할 수 있다.FIG. 6 is a block diagram illustrating an internal configuration of an apparatus using the N-best POI result between steps according to an embodiment of the present invention. In this case, the telematics speech recognition apparatus may include a wide area POI input unit 610, an activation unit 620, and a recognition result providing unit 630.

광역 POI 입력부(610)는 입력된 상세 POI의 인식이 실패한 경우, 광역 POI를 입력 받는다. 이 경우, 상기 광역 POI는 상기 상세 POI의 범위를 한정하기 위한 POI이고, 상기 광역 POI가 일정 수 이상의 상기 상세 POI를 포함하는 경우, 추가적인 분류정보로 중급 POI를 포함할 수 있다.When the recognition of the input detailed POI fails, the wide area POI input unit 610 receives the wide area POI. In this case, the wide area POI is a POI for limiting the range of the detailed POI, and when the wide area POI includes more than a predetermined number of detailed POIs, the wide area POI may include an intermediate POI as additional classification information.

활성화부(620)는 상기 광역 POI와 연관된 POI 네트워크를 활성화한다. 이 경우, 활성화부(620)는 N-best POI 생성부(621) 및 POI 네트워크 활성화부(622)를 포함할 수 있다.The activation unit 620 activates a POI network associated with the wide area POI. In this case, the activation unit 620 may include an N-best POI generation unit 621 and a POI network activation unit 622.

N-best POI 생성부(621)는 상기 입력 받은 광역 POI의 N-best POI 결과를 생성한다. 이 경우, 상기 N-best POI 결과는, 입력된 POI와의 유사도 순서로 N개의 POI를 활성화 POI 네트워크에서 검색하여, 검색된 순서로 정렬한 리스트인 것을 특징으로 할 수 있다.The N-best POI generation unit 621 generates an N-best POI result of the received wide area POI. In this case, the N-best POI result may be a list in which N POIs are searched in the activated POI network in the order of similarity with the input POIs, and arranged in the search order.

POI 네트워크 활성화부(622)는 상기 N-best POI에 해당하는 POI 네트워크를 활성화한다. 이 경우, 상기 유사 POI명에 의한 오류로 인해 상기 상세 POI와 연관된 상기 광역 POI가 선택 받지 못해도, 상기 N-best POI에 의해 상기 광역 POI 네트워크가 활성화 되어, 상기 상세 POI 인식이 성공할 수 있게 된다.The POI network activator 622 activates a POI network corresponding to the N-best POI. In this case, even if the wide area POI associated with the detailed POI is not selected due to an error by the similar POI name, the wide area POI network is activated by the N-best POI, so that the detailed POI recognition can succeed.

인식 결과 제공부(630)는 재 입력된 상세 POI와의 유사도가 가장 높은 POI를 상기 POI 네트워크에서 검색하여 인식 결과로서 제공한다.The recognition result providing unit 630 searches for the POI having the highest similarity with the re-entered detailed POI in the POI network and provides the result as a recognition result.

도 7은 본 발명의 다른 실시예에 있어서, 사용자 발화에 대한 시스템 컨펌과 연관된 정보를 이용하는 장치의 내부 구성을 설명하기 위한 블록도이다. 이 경우, 텔레매틱스 음성 인식 장치는 시스템 컨펌 제공부(710), 및 제외/인식부(720)를 포함할 수 있다.FIG. 7 is a block diagram illustrating an internal configuration of an apparatus using information associated with a system confirmation about user speech according to another embodiment of the present invention. In this case, the telematics speech recognition apparatus may include a system confirmation provider 710 and an exclusion / recognition unit 720.

시스템 컨펌 제공부(710)는, 입력된 POI에 대한 인식 결과와 연관된 소정의 시스템 컨펌을 제공한다. 이 경우, 상기 시스템 컨펌은, 상기 인식 결과가 사용자의 의도에 맞는 POI 인지 여부를 확인하기 위한 질문을 포함하는 것을 특징으로 할 수 있다.The system confirmation provider 710 provides a predetermined system confirmation associated with the recognition result of the input POI. In this case, the system confirmation may include a question for confirming whether the recognition result is a POI suitable for a user's intention.

제외/인식부(720)는 상기 시스템 컨펌의 응답과 연관하여 상기 인식 결과를 제외하고 인식한다. 이 경우, 제외/인식부(720)는 부정 인식 결과 저장부(721), 부정 인식 결과 제외부(722), 및 인식부(723)를 포함할 수 있다.The exclusion / recognition unit 720 recognizes the result of the recognition in association with the response of the system confirmation. In this case, the exclusion / recognition unit 720 may include a negative recognition result storage unit 721, a negative recognition result exclusion unit 722, and a recognition unit 723.

부정 인식 결과 저장부(721)는 상기 시스템 컨펌에 대한 부정 응답이 인식된 경우, 상기 인식 결과를 부정 인식 결과로 저장한다.When a negative response to the system confirmation is recognized, the negative recognition result storage unit 721 stores the recognition result as a negative recognition result.

부정 인식 결과 제외부(722)는 재인식 과정에서 N-best POI 결과에 상기 부정 인식 결과가 다시 포함되는 경우, 상기 부정 인식 결과를 상기 N-best POI 결과에서 제외한다.The negative recognition result exclusion unit 722 excludes the negative recognition result from the N-best POI result when the negative recognition result is included again in the N-best POI result in the re-recognition process.

인식부(723)는 상기 부정 인식 결과가 제외된 상기 N-best POI를 이용하여 인식한다.The recognition unit 723 recognizes the N-best POI from which the negative recognition result is excluded.

도 8은 본 발명의 일실시예에 있어서, 단계간 N-best POI 결과를 활용한 음성 인식의 일례를 도시한 도면이다.8 is a diagram illustrating an example of speech recognition using an N-best POI result between steps according to one embodiment of the present invention.

도 8에 도시된 바와 같이, 상기 광역 POI를 인식하는 단계에서 광역 POI(801)에 대한 N-best POI 결과(802)를 생성하고 N-best POI 결과(802)와 연관된 POI 네트워크(803)를 활성화시킨다. 이때, 상기 상세 POI(804)를 인식하는 단계에서는 활성화된 POI 네트워크(803)을 이용하여 POI를 검색하기 때문에 상기 광역 POI를 인식하는 단계에서 유사 POI로 인한 오류가 발생해도 목적하는 상세 POI(804)를 인식할 수 있다.As shown in FIG. 8, in the step of recognizing the wide area POI, an N-best POI result 802 is generated for the wide area POI 801 and a POI network 803 associated with the N-best POI result 802 is generated. Activate it. In this case, since the POI is searched using the activated POI network 803 in the step of recognizing the detailed POI 804, even if an error due to a similar POI occurs in the step of recognizing the wide area POI, the desired detailed POI 804 ) Can be recognized.

도 9에 도시된 바와 같이, 목적하는 POI명(901)에 대해 유사한 POI명을 인식하여 시스템 컨펌(902)을 제공하여 사용자에 의해 부정 응답을 받은 경우, 시스템 컨펌(902)에 해당하는 유사 POI명을 저장하고, 재 인식 시 상기 N-best POI(903)에서 상기 유사 POI명을 제외(904)함으로써 목적하는 POI명(901)을 인식할 수 있다.As shown in FIG. 9, when a similar POI name is recognized for a desired POI name 901, a system confirm 902 is provided to receive a negative response by the user, a similar POI corresponding to the system confirm 902 is provided. The target POI name 901 can be recognized by storing the name and 904 excluding the similar POI name from the N-best POI 903 upon re-recognition.

도 10은 본 발명의 또 다른 실시예에 있어서, POI의 추가 분류 정보를 이용하는 일례를 도시한 도면이다. 도 10에 도시한 바와 같이, 상세 POI(1001)의 범위를 한정하기 위한 광역 POI(1002)가 일정 수 이상의 상세 POI(1001)를 포함하여 광역 POI(1002)의 범위가 너무 큰 경우, 추가적인 분류 정보(1003)를 포함할 수 있다.FIG. 10 is a diagram for one example of using additional classification information of POI according to another embodiment of the present invention. As shown in FIG. 10, when the wide area POI 1002 for limiting the range of the detailed POI 1001 includes a certain number of detailed POIs 1001, the range of the wide area POI 1002 is too large, further classification. Information 1003 may be included.

위와 같이, 규모가 큰 광역지명에 대해서는 추가적인 분류 정보를 이용하여 탐색망의 크기를 감소시키고, 비 효율적인 메모리(Memory) 사용 문제를 해결할 수 있다.As described above, the size of the search network can be reduced by using additional classification information for large area names, and the problem of inefficient use of memory can be solved.

본 발명에 따른 텔레매틱스 환경에서 음성 인식의 성능을 향상시키는 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(Floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 상기 매체는 프로그램 명령, 데이터 구조 등을 지정하는 신호를 전송하는 반송파를 포함하는 광 또는 금속선, 도파관 등의 전송 매체일 수도 있다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method for improving the performance of speech recognition in a telematics environment according to the present invention may be implemented in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium. The computer readable medium may include program instructions, data files, data structures, etc. alone or in combination. Program instructions recorded on the media may be those specially designed and constructed for the purposes of the present invention, or they may be of the kind well-known and available to those having skill in the computer software arts. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tape, optical media such as CD-ROMs, DVDs, and magnetic disks, such as floppy disks. Magneto-optical media, and hardware devices specifically configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. The medium may be a transmission medium such as an optical or metal wire, a waveguide, or the like including a carrier wave for transmitting a signal specifying a program command, a data structure, or the like. Examples of program instructions include not only machine code generated by a compiler, but also high-level language code that can be executed by a computer using an interpreter or the like. The hardware device described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

이상과 같이 본 발명에서는 구체적인 구성 요소 등과 같은 특정 사항들과 한정된 실시예 및 도면에 의해 설명되었으나 이는 본 발명의 보다 전반적인 이해를 돕기 위해서 제공된 것일 뿐, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상적인 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다.In the present invention as described above has been described by the specific embodiments, such as specific components and limited embodiments and drawings, but this is provided to help a more general understanding of the present invention, the present invention is not limited to the above embodiments. For those skilled in the art, various modifications and variations are possible from these descriptions.

따라서, 본 발명의 사상은 설명된 실시예에 국한되어 정해져서는 아니되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등하거나 등가적 변형이 있는 모든 것들은 본 발명 사상의 범주에 속한다고 할 것이다.Therefore, the spirit of the present invention should not be limited to the described embodiments, and all the things that are equivalent to or equivalent to the claims as well as the following claims will belong to the scope of the present invention. .

본 발명에 따르면, 단순히 음성 인식기기의 성능에만 의존하지 않고, 사용자 인터페이스(Interface)의 효과적인 설계를 통하여 체감 인식률을 극대화할 수 있다.According to the present invention, the sensory recognition rate can be maximized through the effective design of the user interface, not merely depending on the performance of the voice recognition device.

본 발명에 따르면, 텔레매틱스(Telematics) 음성 인식의 단계별 인식 과정에서 이전 단계의 인식 결과의 정보를 이용하여, 상기 이전 단계의 유사 POI(Point Of Interest)명에 의한 오류가 현재 단계까지 이어져서 발생하는 단계별 인식 결과 오류를 최소화 하여 음성 인식의 성능을 향상시킬 수 있다.According to the present invention, an error caused by a similar point of interest (POI) name of the previous step is generated up to the current step by using the information of the recognition result of the previous step in the step recognition process of the telematics speech recognition. It is possible to improve the performance of speech recognition by minimizing the error of the recognition result.

본 발명에 따르면, 사용자 발화에 대한 시스템 컨펌(System Confirm)에 대해 부정 응답을 받은 인식 결과를 저장하고, 재 인식 과정에서 상기 인식 결과가 다시 포함되는 경우, 상기 인식 결과를 제외함으로써, 상기 인식 결과가 다시 인식되는 인식 오류를 최소화하여 음성 인식의 성능을 향상시킬 수 있다.According to the present invention, if the recognition result received a negative response to the system confirmation (System Confirm) for the user utterance, and if the recognition result is included again in the re-recognition process, the recognition result by excluding the recognition result, Can improve the performance of speech recognition by minimizing the recognition errors that are recognized again.

본 발명에 따르면, 규모가 큰 광역 지명에 대해서는 추가적인 분류 정보를 이용하여 탐색망의 크기를 감소시키고 비 효율적인 메모리(Memory) 사용 문제를 해결할 수 있다.According to the present invention, it is possible to reduce the size of the search network and solve an inefficient memory use problem by using additional classification information for a large area name.

Claims

In the telematics speech recognition method,

If the recognition of the inputted detailed point of interest (POI) fails, receiving a wide area POI;

Activating a POI network associated with the wide area POI; And

Searching for the POI having the highest similarity with the detailed POI re-entered in the POI network and providing the result as a recognition result

Telematics speech recognition method comprising a.

The method of claim 1,

Activating a POI network associated with the wide area POI,

Generating an N-best POI result of the received wide area POI; And

Activating a POI network corresponding to the N-best POI

Telematics speech recognition method comprising a.

The method of claim 2,

The N-best POI result is a list of retrieval of the N POIs in the activated POI network in the order of similarity with the input POIs, and a list arranged in the search order.

The method of claim 1,

The wide area POI is a POI for limiting the range of the detailed POI, and when the wide area POI includes the predetermined number or more of the detailed POI, the telematics speech recognition method may include additional classification information.

In the telematics speech recognition method,

Providing a predetermined System Confirm associated with a recognition result for the input POI; And

Recognizing except for the recognition result in association with the response of the system confirmation

Including,

Recognizing except for the recognition result in association with the response of the system confirmation,

Storing the recognition result as a negative recognition result when a negative response to the system confirmation is recognized;

Excluding the negative recognition result from the N-best POI result when the negative recognition result is included again in the N-best POI result during the recognizing process; And

Recognizing using the N-best POI from which the negative recognition result is excluded

Telematics speech recognition method comprising a.

delete

The method of claim 5,

The system confirmation, the telematics speech recognition method characterized in that it comprises a question for confirming whether the recognition result is a POI in accordance with the user's intention.

A computer-readable recording medium in which a program for executing the method of any one of claims 1 to 5 or 7 is recorded.

In the telematics speech recognition device,

A wide area POI input unit configured to receive a wide area POI when recognition of the input detailed POI fails;

An activation unit for activating a POI network associated with the wide area POI; And

Recognition result providing unit for retrieving the POI having the highest similarity with the detailed POI re-entered in the POI network and providing it as a recognition result.

Apparatus comprising a.

The method of claim 9,

The activator,

An N-best POI result generator for generating an N-best POI result of the received wide area POI; And

POI network activation unit for activating the POI network corresponding to the N-best POI

Apparatus comprising a.

The method of claim 10,

And the N-best POI result is a list in which N POIs are searched in the active POI network in the order of similarity with the input POIs, and arranged in the search order.

In the telematics speech recognition device,

A system confirmation provider for providing a predetermined system confirmation associated with the recognition result of the input POI; And

Exclusion / recognition unit for recognizing the result of the recognition in association with the response of the system confirmation

Including,

The exclusion / recognition unit,

A negative recognition result storage unit that stores the recognition result as a negative recognition result when a negative response to the system confirmation is recognized;

A negative recognition result exclusion unit that excludes the negative recognition result from the N-best POI result when the N-best POI result is included again in the N-best POI result during the recognizing process; And

Recognizing unit using the N-best POI without the negative recognition result

Apparatus comprising a.

delete