KR102527346B1

KR102527346B1 - Voice recognition device for vehicle, method for providing response in consideration of driving status of vehicle using the same, and computer program

Info

Publication number: KR102527346B1
Application number: KR1020210080556A
Authority: KR
Inventors: 양태영
Original assignee: 주식회사 인텔로이드
Priority date: 2021-06-22
Filing date: 2021-06-22
Publication date: 2023-05-02
Also published as: KR20220170035A

Abstract

음성 인식 장치의 차량의 주행상태를 고려한 응답 제공 방법은 차량 내 화자의 발화에 따른 음성을 수신하는 단계, 수신된 음성을 텍스트로 변환하는 단계, 차량의 현재 주행 상태를 판단하는 단계, 텍스트로부터 화자의 발화 의도를 분석하고, 판단된 차량의 주행 상태를 기초로 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청인지 또는 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청인지 판단하는 단계 및 일반 요청 또는 긴급 요청에 따라 차량의 문제상황의 처리를 위한 안내를 제공하는 단계를 포함할 수 있다.A method of providing a response in consideration of the driving state of a vehicle by a voice recognition device includes the steps of receiving a voice according to the utterance of a speaker in the vehicle, converting the received voice into text, determining the current driving state of the vehicle, and selecting the speaker from the text. Analyzes the ignition intention and determines whether the speaker's ignition is a general request for general vehicle actions unrelated to the speaker's safety or an urgent request for vehicle action related to the speaker's safety. The method may include determining whether the request is an emergency request and providing guidance for handling a problem situation of the vehicle according to a general request or an emergency request.

Description

[0001] Voice recognition device for vehicle, method for providing response in consideration of driving status of vehicle using the same, and computer program

본 발명은 차량 내 화자의 음성을 기반으로 차량의 주행상태를 고려하여 응답하는 차량용 음성 인식 장치, 방법 및 컴퓨터 프로그램에 관한 것이다.The present invention relates to a voice recognition device for a vehicle, a method, and a computer program for responding in consideration of the driving state of the vehicle based on the voice of a speaker in the vehicle.

최근 음성 인식 기술이 발전함에 따라 그 기술의 적용 대상도 점차 확대되고 있다. 그 대표적인 예로 차량을 들 수 있다.As voice recognition technology has recently developed, the application target of the technology is also gradually expanding. A typical example is a vehicle.

차량에 적용된 음성 인식 기술은 차량을 운전하는 운전자 또는 동승자의 음성을 인식하여 해당 음성에 대응되는 기능을 수행한다. 이 경우, 운전자가 차량에 구비된 각종 장비들의 조작에 익숙하지 않더라도, 음성을 통하여 차량의 기능을 쉽게 선택할 수 있고, 운전자는 손을 사용할 필요가 없어 운전에 보다 집중할 수 있다. Voice recognition technology applied to a vehicle recognizes the voice of a driver or passenger driving the vehicle and performs a function corresponding to the voice. In this case, even if the driver is not familiar with operating various equipment provided in the vehicle, the vehicle function can be easily selected through voice, and the driver does not need to use his or her hands and can concentrate more on driving.

다만, 이러한 종래의 차량 음성 인식 기술은 내비게이션 조작, 라디오 조작, 에어컨 조작, 전화 통화 연결, 날씨 안내 등과 같이, 차량의 기능 조작이나 주행 환경 정보 제공 등에 제한적이었다.However, such conventional vehicle voice recognition technology has been limited to operation of vehicle functions or provision of driving environment information, such as navigation operation, radio operation, air conditioner operation, connection to a phone call, weather information, and the like.

운전자가 차량을 사용하는 과정에는 차량 구비 장치들이 정상적으로 작동하지 않는 다양한 문제 상황(예를 들어, 핸들이 동작하지 않거나, 와이퍼가 동작하지 않는 상황 등)이 발생할 수 있는데, 기존의 차량 음성 인식 기술은 이에 대한 적절한 조치 방안을 제공하지 못하였다.While the driver is using the vehicle, various problem situations (eg, a steering wheel does not operate or a wiper does not operate) may occur in which vehicle-equipped devices do not operate normally. Existing vehicle voice recognition technology It was not possible to provide an appropriate action plan for this.

또한, 동일한 문제 상황도 차량 주행 상태(예를 들어, 주행 중인 상태, 정차 중인 상태 등)에 따라 조치 방안이 달라질 수 있지만, 기존의 차량 음성 인식 기술은 이에 대한 적절한 조치 방안을 제공하지 못하였다.In addition, even in the same problem situation, the action plan may vary according to the vehicle driving state (eg, driving state, stopping state, etc.), but the existing vehicle voice recognition technology has not provided an appropriate action plan for this.

따라서, 차량 내 음성 인식 기술을 보다 고도화시키기 위해서는 이에 대한 해결 방안이 필요하다. Therefore, in order to further advance the in-vehicle voice recognition technology, a solution to this problem is required.

본 발명은 상술한 문제점을 해결하기 위해 안출된 것으로, 본 발명의 목적은 차량 내 사용자의 발화를 차량의 현재 주행 상태를 기초로 화자의 안전에 무관한 일반적인 차량의 조치를 요청하는 것인지 또는 화자의 안전에 관련하여 긴급하게 차량의 조치를 요청하는 것인지 구별하여 응답을 제공하는 차량용 음성 인식 장치, 방법 및 컴퓨터 프로그램을 제공함에 있다.The present invention has been made to solve the above-mentioned problems, and an object of the present invention is to request a general vehicle action unrelated to the speaker's safety based on the current driving state of the user's ignition in the vehicle or the speaker's An object of the present invention is to provide a voice recognition device for a vehicle, a method, and a computer program for providing a response by distinguishing whether an urgent vehicle action is requested in relation to safety.

또한, 본 발명의 목적은 차량 내 사용자의 복수 발화 중 긴급도에 따라 각 발화에 대응되는 응답을 제공하는 차량용 음성 인식 장치, 방법 및 컴퓨터 프로그램을 제공함에 있다.It is also an object of the present invention to provide a voice recognition apparatus for a vehicle, a method and a computer program for providing a response corresponding to each utterance according to the degree of urgency among multiple utterances of a user in a vehicle.

상술한 목적을 달성하기 위한 본 발명의 일 실시 예에 따른 차량의 주행상태를 고려한 응답 제공 방법은 차량 내 화자의 발화에 따른 음성을 수신하는 단계, 상기 수신된 음성을 텍스트로 변환하는 단계, 상기 차량의 현재 주행 상태를 판단하는 단계, 상기 텍스트로부터 상기 화자의 발화 의도를 분석하고, 상기 판단된 차량의 주행 상태를 기초로 상기 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청인지 또는 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청인지 판단하는 단계 및 상기 일반 요청 또는 긴급 요청에 따라 차량의 문제상황의 처리를 위한 안내를 제공하는 단계를 포함할 수 있다.In order to achieve the above object, a method for providing a response in consideration of a driving state of a vehicle according to an embodiment of the present invention includes the steps of receiving a voice according to an utterance of a speaker in a vehicle, converting the received voice into text, and the Determining the current driving state of the vehicle, analyzing the speaker's utterance intention from the text, and based on the determined vehicle driving state, the speaker's utterance requires a general vehicle action unrelated to the speaker's safety Determining whether it is a general request or an emergency request that urgently requires vehicle measures in relation to the safety of the speaker, and providing guidance for handling a problem situation of the vehicle according to the general request or emergency request. there is.

또한, 상기 주행 상태는 상기 차량이 주행 중인 상태, 정차 중인 상태 및 주차 중인 상태를 포함할 수 있다.Also, the driving state may include a state in which the vehicle is driving, a state in which the vehicle is stopped, and a state in which the vehicle is parked.

또한, 상기 판단하는 단계는, 화자 종속적인 자연어 이해 모델을 이용하여 상기 화자의 의도를 분석하고 상기 차량의 주행 상태에 따라 상기 화자의 발화를 상기 일반 요청 또는 상기 긴급 요청으로 구분하여 판단할 수 있다.In the determining step, the intention of the speaker is analyzed using a speaker-dependent natural language understanding model, and the speaker's utterance is divided into the general request or the emergency request according to the driving state of the vehicle. .

또한, 상기 차량의 현재 위치를 판단하는 단계를 더 포함하고, 상기 판단하는 단계는 상기 화자의 발화와 상기 판단된 현재 위치 및 상기 주행 상태 간의 상관 관계에 따른 긴급도를 산출하는 단계 및 상기 산출된 긴급도를 기 설정된 기준값과 비교하여 상기 화자의 발화를 상기 일반 요청 또는 상기 긴급 요청으로 구분하는 단계를 더 포함할 수 있다.The method may further include determining a current location of the vehicle, and the determining may include calculating a degree of urgency according to a correlation between the speaker's utterance, the determined current location, and the driving state, and the calculated urgency. The method may further include classifying the utterance of the speaker into the general request or the urgent request by comparing the degree of urgency with a preset reference value.

또한, 상기 안내를 제공하는 단계는 상기 화자가 복수의 발화를 한 경우, 상기 복수의 발화 중 긴급도에 따라 각 발화에 대응되는 안내를 제공할 수 있다.Further, in the providing of the guidance, when the speaker utters a plurality of utterances, guidance corresponding to each utterance may be provided according to the degree of urgency among the plurality of utterances.

또한, 상기 안내를 제공하는 단계는 상기 화자의 발화가 긴급 요청으로 판단된 경우, 긴급도에 따라 긴급조치 방법, ARS 연결 서비스 및 구조 요청 서비스 중 적어도 하나 이상을 제공할 수 있다.Also, in the providing of the guidance, when the speaker's utterance is determined to be an emergency request, at least one of an emergency action method, an ARS connection service, and a rescue request service may be provided according to the degree of urgency.

한편, 음성 인식 장치는 차량 내 화자의 발화에 따른 음성을 수신하는 음성 입력부, 상기 수신된 음성을 텍스트로 변환하는 음성-텍스트 변환부, 상기 차량의 현재 주행 상태를 판단하는 주행 상태 판단부, 상기 텍스트로부터 상기 화자의 발화 의도를 분석하고, 상기 판단된 차량의 주행 상태를 기초로 상기 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청인지 또는 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청인지 판단하는 자연어 이해부 및 상기 일반 요청 또는 긴급 요청에 따라 차량의 문제상황의 처리를 위한 안내를 제공하는 제어부를 포함할 수 있다.On the other hand, the voice recognition device includes a voice input unit for receiving voice according to the utterance of a speaker in the vehicle, a voice-to-text converter for converting the received voice into text, a driving state determination unit for determining the current driving state of the vehicle, The speaker's utterance intention is analyzed from the text, and based on the determined driving state of the vehicle, whether the speaker's utterance is a general request requiring a general vehicle action unrelated to the speaker's safety or an emergency related to the speaker's safety It may include a natural language understanding unit that determines whether it is an emergency request to take action of the vehicle, and a control unit that provides guidance for handling a problem situation of the vehicle according to the general request or emergency request.

또한, 상기 차량의 현재 위치를 판단하는 위치 판단부를 더 포함하고, 상기 제어부는 상기 화자의 발화와 상기 판단된 현재 위치 및 상기 주행 상태 간의 상관 관계에 따른 긴급도를 산출하고, 상기 자연어 이해부는 상기 산출된 긴급도를 기 설정된 기준값과 비교하여 상기 화자의 발화를 상기 일반 요청 또는 상기 긴급 요청으로 구분할 수 있다.The controller may further include a location determination unit that determines the current location of the vehicle, wherein the control unit calculates a degree of urgency according to a correlation between the speaker's speech, the determined current location, and the driving state, and the natural language understanding unit calculates the The utterance of the speaker may be classified as the general request or the emergency request by comparing the calculated urgency with a preset reference value.

또한, 상기 제어부는, 상기 화자가 복수의 발화를 한 경우, 상기 복수의 발화 중 긴급도에 따라 각 발화에 대응되는 안내를 제공할 수 있다.In addition, when the speaker utters a plurality of utterances, the control unit may provide guidance corresponding to each utterance according to the urgency among the plurality of utterances.

또한, 상기 제어부는 상기 화자의 발화가 긴급 요청으로 판단된 경우, 긴급도에 따라 긴급조치 방법, ARS 연결 서비스 및 구조 요청 서비스 중 적어도 하나 이상을 제공할 수 있다.In addition, when it is determined that the speaker's utterance is an emergency request, the controller may provide at least one of an emergency action method, an ARS connection service, and a rescue request service according to the degree of urgency.

한편, 상술한 목적을 달성하기 위한 본 발명의 일 실시 예에 따른 기록 매체에 저장된 프로그램은 상술한 차량의 주행상태를 고려한 응답 방법을 실행하기 위한 프로그램 코드를 포함할 수 있다. Meanwhile, a program stored in a recording medium according to an embodiment of the present invention for achieving the above object may include a program code for executing the above-described response method considering the driving state of the vehicle.

본 발명에 따르면, 차량 내 화자의 발화를 기초로 해당 문제 상황에 대한 적절한 처리 방법을 제공할 수 있다. According to the present invention, it is possible to provide an appropriate processing method for a corresponding problem situation based on the speech of a speaker in a vehicle.

또한, 본 발명에 따르면, 차량의 주행상태를 고려하여 화자의 발화가 화자의 안전에 관련하여 긴급하게 조치가 필요한 긴급 요청인지 또는 화자의 안전에 관련하여 긴급하게 조치가 필요하지 않은 일반 요청인지 구분하고, 화자의 발화 의도에 부합하는 응답을 제공할 수 있다. In addition, according to the present invention, considering the driving state of the vehicle, whether the speaker's utterance is an emergency request that urgently requires action in relation to the speaker's safety or a general request that does not urgently require action in relation to the speaker's safety is distinguished. and can provide a response that meets the speaker's utterance intention.

또한, 본 발명에 따르면, 화자의 복수의 발화가 있는 경우, 긴급도에 따라 발화에 대응되는 응답을 제공함으로써, 화자는 긴급한 문제 상황을 우선적으로 해결할 수 있다. In addition, according to the present invention, when there are a plurality of utterances of a speaker, the speaker may preferentially solve an urgent problem situation by providing a response corresponding to the utterance according to the degree of urgency.

도 1은 본 발명의 일 실시 예에 따른 차량 음성 인식 장치의 활용 예를 나타내는 도면 이다.
도 2는 본 발명의 일 실시 예에 따른 차량 음성 인식 장치를 나타내는 블록도 이다.
도 3은 본 발명의 일 실시 예에 따른 음성 처리부를 보다 구체적으로 나타내는 블록도 이다.
도 4 및 도 5는 본 발명의 일 실시 예에 따른 응답 제공 방법을 나타내는 흐름도 이다. 1 is a diagram showing an example of utilization of a vehicle voice recognition apparatus according to an embodiment of the present invention.
2 is a block diagram illustrating a vehicle voice recognition apparatus according to an embodiment of the present invention.
3 is a block diagram showing a voice processing unit according to an embodiment of the present invention in more detail.
4 and 5 are flowcharts illustrating a method for providing a response according to an embodiment of the present invention.

이하의 내용은 단지 본 발명의 원리를 예시한다. 그러므로 당업자는 비록 본 명세서에 명확히 설명되거나 도시되지 않았지만 본 발명의 원리를 구현하고 본 발명의 개념과 범위에 포함된 다양한 장치를 발명할 수 있는 것이다. 또한, 본 명세서에 열거된 모든 조건부 용어 및 실시 예들은 원칙적으로, 본 발명의 개념이 이해되도록 하기 위한 목적으로만 명백히 의도되고, 이와 같이 특별히 열거된 실시 예들 및 상태들에 제한적이지 않는 것으로 이해되어야 한다.The following merely illustrates the principles of the present invention. Therefore, those skilled in the art can invent various devices that embody the principles of the present invention and fall within the concept and scope of the present invention, even though not explicitly described or shown herein. In addition, all conditional terms and embodiments listed in this specification are, in principle, expressly intended only for the purpose of understanding the concept of the present invention, and should be understood not to be limited to such specifically listed embodiments and conditions. do.

또한, 본 발명의 원리, 관점 및 실시 예들뿐만 아니라 특정 실시 예를 열거하는 모든 상세한 설명은 이러한 사항의 구조적 및 기능적 균등물을 포함하도록 의도되는 것으로 이해되어야 한다. 또한 이러한 균등물들은 현재 공지된 균등물뿐만 아니라 장래에 개발될 균등물 즉 구조와 무관하게 동일한 기능을 수행하도록 발명된 모든 소자를 포함하는 것으로 이해되어야 한다.In addition, it should be understood that all detailed descriptions reciting specific embodiments, as well as principles, aspects and embodiments of the present invention, are intended to encompass structural and functional equivalents of these matters. In addition, it should be understood that such equivalents include not only currently known equivalents but also equivalents developed in the future, that is, all devices invented to perform the same function regardless of structure.

따라서, 예를 들어, 본 명세서의 블럭도는 본 발명의 원리를 구체화하는 예시적인 회로의 개념적인 관점을 나타내는 것으로 이해되어야 한다. 이와 유사하게, 모든 흐름도, 상태 변환도, 의사 코드 등은 컴퓨터가 판독 가능한 매체에 실질적으로 나타낼 수 있고 컴퓨터 또는 프로세서가 명백히 도시되었는지 여부를 불문하고 컴퓨터 또는 프로세서에 의해 수행되는 다양한 프로세스를 나타내는 것으로 이해되어야 한다.Thus, for example, the block diagrams herein are to be understood as representing conceptual views of exemplary circuits embodying the principles of the present invention. Similarly, all flowcharts, state transition diagrams, pseudo code, etc., are meant to be tangibly represented on computer readable media and represent various processes performed by a computer or processor, whether or not the computer or processor is explicitly depicted. It should be.

프로세서 또는 이와 유사한 개념으로 표시된 기능 블럭을 포함하는 도면에 도시된 다양한 소자의 기능은 전용 하드웨어뿐만 아니라 적절한 소프트웨어와 관련하여 소프트웨어를 실행할 능력을 가진 하드웨어의 사용으로 제공될 수 있다. 프로세서에 의해 제공될 때, 상기 기능은 단일 전용 프로세서, 단일 공유 프로세서 또는 복수의 개별적 프로세서에 의해 제공될 수 있고, 이들 중 일부는 공유될 수 있다.The functions of various elements shown in the drawings including functional blocks represented by processors or similar concepts may be provided using dedicated hardware as well as hardware capable of executing software in conjunction with appropriate software. When provided by a processor, the functionality may be provided by a single dedicated processor, a single shared processor, or a plurality of separate processors, some of which may be shared.

또한 프로세서, 제어 또는 이와 유사한 개념으로 제시되는 용어의 명확한 사용은 소프트웨어를 실행할 능력을 가진 하드웨어를 배타적으로 인용하여 해석되어서는 아니되고, 제한 없이 디지털 신호 프로세서(DSP) 하드웨어, 소프트웨어를 저장하기 위한 롬(ROM), 램(RAM) 및 비 휘발성 메모리를 암시적으로 포함하는 것으로 이해되어야 한다. 주지관용의 다른 하드웨어도 포함될 수 있다.In addition, the explicit use of terms presented as processor, control, or similar concepts should not be construed as exclusively citing hardware capable of executing software, but without limitation, digital signal processor (DSP) hardware, ROM for storing software (ROM), random access memory (RAM) and non-volatile memory. Other hardware for the governor's use may also be included.

본 명세서의 청구범위에서, 상세한 설명에 기재된 기능을 수행하기 위한 수단으로 표현된 구성요소는 예를 들어 상기 기능을 수행하는 회로 소자의 조합 또는 펌웨어/마이크로 코드 등을 포함하는 모든 형식의 소프트웨어를 포함하는 기능을 수행하는 모든 방법을 포함하는 것으로 의도되었으며, 상기 기능을 수행하도록 상기 소프트웨어를 실행하기 위한 적절한 회로와 결합된다. 이러한 청구범위에 의해 정의되는 본 발명은 다양하게 열거된 수단에 의해 제공되는 기능들이 결합되고 청구항이 요구하는 방식과 결합되기 때문에 상기 기능을 제공할 수 있는 어떠한 수단도 본 명세서로부터 파악되는 것과 균등한 것으로 이해되어야 한다.In the claims of this specification, components expressed as means for performing the functions described in the detailed description include, for example, a combination of circuit elements performing the functions or all types of software including firmware/microcode, etc. It is intended to include any method that performs the function of performing the function, combined with suitable circuitry for executing the software to perform the function. Since the invention defined by these claims combines the functions provided by the various enumerated means and is combined in the manner required by the claims, any means capable of providing such functions is equivalent to that discerned from this specification. should be understood as

상술한 목적, 특징 및 장점은 첨부된 도면과 관련한 다음의 상세한 설명을 통하여 보다 분명해질 것이며, 그에 따라 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있을 것이다. 또한, 본 발명을 설명함에 있어서 본 발명과 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 불필요하게 흐릴 수 있다고 판단되는 경우에 그 상세한 설명을 생략하기로 한다. The above objects, features and advantages will become more apparent through the following detailed description in conjunction with the accompanying drawings, and accordingly, those skilled in the art to which the present invention belongs can easily implement the technical idea of the present invention. There will be. In addition, in describing the present invention, if it is determined that a detailed description of a known technology related to the present invention may unnecessarily obscure the subject matter of the present invention, the detailed description will be omitted.

이하, 첨부된 도면을 참조하여 본 발명의 다양한 실시 예에 대하여 상세히 설명하기로 한다.Hereinafter, various embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일 실시 예에 따른 차량 음성 인식 장치의 활용 예를 나타내는 도면 이다. 도 1을 참조하면, 차량 내부에 위치한 화자(10)는 차량에 구비된 음성 인식 장치(미도시)의 음성 인식 기능을 이용하기 위하여 발화를 통해 음성 인식 장치에 음성을 입력할 수 있다. 여기서, 화자(10)는 차량의 운전자, 차량의 동승자 등과 같이 차량 내부에 위치한 사람을 의미할 수 있다.1 is a diagram showing an example of utilization of a vehicle voice recognition apparatus according to an embodiment of the present invention. Referring to FIG. 1 , a speaker 10 located inside a vehicle may input a voice to a voice recognition device (not shown) through speech in order to use a voice recognition function of a voice recognition device (not shown) provided in the vehicle. Here, the speaker 10 may mean a person located inside the vehicle, such as a vehicle driver or a vehicle passenger.

이러한 화자의 음성은 차량의 주차, 정차 및 주행 중 다양한 주행 상황에서 차량에 구비된 음성 인식 장치에 입력될 수 있다. The speaker's voice may be input to a voice recognition device provided in the vehicle in various driving situations while the vehicle is parked, stopped, and driven.

이 때, 차량의 음성 인식 기능 사용을 위한 화자(10)의 발화는 크게 두 가지 유형으로 구별될 수 있다. At this time, the speech of the speaker 10 for using the voice recognition function of the vehicle can be largely classified into two types.

첫 번째 발화 유형은 "라디오 틀어줘(11)", 실내 온도 22도(12)", "우리집 가는 길 안내해(13)", "오늘 날씨 알려줘(14)"와 같이, 차량의 주행 상태에 따라 영향을 받지 않는 발화로써, 단순 조작을 위한 발화일 수 있다. The first type of ignition is “Turn on the radio (11)”, “The room temperature is 22 degrees (12)”, “Guide me to my house (13)”, “Tell me the weather today (14)” according to the driving condition of the vehicle. As an unaffected utterance, it may be an utterance for simple manipulation.

만약, 화자(10)가 "라디오 틀어줘(11)"와 같이 발화하는 경우, 음성 인식 장치는 화자(10)의 음성을 인식하여 인식된 음성 명령에 대응되는 요청을 차량의 ECU(Electronic Control unit)에 전송하고, ECU는 차량의 라디오 모듈의 전원을 온(On) 시키도록 제어할 수 있다.If the speaker 10 utters “Play the radio 11”, the voice recognition device recognizes the speaker 10's voice and sends a request corresponding to the recognized voice command to the vehicle's ECU (Electronic Control Unit). ), and the ECU can control the power of the vehicle's radio module to turn on.

두 번째 발화 유형은 "시동이 안 걸리네(21)", "핸들이 안 움직이네(22)", "라이트가 안켜지네(23)"과 같이, 차량의 주행 상태에 따라 응답이 달라질 수 있는 발화로써, 차량이 문제 상황에 있음을 알리고 이에 대한 해결책을 안내받기 위한 발화일 수 있다. The second type of utterance is a response that can vary depending on the driving condition of the vehicle, such as "I can't start (21)," "The steering wheel doesn't move (22)," and "The lights don't turn on (23)." As the ignition, it may be an ignition for notifying that the vehicle is in a problem situation and being guided to a solution therefor.

만약, 화자(10)가 "핸들이 안 움직이네(22)"와 같이 발화하는 경우, 음성 인식 장치는 화자(10)의 발화를 인식하고, 차량의 주행 상태(즉, 차량이 주행 중인지, 차량이 정차 중인지, 또는 차량이 주차 중인지)에 기초하여 해당 문제 상황을 해결하기 위한 해결책을 기 저장된 데이터베이스로부터 검출하며, 검출된 해결책을 음성 또는 디스플레이를 통해 화자(10)에게 안내할 수 있다. If the speaker 10 utters something like “The steering wheel does not move (22)”, the voice recognition device recognizes the speaker 10's utterance, and the driving state of the vehicle (ie, whether the vehicle is driving or not, the vehicle Based on whether the vehicle is stopped or the vehicle is parked, a solution for solving the corresponding problem situation is detected from a pre-stored database, and the detected solution can be guided to the speaker 10 through voice or display.

예를 들어, 차량 주행 중 핸들이 안 움직이는 경우와 차량이 정차 또는 주차 중 핸들이 안 움직이는 경우는 서로 대응 방법이 다르며, 음성 인식 장치가 상황에 대응되는 정확한 응답을 제공하지 못하는 경우, 화자(10)가 안전에 위협을 받는 상황을 초래할 수 있다. For example, when the steering wheel does not move while driving the vehicle and when the steering wheel does not move while the vehicle is stopped or parked, response methods are different, and when the voice recognition device does not provide an accurate response corresponding to the situation, the speaker (10 ) may result in a safety-threatening situation.

이에 따라, 본 발명에 따른 음성 인식 장치는 차량 내 화자로부터 수신된 음성이 주행 상태를 기초로 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청인지 또는 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청인지 판단함으로써 이러한 문제점을 해결할 수 있다. Accordingly, the voice recognition apparatus according to the present invention determines whether the voice received from the speaker in the vehicle is a general request requiring a general vehicle action unrelated to the speaker's safety or related to the speaker's safety, based on the driving state. This problem can be solved by determining whether it is an emergency request that urgently requires the vehicle to take action.

이러한 본 발명에 따른 음성 인식 장치에 대해서는 이후 도면을 참조하여 보다 구체적으로 설명하기로 한다. The voice recognition device according to the present invention will be described in more detail with reference to the accompanying drawings.

도 2는 본 발명의 일 실시 예에 따른 음성 인식 장치를 나타내는 블록도 이다. 도 2를 참조하면, 음성 인식 장치(100)는 음성 처리부(110), 주행 상태 판단부(120), 위치 판단부(130) 및 제어부(140)의 전부 또는 일부를 포함할 수 있다.2 is a block diagram illustrating a voice recognition device according to an embodiment of the present invention. Referring to FIG. 2 , the voice recognition apparatus 100 may include all or part of a voice processing unit 110, a driving state determination unit 120, a location determination unit 130, and a control unit 140.

음성 처리부(110)는 차량 내 화자의 발화에 따른 음성을 수신하고, 수신된 음성과 차량의 주행 상태를 기초로 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청인지 또는 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청인지 판단할 수 있다. The voice processing unit 110 receives a voice according to the speaker's speech in the vehicle, and based on the received voice and the driving state of the vehicle, whether the speaker's speech is a general request requiring a general vehicle action unrelated to the speaker's safety or It may be determined whether the request is an emergency request that urgently requires vehicle measures in relation to the safety of the speaker.

이러한 음성 처리부(110)의 동작에 대해서는 도 3을 참조하여 설명하기로 한다.The operation of the voice processing unit 110 will be described with reference to FIG. 3 .

도 3은 본 발명의 일 실시 예에 따른 음성 처리부를 보다 구체적으로 나타내는 블록도 이다. 도 3을 참조하면, 음성 처리부(110)는 음성 입력부(111), 음성-텍스트 변환부(112), 단어 보정부(113), 자연어 이해부(114), 자연어 생성부(115) 및 음성 합성부(116)의 전부 또는 일부를 포함할 수 있다.3 is a block diagram showing a voice processing unit according to an embodiment of the present invention in more detail. Referring to FIG. 3 , the voice processing unit 110 includes a voice input unit 111, a voice-to-text conversion unit 112, a word correction unit 113, a natural language understanding unit 114, a natural language generator 115, and a voice synthesizer. It may include all or part of section 116 .

음성 입력부(111)는 차량 내 화자가 발화한 발화 음성을 입력 받을 수 있다. 이를 위해 음성 입력부(111)는 하나 이상의 마이크(미도시)를 구비할 수 있다. 또한, 음성 입력부(111)는 사용자의 발화 음성을 수신하는 과정에서 발생하는 노이즈를 제거하기 위한 다양한 노이즈 제거 알고리즘을 사용할 수 있다. 구체적으로, 음성 입력부(111)는 발화 음성의 노이즈를 제거하는 필터(미도시), 필터에서 출력되는 신호를 증폭하여 출력하는 증폭기(미도시) 등을 포함할 수 있다. The voice input unit 111 may receive an input voice spoken by a speaker in the vehicle. To this end, the voice input unit 111 may include one or more microphones (not shown). In addition, the voice input unit 111 may use various noise removal algorithms to remove noise generated in the process of receiving the user's spoken voice. Specifically, the voice input unit 111 may include a filter (not shown) that removes noise of spoken voice, an amplifier (not shown) that amplifies and outputs a signal output from the filter, and the like.

음성-텍스트 변환부(112)는 음성 입력부(111)를 통하여 입력되는 일련의 발화 음성 신호를 텍스트로 변환할 수 있다. 음성 인식부는 음향 모델, 언어 모델, 발음 모델을 포함할 수 있다. The voice-to-text conversion unit 112 may convert a series of spoken voice signals input through the voice input unit 111 into text. The voice recognition unit may include an acoustic model, a language model, and a pronunciation model.

음향 모델(Acoustic Model)은 음소가 어떤 식으로 발성되는지를 다수의 화자 발성 데이터를 토대로 훈련함으로써 만들어지는 음성의 통계적 모델을 포함할 수 있다. 이러한 음향 모델은 음성 입력부(111)에서 입력된 환자의 음성을 기초로 음소 텍스트를 생성할 수 있다. 여기서, 음향 모델은 HMM(Hidden Markov Model), GMM(Gaussian mixture emissions)-HMM, HMM에 심층신경망을 적용한 DNN(Deep Neural Network)-HMM 등 다양한 방식으로 구현될 수 있다. 여기서, 음향 모델은 일 예로 주파수 영역 에너지 추정방식으로 가장 대표적인 방법인 MFCC(melfrequency cepstral coefficient)을 특징으로 이용할 수 있다. The acoustic model may include a statistical model of speech created by training how phonemes are uttered based on a plurality of speaker utterance data. This acoustic model may generate phoneme text based on the patient's voice input through the voice input unit 111 . Here, the acoustic model may be implemented in various ways, such as Hidden Markov Model (HMM), Gaussian mixture emissions (GMM)-HMM, and Deep Neural Network (DNN)-HMM obtained by applying a deep neural network to HMM. Here, the acoustic model may use, for example, a melfrequency cepstral coefficient (MFCC), which is the most representative method as a frequency domain energy estimation method, as a feature.

또한, 언어 모델(Language model)은 자연어 안에서 문법, 구문, 단어 등에 대한 규칙성을 찾아내고, 그 규칙성을 이용하여 검색하고자 하는 대상의 정확도를 높이기 위한 알고리즘을 포함할 수 있다. 이 때, 일반적으로 사용되는 방식이 확률값을 산출하는 통계적 모델링 기법이며, 이는 대량의 말뭉치를 통하여 입력되는 발화 음성 신호)에서 언어규칙을 확률로 나타내고, 확률값을 통해서 탐색 영역을 제한하는 방법을 포함할 수 있다. 일 예로, 언어 모델은 N-Gram을 사용할 수 있다. In addition, the language model may include an algorithm for finding regularities of grammar, syntax, words, etc. in natural language and using the regularities to increase the accuracy of a target to be searched for. At this time, a commonly used method is a statistical modeling technique that calculates probability values, which may include a method of representing language rules as probabilities in a speech signal input through a large corpus) and limiting the search area through probability values. can For example, the language model may use N-Gram.

또한, 발음 모델은 텍스트를 소리 나는 대로 변환하는 음소 변환(G2P : Grapheme-to-Phoneme)하는 발음 사전 모델로, 표준 발음뿐만 아니라 방언 등에 대해서도 발음 사전을 구축할 수 있다.In addition, the pronunciation model is a pronunciation dictionary model that converts text into phonemes (G2P: Grapheme-to-Phoneme), and can build a pronunciation dictionary not only for standard pronunciation but also for dialects.

이러한 음성-텍스트 변환부(112)는 특정 화자에 종속되지 않은 화자 독립적인 음성 인식 모델을 이용할 수 있다. The speech-to-text conversion unit 112 may use a speaker-independent speech recognition model that is not dependent on a specific speaker.

한편, 음성-텍스트 변환부(112)를 통해 음성이 텍스트로 변환되면, 변환된 텍스트는 단어 보정부(113)로 입력될 수 있다. Meanwhile, when voice is converted into text through the voice-to-text conversion unit 112, the converted text may be input to the word correction unit 113.

이 경우, 단어 보정부(113)는 텍스트에 포함된 차량 관련 단어를 기 설정된 차량 관련 단어로 보정할 수 있다. In this case, the word correction unit 113 may correct the vehicle-related words included in the text to a preset vehicle-related word.

구체적으로, 단어 보정부(113)는 텍스트를 단어 단위로 분리하고, 분리된 단어들 중 차량 관련 단어를 검출할 수 있다. 이 경우, 단어 보정부(113)는 적어도 하나의 단어가 결합된 최소의 자립 형식을 단어 단위로 설정하여 단어를 분리할 수 있다. Specifically, the word correction unit 113 may divide text into word units and detect vehicle-related words from among the separated words. In this case, the word compensator 113 may separate the words by setting the minimum independent form in which at least one word is combined in word units.

그리고, 단어 보정부(113)는 검출된 차량 관련 단어를 차량 관련 단어 데이터베이스에 적용하여 검출된 차량 관련 단어가 표준 단어인지 또는 비표준 단어인지 판단할 수 있다. Also, the word compensator 113 may determine whether the detected vehicle-related word is a standard word or a non-standard word by applying the detected vehicle-related word to a vehicle-related word database.

만약, 검출된 차량 관련 단어가 표준 단어인 경우, 단어 보정부(113)는 검출된 차량 관련 단어를 보정하지 않을 수 있다. If the detected vehicle-related words are standard words, the word correction unit 113 may not correct the detected vehicle-related words.

다만, 검출된 차량 관련 단어가 비표준 단어인 경우, 단어 보정부(113)는 차량 관련 단어 데이터베이스를 이용하여 검출된 차량 관련 단어를 표준 단어로 보정할 수 있다. However, when the detected vehicle-related words are non-standard words, the word correction unit 113 may correct the detected vehicle-related words to standard words by using the vehicle-related word database.

여기서, 표준 단어는 차량 업계에서 일반적으로 사용되는 단어를 의미하고, 비표준 단어는 차량 업계에서 일반적으로 사용되지 않는 단어를 의미한다. Here, the standard words refer to words commonly used in the vehicle industry, and the non-standard words refer to words not generally used in the vehicle industry.

그리고, 차량 관련 단어 데이터베이스는 비표준 단어와 표준 단어를 매칭시켜 높은 것으로, 단어 보정부(113)는 차량 관련 단어 데이터베이스는 차량 내에서 사용되는 복수의 차량 관련 용어나 상황 묘사 단어들을 기초로 학습시킬 수 있다. In addition, the vehicle-related word database is high by matching non-standard words with standard words, and the word correction unit 113 can learn the vehicle-related word database based on a plurality of vehicle-related terms or situation description words used in the vehicle. there is.

일 예로, 차량의 계기판이 제대로 동작을 하지 않는 상황에서 "계기판"이라는 단어를 모르는 차량 운전자는 상황을 묘사하면서 "핸들 앞 화면이 동작을 안하네"와 같이 말할 수 있다. 이 경우, 입력된 음성은 음성-텍스트 변환부(112)를 통해 텍스트로 변환되어 단어 보정부(113)에 입력될 수 있다. 이 경우, 단어 보정부(113)는 "핸들 앞 화면"이라는 비표준 단어를 차량 관련 단어 데이터베이스에 적용하여 대응되는 표준 단어인 "계기판"을 검출하고, "핸들 앞 화면"을 "계기판"으로 수정하여 “계기판이 동작을 안하네"로 수정할 수 있다.For example, in a situation where the instrument panel of the vehicle does not operate properly, a vehicle driver who does not know the word “instrument panel” may say something like “the screen in front of the steering wheel does not operate” while describing the situation. In this case, the input voice may be converted into text through the voice-to-text conversion unit 112 and input into the word correction unit 113 . In this case, the word correction unit 113 applies the non-standard word "screen in front of the steering wheel" to the database of vehicle-related words, detects the corresponding standard word "instrument panel", corrects "screen in front of the steering wheel" to "instrument panel", It can be modified to "The instrument panel does not work".

다른 예로, 차량의 와이퍼가 제대로 동작을 하지 않는 상황에서 "와이퍼"이라는 단어를 모르는 차량 운전자는 상황을 묘사하면서 "차량 앞 유리닦이가 느리네"와 같이 말할 수 있다. 이 경우, 입력된 음성은 음성-텍스트 변환부(112)를 통해 텍스트로 변환되어 단어 보정부(113)에 입력될 수 있다. 이 경우, 단어 보정부(113)는 "차량 앞 유리닦이"라는 비표준 단어를 차량 관련 단어 데이터베이스에 적용하여 대응되는 표준 단어인 "와이퍼"를 검출하고, "차량 앞 유리닦이"을 "와이퍼"로 수정하여 “와이퍼가 느리네"로 수정할 수 있다.As another example, in a situation where the vehicle's wipers are not working properly, a vehicle driver who does not know the word "wiper" may describe the situation and say something like "the windshield wiper is slow". In this case, the input voice may be converted into text through the voice-to-text conversion unit 112 and input into the word correction unit 113 . In this case, the word correction unit 113 applies the non-standard word "vehicle windshield wiper" to the vehicle-related word database to detect the corresponding standard word "wiper", and converts "vehicle windshield wiper" to "wiper". It can be modified to “Wipers are slow”.

이러한 본 발명에 따르면, 단어 보정부(113)에서 학습된 데이터베이스를 이용하여 비표준 단어를 표준 단어로 변환하는 선처리를 수행함으로써, 후술할 자연어 이해부(114)의 경량화에 도움을 줄 수 있다. 그리고, 이는 자연어 이해부(114)의 성능을 향상시켜 차량 내 화자 발화 의도를 보다 정확하게 파악하게 할 수 있다. According to the present invention, by using the database learned in the word correction unit 113 to perform preprocessing of converting non-standard words into standard words, it is possible to help reduce the weight of the natural language understanding unit 114 to be described later. In addition, this can improve the performance of the natural language understanding unit 114 to more accurately grasp the speaker's utterance intention in the vehicle.

한편, 단어 보정부(113)에서 수정된 텍스트는 자연어 이해부(114)로 입력될 수 있다. Meanwhile, the text corrected by the word correction unit 113 may be input to the natural language understanding unit 114 .

자연어 이해부(114)(natural language understanding)는 단어 보정부(113)로부터 입력받은 텍스트의 의미를 이해하기 위한 처리를 수행할 수 있다. 구체적으로 자연어 이해부(114)는 음성 인식 결과로 생성된 텍스트에 대하여 문법적 분석(syntactic analyze) 또는 의미적 분석(semantic analyze)을 수행하여 사용자의 발화음성에 대한 발화 의도를 분석할 수 있다. The natural language understanding unit 114 may perform processing to understand the meaning of text input from the word correction unit 113 . In detail, the natural language understanding unit 114 may perform syntactic analysis or semantic analysis on text generated as a result of speech recognition to analyze the user's speech intention.

여기서, 문법적 분석은 질의 텍스트를 문법적 단위(예: 단어, 구, 형태소 등)로 나누고, 나누어진 단위가 어떤 문법적인 요소를 갖는지 파악할 수 있다. Here, the grammatical analysis may divide the query text into grammatical units (eg, words, phrases, morphemes, etc.) and determine which grammatical elements the divided units have.

또한 의미적 분석은 의미(semantic) 매칭, 룰(rule) 매칭, 포뮬러(formula) 매칭 등을 이용하여 수행할 수 있다. In addition, semantic analysis may be performed using semantic matching, rule matching, formula matching, and the like.

이에 따라, 자연어 이해부(114)는 차량 내 화자의 음성 인식 결과로 생성된 텍스트가 어떤 의도(intent)인지 분석할 수 있다. 특히, 자연어 이해부(114)는 차량의 주행 상태에 기초하여 텍스트로부터 화자의 발화 의도를 분석하여 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청인지 또는 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청인지 판단할 수 있다. 여기서, 주행 상태에는 차량이 현재 주행 중인 상태, 주차 중인 상태, 정차 중인 상태 및 주차 중인 상태 중 하나를 포함할 수 있다.Accordingly, the natural language understanding unit 114 may analyze the intention of the text generated as a result of recognizing the voice of the speaker in the vehicle. In particular, the natural language understanding unit 114 analyzes the speaker's utterance intention from the text based on the driving state of the vehicle, and determines whether the speaker's utterance is a general request for a general vehicle action unrelated to the speaker's safety or a concern for the speaker's safety. In relation to this, it may be determined whether the request is an emergency request that urgently requires the vehicle to take action. Here, the driving state may include one of a current driving state, a parking state, a stopping state, and a parking state.

예를 들어, 차량 내 화자의 음성 인식 결과로 생성된 텍스트가 "핸들이 안 움직이네"일 때, 차량의 주행 상태가 주행 중인 경우, 자연어 이해부(114)는 해당 화자의 발화가 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청으로 판단할 수 있다. 다만, 차량 내 화자의 음성 인식 결과로 생성된 텍스트가 "핸들이 안 움직이네"일 때, 차량의 주행 상태가 주차 중인 경우, 자연어 이해부(114)는 해당 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청으로 판단할 수 있다.For example, when the text generated as a result of voice recognition of a speaker in a vehicle is “steering wheel is not moving” and the driving state of the vehicle is driving, the natural language understanding unit 114 determines that the speaker's speech is safe for the speaker. It can be determined as an urgent request that urgently requires vehicle measures in relation to However, when the text generated as a result of the voice recognition of the speaker in the vehicle is “steering wheel is not moving” and the driving state of the vehicle is parked, the natural language understanding unit 114 determines that the speaker's utterance is irrelevant to the speaker's safety. It can be judged as a general request requiring a general vehicle action.

다른 예로, 차량 내 화자의 음성 인식 결과로 생성된 텍스트가 "가속 페달이 동작 안하네"일 때, 차량의 주행 상태가 주행 중인 경우, 자연어 이해부(114)는 해당 화자의 발화가 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청으로 판단할 수 있다. 다만, 차량 내 화자의 음성 인식 결과로 생성된 텍스트가 "가속 페달이 동작 안하네"일 때, 차량의 주행 상태가 주차 중인 경우, 자연어 이해부(114)는 해당 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청으로 판단할 수 있다.As another example, when the text generated as a result of voice recognition of a speaker in the vehicle is “the accelerator pedal is not working” and the driving state of the vehicle is driving, the natural language understanding unit 114 determines that the speaker's speech is important to the speaker's safety. In this regard, it may be determined as an urgent request that urgently requires the vehicle to take action. However, when the text generated as a result of voice recognition of the speaker in the vehicle is “the accelerator pedal is not working” and the driving state of the vehicle is parked, the natural language understanding unit 114 determines that the speaker's speech is irrelevant to the speaker's safety. It can be judged as a general request requiring a general vehicle action.

또한, 자연어 이해부(114)는 차량의 현재 위치 및 주변 환경 정보 중 적어도 하나를 추가적으로 더 고려하여 화자의 발화가 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청인지 또는 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청인지 판단할 수 있다. 여기서, 주변 환경 정보는 외부 서버로부터 수신되거나 기 저장될 수 있으며, 주변 환경 정보에는 차량 주변의 날씨, 현재 시간 등이 포함될 수 있다.In addition, the natural language understanding unit 114 additionally considers at least one of the current location of the vehicle and surrounding environment information to determine whether the speaker's speech is a general request that requires a general vehicle action unrelated to the speaker's safety or It may be determined whether the request is an emergency request that urgently requires vehicle measures in relation to the safety of the speaker. Here, the surrounding environment information may be received from an external server or pre-stored, and the surrounding environment information may include weather around the vehicle, current time, and the like.

예를 들어, 차량 내 화자의 음성 인식 결과로 생성된 텍스트가 "라이트가 안켜지네"일 때, 차량의 현재 위치가 터널 안인 경우, 자연어 이해부(114)는 해당 화자의 발화가 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청으로 판단할 수 있다. 다만, 차 차량 내 화자의 음성 인식 결과로 생성된 텍스트가 "라이트가 안켜지네"일 때, 량의 현재 위치가 대낮 실외인 경우, 자연어 이해부(114)는 해당 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청으로 판단할 수 있다.For example, when the text generated as a result of voice recognition of a speaker in a vehicle is "lights are not turned on" and the current location of the vehicle is in a tunnel, the natural language understanding unit 114 determines that the speaker's speech is safe for the speaker. It can be determined as an urgent request that urgently requires vehicle measures in relation to However, when the text generated as a result of the voice recognition of the speaker in the car is "The light is not turned on", and the current location of the vehicle is outdoors in broad daylight, the natural language understanding unit 114 determines that the speaker's utterance is safe for the speaker. It can be judged as a general request that requires general vehicle measures unrelated to

다른 예로, 차량 내 화자의 음성 인식 결과로 생성된 텍스트가 "히터가 안나오네"일 때, 차량 주변의 온도가 영하 20도인 경우, 자연어 이해부(114)는 해당 화자의 발화가 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청으로 판단할 수 있다. 다만, 차량 내 화자의 음성 인식 결과로 생성된 텍스트가 "히터가 안나오네"일 때, 차량 주변의 온도가 영상 20도인 경우, 자연어 이해부(114)는 해당 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청으로 판단할 수 있다.As another example, when the text generated as a result of voice recognition of a speaker in the vehicle is “The heater is not working” and the temperature around the vehicle is minus 20 degrees, the natural language understanding unit 114 determines that the speaker's speech is related to the speaker's safety. Therefore, it can be determined as an emergency request that urgently requires the vehicle to take action. However, when the text generated as a result of the voice recognition of the speaker in the vehicle is "The heater is not working" and the temperature around the vehicle is 20 degrees, the natural language understanding unit 114 determines that the speaker's speech is irrelevant to the speaker's safety. It can be judged as a general request requiring general vehicle action.

또한, 자연어 이해부(114)는 화자가 선택한 차량 기능 모듈의 종류를 추가로 더 고려하여 화자의 발화가 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청인지 또는 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청인지 판단할 수 있다. 여기서, 차량 기능 모듈은 에어컨, 변속기, 핸들, 인포테인먼트, 오디오, 클러스터, 방향지시등, 창문, 시트 등과 같이 의 종류는 차량의 기능을 구성하는 단위 모듈 모두를 포함하는 개념일 수 있다. In addition, the natural language understanding unit 114 further considers the type of vehicle function module selected by the speaker to determine whether the speaker's speech is a general request requiring a general vehicle action unrelated to the speaker's safety or the speaker's speech. In relation to safety, it may be determined whether the request is an emergency request that urgently requires vehicle measures. Here, the vehicle function module may be a concept including all unit modules constituting vehicle functions, such as air conditioner, transmission, steering wheel, infotainment, audio, cluster, direction indicator, window, and seat.

예를 들어, 차량 내 화자의 음성 인식 결과로 생성된 텍스트가 "핸들이 이상하네"와 "에어컨이 이상하네"일 때, 자연어 이해부(114)는 차량의 기능 모듈인 "핸들"은 차량의 주행에 관련된 것으로 안전과 관련성이 높은 것으로 판단하고 "에어컨"은 실내 환경을 위한 것으로 안전과 관련성은 낮은 것으로 판단할 수 있다. 즉, 자연어 이해부(114)는 "핸들이 이상하네"에 대하여 해당 화자의 발화가 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청으로 판단할 수 있고, 에어컨이 이상하네"에 대해서는 해당 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청으로 판단할 수 있다.For example, when texts generated as a result of voice recognition of a speaker in a vehicle are “the steering wheel is strange” and “the air conditioner is strange”, the natural language understanding unit 114 determines that the “steering wheel”, which is a function module of the vehicle, is It is determined that it is related to driving and has a high relevance to safety, and “air conditioner” is for an indoor environment and is determined to have a low relevance to safety. That is, the natural language understanding unit 114 may determine that the speaker's utterance is an urgent request for urgently requiring vehicle measures in relation to the speaker's safety with respect to "the steering wheel is abnormal", and the air conditioner is abnormal" For this case, the speaker's utterance can be determined as a general request requesting a general vehicle action unrelated to the speaker's safety.

구체적으로, 음성 인식 장치(100)의 주행 상태 판단부(120)와 위치 판단부(130)는 화자가 발화할 때 차량의 주행 상태 및 현재 위치를 각각 판단할 수 있고, 음성 인식 장치(100)의 제어부(140)는 차량의 주행 상태, 현재 위치, 주변 환경 정보 및 화자가 발화한 차량 기능 모듈의 종류 간의 상관 관계에 따른 긴급도를 산출할 수 있다. 이 때, 제어부(140)는 주행 상태, 현재 위치, 주변 환경 정보 및 화자가 발화한 차량 기능 모듈의 종류가 레이블링된 화자의 발화 텍스트를 학습데이터로 긴급도를 산출하도록 학습된 신경망이 이용될 수 있다. 여기서, 긴급도는 해당 화자의 발화가 긴급한 정도를 나타내는 정보로써, 긴급도 값이 높을수록 더 긴급함을 나타낼 수 있다.Specifically, the driving state determination unit 120 and the position determination unit 130 of the voice recognition apparatus 100 may respectively determine the driving state and the current position of the vehicle when the speaker speaks, and the voice recognition apparatus 100 The control unit 140 of the vehicle may calculate the degree of urgency according to the correlation between the driving state of the vehicle, the current location, surrounding environment information, and the type of vehicle function module uttered by the speaker. At this time, the control unit 140 may use a neural network learned to calculate the degree of urgency as learning data of the speaker's spoken text labeled with the driving state, current location, surrounding environment information, and the type of vehicle function module uttered by the speaker. there is. Here, the degree of urgency is information indicating the degree of urgency of the speaker's utterance, and may indicate more urgency as the urgency value increases.

그리고, 신경망은 지도 학습(supervised learning), 비지도 학습(unsupervised learning) 및 강화 학습(reinforcement learning)을 통하여 학습될 수 있다.In addition, the neural network may be learned through supervised learning, unsupervised learning, and reinforcement learning.

그리고, 자연어 이해부(114)는 산출된 긴급도 값을 기 설정된 기준값과 비교하여 화자의 발화를 일반 요청 또는 긴급 요청으로 구분할 수 있다. 여기서, 기준값은 일반 요청과 긴급 요청을 구분하는 기준값으로써, 차량의 종류, 현재 위치, 주변 환경 정보, 주행 상태, 화자의 나이, 및 성별 등에 따라 결정되는 가변값일 수 있다.In addition, the natural language understanding unit 114 may compare the calculated urgency value with a preset reference value to classify the speaker's utterance as a general request or an urgent request. Here, the reference value is a reference value for distinguishing a general request from an emergency request, and may be a variable value determined according to vehicle type, current location, surrounding environment information, driving state, age and gender of a speaker.

예를 들어, 화자의 발화에 대해 산출된 긴급도가 기준값 이상인 경우, 자연어 이해부(114)는 화자의 해당 발화를 긴급 요청으로 판단할 수 있다.For example, when the urgency calculated for the speaker's utterance is greater than or equal to the reference value, the natural language understanding unit 114 may determine the speaker's utterance as an urgent request.

즉, 자연어 이해부(114)는 차량의 주행 상태, 현재 위치 및 주변 환경 정보에 따라 동일한 화자의 발화를 일반 요청 또는 긴급 요청으로 판단할 수 있다.That is, the natural language understanding unit 114 may determine the utterance of the same speaker as a general request or an emergency request according to the driving state of the vehicle, the current location, and surrounding environment information.

한편, 자연어 이해부(114)는 화자 발언 마다 차량의 운행 상태, 현재 위치 및 주변 환경 정보에 따른 일반 요청 또는 긴급 요청이 매칭된 데이터베이스를 이용하여 화자의 발화를 구분할 수도 있다.Meanwhile, the natural language understanding unit 114 may distinguish the speaker's utterance by using a database in which general requests or emergency requests are matched according to the driving state of the vehicle, the current location, and surrounding environment information for each speaker's utterance.

한편, 자연어 생성부(natural language generation)(115)는 자연어 이해부(114)에서 분석한 발화 의도에 기초하여 지식 베이스(knowledge-base)를 이용하여 응답 텍스트를 생성할 수 있다. Meanwhile, the natural language generation unit 115 may generate a response text using a knowledge-base based on the utterance intention analyzed by the natural language understanding unit 114 .

음성 합성부(TTS: text to speech)(116)는 자연어 생성부(115)가 생성한 자연어 발화 형태의 응답 텍스트에 대한 응답 발화음성을 생성할 수 있다. 이러한 음성 합성부(116)에서 생성된 음성은 차량 내 화자에게 음성으로 제공될 수 있다. The text to speech (TTS) 116 may generate a response speech voice for the response text in the form of natural language speech generated by the natural language generator 115 . The voice generated by the voice synthesis unit 116 may be provided as a voice to a speaker in the vehicle.

한편, 상술한 자연어 이해부(114)는 자연어 이해 모델을 이용하여 화자의 발화 의도를 분석하여 일반 요청인지 긴급 요청인지 판단할 수 있다. Meanwhile, the above-described natural language understanding unit 114 may determine whether the request is a general request or an emergency request by analyzing a speaker's utterance intention using a natural language understanding model.

여기서, 본 발명에 따른 자연어 이해부(114)의 자연어 이해 모델은 화자의 신원 및 화자의 운전 여부에 따라 분류된 화자 종속적인 자연어 이해 모델일 수 있다. Here, the natural language understanding model of the natural language understanding unit 114 according to the present invention may be a speaker-dependent natural language understanding model classified according to the identity of the speaker and whether or not the speaker is driving.

일 예로, 차량을 이용하는 사람이 "A", "B" 두 사람인 경우, 본 발명에 따른 자연어 이해부(114)는 "A가 운전자인 경우", "A가 동승자인 경우", "B가 운전자인 경우", "B가 동승자인 경우"로 구분하여 각 경우에 따른 화자 종속적인 자연어 이해 모델을 구축할 수 있다. For example, when there are two people using the vehicle, “A” and “B,” the natural language understanding unit 114 according to the present invention performs “when A is a driver”, “when A is a passenger”, and “when B is a driver”. A speaker-dependent natural language understanding model according to each case can be built by dividing into "when B is a passenger" and "when B is a passenger."

즉, 화자가 운전 중인지 또는 운전 중이 아닌지에 따라 화자의 음성의 속도, 명료도, 톤, 피치, 크기, 뉘앙스 등과 같은 음성 특성은 서로 다를 수 있고, 사용하는 단어, 어휘, 문법 구성 등도 달라질 수 있다. 이러한 특성은 화자가 누구인지에 따라서도 종속적으로 달라질 수 있다. That is, depending on whether the speaker is driving or not, voice characteristics such as speed, clarity, tone, pitch, loudness, and nuance of the speaker's voice may differ from each other, and the words used, vocabulary, and grammatical construction may also vary. These characteristics may vary depending on who the speaker is.

이에 따라, 본 발명에 따른 음성 인식 장치(100)는 발화가 발생된 위치를 기초로 화자의 운전 여부를 판단하고, 발화의 음성, 피치, 톤 등에 따라 화자의 신원을 식별할 수 있다. 그리고, 자연어 이해부(114)는 화자의 신원 및 화자의 운전 여부에 따라 분류된 화자 종속적인 자연어 이해 모델을 이용하여 차량 내 화자의 발화 의도를 분석할 수 있다. 이에 따라, 본 발명에 따른 자연어 이해부(114)는 차량 내 화자 발화 의도를 보다 정확하게 파악하게 할 수 있다. Accordingly, the voice recognition apparatus 100 according to the present invention can determine whether the speaker is driving based on the location where the utterance occurred, and can identify the speaker's identity according to the voice, pitch, and tone of the utterance. In addition, the natural language understanding unit 114 may analyze the speaker's utterance intention in the vehicle using a speaker-dependent natural language understanding model classified according to the identity of the speaker and whether or not the speaker is driving. Accordingly, the natural language understanding unit 114 according to the present invention can more accurately grasp the speaker's utterance intention in the vehicle.

또한, 각 차량을 이용하는 사람은 제한적이기에, 본 발명에 따르면, 정확성을 높이면서도 차량 내 사용 환경에 맞게 최적화된 모델을 구축할 수 있다. In addition, since the number of people using each vehicle is limited, according to the present invention, it is possible to build a model optimized for the use environment in the vehicle while increasing accuracy.

한편, 음성 인식 장치(100)의 제어부(140)는 자연어 이해부(114)에 의해 구분된 화자의 발화에 대응되는 안내를 제공할 수 있다.Meanwhile, the control unit 140 of the voice recognition apparatus 100 may provide guidance corresponding to the utterance of the speaker identified by the natural language understanding unit 114 .

구제척으로, 화자의 발화가 일반 요청으로 판단된 경우, 제어부(140)는 일반 요청의 처리를 위한 안내를 화자에게 제공하고, 화자의 발화가 긴급 요청으로 판단된 경우, 긴급 요청의 처리를 위한 안내를 화자에게 제공할 수 있다.As a remedy, if the speaker's utterance is determined to be a general request, the controller 140 provides a guide for processing the general request to the speaker, and if the speaker's utterance is determined to be an emergency request, the controller 140 provides instructions for processing the emergency request. Guidance may be provided to the speaker.

예를 들어, 화자의 발화인 "시동이 안 걸리네"가 일반 요청으로 판단된 경우, 제어부(140)는 시동을 거는 방법을 안내할 수 있으며, 화자의 발화가 긴급 요청으로 판단된 경우, 제어부(140)는 긴급도에 따라 긴급도에 따라 긴급조치 방법, ARS 연결 서비스 및 구조 요청 서비스 중 적어도 하나 이상을 안내할 수 있다.For example, if the speaker's utterance "I can't start the car" is determined as a general request, the controller 140 may provide instructions on how to start the engine, and if the speaker's utterance is determined as an emergency request, the controller 140 ( 140) may guide at least one or more of an emergency action method, an ARS connection service, and a rescue request service according to the level of urgency.

일 예로, 긴급 요청인 경우. 제어부(140)는 긴급도에 따라 복수의 긴급 레벨로 분류하고, 제일 긴급도가 낮은 제1 긴급도의 경우 긴급조치 방법을 제공하도록 제어하고, 제1 긴급도 보다 긴급도가 높은 제2 긴급도의 경우 긴급 조치 상담원과 자동으로 ARS 연결되도록 제어하며, 제일 긴급도가 높은 제3 긴급도의 경우 119와 같은 긴급 구조 요청 서비스에 자동으로 통화 연결되도록 제어할 수 있다. For example, in the case of an urgent request. The control unit 140 classifies the urgency into a plurality of urgency levels, controls to provide an emergency action method in the case of the first urgency having the lowest urgency, and provides a second urgency higher than the first urgency. In case of , it is controlled to be automatically connected to an emergency agent and ARS, and in the case of a third level of urgency, which has the highest level of urgency, it is possible to control a call to be automatically connected to an emergency rescue request service such as 119.

한편, 본 발명에 따른 음성 인식 장치(100)는 화자가 복수의 발화를 하는 경우, 긴급도에 따라 각 발화에 대응되는 안내를 제공할 수 있다.Meanwhile, when a speaker utters a plurality of utterances, the voice recognition apparatus 100 according to the present invention may provide guidance corresponding to each utterance according to the degree of urgency.

구체적으로, 음성 인식 장치(100)의 제어부(140)는 화자의 복수 발화에 대응되는 긴급도를 각각 산출하고, 산출된 긴급도가 높은 순으로 각 발화에 대응되는 안내를 제공할 수 있다.Specifically, the control unit 140 of the voice recognition apparatus 100 may calculate urgency levels corresponding to multiple utterances of speakers, and provide guidance corresponding to each utterance in the order of the calculated urgency levels.

예를 들어, 차량 내 화자의 음성 인식 결과로 생성된 텍스트가 "시동이 안켜지고, 히터가 안나오네"일 때, "시동이 안켜지고"의 긴급도 값이 "히터가 안나오네"긴급도 값보다 높은 경우, 제어부(140)는 "시동이 안켜지고"에 대응되는 처리 방법을 안내한 후, "히터가 안나오네"에 대응되는 처리 방법을 화자에게 안내할 수 있다.For example, when the text generated as a result of voice recognition of a speaker in a vehicle is "Ignition does not turn on, heater does not work", the urgency value of "Ignition does not turn on" is higher than the urgency value of "heater does not work" In this case, the controller 140 may guide the processing method corresponding to "the engine does not turn on" and then guide the processing method corresponding to "the heater does not work" to the speaker.

이와 같이 동시에 여러 상황이 발생한 경우, 발화의 긴급한 순으로 해당 발화에 대응되는 처리 방법을 안내함으로써, 화자에게 적절한 응답을 제공할 수 있다.In this way, when several situations occur at the same time, an appropriate response can be provided to the speaker by guiding a processing method corresponding to the utterance in the order of urgency of the utterance.

이하에서는 이후 도면을 참조하여 흐름도를 참조하여 본 발명의 일 실시 예에 따른 응답 제공 방법에 대하여 설명하기로 한다. 여기서, 설명된 순서는 본 발명의 일 예시일 뿐, 그 순서에 한정되는 것은 아니다.Hereinafter, a response providing method according to an embodiment of the present invention will be described with reference to a flow chart with reference to the subsequent drawings. Here, the described order is only an example of the present invention, and is not limited thereto.

도 4는 본 발명의 일 실시 예에 따른 응답 제공 방법을 개략적으로 나타내는 흐름도 이다. 도 4를 참조하면, 먼저 차량 내 화자의 발화에 따른 음성을 수신할 수 있다(S110). 4 is a flowchart schematically illustrating a response providing method according to an embodiment of the present invention. Referring to FIG. 4 , first, a voice according to an utterance of a speaker in a vehicle may be received (S110).

그리고, 수신된 음성을 텍스트로 변환할 수 있다(S120). 구체적으로, 단계(S120)는 음향 모델, 언어 모델, 발음 모델을 이용하여 입력되는 일련의 발화 음성 신호를 텍스트로 변환할 수 있다. Then, the received voice may be converted into text (S120). Specifically, in step S120, a series of input spoken voice signals may be converted into text using an acoustic model, a language model, and a pronunciation model.

그리고, 텍스트로부터 화자의 발화 의도를 분석하고, 차량의 주행 상태를 기초로 화자의 발화가 화자의 안전에 무관한 일반적인 차량의 조치를 요구하는 일반 요청인지 또는 화자의 안전에 관련하여 긴급하게 차량의 조치를 요구하는 긴급 요청인지 판단할 수 있다(S130). 여기서 판단하는 단계(S130)는 자연어 이해 모델을 이용하여 화자의 의도를 분석할 수 있으며, 자연어 이해 모델은 화자의 신원 및 화자의 운전 여부에 따라 분류된 화자 종속적인 자연어 이해 모델일 수 있다. In addition, the speaker's utterance intention is analyzed from the text, and based on the driving condition of the vehicle, whether the speaker's utterance is a general request requiring general vehicle measures unrelated to the speaker's safety or an urgent vehicle related to the speaker's safety. It may be determined whether it is an urgent request requiring action (S130). Here, in step S130 of determining, the intention of the speaker may be analyzed using a natural language understanding model, and the natural language understanding model may be a speaker-dependent natural language understanding model classified according to the identity of the speaker and whether or not the speaker is driving.

만약, 화자의 발화가 일반 요청으로 판단되는 경우(S140:Y), 일반 요청의 처리를 위한 안내를 화자에게 제공할 수 있다(S150). 일 예로, 차량의 문제상황에 대한 해결책을 음성 또는 디스플레이를 통해 화자에게 제공할 수 있다. If the speaker's utterance is determined to be a general request (S140:Y), guidance for handling the general request may be provided to the speaker (S150). For example, a solution to a problem situation of a vehicle may be provided to a speaker through voice or a display.

다만, 화자의 발화가 긴급 요청으로 판단되는 경우(S140:N), 긴급 요청의 처리를 위한 안내를 화자에게 제공할 수 있다(S160). 일 예로, 긴급 요청인 경우. 긴급도에 따라 복수의 긴급 레벨로 분류하고, 긴급 레벨에 따라 긴급조치 방법을 제공하거나, 긴급 조치 상담원과 자동으로 ARS 연결되도록 하거나, 119와 같은 긴급 구조 요청 서비스에 자동으로 통화 연결되도록 할 수 있다. However, when it is determined that the speaker's utterance is an emergency request (S140:N), guidance for processing the emergency request may be provided to the speaker (S160). For example, in the case of an urgent request. It can be classified into multiple emergency levels according to the level of urgency, provide emergency measures according to the level of urgency, automatically connect ARS with an emergency counselor, or automatically connect calls to emergency rescue services such as 119. .

한편, 도 5에는 도시되지 않았으나, 본 발명의 일 실시 예에 따른 방법은 S120 단계와 S130 단계 사이에 텍스트에 포함된 차량 관련 단어를 검출하고, 검출된 차량 관련 단어를 기 설정된 표준 단어로 보정하는 단계를 더 포함할 수 있다.Meanwhile, although not shown in FIG. 5 , the method according to an embodiment of the present invention detects vehicle-related words included in the text between steps S120 and S130 and corrects the detected vehicle-related words with preset standard words. Further steps may be included.

도 5는 본 발명의 일 실시예에 따른 응답 제공 방법을 나타낸 흐름도이다. 도 5를 참조하면, 차량의 주행 상태, 현재 위치 및 주변 환경 정보 간의 상관 관계에 따른 긴급도를 산출할 수 있다(S210). 구체적으로, 차량의 주행 상태, 현재 위치, 주변 환경 정보 및 화자가 발화한 차량 기능 모듈의 종류 간의 상관 관계에 따른 긴급도를 산출할 수 있다. 이 때, 긴급도의 산출은 주행 상태, 현재 위치, 주변 환경 정보 및 화자가 발화한 차량 기능 모듈의 종류가 레이블링된 화자의 발화 텍스트를 학습데이터로 긴급도를 산출하도록 학습된 신경망이 이용될 수 있다. 5 is a flowchart illustrating a method for providing a response according to an embodiment of the present invention. Referring to FIG. 5 , the degree of urgency may be calculated according to the correlation between the driving state of the vehicle, the current location, and surrounding environment information (S210). Specifically, the degree of urgency may be calculated according to the correlation between the driving state of the vehicle, the current location, surrounding environment information, and the type of vehicle function module uttered by the speaker. At this time, the calculation of the degree of urgency may use a neural network learned to calculate the degree of urgency using the driving state, the current location, surrounding environment information, and the speaker's utterance text labeled with the type of vehicle function module uttered by the speaker as learning data. there is.

만약, 산출된 긴급도 값이 기 설정된 기준값보다 큰 경우(S220: Y), 해당 화자의 발화를 긴급 요청으로 판단할 수 있다(S230).If the calculated urgency value is greater than the preset reference value (S220: Y), the utterance of the corresponding speaker may be determined as an emergency request (S230).

다만, 산출된 긴급도 값이 기 설정된 기준값보다 작은 경우(S220 : N), 해당 화자의 발화를 일반 요청으로 판단할 수 있다(S240). 그리고, 상술한 S140 단계부터 수행할 수 있다.However, if the calculated urgency value is smaller than the preset reference value (S220: N), the utterance of the corresponding speaker may be determined as a general request (S240). And, it can be performed from step S140 described above.

이러한 본 발명에 따르면, 차량 내 화자의 발화를 기초로 해당 문제 상황에 대한 적절한 처리 방법을 제공할 수 있다. According to the present invention, it is possible to provide an appropriate processing method for a corresponding problem situation based on the ignition of a speaker in the vehicle.

또한, 본 발명에 따르면, 화자의 복수의 발화가 있는 경우, 긴급도에 따라 발화에 대응되는 응답을 제공함으로써, 화자는 긴급한 문제 상황을 우선적으로 해결할 수 있다.In addition, according to the present invention, when there are a plurality of utterances of a speaker, the speaker may preferentially solve an urgent problem situation by providing a response corresponding to the utterance according to the degree of urgency.

한편, 명세서 및 청구범위에서 "제 1", "제 2", "제 3" 및 "제 4" 등의 용어는, 만약 있는 경우, 유사한 구성요소 사이의 구분을 위해 사용되며, 반드시 그렇지는 않지만 특정 순차 또는 발생 순서를 기술하기 위해 사용된다. 그와 같이 사용되는 용어는 여기에 기술된 본 발명의 실시예에 의해 이해될 것이다. 마찬가지로, 여기서 방법이 일련의 단계를 포함하는 것으로 기술되는 경우, 여기에 제시된 그러한 단계의 순서는 반드시 그러한 단계가 실행될 수 있는 순서인 것은 아니며, 임의의 기술된 단계는 생략될 수 있고/있거나 여기에 기술되지 않은 임의의 다른 단계가 그 방법에 부가 가능할 것이다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성 요소로 명명될 수 있고, 유사하게 제2 구성 요소도 제1 구성 요소로 명명될 수 있다.On the other hand, terms such as "first", "second", "third" and "fourth" in the specification and claims, if any, are used to distinguish between similar components, but not necessarily Used to describe a specific sequence or order of occurrence. The terms so used will be understood in light of the embodiments of the invention described herein. Likewise, where a method is described herein as comprising a series of steps, the order of those steps presented herein is not necessarily the order in which those steps may be performed, and any recited steps may be omitted and/or here Any other step not described may be added to the method. For example, a first element may be termed a second element, and similarly, a second element may be termed a first element, without departing from the scope of the present invention.

또한 명세서 및 청구범위의 "왼쪽", "오른쪽", "앞", "뒤", "상부", "바닥", "위에", "아래에" 등의 용어는, 설명을 위해 사용되는 것이며, 반드시 불변의 상대적 위치를 기술하기 위한 것은 아니다. 그와 같이 사용되는 용어는 여기에 기술된 본 발명의 실시예가, 예컨대, 여기에 도시 또는 설명된 것이 아닌 다른 방향으로 동작할 수 있도록 적절한 환경하에서 호환 가능한 것이 이해될 것이다. 여기서 사용된 용어 "연결된"은 전기적 또는 비 전기적 방식으로 직접 또는 간접적으로 접속되는 것으로 정의된다. 여기서 서로 "인접하는" 것으로 기술된 대상은, 그 문구가 사용되는 문맥에 대해 적절하게, 서로 물리적으로 접촉하거나, 서로 근접하거나, 서로 동일한 일반적 범위 또는 영역에 있는 것일 수 있다. 여기서 "일실시예에서"라는 문구의 존재는 반드시 그런 것은 아니지만 동일한 실시예를 의미한다.In addition, terms such as "left", "right", "front", "rear", "top", "bottom", "above", "below" in the specification and claims are used for explanation, It is not necessarily intended to describe an invariant relative position. It will be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the invention described herein may, for example, operate in directions other than those shown or described herein. As used herein, the term "connected" is defined as being directly or indirectly connected in an electrical or non-electrical manner. Objects described herein as "adjacent" to each other may be in physical contact with each other, in close proximity to each other, or in the same general scope or area as is appropriate for the context in which the phrase is used. The presence of the phrase “in one embodiment” herein refers to the same embodiment, although not necessarily.

또한 명세서 및 청구범위에서 '연결된다', '연결하는', '체결된다', '체결하는', '결합된다', '결합하는' 등과 이런 표현의 다양한 변형들의 지칭은 다른 구성요소와 직접적으로 연결되거나 다른 구성요소를 통해 간접적으로 연결되는 것을 포함하는 의미로 사용된다. In addition, in the specification and claims, 'connected', 'connected', 'engaged', 'fastened', 'coupled', 'coupled', etc., refer to various variations of these expressions directly with other components. It is used in the meaning of being connected or indirectly connected through other components.

반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다.On the other hand, when an element is referred to as “directly connected” or “directly connected” to another element, it should be understood that no other element exists in the middle.

또한, 본 명세서에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로써, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다.In addition, the suffixes "module" and "unit" for components used in this specification are given or used interchangeably in consideration of ease of writing the specification, and do not have meanings or roles that are distinct from each other by themselves.

또한 본 명세서에서 사용된 용어들은 실시예를 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서 사용되는 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "구성된다" 또는 "포함한다" 등의 용어는 명세서 상에 기재된 여러 구성 요소들, 또는 여러 단계들을 반드시 모두 포함하는 것으로 해석되지 않아야 하며, 그 중 일부 구성 요소들 또는 일부 단계들은 포함되지 않을 수도 있고, 또는 추가적인 구성 요소 또는 단계들을 더 포함할 수 있는 것으로 해석되어야 한다.Also, the terms used in this specification are for describing the embodiments and are not intended to limit the present invention. Singular expressions used herein include plural expressions unless the context clearly dictates otherwise. In this application, terms such as "consisting of" or "comprising" should not be construed as necessarily including all of the various components or steps described in the specification, and some of the components or some of the steps It should be construed that it may not be included, or may further include additional components or steps.

이제까지 본 발명에 대하여 그 바람직한 실시예들을 중심으로 살펴보았다. 본 명세서를 통해 개시된 모든 실시예들과 조건부 예시들은, 본 발명의 기술 분야에서 통상의 지식을 가진 당업자가 독자가 본 발명의 원리와 개념을 이해하도록 돕기 위한 의도로 기술된 것으로, 당업자는 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다.So far, the present invention has been looked at with respect to its preferred embodiments. All embodiments and conditional examples disclosed throughout this specification are described with the intention of helping readers to understand the principles and concepts of the present invention by those skilled in the art having ordinary knowledge in the technical field of the present invention, and those skilled in the art It will be understood that it can be implemented in a modified form within the range that does not deviate from the essential characteristics of the present invention.

그러므로 개시된 실시예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다. Therefore, the disclosed embodiments should be considered from an illustrative rather than a limiting point of view. The scope of the present invention is shown in the claims rather than the foregoing description, and all differences within the equivalent scope will be construed as being included in the present invention.

한편, 상술한 본 발명의 다양한 실시 예들에 따른 차량의 주행상태를 고려한 응답 제공 방법은 프로그램으로 구현되어 서버 또는 기기들에 제공될 수 있다. 이에 따라 각 장치들은 프로그램이 저장된 서버 또는 기기에 접속하여, 상기 프로그램을 다운로드 할 수 있다.Meanwhile, the above-described response providing method considering the driving state of the vehicle according to various embodiments of the present invention may be implemented as a program and provided to servers or devices. Accordingly, each device may access a server or device in which the program is stored and download the program.

또한, 상술한 본 발명의 다양한 실시 예들에 따른 방법은 프로그램으로 구현되어 다양한 비일시적 판독 가능 매체(non-transitory computer readable medium)에 저장되어 제공될 수 있다. 비일시적 판독 가능 매체란 레지스터, 캐쉬, 메모리 등과 같이 짧은 순간 동안 데이터를 저장하는 매체가 아니라 반영구적으로 데이터를 저장하며, 기기에 의해 판독(reading)이 가능한 매체를 의미한다. 구체적으로는, 상술한 다양한 어플리케이션 또는 프로그램들은 CD, DVD, 하드 디스크, 블루레이 디스크, USB, 메모리카드, ROM 등과 같은 비일시적 판독 가능 매체에 저장되어 제공될 수 있다.In addition, the method according to various embodiments of the present invention described above may be implemented as a program and stored in various non-transitory computer readable media to be provided. A non-transitory readable medium is not a medium that stores data for a short moment, such as a register, cache, or memory, but a medium that stores data semi-permanently and can be read by a device. Specifically, the various applications or programs described above may be stored and provided in non-transitory readable media such as CD, DVD, hard disk, Blu-ray disk, USB, memory card, ROM, and the like.

또한, 이상에서는 본 발명의 바람직한 실시 예에 대하여 도시하고 설명하였지만, 본 발명은 상술한 특정의 실시 예에 한정되지 아니하며, 청구범위에서 청구하는 본 발명의 요지를 벗어남이 없이 당해 발명이 속하는 기술분야에서 통상의 지식을 가진 자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 발명의 기술적 사상이나 전망으로부터 개별적으로 이해되어져서는 안될 것이다.In addition, although the preferred embodiments of the present invention have been shown and described above, the present invention is not limited to the specific embodiments described above, and the technical field to which the present invention belongs without departing from the gist of the present invention claimed in the claims. Of course, various modifications are possible by those skilled in the art, and these modifications should not be individually understood from the technical spirit or perspective of the present invention.

100 : 음성 인식 장치 110 : 음성 처리부
120 : 주행 상태 판단부 130 : 위치 판단부
140 : 제어부100: voice recognition device 110: voice processing unit
120: driving state determination unit 130: position determination unit
140: control unit

Claims

In the response providing method considering the driving state of the vehicle,
Receiving a voice according to an utterance of a speaker in the vehicle;
converting the received voice into text;
determining a current driving state of the vehicle;
determining the current location of the vehicle;
The speaker's utterance intention is analyzed from the text, and based on the determined driving state of the vehicle, whether the speaker's utterance is a general request requiring a general vehicle action unrelated to the speaker's safety or related to the speaker's safety Determining whether an emergency request requires an urgent action of the vehicle; and
Providing guidance for handling the problem situation of the vehicle according to the general request or emergency request; Including,
Wherein the step of determining whether the request is an emergency further considers the current location of the vehicle and surrounding environment information to determine whether the speaker's utterance is a general request or an emergency request.

According to claim 1,
The driving state includes at least two of a state in which the vehicle is driving, a state in which the vehicle is stopped, and a state in which the vehicle is parked.

According to claim 2,
The determining may include calculating a degree of urgency according to a correlation between the speaker's utterance, the determined current location, and the driving state; and
and classifying the speaker's utterance into the general request or the emergency request by comparing the calculated urgency with a preset reference value.

According to claim 3,
The step of providing the guidance is
When the speaker makes a plurality of utterances, a guide corresponding to each utterance is provided according to the degree of urgency among the plurality of utterances.

According to claim 2,
The step of providing the guidance is
When the speaker's utterance is determined to be an emergency request, at least one of an emergency action method, an ARS connection service, and a rescue request service is provided according to the degree of urgency.

In the voice recognition device,
a voice input unit for receiving a voice according to an utterance of a speaker in the vehicle;
a voice-to-text conversion unit that converts the received voice into text;
a driving state determination unit for determining a current driving state of the vehicle;
a location determining unit that determines the current location of the vehicle;
The speaker's utterance intention is analyzed from the text, and based on the determined driving state of the vehicle, whether the speaker's utterance is a general request requiring a general vehicle action unrelated to the speaker's safety or related to the speaker's safety a natural language understanding unit that determines whether an emergency request requires an urgent action by the vehicle; and
Including; a control unit that provides guidance for handling a problem situation of the vehicle according to the general request or emergency request;
The voice recognition device of claim 1 , wherein the natural language understanding unit further considers the current location of the vehicle and surrounding environment information to determine whether the speaker's utterance is a general request or an emergency request.

According to claim 6,
The driving state includes a state in which the vehicle is driving, a state in which the vehicle is stopped, and a state in which the vehicle is parked.

According to claim 7,
The control unit calculates a degree of urgency according to a correlation between the speaker's utterance, the determined current location, and the driving state;
The voice recognition apparatus of claim 1 , wherein the natural language understanding unit compares the calculated level of urgency with a preset reference value to classify the speaker's utterance as the general request or the emergency request.

According to claim 8,
The control unit,
When the speaker utters a plurality of utterances, the voice recognition apparatus characterized in that providing a guide corresponding to each utterance according to the degree of urgency among the plurality of utterances.

According to claim 7,
The voice recognition device, characterized in that the control unit provides at least one of an emergency action method, an ARS connection service, and a rescue request service according to the degree of urgency when it is determined that the speaker's utterance is an emergency request.

A program stored in a computer readable recording medium containing program codes for executing the response providing method in consideration of the driving state of a vehicle according to any one of claims 1 to 5.