KR101086602B1

KR101086602B1 - Voice recognition system for vehicle and the method thereof

Info

Publication number: KR101086602B1
Application number: KR1020050078792A
Authority: KR
Inventors: 예성수
Original assignee: 현대자동차주식회사
Priority date: 2005-08-26
Filing date: 2005-08-26
Publication date: 2011-11-23
Also published as: KR20070024158A

Abstract

본 발명은 차량용 음성인식 시스템 및 그 방법에 관련된 것으로서, 보다 상세하게는 위치추적카메라를 이용하여 사용자의 입술의 위치를 추적하여 사용자의 위치에 대한 좌표값을 검출하고 상기 검출된 좌표값으로부터 사용자 간의 거리를 계산한 후에 상기 계산된 거리값을 음성인식 시스템에 반영하여 그에 따른 변수값을 바꾸어 음성을 인식하는 방법을 사용함으로써 차량에 위치한 사용자의 위치에 상관 없이 마이크로부터 입력되는 음성을 보다 정확하게 인식하여 음성인식률을 높일 수 있는 차량용 음성인식 시스템 및 그 방법에 관한 것이다.The present invention relates to a voice recognition system for a vehicle and a method thereof, and more particularly, tracks the position of a user's lips by using a location tracking camera to detect coordinate values of the user's position and between the user from the detected coordinate values. After calculating the distance, the calculated distance value is reflected in the voice recognition system, and the variable value corresponding thereto is used to recognize the voice, thereby more accurately recognizing the voice input from the microphone regardless of the position of the user located in the vehicle. The present invention relates to a voice recognition system for a vehicle and a method thereof for improving voice recognition rate.

상기와 같은 발명은, 종래의 음성인식 시스템이 차량에서 사용자가 변경되거나 사용자가 위치한 곳이 변동되면 이에 따라 음성인식율이 낮아진다는 단점을 가지는 점에 비해서 사용자의 키나 사용자의 위치, 또는 움직임에 관계없이 음성신호를 보다 정확하게 인식할 수 있어 차량의 음성 제어에 신뢰성을 향상 시킬 수 있다는 점에서 유리한 효과를 가진다.As described above, the conventional voice recognition system has a disadvantage in that the voice recognition rate is lowered when the user is changed or the user's location is changed in the vehicle, regardless of the user's height or the user's position or movement. Since the voice signal can be recognized more accurately, the reliability of the voice control of the vehicle can be improved.

자동차, 음성인식장치, 위치 추적, 마이크 Car, Voice Recognition, Location Tracking, Microphone

Description

Voice recognition system for vehicle and method thereof {VOICE RECOGNITION SYSTEM FOR VEHICLE AND THE METHOD THEREOF}

도 1은 종래의 차량용 음성인식장치의 내부 구성도이다.1 is a block diagram of a conventional voice recognition device for a vehicle.

도 2는 본 발명에 따른 위치 감지 기능을 가지는 차량용 음성인식 시스템의 내부 구성도이다.2 is an internal configuration diagram of a vehicle voice recognition system having a position sensing function according to the present invention.

도 3은 본 발명에 따른 위치 감지 기능을 가지는 차량용 음성인식 시스템의 거리 감지 원리를 설명하기 위한 간략도이다.Figure 3 is a simplified diagram for explaining the principle of distance detection of the vehicle voice recognition system having a position detection function according to the present invention.

도 4는 본 발명에 따른 차량용 음성인식장치의 내부 구성도이다.4 is an internal configuration of a voice recognition device for a vehicle according to the present invention.

도 5는 본 발명에 따른 위치 감지 기능을 가지는 차량용 음성인식 시스템의 동작 과정을 도시한 순서도이다.5 is a flowchart illustrating an operation process of a voice recognition system for a vehicle having a position sensing function according to the present invention.

* 도면의 주요 부분에 대한 부호의 설명* Explanation of symbols for the main parts of the drawings

10, 200 : 음성인식장치 20, 210 : 음성신호 입력부10, 200: voice recognition device 20, 210: voice signal input unit

32 : 디지털 신호 변환부 34 : 주파수 분석부32: digital signal converter 34: frequency analyzer

36 : 벡터 퀀타이저 40, 240 : 기준 패턴 저장부36: vector quantizer 40, 240: reference pattern storage unit

50, 250 : 인식 판단부 100 : 음성인식 시스템50, 250: recognition determination unit 100: speech recognition system

110 : 사용자 120 : 위치추적부110: user 120: location tracking unit

122 : 제1카메라 124 : 제2카메라122: first camera 124: second camera

130 : 거리산출부 220 : 음성신호 변환부130: distance calculation unit 220: voice signal conversion unit

230 : 기준 패턴 조정부230: reference pattern adjustment unit

본 발명은 차량용 음성인식 시스템 및 그 방법에 관련된 것으로서, 보다 상세하게는 사용자의 위치를 추적하여 사용자의 위치에 대한 좌표값을 검출하고 상기 검출된 좌표값으로부터 사용자 간의 거리를 계산한 후에 상기 계산된 거리값을 음성인식 시스템에 반영함으로써 차량에 위치한 사용자의 위치에 상관 없이 마이크로부터 입력되는 음성을 보다 정확하게 인식할 수 있는 차량용 음성인식 시스템 및 그 방법에 관한 것이다.The present invention relates to a voice recognition system for a vehicle and a method thereof, and more particularly, tracks a user's location to detect coordinate values for the user's location and calculates the distance between the users from the detected coordinate values. The present invention relates to a voice recognition system and a method for recognizing a voice input from a microphone more precisely by reflecting a distance value to a voice recognition system.

음성인식 기술은 음의 강약과 높낮음 등의 음성의 진동이 갖는 특징을 디지털 신호로 바꾼 후 분석하여 데이터베이스에 저장된 음성 자료와 비교함으로써 음성에 담긴 언어적인 의미를 프로세서가 자동적으로 결정하는 기술로서 요즈음에는 탤레매틱스 시스템(Telematics System)의 발전과 더불어 차량에 적용되어 점차적으로 다양한 분야에서 사용되고 있는 추세이다. Speech recognition technology is a technology that automatically determines the linguistic meaning of speech by changing the characteristics of the vibration of the voice such as sound intensity and high and low into a digital signal and analyzing it and comparing it with the speech data stored in the database. With the development of telematics system, it is applied to vehicles and is gradually used in various fields.

일반적으로 자동차에 사용되는 음성인식장치는 자동차에 구비되는 파워 윈도우, 와이퍼, 비상램프, 에어컨, 오디오 등 운행에 안전성 및 편의성을 제공하여 주는 각종 주변 장치를 운전자의 음성을 통해 편리하게 조작하는 것을 의미한다.In general, a voice recognition device used in a car means a convenient operation of various peripheral devices that provide safety and convenience in driving such as a power window, a wiper, an emergency lamp, an air conditioner, and an audio provided in a car through a driver's voice. do.

이와 같은 음성인식장치는 운전자가 마이크를 통하여 명령한 음성신호로 아날로그 전기적 신호로 변화된 후 음성인식에 필요한 소정의 전처리 과정이 실행되며, 상기 전처리된 음성신호를 묵음 구간과 음성 구간으로 분리하여 분석함으로서 이루어진다.Such a voice recognition device is a voice signal commanded by the driver through a microphone and then converted into an analog electrical signal, a predetermined preprocessing process required for voice recognition is performed, and by analyzing the preprocessed voice signal into a silent section and a voice section Is done.

상기와 같은 과정을 통해 분리된 음성 구간의 신호는 대략 10ms의 프레임으로 구간별 음성의 특징을 표현하는 파라미터를 추출하며, 추출된 파라미터를 통하여 설정된 기준 영역을 초과하는 주파수 대역을 음성신호로 추출하여 사용자의 명령에 대한 음성신호로 인식하도록 하는 방법으로 사용되었다.The signal of the speech section separated through the above process extracts a parameter expressing a feature of speech for each section in a frame of approximately 10 ms, and extracts a frequency band exceeding the reference region set through the extracted parameters as a speech signal. It was used as a way to recognize the voice signal of the user's command.

이하, 첨부된 도면을 참조하여 일반적인 차량용 음성인식장치의 구조 및 그 동작을 설명하기로 한다.Hereinafter, with reference to the accompanying drawings will be described the structure and operation of the general vehicle voice recognition device.

도면을 참조하면, 종래의 차량용 음성인식장치(10)는 사용자의 음성을 인식하여 전기 신호로 바꾸어 출력하는 음성신호 입력부(20)와, 상기 음성신호 입력부(20)를 통해 입력된 아날로그 음성신호를 필터링한 후에 디지털 음성신호로 변환하는 디지털 신호 변환부(32)와, 상기 디지털 신호 변환부(32)에서 디지털 변환된 음 성신호를 입력받아 주파수를 분석하여 분석된 음성 데이타를 출력하는 주파수 분석부(34)와, 상기 주파수 분석부(34)에서 분석된 음성 데이타를 입력받아 코드북(Codebook)을 이용하여 주파수 분석된 음성 데이터를 테스트 패턴으로 만들어 출력하는 벡터 퀀타이저(Vector Quantizer)(36)와, 음성 명령의 기준 패턴이 미리 저장되어 있는 기준 패턴 저장부(40)와, 상기 벡터 퀀타이저(36)에서 출력되는 테스트 패턴을 입력받아 기준 패턴 저장부(40)에 저장된 기준 패턴과 비교하여 일치하는 단어(Matched Word)를 출력하는 인식 판단부(DTW, Dynamic Time Warping)(50)로 이루어져 있다.Referring to the drawings, the conventional vehicle voice recognition apparatus 10 recognizes a user's voice and converts it into an electrical signal and outputs the voice signal input unit 20 and the analog voice signal input through the voice signal input unit 20. A digital signal converter 32 for filtering and converting the digital voice signal into a digital voice signal, and a frequency analyzer for receiving the digital signal converted from the digital signal converter 32 and analyzing the frequency to output the analyzed voice data (34) and a vector quantizer (36) for receiving the voice data analyzed by the frequency analyzer 34 to generate a test pattern of the voice data analyzed using a codebook as a test pattern. And a reference pattern storage unit 40 in which a reference pattern of a voice command is stored in advance, and a test pattern output from the vector quantizer 36. It is composed of a recognition determination unit (DTW, Dynamic Time Warping) (50) for outputting a matched word (Matched Word) compared to the stored reference pattern.

상기의 구성에 의한 종래의 차량용 음성인식장치(10)의 동작은 다음과 같다.The operation of the conventional vehicle voice recognition device 10 by the above configuration is as follows.

먼저, 동작에 필요한 전원이 인가되면, 마이크 등의 음성신호 입력부(20)를 통해 입력되는 아날로그 음성신호가 디지털 신호 변환부(32)를 통해 디지털 음성신호로 변환된 후에 주파수 분석부(34)를 통하여 주파수 분석된 음성 데이터로 출력된다. First, when power required for operation is applied, the analog voice signal input through the voice signal input unit 20 such as a microphone is converted into a digital voice signal through the digital signal converter 32 and then the frequency analyzer 34 is operated. The audio data is output through the frequency analysis.

계속하여, 벡터 퀀타이저(36)는 입력되는 음성신호를 코드북을 이용하여 주파수 분석된 음성 데이터를 테스트 패턴으로 만들어 출력한다. 이 때, 음성 명령의 기준 패턴은 기준 패턴 저장부(40)에 미리 저장되어 있어, 인식 판단부(50)에서는 상기 벡터 퀀타이저(36)에서 출력되는 테스트 패턴을 입력받아 상기 기준 패턴 저장부(40)에 저장된 기준 패턴과의 거리를 계산하여 전체 패턴에서 상기 거리의 누산이 가장 작은 단어를 일치하는 단어로 결정하여 출력함으로써 음성인식 작업을 완료할 수 있다.Subsequently, the vector quantizer 36 outputs the input voice signal using a codebook to generate a frequency analysis voice data as a test pattern. At this time, the reference pattern of the voice command is stored in advance in the reference pattern storage unit 40, and the recognition determination unit 50 receives the test pattern output from the vector quantizer 36 and receives the reference pattern storage unit. The voice recognition operation can be completed by calculating the distance from the reference pattern stored in 40 and determining and outputting the word having the smallest accumulation of the distance as the matching word in the entire pattern.

요즈음에는 전술한 바와 같은 음성인식 방법이 조용한 환경에서는 운전자의 음성 명령이 인식되어지나, 소정의 속도 이상으로 주행하는 상태에서는 엔진의 출력에 의한 잡음과 주변의 소음이 운전자의 음성 명령과 함께 유입되므로 운전자의 음성 명령을 정상적으로 인지할 수 없어 오동작을 일으키거나 동작을 실행시킬 수 없는 문제점을 가지는 것을 고려하여 상기 벡터 퀀타이저(36)에서 출력되는 테스트 패턴을 유사한 패턴의 대표값으로 묶어 대표값에 인덱스를 붙이고, 출력되는 인덱스 시퀀스를 입력받아 음성 인덱스만을 검출하여 출력하는 인덱스 비교수단을 추가로 구비함으로써 상기 노이즈 문제를 해결하고 있다.Nowadays, the voice recognition method of the driver is recognized in a quiet environment as described above, but when driving at a predetermined speed or more, noise from the engine output and ambient noise are introduced together with the voice command of the driver. Considering that the driver's voice command cannot be normally recognized and causes a malfunction or cannot be executed, the test pattern output from the vector quantizer 36 is grouped with a representative value of a similar pattern to a representative value. The noise problem is solved by providing an index comparison unit which adds an index, receives an output index sequence, and detects and outputs only an audio index.

상기와 같이 음성인식 기술을 차량에 적용하기 위하여 다양한 분야에서 관련한 선행 개발이 진행되고 있지만 종래의 발명에서는 사용자가 움직일 경우나 사용자가 바뀔 경우에 음성인식률이 보장되지 않는다는 문제점을 가진다. In order to apply the voice recognition technology to the vehicle as described above, prior developments related to various fields are in progress, but the conventional invention has a problem that the voice recognition rate is not guaranteed when the user moves or the user changes.

상술하면, 음성인식은 마이크와 화자간의 거리가 매우 중요한데 차량의 경우 이미 그 거리값이 미리 계산되어 차량에 장착되기 때문에 사용자가 움직이거나 시트의 위치가 변한다거나 또는 사용자가 변했을 때 그에 따라 음성인식률이 달라진다는 문제점을 가지고 있어 안정적인 음성인식 시스템을 제공하는 있어 많은 어려움을 겪고 있는 실정이다.In detail, the distance between the microphone and the speaker is very important. In the case of the vehicle, since the distance value is already calculated and installed in the vehicle, the voice recognition rate may be changed when the user moves, the seat position changes, or the user changes. There is a problem in that it is different, there is a lot of difficulties in providing a stable speech recognition system.

본 발명은 상기의 문제점을 해결하고자 제안된 것으로서 위치추적카메라를 이용하여 사용자의 입술의 위치를 추적하여 사용자의 위치에 대한 좌표값을 검출하고 상기 검출된 좌표값으로부터 사용자 간의 거리를 계산한 후에 상기 계산된 거리값을 음성인식 시스템에 반영하여 그에 따른 변수값을 바꾸어 음성을 인식하는 방법을 사용함으로써 차량에 위치한 사용자의 위치에 상관 없이 마이크로부터 입력되는 음성을 보다 정확하게 인식하여 음성인식률을 높일 수 있는 차량용 음성인식 시스템 및 그 방법을 제공하고자 하는 데 그 목적이 있다. The present invention has been proposed to solve the above problems, by using the position tracking camera to track the position of the user's lips to detect the coordinate value for the user's position and calculate the distance between the user from the detected coordinate value By applying the calculated distance value to the voice recognition system and changing the variable value accordingly, it is possible to increase the voice recognition rate by more accurately recognizing the voice input from the microphone regardless of the position of the user located in the vehicle. It is an object of the present invention to provide a speech recognition system and method for a vehicle.

또한, 종래의 음성인식 시스템이 차량에서 사용자가 변경되거나 사용자가 위치한 곳이 변동되면 이에 따라 음성인식율이 낮아진다는 단점을 가지는 점에 비해서 본 발명에 따른 시스템은 사용자의 위치나 사용자의 키나 움직임에 관계없이 음성신호를 보다 정확하게 인식할 수 있어 차량의 음성 제어에 대한 신뢰성을 향상 시킬 수 있다는 장점을 가진다.In addition, the conventional voice recognition system has a disadvantage in that the voice recognition rate is lowered when the user is changed or the location of the user is changed in the vehicle, whereas the system according to the present invention is related to the user's position or the user's key or movement. Without the voice signal can be recognized more accurately has the advantage that can improve the reliability of the voice control of the vehicle.

본 발명에서는 음성인식에서 많이 사용되는 잡음제거회로(Noise Cancelation Circuit)에 입력되는 변수값을 설정할 시에도 상기에서 산출된 거리값을 반영할 수 있어 보다 우수한 잡음제거회로의 성능을 제공할 수 있다는 장점을 가진다.In the present invention, even when setting the variable value input to the noise cancellation circuit (Noise Cancelation Circuit), which is widely used in speech recognition, it is possible to reflect the distance value calculated above to provide better performance of the noise cancellation circuit. Has

상기와 같은 목적을 달성하기 위하여 본 발명은, 차량용 음성인식 시스템을 설명하고, 또한 상기 과정들을 달성하기 위한 차량용 음성인식 시스템 및 장치의 내부 구성 요소를 제안한다.In order to achieve the above object, the present invention describes a vehicle speech recognition system, and also proposes an internal component of a vehicle speech recognition system and apparatus for achieving the above processes.

차량용 음성인식을 위하여 본 발명은 위치추적부에서 사용자의 신체 위치를 감지하여 해당하는 좌표값을 추출하는 단계와, 거리산출부에서 상기 위치추적부에서 추출된 좌표값으로부터 사용자간의 거리값을 산출하는 단계와, 기준 패턴 조정부에서 상기 산출된 거리값을 입력받아 기준 패턴을 업데이트 한 다음에 조정된 기준 패턴을 기준 패턴 저장부에 저장하는 단계와, 사용자의 음성이 감지되면, 이를 입력받아 일정한 패턴으로 변환한 후에 상기 저장된 기준 패턴과 비교 분석하여 해당하는 명령어를 출력하는 단계로 이루어진다.In the present invention, for detecting a vehicle voice, a position tracking unit detects a body position of a user and extracts a corresponding coordinate value, and a distance calculator calculates a distance value between users from a coordinate value extracted from the position tracking unit. And receiving the calculated distance value from the reference pattern adjusting unit, updating the reference pattern, and storing the adjusted reference pattern in the reference pattern storage unit. After converting, comparing and comparing the stored reference pattern and outputting a corresponding command.

바람직하게, 상기 산출된 거리값은 차량에 구비된 잡음제거회로(Noise Cancelation Circuit)에 입력되는 변수값을 설정할 시에도 반영되어 설정할 수 있다.Preferably, the calculated distance value may be reflected and set when setting a variable value input to a noise cancellation circuit provided in the vehicle.

또한, 상기의 각 단계를 위하여 본 발명은 사용자의 특정 신체 위치를 감지하여 이에 해당하는 좌표값을 추출하는 위치추적부(120)와; 상기 위치추적부에서 추출된 좌표값으로부터 사용자간의 거리값을 산출하는 거리산출부(130)와; 상기 거리산출부에서 산출된 거리값을 입력받아 기준 패턴에 해당하는 변수값을 업데이트하여 저장한 후, 사용자의 음성을 입력받아 이를 상기 기준 패턴과 비교 분석한 후에 해당하는 명령을 출력하는 음성인식장치(200)를 구비한다.In addition, for each of the above steps, the present invention includes a position tracking unit 120 for detecting a specific body position of the user and extracting a coordinate value corresponding thereto; A distance calculator 130 for calculating a distance value between users from the coordinate values extracted by the location tracker; A voice recognition device that receives a distance value calculated by the distance calculator, updates and stores a variable value corresponding to a reference pattern, receives a user's voice, compares the result with the reference pattern, and outputs a corresponding command. 200.

바람직하게, 상기의 음성인식장치(200)는 마이크 등으로 구성되어 음성을 인식하여 전기 신호로 바꾸어 출력하는 음성신호 입력부(210)와; 상기 음성신호 입력부를 통해 입력된 아날로그 음성신호를 변환하여 기준 패턴과 비교할 수 있는 테스트 패턴으로 만드는 음성신호 변환부(220)와; 거리산출부에서 산출된 거리값을 입력받아 기준 패턴 저장부에 전장된 기준 패턴의 값을 업데이트하는 기준 패턴 조정 부(230)와; 음성 명령의 기준 패턴이 미리 저장되어 있는 기준 패턴 저장부(240)와; 상기 음성신호 변환부에서 출력되는 테스트 패턴을 입력받아 기준 패턴 저장부에 저장된 기준 패턴과 비교하여 일치하는 단어를 출력하는 인식 판단부(250)로 이루어질 수 있다.Preferably, the voice recognition device 200 comprises a voice signal input unit 210 is composed of a microphone and the like to recognize the voice and to convert it into an electrical signal; A voice signal converter 220 converting the analog voice signal inputted through the voice signal input unit into a test pattern which can be compared with a reference pattern; A reference pattern adjusting unit 230 which receives a distance value calculated by the distance calculating unit and updates a value of the reference pattern which is transmitted to the reference pattern storage unit; A reference pattern storage unit 240 in which reference patterns of voice commands are stored in advance; A recognition determination unit 250 may be configured to receive a test pattern output from the voice signal converter and to output a matched word by comparing the reference pattern stored in the reference pattern storage unit.

또한, 상기 음성신호 변환부(220)는 상기 음성신호 입력부를 통해 입력된 아날로그 음성신호를 디지털 음성신호로 변환하는 디지털 신호 변환부와; 상기 디지털 신호 변환부에서 변환된 음성신호를 입력받아 주파수를 분석하여 출력하는 주파수 분석부와; 상기 주파수 분석부에서 분석된 음성 데이타를 입력받아 음성 데이터를 테스트 패턴으로 만들어 출력하는 벡터 퀀타이저로 이루어지는 것을 특징으로 한다.In addition, the voice signal converter 220 may include a digital signal converter for converting an analog voice signal input through the voice signal input unit into a digital voice signal; A frequency analyzer which receives the voice signal converted by the digital signal converter and analyzes and outputs a frequency; It is characterized in that it consists of a vector quantizer for receiving the voice data analyzed by the frequency analyzer to make the voice data as a test pattern and output.

이하, 본 발명의 가장 바람직한 실시예를 첨부한 도면을 참조하여 상세히 설명하면 다음과 같다.Hereinafter, with reference to the accompanying drawings, the most preferred embodiment of the present invention will be described in detail as follows.

도면을 참조하면, 본 발명에 따른 차량용 음석 인식 시스템(100)은 사용자(100)의 특정 신체 위치를 감지하여 이에 해당하는 현재 위치에 대한 좌표값을 추출하는 위치추적부(120)와, 상기 위치추적부(120)에서 추출된 좌표값으로부터 사용자(110)와 위치추적부(120) 간의 거리값을 산출하는 거리산출부(130)와, 상기 거리 산출부(130)에서 산출된 거리값을 입력받아 기준 패턴에 해당하는 변수값을 업데이트한 후 사용자(110)로부터 음성신호를 입력받으면 이를 상기 기준 패턴과 비교 분석한 후에 해당하는 명령어를 출력하는 음성인식장치(200)로 이루어진다.Referring to the drawings, the vehicle sound recognition system 100 according to the present invention detects a specific body position of the user 100 and extracts a coordinate value for the current position corresponding to the position tracking unit 120, the position The distance calculator 130 calculates a distance value between the user 110 and the location tracker 120 from the coordinate values extracted by the tracker 120, and inputs the distance value calculated by the distance calculator 130. After receiving a variable value corresponding to the reference pattern and receiving a voice signal from the user 110, the voice recognition device 200 outputs a corresponding command after comparing and analyzing the voice signal with the reference pattern.

상기 위치추적부(120)는 위치를 추적할 수 있도록 드라이버석 차량 천정와 전방 윈도우가 만나는 지점에 위치한 제1카메라(122) 및 제2카메라(124)의 두 개의 카메라로 이루어 질 수 있다. 본 발명에 따른 일실시예로 상기 특정 신체 위치는 입술이 될 수 있으며, 이 때, 상기 제1카메라(122) 및 제2카메라(124)는 사용자(110)의 입술의 붉은 색을 계속 추적하여 현재 사용자(110)의 위치에 대한 좌표값을 추출할 수 있다. The position tracking unit 120 may be composed of two cameras, a first camera 122 and a second camera 124 located at a point where the driver's seat ceiling and the front window meet to track the position. According to an embodiment of the present invention, the specific body position may be a lip. In this case, the first camera 122 and the second camera 124 continue to track the red color of the lips of the user 110. Coordinate values for the location of the current user 110 may be extracted.

보다 상세하게 도 3을 참조하면, 상기 제1카메라(122) 및 제2카메라(124)의 위치는 고정되어 있고, 이에 따라 두 카메라 사이의 거리(ℓ)가 일정한 값을 가지게 된다. 또한, 2차원 각도 측정을 이용하여 상기 제1카메라(122)와 사용자(110)의 입술 사이의 각도(α) 및 제2카메라(124)와 사용자(110)의 입술 사이의 각도(β)를 알 수 있으므로 사용자(110)의 위치에 대한 좌표값을 추출할 수 있다.Referring to FIG. 3 in detail, the positions of the first camera 122 and the second camera 124 are fixed, so that the distance l between the two cameras has a constant value. In addition, the angle α between the first camera 122 and the lips of the user 110 and the angle β between the second camera 124 and the lips of the user 110 are measured using two-dimensional angle measurement. As can be seen, it is possible to extract a coordinate value for the location of the user 110.

상기 거리산출부(130)는 상기 제1카메라(122) 및 제2카메라(124)로부터 좌표값을 받아 사용자(110)의 입술까지의 거리값을 계산한다. 이 때, 상기의 거리값은 사용자의 키나 사용자의 움직임, 사용자의 위치를 모두 고려하여 계산된 수치이므로 본 발명에 따른 시스템은 항상 사용자의 입술을 추적하여 정확한 거리 파라메터 값을 산출할 수 있어 보다 정확한 음성인식률을 제공할 수 있도록 한다.The distance calculator 130 receives coordinate values from the first camera 122 and the second camera 124 and calculates a distance value to the lips of the user 110. At this time, since the distance value is calculated in consideration of the user's height, the user's movement, and the user's position, the system according to the present invention can always calculate the accurate distance parameter value by tracking the user's lips. Provide speech recognition rate.

상기 음성인식장치(200)는 상기 거리산출부(130)에서 산출된 거리값을 입력 받아 기준 패턴에 해당하는 변수값을 업데이트한 후 사용자(110)의 음성을 입력받아 이를 적절한 패턴으로 변환하여 상기 기준 패턴과 비교 분석한 후에 해당하는 명령어를 출력하는 역할을 수행한다.The voice recognition device 200 receives a distance value calculated by the distance calculator 130 and updates a variable value corresponding to a reference pattern, receives a voice of the user 110, and converts the voice into an appropriate pattern. It performs the role of outputting the corresponding command after comparing with the reference pattern.

보다 상세하게, 도 4는 본 발명에 따른 차량용 음성인식장치(200)의 내부 구성도이다.In more detail, Figure 4 is an internal configuration of the vehicle voice recognition device 200 according to the present invention.

도면을 참조하면, 본 발명에 따른 음성인식장치(200)는 마이크 등으로 구성되어 음성을 인식하여 전기 신호로 바꾸어 출력하는 음성신호 입력부(210)와, 상기 음성신호 입력부(210)를 통해 입력된 아날로그 음성신호를 변환하여 기준 패턴과 비교할 수 있는 테스트 패턴으로 만드는 음성신호 변환부(220)와, 음성 명령의 기준 패턴이 미리 저장되어 있는 기준 패턴 저장부(240)와, 상기 음성신호 변환부(220)에서 출력되는 음성신호를 입력받아 이를 변환한 후에 기준 패턴 저장부에 저장된 기준 패턴과 비교하여 일치하는 단어를 출력하는 인식 판단부(250)로 구성된다. 이 외에도, 본 발명에 따라 거리산출부(130)에서 산출된 사용자와의 거리값을 입력받아 기준 패턴 저장부(240)에 저장된 변수값을 업데이트하는 기준 패턴 조정부(230)를 추가로 구비한다.Referring to the drawings, the voice recognition device 200 according to the present invention includes a voice signal input unit 210 and a voice signal input unit 210 for recognizing and converting the voice into an electrical signal and outputting the voice signal input unit 210. A voice signal converter 220 for converting an analog voice signal into a test pattern that can be compared with a reference pattern, a reference pattern storage unit 240 in which a reference pattern of a voice command is stored in advance, and the voice signal converter ( After receiving the audio signal output from the 220 and converting it, it is composed of a recognition determination unit 250 for comparing the reference pattern stored in the reference pattern storage unit and outputs a matched word. In addition, according to the present invention, the apparatus further includes a reference pattern adjusting unit 230 which receives a distance value with the user calculated by the distance calculating unit 130 and updates a variable value stored in the reference pattern storage unit 240.

상기 음성신호 변환부(220)는 상기 음성신호 입력부를 통해 입력된 아날로그 음성신호를 디지털 음성신호로 변환하는 디지털 신호 변환부와, 상기 디지털 신호 변환부에서 변환된 음성신호를 입력받아 주파수를 분석하여 출력하는 주파수 분석부와, 상기 주파수 분석부에서 분석된 음성 데이타를 입력받아 음성 데이터를 테스 트 패턴으로 만들어 출력하는 벡터 퀀타이저로 이루어질 수 있다.The voice signal converter 220 receives a digital signal converter for converting an analog voice signal input through the voice signal input unit into a digital voice signal, and analyzes a frequency by receiving the voice signal converted by the digital signal converter. A frequency quantizer for outputting and a vector quantizer for receiving the voice data analyzed by the frequency analyzer and making the test data into a test pattern are output.

상기 기준 패턴 조정부(230)는 상기 거리산출부(130)에서 산출된 음성신호 입력부(120)와 사용자 간의 거리값을 입력받아 이를 해당하는 기준 패턴의 값을 업데이트 한 다음에 조정된 기준 패턴을 기준 패턴 저장부(240)에 저장하는 역할을 수행한다.The reference pattern adjusting unit 230 receives a distance value between the voice signal input unit 120 and the user calculated by the distance calculating unit 130, updates the value of the corresponding reference pattern, and then references the adjusted reference pattern. It serves to store in the pattern storage unit 240.

또한, 본 발명에서는 음성인식에서 많이 사용되는 잡음제거회로(Noise Cancelation Circuit)에 입력되는 변수값을 설정할 시에도 상기에서 산출된 거리값을 반영하여 사용할 수 있어 보다 우수한 잡음제거회로의 성능을 제공할 수 있는 것을 특징으로 한다.In addition, in the present invention, even when setting the variable value input to the noise cancellation circuit (Noise Cancelation Circuit) which is often used in speech recognition can be used by reflecting the distance value calculated above to provide better performance of the noise cancellation circuit. Characterized in that it can.

도면을 참조하면, 먼저, 위치추적부(120)에서 사용자(110)의 입술이 감지되면 이를 추적하여(S102) 이에 해당하는 좌표값을 추출한다(S104). 상기에서 위치추적부(120)는 위치를 추적할 수 있는 제1카메라(122) 및 제2카메라(124)의 두 개의 카메라로 이루어 질 수 있다. 이 때, 상기 제1카메라(122) 및 제2카메라(124)는 사용자(110)의 입술의 붉은 색을 계속 추적하여 현재 사용자(110)의 위치에 대한 좌표값을 추출할 수 있다. Referring to the drawing, first, when the lip of the user 110 is detected by the position tracking unit 120, it tracks it (S102) and extracts a corresponding coordinate value (S104). The position tracking unit 120 may be composed of two cameras, a first camera 122 and a second camera 124 capable of tracking a position. In this case, the first camera 122 and the second camera 124 may extract the coordinate values of the location of the current user 110 by continuously tracking the red color of the lips of the user 110.

상기에서, 상기 제1카메라(122) 및 제2카메라(124) 사이의 거리(ℓ)가 일정한 값을 가지고, 2차원 각도 측정을 이용하여 상기 제1카메라(122)와 사용자(110) 의 입술 사이의 각도(α) 및 제2카메라(124)와 사용자(110)의 입술 사이의 각도(β)를 알 수 있으므로 사용자(110)의 위치에 대한 좌표값을 추출할 수 있다.In the above, the distance (l) between the first camera 122 and the second camera 124 has a constant value, the lips of the first camera 122 and the user 110 by using a two-dimensional angle measurement. Since the angle α and the angle β between the second camera 124 and the lips of the user 110 can be known, a coordinate value of the position of the user 110 can be extracted.

거리산출부(130)에서는 상기 추출된 현재 위치에 대한 좌표값을 입력받아 이를 이전에 추출된 좌표값을 비교하여 사용자(110)의 입술 위치가 변경되었는지를 확인한다(S106). The distance calculation unit 130 receives the coordinate value of the extracted current position and compares the previously extracted coordinate value to check whether the position of the lip of the user 110 is changed (S106).

만약, 사용자(110)의 위치가 변경되었다고 판단되면, 거리산출부(130)는 상기에서 추출된 좌표값으로부터 카메라와 사용자 간의 거리값을 산출한다(S108). 상기 산출된 거리값이 음성인식장치(200)의 기준 패턴 조정부(230)로 전달되고(S110), 기준 패턴 조정부(230)에서는 상기 산출된 거리값을 입력받아 해당하는 기준 패턴을 업데이트 한 다음에 조정된 기준 패턴을 기준 패턴 저장부(240)에 저장한다(S112).If it is determined that the position of the user 110 is changed, the distance calculation unit 130 calculates the distance value between the camera and the user from the extracted coordinate values (S108). The calculated distance value is transmitted to the reference pattern adjusting unit 230 of the voice recognition apparatus 200 (S110), and the reference pattern adjusting unit 230 receives the calculated distance value and updates the corresponding reference pattern. The adjusted reference pattern is stored in the reference pattern storage unit 240 (S112).

사용자의 음성이 감지되면, 마이크 등의 음성신호 입력부(210)를 통해 음성신호가 입력되고(S114), 음성신호 변환부(220)의 디지털 신호 변환부에서는 상기 입력된 아날로그 음성신호를 디지털 신호로 변환하고, 상기 변환된 디지털 신호는 주파수 분석부를 통하여 주파수 분석된 음성 데이터로 출력되어, 벡터 퀀타이저를 통하여 일정한 형식의 테스트 패턴으로 출력된다(S116).When the user's voice is detected, a voice signal is input through a voice signal input unit 210 such as a microphone (S114), and the digital signal converter of the voice signal converter 220 converts the input analog voice signal into a digital signal. The converted digital signal is output as voice data analyzed by a frequency analyzer through a frequency analyzer and output as a test pattern of a predetermined format through a vector quantizer (S116).

인식 판단부(22)에서는 상기 기준 패턴 저장부(20)에 미리 저장되어 있는 음성 명령의 기준 패턴과 상기 벡터 퀀타이저에서 출력되는 테스트 패턴을 비교하여(S118) 기준 패턴과의 거리를 계산하고 전체 패턴에서 상기 거리의 누산이 가장 작은 단어를 일치하는 단어로 결정하여 출력함으로써(S120) 사용자의 위치나 키, 움 직임에 상관없이 음성인식률을 높일 수 있는 음성인식 과정을 수행할 수 있다.The recognition determiner 22 compares the reference pattern of the voice command pre-stored in the reference pattern storage unit 20 with the test pattern output from the vector quantizer (S118) to calculate a distance from the reference pattern. By determining and outputting the word having the smallest accumulation value of the distance in the entire pattern (S120), the voice recognition process may be performed to increase the voice recognition rate regardless of the user's position, key, or movement.

이상, 본 발명을 바람직한 실시예를 사용하여 상세히 설명하였으나, 본 발명의 범위는 특정 실시예에 한정되는 것은 아니며, 첨부된 특허 청구범위에 의하여 해석되어야 할 것이다. 또한, 이 기술 분야에서 통상의 지식을 습득한 자라면, 본 발명의 범위에서 벗어나지 않으면서도 많은 수정과 변형이 가능함을 이해하여야 할 것이다.As mentioned above, although this invention was demonstrated in detail using the preferable Example, the scope of the present invention is not limited to a specific Example and should be interpreted by the attached Claim. In addition, those skilled in the art should understand that many modifications and variations are possible without departing from the scope of the present invention.

상기에서 설명한 바와 같이 본 발명은, 위치추적카메라를 이용하여 사용자의 입술의 위치를 추적하여 사용자의 위치에 대한 좌표값을 검출하고 상기 검출된 좌표값으로부터 사용자와의 거리를 계산한 후에 상기 계산된 거리값을 음성인식 시스템에 반영하여 그에 따른 변수값을 바꾸어 음성을 인식하는 방법을 사용함으로써 차량에 위치한 사용자의 위치에 상관 없이 마이크로부터 입력되는 음성을 보다 정확하게 인식하여 음성인식률을 높일 수 있다는 효과를 가진다.As described above, the present invention uses the position tracking camera to track the position of the user's lips to detect coordinate values for the user's position and calculate the distance from the user from the detected coordinate values. By applying the distance value to the voice recognition system and changing the variable accordingly, the voice recognition method can be used to more accurately recognize the voice input from the microphone regardless of the position of the user located in the vehicle, thereby improving the voice recognition rate. Have

상기와 같은 시스템을 이용하여, 종래의 음성인식 시스템이 차량에서 사용자가 변경되거나 사용자가 위치한 곳이 변동되면 이에 따라 음성인식율이 낮아 진다는 단점을 가지는 점에 비해서 사용자의 위치에 관계없이 음성신호를 보다 정확하게 인식할 수 있어 차량의 음성 제어에 신뢰성을 향상 시킬 수 있다는 유리한 효과를 가진다.By using the system as described above, the conventional voice recognition system has a disadvantage in that the voice recognition rate is lowered when the user is changed or the location of the user is changed in the vehicle. It can be recognized more accurately and has an advantageous effect that can improve the reliability of the voice control of the vehicle.

또한, 본 발명에서는 음성인식에서 많이 사용되는 잡음제거회로(Noise Cancelation Circuit)에 입력되는 변수값을 설정할 시에도 상기에서 산출된 거리값을 반영할 수 있어 보다 우수한 잡음제거회로의 성능을 제공할 수 있다는 유리한 효과를 가진다.In addition, in the present invention, even when setting the variable value input to the noise cancellation circuit (Noise Cancelation Circuit), which is widely used in speech recognition, it is possible to reflect the distance value calculated above to provide better performance of the noise cancellation circuit. That has the beneficial effect.

Claims

A position tracking unit 120 for detecting a specific body position of the user and extracting a coordinate value corresponding thereto;

A distance calculator 130 for calculating a distance value from the user from the coordinate values extracted by the location tracker;

A voice recognition device that receives a distance value calculated by the distance calculator, updates and stores a variable value corresponding to a reference pattern, receives a user's voice, compares the result with the reference pattern, and outputs a corresponding command. (200)

Made of, the voice recognition device 200 is

A voice signal input unit 210 having a microphone for recognizing a voice and converting the voice into an electric signal;

A voice signal converter 220 converting the analog voice signal inputted through the voice signal input unit into a test pattern which can be compared with a reference pattern;

A reference pattern storage unit 240 in which reference patterns of voice commands are stored in advance;

A reference pattern adjusting unit 230 which receives a distance value calculated by the distance calculating unit and updates a value of the reference pattern stored in the reference pattern storage unit;

Recognition determination unit 250 for receiving a test pattern output from the voice signal conversion unit and compares the reference pattern stored in the reference pattern storage unit to output a matched word

Speech recognition system, characterized in that consisting of.

delete

The method of claim 1, wherein the voice signal converter 220

A digital signal converter for converting an analog voice signal input through the voice signal input unit into a digital voice signal;

A frequency analyzer which receives the voice signal converted by the digital signal converter and analyzes and outputs a frequency;

A vector quantizer which receives the voice data analyzed by the frequency analyzer and makes the voice data into a test pattern and outputs the test data.

Speech recognition system, characterized in that consisting of.

The method according to claim 1,

The body position of the user is a voice recognition system, characterized in that the lips.

The method according to claim 1,

The position tracking unit 120 is composed of a first camera and a second camera. Speech recognition system, characterized in that for calculating the coordinate value for the body position of the user using a two-dimensional angle measurement.

A first step of extracting a corresponding coordinate value by detecting a user's body position in the position tracking unit;

A second step of calculating a distance value with a user from a coordinate value extracted by the location tracking unit by a distance calculation unit;

A third step of receiving the calculated distance value from the reference pattern adjusting unit, updating the reference pattern, and storing the adjusted reference pattern in the reference pattern storage unit;

A fourth step of receiving a user's voice, converting the received voice into a predetermined pattern, comparing the stored reference pattern, and outputting a corresponding command;

It consists of

The fourth step is

Step 4-1 of inputting a voice signal through a voice signal input unit when a user voice is detected;

A voice signal converter converts the analog voice signal inputted through the voice signal input unit into a test pattern which can be compared with a reference pattern;

Step 4-3 which receives the test pattern of step 4-2 and compares the reference pattern stored in the reference pattern storage to output a matched word

Speech recognition method, characterized in that consisting of.

The method of claim 6, wherein in the first step

The body position of the user is a voice recognition method characterized in that the lips.

The method of claim 6, wherein in the second step

The calculated distance value further includes the step of reflecting and setting the variable value input to the noise canceling circuit provided in the vehicle.

delete