KR101889465B1

KR101889465B1 - voice recognition device and lighting device therewith and lighting system therewith

Info

Publication number: KR101889465B1
Application number: KR1020170014946A
Authority: KR
Inventors: 윤형관; 이풍우; 윤태식; 김성진
Original assignee: 인성 엔프라 주식회사; 주식회사 보임
Priority date: 2017-02-02
Filing date: 2017-02-02
Publication date: 2018-08-17
Also published as: KR20180090046A

Abstract

본 발명은 음성인식부가 2개의 마이크로폰들로부터 입력되는 음향신호(H1), (H2)들 각각으로부터 원신호(S1), (S2) 및 잡음신호(N1), (N2)들을 분리한 후 원신호(S1), (S2)를 합산하여 1차 원신호(X1)를 검출하는 제1 음성인식모듈과, 다른 마이크로폰으로부터 입력되는 음향신호(H3)로부터 원신호(S3) 및 잡음신호(N3)를 분리한 후 분리된 원신호(S3)를 제1 음성인식모듈로부터 입력되는 1차 원신호(X1)와 합산하여 최종 원신호(X2)를 검출하는 제2 음성인식모듈을 포함함으로써 잡음제거 시 원신호가 함께 제거되는 현상을 상쇄시켜 음성인식의 정확성 및 신뢰도를 높일 수 있고, 음성인식모듈들에 의하여 원신호의 검출이 반복적으로 수행됨으로써 음성을 정확하고 정밀하게 검출할 수 있으며, 제1 음성인식모듈 및 제2 음성인식모듈이 서로 다른 신호분리 알고리즘을 이용하여 음향신호로부터 원신호 및 잡음신호를 분리하도록 구성됨으로써 각 신호분리 알고리즘의 장점은 부각시키되, 단점을 서로 상쇄시켜 음성인식의 정확성을 더욱 높일 수 있고, 각 마이크로폰이 입력신호로부터 잡음회귀현상을 제거하기 위한 음향반향삭제(AEC, Acoustic Echo Cancellation)가 적용됨으로써 동적 잡음원을 적응적으로 제거할 수 있는 음성인식장치와, 음성인식장치가 구비된 조명등기구와, 이를 이용한 조명시스템에 관한 것이다.The present invention separates the original signals S1 and S2 and the noise signals N1 and N2 from the acoustic signals H1 and H2 input from the two microphones, (S3) and a noise signal (N3) from an acoustic signal (H3) input from another microphone, a first speech recognition module for detecting the first original signal (X1) And a second voice recognition module for detecting the final original signal X2 by summing the original signal S3 separated after the separation and the primary signal X1 input from the first voice recognition module, It is possible to improve the accuracy and reliability of voice recognition by canceling the phenomenon that the signals are removed together, and the voice can be accurately and precisely detected by repeatedly detecting the original signal by the voice recognition modules, Module and the second speech recognition module use different signal separation algorithms The advantages of each signal separation algorithm are emphasized, but the disadvantages can be offset to improve the accuracy of speech recognition, and each microphone can eliminate the noise recurrence phenomenon from the input signal (AEC) to adaptively remove a dynamic noise source, an illumination apparatus including the speech recognition apparatus, and an illumination system using the illumination apparatus.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a voice recognition device, a lighting recognition device and a lighting system using the same,

본 발명은 음성인식장치와, 음성인식장치가 구비된 조명등기구와, 이를 이용한 조명시스템에 관한 것으로서, 상세하게로는 서로 다른 음성인식 알고리즘이 적용되는 복수개의 음성인식모듈을 이용하여 입력된 음향으로부터 음성을 인식하도록 구성됨으로써 연산과정이 간단하면서도 음성인식의 정확성 및 신뢰도를 현저히 높일 수 있는 음성인식장치와, 음성인식장치가 구비된 조명등기구와, 이를 이용한 조명시스템에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a speech recognition apparatus, an illumination apparatus including the speech recognition apparatus, and an illumination system using the same. More particularly, the present invention relates to an illumination system using a plurality of speech recognition modules The present invention relates to a speech recognition apparatus, a speech recognition apparatus, and a lighting system using the speech recognition apparatus. The speech recognition apparatus is capable of recognizing speech, thereby simplifying the calculation process and significantly improving the accuracy and reliability of speech recognition.

음성인식(Speech recognition)이란, 마이크 등과 같은 음향수집센서를 통해 획득된 음향학적 신호(Acoustic speech signal)로부터 노이즈(Noise, 잡음)를 제거한 후 특징 파라미터(parameter)를 추출하여 음성을 인식하는 기술로 정의되고, 물리적인 접촉 없이 인간 및 디바이스의 인터페이스를 지원할 수 있을 뿐만 아니라 보안, 홈 네트워크, 조명등, 로봇 및 내비게이션 등과 같은 다양한 어플리케이션에 연동이 가능한 장점으로 인해 그 사용분야 및 수요도가 점차적으로 증가하고 있는 추세이다.Speech recognition is a technique for extracting a characteristic parameter by removing noise from an acoustic speech signal obtained through an acoustic collection sensor such as a microphone or the like, The field of application and demand are gradually increasing due to the ability to support interfaces of human and devices without being defined and to be physically connected, and to be able to interoperate with various applications such as security, home network, illumination, robots and navigation There is a trend.

특히 최근 들어 전자 디바이스들이 발달하고 각종 디바이스들이 상호 네트워킹(networking)되어 운용되며, 사물인터넷(IoT, Internet of Things) 서비스의 보급이 증가하였고, 이러한 사물인터넷에 있어서 디바이스가 물리적 접촉 없이 인간의 의도를 인식하기 위한 가장 편리한 수단은 인간의 생체신호이고, 음성은 이러한 인간의 생체신호 중 가장 기초적인 신호이기 때문에 음성인식기술에 대한 다양한 연구가 진행되고 있다.Especially in recent years, electronic devices have been developed and various devices have been networked with each other, and the diffusion of the internet of things (IoT) service has been increasing. The most convenient means for recognizing is the human bio-signal, and since the voice is the most basic signal among the human bio-signals, various researches on the voice recognition technology are being carried out.

이러한 음성인식기술에 있어서, 다양한 형태의 배경잡음이 포함된 음향으로부터 잡음성분을 제거하여 음질(Speech quality)을 개선하는 기술을 음성개선(Speech enhancement)이라고 하는데, 음성개선 방법으로는 크게 1)스펙트럼 차감법, MMSE-STSA 등과 같이 단일 채널 음성신호를 이용하는 방법과, 2)GSC(Generalized Sideobe Canceller), DSB(Delay-Sum Beamforming) 등과 같이 마이크로폰 배열 기반의 다채널 음성 신호를 이용하는 방법으로 분류된다.Speech enhancement is a technique for improving speech quality by removing noise components from a sound including background noise of various types in the speech recognition technology. Channel speech signal, such as a MMSE-STSA, and 2) a Generalized Sideobe Canceller (GSC) and a Delay-Sum Beamforming (DSB)

이때 다채널 음성 신호를 이용한 음성인식기술은 단일 마이크로폰 신호로는 획득할 수 없는 마이크로폰 입력신호들 사이의 차이를 이용하여 정적잡음과 동적잡음을 동시에 적응적으로 제거할 수 있기 때문에 TV/라디오 소리, 사람 대화 등의 다양한 종류의 잡음이 존재하는 실생활 환경에 적합한 장점을 갖는다.In this case, since the speech recognition technology using the multi-channel speech signal can adaptively remove the static noise and the dynamic noise simultaneously using the difference between the microphone input signals which can not be obtained with the single microphone signal, And is suitable for a real-life environment in which various kinds of noises such as human conversation are present.

도 1은 국내등록특허 제10-0486736호(발명의 명칭 : 두 개의 센서를 이용한 목적원별 신호 분리방법 및 장치)에 개시된 목적원별 신호 분리장치를 나타내는 블록도이다.FIG. 1 is a block diagram showing a signal source separation apparatus according to a first embodiment of the present invention disclosed in Korean Patent No. 10-0486736 (entitled " Method and Apparatus for Separating Signals According to Objective Using Two Sensors).

도 1의 목적원별 신호 분리장치(이하 종래기술이라고 함)(100)는 신호부재확률(Signal Absence Probability) 계산부(111)와, 목적원 식별부(113), 신호 추정부(115), 목적원별 신호분리부(117)로 이루어진다.1, a signal source separation unit 100 includes a signal absence probability calculation unit 111, a destination identification unit 113, a signal estimation unit 115, And a signal separation unit 117 for each source.

신호부재확률 계산부(111)는 한 쌍의 마이크로폰으로부터 수신되어 주파수영역으로 변환된 제1 마이크로폰 수신신호 및 제2 마이크로폰 수신신호 중 하나에 대하여 m번째 프레임에 대한 광역 신호부재확률을 산출한다.The signal absence probability calculation unit 111 calculates a wide-area signal absence probability for the m-th frame with respect to one of the first microphone reception signal and the second microphone reception signal received from the pair of microphones and converted into the frequency domain.

신호 추정부(115)는 신호부재확률 계산부(111)에 의해 산출된 광역 신호부재확률에 의해 각 주파수대역별로 잡음신호를 제거한 스펙트럼벡터를 추정한다.The signal estimator 115 estimates a spectrum vector from which a noise signal is removed for each frequency band according to the global signal absence probability calculated by the signal absence probability calculator 111. [

목적원 식별부(113)는 신호부재확률 계산부(111)에 의해 산출된 제1 마이크로폰 수신신호 또는 제2 마이크로폰 수신신호의 프레임별 국지 신호부재확률을 입력으로 하여 미리 정의된 제1 문턱치와 비교하며 비교 결과에 따라서 각 프레임의 해당 주파수대역에 목적원 신호가 존재하는지 여부를 판단한다.The target circle identification unit 113 receives the local signal absence probability of each of the first microphone reception signal or the second microphone reception signal calculated by the signal absence probability calculation unit 111 and compares it with a predefined first threshold value And determines whether the target signal exists in the corresponding frequency band of each frame according to the comparison result.

또한 목적원 식별부(113)는 만약 목적원 신호가 존재한다고 판단되는 경우, 주파수대역의 신호값에 대하여 크기 감쇠와 지연시간을 생성하고, 크기 감쇠와 지연시간으로 이루어지는 혼합 파라미터를 이용하여 목적원의 수 및 각 목적원에 속하는 주파수 대역을 구분한다.Also, if it is determined that the target signal is present, the target identification unit 113 generates a magnitude attenuation and a delay time with respect to the signal value of the frequency band, And the frequency band belonging to each destination circle.

목적원별 신호분리부(117)는 목적원 식별부(113)에서 얻어진 목적원별 레이블벡터와 신호 추정부(115)에서 얻어진 각 주파수대역에 대하여 추정된 스펙트럼벡터를 각각 승산하여 목적원별로 신호를 분리한다.The destination-specific signal separation unit 117 multiplies the estimated vector vector for each frequency band obtained by the signal estimation unit 115 by the target-source label vector obtained by the destination-source identification unit 113, do.

이와 같이 구성되는 종래기술(100)은 두 개의 마이크로폰으로부터 수신되는 혼합신호로부터 실시간으로 잡음을 제거함과 동시에 임의의 N개의 목적원별 신호를 분리할 수 있는 장점을 갖는다.The conventional art 100 configured as described above has an advantage that noise can be removed from a mixed signal received from two microphones in real time, and at the same time, an arbitrary N destination signal can be separated.

그러나 종래기술(100)은 신호 추정부(115)가 스펙트럼벡터를 추정할 때, 신호부재확률 계산부(111)에 의해 산출된 광역 신호부재확률에 의해 각 주파수대역별로 잡음신호를 제거하여 스펙트럼벡터를 추정하도록 구성되었으나, 이러한 구성은 잡음제거 시 일부의 목적원신호가 잡음과 함께 제거되기 때문에 음성인식률이 떨어지는 문제점이 발생하게 된다.However, in the conventional art 100, when the signal estimator 115 estimates a spectrum vector, the noise signal is removed for each frequency band according to the global signal absence probability calculated by the signal absence probability calculator 111, However, this configuration causes a problem that the voice recognition rate is lowered because a part of the original signal is removed together with the noise when the noise is removed.

예를 들어, 종래기술(100)이 3개 이상인 M개의 마이크로폰을 사용한다고 가정하더라도, M개의 마이크로폰으로부터 입력된 제1 ~ M 마이크로폰 수신신호들 중 적어도 하나 이상의 수신신호에 대한 스펙트럼벡터를 추정할 때, 광역 신호부재확률에 의해 각 주파수대역별로 잡음신호를 제거하도록 구성되었기 때문에 잡음 제거 시 목적원신호가 함께 제거되어 음성인식률이 떨어지게 된다.For example, even when it is assumed that M microphones having three or more microphones are used in the prior art 100, when estimating the spectral vectors of at least one of the first through M microphone reception signals input from the M microphones , And the noise signal is removed for each frequency band by the wide-area signal absence probability. Therefore, the target signal is removed together with the noise reduction, and the voice recognition rate is lowered.

즉, 종래기술(100)은 수신신호로부터 잡음으로 판별된 잡음신호를 제거하는 방식으로 목적원신호들을 추정한 후 목적원별 레이블벡터를 생성하며, 생성된 목적원별 레이블벡터들을 각각 승산하여 목적원별로 분리된 신호를 획득하도록 구성되었기 때문에 3개 이상의 마이크로폰을 사용한다고 하더라도, 잡음제거 시 목적원 신호가 함께 제거되는 구조적 한계를 갖는다.That is, the conventional art 100 estimates the source signals by eliminating the noise signal discriminated as noise from the received signal, generates a label vector according to the destination, multiplies the generated label vectors according to the destination object, Even if three or more microphones are used, it is structurally limited that the original signal is removed at the time of noise removal because it is configured to acquire a separated signal.

또한 종래기술(100)은 수신신호로부터 목적원별 신호를 분리하기 위한 모듈(신호 부재확률 계산수, 목적원 식별부, 신호추정부 및 목적원별 신호 분리부를 포함)이 한 개로만 구성되기 때문에 목적원별 신호 분리를 위한 연산이 한 번만 수행되어 음성인식률이 떨어지는 단점을 갖게 된다.In addition, since the prior art (100) comprises only one module (including a signal absence probability calculation number, a source identification unit, a signal estimation unit, and a target signal separation unit) for separating a signal for each source from a received signal, The operation for separating the signal is performed only once and the speech recognition rate is lowered.

본 발명은 이러한 문제를 해결하기 위한 것으로, 본 발명의 해결과제는 음성인식부가 2개의 마이크로폰들로부터 입력되는 음향신호(H1), (H2)들 각각으로부터 원신호(S1), (S2) 및 잡음신호(N1), (N2)들을 분리한 후 원신호(S1), (S2)를 합산하여 1차 원신호(X1)를 검출하는 제1 음성인식모듈과, 다른 마이크로폰으로부터 입력되는 음향신호(H3)로부터 원신호(S3) 및 잡음신호(N3)를 분리한 후 분리된 원신호(S3)를 제1 음성인식모듈로부터 입력되는 1차 원신호(X1)와 합산하여 최종 원신호(X2)를 검출하는 제2 음성인식모듈을 포함함으로써 잡음제거 시 원신호가 함께 제거되는 현상을 상쇄시켜 음성인식의 정확성 및 신뢰도를 높일 수 있는 음성인식장치와, 음성인식장치가 구비된 조명등기구와, 이를 이용한 조명시스템을 제공하기 위한 것이다.SUMMARY OF THE INVENTION The present invention has been made to solve the above problems and it is an object of the present invention to provide a sound recognition apparatus and a sound recognition method in which sound signals S1 and S2 are respectively output from sound signals H1 and H2 input from two microphones, A first speech recognition module for separating the signals N1 and N2 and then summing the original signals S1 and S2 to detect the primary signal X1; The original signal S3 separated from the original signal S3 and the noise signal N3 is summed with the primary signal X1 input from the first voice recognition module to obtain the final original signal X2 A voice recognition device capable of enhancing the accuracy and reliability of voice recognition by canceling the phenomenon that the original signals are removed together when the noise is removed by including the second voice recognition module for detecting the voice, Illumination system.

또한 본 발명의 다른 해결과제는 음성인식모듈들에 의하여 원신호의 검출이 반복적으로 수행됨으로써 음성을 정확하고 정밀하게 검출할 수 있는 음성인식장치와, 음성인식장치가 구비된 조명등기구와, 이를 이용한 조명시스템을 제공하기 위한 것이다.In addition, another object of the present invention is to provide a speech recognition apparatus capable of accurately and precisely detecting speech by repeatedly detecting the original signal by speech recognition modules, an illumination lamp apparatus having a speech recognition apparatus, Illumination system.

또한 본 발명의 또 다른 해결과제는 제1 음성인식모듈 및 제2 음성인식모듈이 서로 다른 신호분리 알고리즘을 이용하여 음향신호로부터 원신호 및 잡음신호를 분리하도록 구성됨으로써 각 신호분리 알고리즘의 장점은 부각시키되, 단점을 서로 상쇄시켜 음성인식의 정확성을 더욱 높일 수 있는 음성인식장치와, 음성인식장치가 구비된 조명등기구와, 이를 이용한 조명시스템을 제공하기 위한 것이다.Another advantage of the present invention is that the first speech recognition module and the second speech recognition module are configured to separate the original signal and the noise signal from the acoustic signal using different signal separation algorithms, The present invention is to provide a speech recognition apparatus, speech recognition apparatus, speech recognition apparatus, and illumination system using the same, which can improve the accuracy of speech recognition by canceling out the disadvantages.

또한 본 발명의 또 다른 해결과제는 각 마이크로폰이 입력신호로부터 잡음회귀현상을 제거하기 위한 음향반향삭제(AEC, Acoustic Echo Cancellation)가 적용됨으로써 동적 잡음원을 적응적으로 제거할 수 있는 음성인식장치와, 음성인식장치가 구비된 조명등기구와, 이를 이용한 조명시스템을 제공하기 위한 것이다.Another object of the present invention is to provide a speech recognition apparatus capable of adaptively removing a dynamic noise source by applying Acoustic Echo Cancellation (AEC) for eliminating a noise recurrence phenomenon from an input signal of each microphone, And a lighting system using the same.

또한 본 발명의 또 다른 해결과제는 음성인식율을 높여 돌발 상황에 대한 발생여부를 정확하게 판단함과 동시에 돌발 상황 발생 시 비상구를 향하는 방향으로 점등이 이루어지도록 구성됨으로써 돌발 상황에 대한 신속한 대처가 이루어져 인명사고를 미연에 방지할 수 있는 음성인식장치와, 음성인식장치가 구비된 조명등기구와, 이를 이용한 조명시스템을 제공하기 위한 것이다.Another object of the present invention is to increase the voice recognition rate to accurately determine whether or not an unexpected event occurs, and at the same time, when an unexpected event occurs, the light is directed toward the emergency exit, And an illumination system using the illumination device. [0002] The present invention relates to an illumination system and a lighting system using the same.

상기 과제를 해결하기 위한 본 발명의 해결수단은 음향신호를 수집하여 전기신호로 변환하는 제1, 2, 3 마이크로폰들; 기 설정된 참조모델들이 저장되는 참조모델 데이터베이스부; 상기 마이크로폰들에 의해 획득된 음향신호들을 입력받는 음향신호 입력부; 상기 음향신호 입력부에 의해 입력되는 음향신호들을 분석하여 원신호(X2)를 검출하는 음성인식부; 상기 음성인식부에 의해 검출된 원신호(X2)의 특징벡터를 추출한 후 추출된 특징벡터를 이용하여 특징파라미터를 생성하는 특징파라미터 생성부; 기 설정된 비교 알고리즘을 이용하여 상기 참조모델 데이터베이스부에 저장된 참조모델들과, 상기 특징파라미터 생성부에 의해 생성된 특징파라미터를 분석하여 특징파라미터와 가장 유사도가 높은 참조모델을 검출하는 비교 및 매칭부; 상기 비교 및 매칭부에 의해 검출된 참조모델에 대응되는 문자를 검색어로 하여 단어를 검색하며, 검색된 단어를 최종 출력하여 음성인식을 수행하는 단어결정부를 포함하고, 상기 음성인식부는 기 설정된 제1 신호분리 알고리즘을 이용하여 제1 마이크로폰의 음향신호를 원신호(S1) 및 잡음신호(N1)로 분리하고, 상기 제1 신호분리 알고리즘을 이용하여 제2 마이크로폰의 음향신호를 원신호(S2) 및 잡음신호(N2)로 분리하고, 분리된 원신호(S1), (S2)들을 합산하여 1차 원신호(X1)를 검출하는 제1 음성인식모듈; 기 설정된 제2 신호분리 알고리즘을 이용하여 제3 마이크로폰의 음향신호를 원신호(S3) 및 잡음신호(N3)로 분리한 후 분리된 원신호를 상기 제1 음성인식모듈에 의해 검출된 1차 원신호(X1)에 합산하여 최종 원신호(X2)를 검출하는 제2 음성인식모듈을 더 포함하고, 상기 제1 신호분리 알고리즘 및 상기 제2 신호분리 알고리즘은 서로 다른 방식으로 음향신호로부터 원신호(S) 및 잡음신호(N)를 분리시키는 것이다.According to an aspect of the present invention, there is provided an audio signal processing apparatus including first, second, and third microphones that collect acoustic signals and convert the acoustic signals into electric signals; A reference model database unit storing predetermined reference models; An acoustic signal input unit receiving acoustic signals obtained by the microphones; A voice recognition unit for analyzing the acoustic signals input by the acoustic signal input unit and detecting the original signal X2; A feature parameter generation unit for extracting a feature vector of the original signal X2 detected by the speech recognition unit and generating a feature parameter using the extracted feature vector; A comparison and matching unit for analyzing the reference models stored in the reference model database unit and the feature parameters generated by the feature parameter generation unit using a predetermined comparison algorithm to detect a reference model having the highest similarity to the feature parameters; And a word determining unit for searching for a word using a character corresponding to the reference model detected by the comparison and matching unit as a search word and finally outputting the searched word to perform speech recognition, The sound signal of the first microphone is separated into the original signal S1 and the noise signal N1 by using the separation algorithm and the sound signal of the second microphone is separated into the original signal S2 and the noise A first speech recognition module for separating the original signal S1 by a signal N2 and for summing the original signals S1 and S2 to detect a primary signal X1; And separating the sound signal of the third microphone into the original signal (S3) and the noise signal (N3) using a predetermined second signal separation algorithm, and then separating the original signal into a first order signal detected by the first speech recognition module Further comprising a second speech recognition module for summing the signal X1 to detect the final original signal X2, wherein the first signal separation algorithm and the second signal separation algorithm generate an original signal < RTI ID = 0.0 > S) and the noise signal (N).

삭제delete

또한 본 발명에서 상기 제1, 2, 3 마이크로폰들은 빔-포밍(Beam forming) 기법이 적용되고, 제1 마이크로폰은 직진 방향으로 빔을 형성하되, 제2, 3 마이크로폰들은 좌측 및 우측으로 빔을 형성하고, 상기 제1, 2, 3 마이크로폰들은 입력되는 음향신호들로부터 동적잡음원을 제거하기 위한 음향반향삭제(AEC, Acoustic Echo Cancellation)가 적용되는 것이 바람직하다.Also, in the present invention, the first, second and third microphones are applied with a beam-forming technique, the first microphone forms a beam in the straight direction, and the second and third microphones form a beam to the left and right Preferably, the first, second, and third microphones are applied with Acoustic Echo Cancellation (AEC) for removing a dynamic noise source from input acoustic signals.

또한 본 발명에서 상기 음성인식부는 최종 원신호(X2)가 검출되면, 검출된 원신호(X2)의 초성에 기 설정된 모음을 조합하되, 종성을 삭제한 음절로 변환시키고, 상기 비교 및 매칭부는 입력된 음성과 참조음성 사이의 발음 속도와 길이의 차이를 보상하기 위하여 입력 특징파라미터와 참조모델을 비선형적으로 정합하여 가장 유사도가 높은 참조모델의 음성을 인식하기 위한 동적시간 워핑(DTW; Dynamic Time Warping) 알고리즘을 이용하여 특징파라미터와 참조모델들 각각의 유클리드 제곱 거리(Squared Euclidean Distance)를 산출한 후 그 거리가 가장 작은 참조모델을 특징파리미터와 가장 유사한 모델로 인식하고, 상기 비교 및 매칭부는 특징파라미터와의 유사도가 기 설정된 임계치 이내인 참조모델이 한 개인 경우 유사도가 가장 높은 참조모델을 입력음성을 결정하며, 특정파라미터와의 유사도가 기 설정된 임계치 이내인 참조모델이 2개 이상인 경우, 음성신호를 음소단위로 분리한 후 은닉 마르포크 모델(Hidden Markov model)을 통해 패턴 비교 알고리즘을 통해 유사도가 가장 높은 음소를 입력음성을 결정하는 것이 바람직하다.In the present invention, when the final original signal X2 is detected, the speech recognition unit combines the vowel set in the initials of the detected original signal X2 into syllables whose continuity is eliminated, and the comparison and matching unit inputs Dynamic Time Warping (DTW) for recognizing the speech of the reference model with the highest similarity by nonlinearly matching the input feature parameter and the reference model in order to compensate for the difference in the speaking speed and the length between the speech and the reference speech ) Algorithm to calculate a Squared Euclidean distance of each of the feature parameters and the reference models, and then recognizes the reference model having the smallest distance as the model most similar to the feature parameter, The reference model having the highest degree of similarity is determined as the input voice if there is one reference model within the predetermined threshold value If there are two or more reference models whose similarity with a specific parameter is within a predetermined threshold value, the speech signal is divided into phonemes and then the highest similarity is obtained through a pattern comparison algorithm through a hidden Markov model It is desirable to determine phonemes as input speech.

삭제delete

또한 본 발명의 다른 해결수단은 광원소자, 하우징, 상기 하우징의 외측에 설치되는 마이크로폰 및 상기 하우징의 내부에 설치되는 제어부로 이루어지는 조명등기구들을 포함하는 조명시스템에 있어서: 상기 제어부는 상기 광원소자를 제어하는 점소등 관리부; 음성단어가 조명을 제어하기 위한 문자라고 판단할 수 있는 문자로 정의되는 기 설정된 ‘조명제어관련 비교대상문자’들과, 상기 ‘조명제어관련 비교대상문자’들 각각에 대한 실제 조명의 제어데이터를 매칭한 매칭테이블이 저장되는 메모리를 더 포함하고, 상기 제어부는 상기 마이크로폰으로부터 입력되는 음향신호를 분석하여 음성단어를 결정하며, 결정된 음성단어가 상기 ‘조명제어관련 비교대상문자’들 중 어느 하나인 경우 상기 매칭테이블을 탐색하여 제어데이터를 검출한 후 상기 점소등 관리부가 검출된 제어데이터에 따라 조명을 제어하고, 상기 마이크로폰은 서로 간격을 두고 설치되는 제1, 2, 3 마이크로폰들로 구성되고, 상기 제어부는 기 설정된 참조모델들이 저장되는 참조모델 데이터베이스부; 상기 제1, 2, 3 마이크로폰들에 의해 획득된 음향신호들을 입력받는 음향신호 입력부; 상기 음향신호 입력부에 의해 입력되는 음향신호들을 분석하여 원신호(X2)를 검출하는 음성인식부; 상기 음성인식부에 의해 검출된 원신호(X2)의 특징벡터를 추출한 후 추출된 특징벡터를 이용하여 특징파라미터를 생성하는 특징파라미터 생성부; 기 설정된 비교 알고리즘을 이용하여 상기 참조모델 데이터베이스부에 저장된 참조모델들과, 상기 특징파라미터 생성부에 의해 생성된 특징파라미터를 분석하여 특징파라미터와 가장 유사도가 높은 참조모델을 검출하는 비교 및 매칭부; 상기 비교 및 매칭부에 의해 검출된 참조모델에 대응되는 문자를 검색어로 하여 단어를 검색하며, 검색된 단어를 최종 출력하여 음성인식을 수행하는 단어결정부; 상기 단어결정부에 의해 결정된 음성단어와 상기 메모리에 저장된 상기 ‘조명제어관련 비교대상문자’들 각각의 연관관계를 검출한 후 검출된 연관관계가 임계치를 넘어서는 경우 상기 매칭테이블을 탐색하여 해당 ‘조명제어관련 비교대상문자’에 대응되는 제어데이터를 검출하여 상기 점소등 관리부로 입력하는 판단부를 더 포함하고, 상기 음성인식부는 기 설정된 제1 신호분리 알고리즘을 이용하여 제1 마이크로폰의 음향신호를 원신호(S1) 및 잡음신호(N1)로 분리하고, 상기 제1 신호분리 알고리즘을 이용하여 제2 마이크로폰의 음향신호를 원신호(S2) 및 잡음신호(N2)로 분리하고, 분리된 원신호(S1), (S2)들을 합산하여 1차 원신호(X1)를 검출하는 제1 음성인식모듈; 기 설정된 제2 신호분리 알고리즘을 이용하여 제3 마이크로폰의 음향신호를 원신호(S3) 및 잡음신호(N3)로 분리한 후 분리된 원신호를 상기 제1 음성인식모듈에 의해 검출된 1차 원신호(X1)에 합산하여 최종 원신호(X2)를 검출하는 제2 음성인식모듈을 더 포함하고, 상기 제1 신호분리 알고리즘 및 상기 제2 신호분리 알고리즘은 서로 다른 방식으로 음향신호로부터 원신호(S) 및 잡음신호(N)를 분리시키는 것이다.According to another aspect of the present invention, there is provided an illumination system including a light source device, a housing, a microphone installed outside the housing, and an illumination lamp device including a control unit installed inside the housing, the illumination system comprising: A turn-off management unit; The control data of the actual illumination for each of the 'illumination control related comparison characters' defined by the characters that can be determined as the characters for controlling the illumination of the voice word and the 'illumination control related comparison characters' And a memory in which a matched matching table is stored, wherein the controller analyzes a sound signal input from the microphone to determine a voice word, and the determined voice word is any one of the 'illumination control related comparison characters' The control unit detects the control data by searching the matching table and then controls the lighting according to the control data detected by the lighting control unit. The microphones are composed of first, second and third microphones installed at intervals, The control unit includes a reference model database unit storing predetermined reference models; An acoustic signal input unit receiving acoustic signals obtained by the first, second and third microphones; A voice recognition unit for analyzing the acoustic signals input by the acoustic signal input unit and detecting the original signal X2; A feature parameter generation unit for extracting a feature vector of the original signal X2 detected by the speech recognition unit and generating a feature parameter using the extracted feature vector; A comparison and matching unit for analyzing the reference models stored in the reference model database unit and the feature parameters generated by the feature parameter generation unit using a predetermined comparison algorithm to detect a reference model having the highest similarity to the feature parameters; A word determination unit for searching for a word using a character corresponding to the reference model detected by the comparing and matching unit as a search word and finally outputting the searched word to perform speech recognition; When the detected relation is greater than a threshold value after detecting the association between the voice word determined by the word determination unit and each of the 'illumination control related comparison characters' stored in the memory, the matching table is searched for, Further comprising a determination unit configured to detect the control data corresponding to the control-related comparison object character 'and input the control data to the light-off management unit, wherein the voice recognition unit uses the predetermined first signal separation algorithm to convert the acoustic signal of the first microphone into the original signal And separates the sound signal of the second microphone into the original signal S2 and the noise signal N2 using the first signal separation algorithm and outputs the separated original signal S1 ) And (S2) to detect the primary signal X1; And separating the sound signal of the third microphone into the original signal (S3) and the noise signal (N3) using a predetermined second signal separation algorithm, and then separating the original signal into a first order signal detected by the first speech recognition module Further comprising a second speech recognition module for summing the signal X1 to detect the final original signal X2, wherein the first signal separation algorithm and the second signal separation algorithm generate an original signal < RTI ID = 0.0 > S) and the noise signal (N).

삭제delete

또한 본 발명에서 상기 음성인식부는 최종 원신호(X2)가 검출되면, 검출된 원신호(X2)의 초성에 기 설정된 모음을 조합하되, 종성을 삭제한 음절로 변환시키고, 상기 비교 및 매칭부는 입력된 음성과 참조음성 사이의 발음 속도와 길이의 차이를 보상하기 위하여 입력 특징파라미터와 참조모델을 비선형적으로 정합하여 가장 유사도가 높은 참조모델의 음성을 인식하기 위한 동적시간 워핑(DTW; Dynamic Time Warping) 알고리즘을 이용하여 특징파라미터와 참조모델들 각각의 유클리드 제곱 거리(Squared Euclidean Distance)를 산출한 후 그 거리가 가장 작은 참조모델을 특징파리미터와 가장 유사한 모델로 인식하고, 상기 비교 및 매칭부는 특징파라미터와의 유사도가 기 설정된 임계치 이내인 참조모델이 한 개인 경우 유사도가 가장 높은 참조모델을 입력음성을 결정하며, 특정파라미터와의 유사도가 기 설정된 임계치 이내인 참조모델이 2개 이상인 경우, 음성신호를 음소단위로 분리한 후 은닉 마르포크 모델((Hidden Markov model)을 통해 패턴 비교 알고리즘을 통해 유사도가 가장 높은 음소를 입력음성을 결정하는 것이 바람직하다.In the present invention, when the final original signal X2 is detected, the speech recognition unit combines the vowel set in the initials of the detected original signal X2 into syllables whose continuity is eliminated, and the comparison and matching unit inputs Dynamic Time Warping (DTW) for recognizing the speech of the reference model with the highest similarity by nonlinearly matching the input feature parameter and the reference model in order to compensate for the difference in the speaking speed and the length between the speech and the reference speech ) Algorithm to calculate a Squared Euclidean distance of each of the feature parameters and the reference models, and then recognizes the reference model having the smallest distance as the model most similar to the feature parameter, The reference model having the highest degree of similarity is determined as the input voice if there is one reference model within the predetermined threshold value When two or more reference models with similarity to a specific parameter are within a predetermined threshold value, the speech signal is divided into phonemes and then the similarity degree is calculated through a pattern comparison algorithm through a hidden Markov model It is desirable to determine a high phoneme input voice.

또한 본 발명에서 상기 메모리에는 돌발 상황이라고 판단할 수 있는 ‘돌발관련 비교대상문자’들이 더 저장되고, 상기 판단부는 상기 단어결정부에 의해 결정된 음성단어와 상기 메모리에 저장된 상기 ‘돌발관련 비교대상문자’들 각각의 연관관계를 검출한 후 검출된 연관관계가 임계치를 넘어서는 경우 돌발 상황이 발생하였다고 판단하고, 상기 점소등 관리부는 상기 판단부에 의해 돌발 상황이 발생하였다고 판단되면, 상기 광원소자를 기 설정된 방식으로 점소등이 이루어지도록 제어함으로써 상기 조명등기구는 돌발 상황이 발생되지 않는 평상시에는 조명용으로 사용되되, 돌발 상황 발생 시 비상등으로 사용되는 것이 바람직하다.Further, in the present invention, 'memory-related characters to be compared', which can be determined as an unexpected situation, are further stored in the memory, and the determination unit compares the voice word determined by the word determination unit and the ' And if it is determined by the determination unit that an unexpected situation has occurred, it is determined that an unexpected situation has occurred, and if the detected relation is greater than the threshold value, By controlling the lighting to be turned on in a set manner, it is preferable that the illumination lamp apparatus is used for illumination in a normal situation where an unexpected situation does not occur, and is used as an emergency illumination in an unexpected situation.

또한 본 발명에서 상기 조명시스템은 상기 조명등기구들을 관리 및 제어하는 컨트롤러를 더 포함하고, 상기 제어부는 상기 판단부에 의하여 돌발 상황이 발생하였다고 판단되면, 상기 컨트롤러로 돌발 상황이 발생하였다는 돌발상황 확인데이터를 전송하고, 상기 컨트롤러는 상기 조명등기구들 각각의 위치정보와, 기 설정된 비상대피경로를 저장하며, 상기 조명등기구들 중 어느 하나의 제어부로부터 돌발 상황 확인데이터를 전송받으면 상기 조명등기구들 각각의 위치정보 및 비상대피경로를 이용하여 비상대피경로에 대응되는 순서대로 조명등기구들을 정렬한 후 정렬된 순서에 따라 점멸이 이루어지도록 하는 것이 바람직하다.Further, in the present invention, the lighting system may further include a controller for managing and controlling the lighting apparatuses, and when the controller determines that an unexpected condition has occurred, the controller determines that an unexpected condition has occurred And the controller stores the location information of each of the luminaire mechanisms and the predetermined emergency evacuation route and when the unexpected situation confirmation data is received from any one of the luminaire mechanisms, It is preferable to arrange the light fixtures in the order corresponding to the emergency evacuation route by using the location information and the emergency evacuation route, and then to perform the blinking according to the ordered sequence.

상기 과제와 해결수단을 갖는 본 발명에 따르면 음성인식부가 2개의 마이크로폰들로부터 입력되는 음향신호(H1), (H2)들 각각으로부터 원신호(S1), (S2) 및 잡음신호(N1), (N2)들을 분리한 후 원신호(S1), (S2)를 합산하여 1차 원신호(X1)를 검출하는 제1 음성인식모듈과, 다른 마이크로폰으로부터 입력되는 음향신호(H3)로부터 원신호(S3) 및 잡음신호(N3)를 분리한 후 분리된 원신호(S3)를 제1 음성인식모듈로부터 입력되는 1차 원신호(X1)와 합산하여 최종 원신호(X2)를 검출하는 제2 음성인식모듈을 포함함으로써 잡음제거 시 원신호가 함께 제거되는 현상을 상쇄시켜 음성인식의 정확성 및 신뢰도를 높일 수 있게 된다.According to the present invention having the above-described problems and the solution, the speech recognition unit generates the original signals S1 and S2 and the noise signals N1 and N2 from the acoustic signals H1 and H2 input from the two microphones, A first speech recognition module for detecting the first original signal X1 by summing the original signals S1 and S2 after separating the original signals S3 and N2 from the acoustic signals H3 inputted from other microphones; And a second voice recognition unit for detecting a final original signal X2 by summing the original signal S3 separated from the noise signal N3 and the original signal X1 input from the first speech recognition module, By including the module, it is possible to improve the accuracy and reliability of speech recognition by canceling the phenomenon that the original signal is removed together when the noise is removed.

또한 본 발명에 의하면 음성인식모듈들에 의하여 원신호의 검출이 반복적으로 수행됨으로써 음성을 정확하고 정밀하게 검출할 수 있다.Further, according to the present invention, detection of the original signal is repeatedly performed by the speech recognition modules, so that the speech can be accurately and precisely detected.

또한 본 발명에 의하면 제1 음성인식모듈 및 제2 음성인식모듈이 서로 다른 신호분리 알고리즘을 이용하여 음향신호로부터 원신호 및 잡음신호를 분리하도록 구성됨으로써 각 신호분리 알고리즘의 장점은 부각시키되, 단점을 서로 상쇄시켜 음성인식의 정확성을 더욱 높일 수 있다.According to the present invention, since the first speech recognition module and the second speech recognition module separate the original signal and the noise signal from the acoustic signal using different signal separation algorithms, the advantages of each signal separation algorithm are highlighted, So that the accuracy of speech recognition can be further enhanced.

또한 본 발명에 의하면 각 마이크로폰이 입력신호로부터 잡음회귀현상을 제거하기 위한 음향반향삭제(AEC, Acoustic Echo Cancellation)가 적용됨으로써 동적 잡음원을 적응적으로 제거할 수 있다.According to the present invention, a dynamic noise source can be adaptively removed by applying acoustic echo cancellation (AEC) for eliminating noise recurrence from an input signal of each microphone.

또한 본 발명에 의하면 음성인식율을 높여 돌발 상황에 대한 발생여부를 정확하게 판단함과 동시에 돌발 상황 발생 시 비상구를 향하는 방향으로 점등이 이루어지도록 구성됨으로써 돌발 상황에 대한 신속한 대처가 이루어져 인명사고를 미연에 방지할 수 있게 된다.In addition, according to the present invention, it is possible to accurately determine whether an unexpected situation occurs by raising the voice recognition rate, and to illuminate in the direction toward the exit when an unexpected situation occurs, thereby promptly coping with the unexpected situation, .

도 1은 국내등록특허 제10-0486736호(발명의 명칭 : 두 개의 센서를 이용한 목적원별 신호 분리방법 및 장치)에 개시된 목적원별 신호 분리장치를 나타내는 블록도이다.
도 2는 본 발명의 일실시예인 음성인식장치를 나타내는 구성도이다.
도 3은 도 2의 마이크로폰에 적용되는 전처리 기술을 설명하기 위한 예시도이다.
도 4는 도 2의 음성인식부를 나타내는 블록도이다.
도 5는 도 4의 제1 음성인식모듈에 적용되는 제1 신호분리 알고리즘을 설명하기 위한 예시도이다.
도 6은 도 2의 동작과정을 설명하기 위한 플로차트이다.
도 7은 도 2의 음성인식장치가 적용된 조명시스템을 설명하기 위한 구성도이다.
도 8은 도 7의 조명등에 설치되는 제어부를 나타내는 블록도이다.
도 9는 도 7의 예시도이다.
도 10은 도 7의 컨트롤러를 나타내는 블록도이다.
도 11은 도 10의 조명등 제어부에 의해 결정되는 점멸주기를 나타내는 예시도이다.FIG. 1 is a block diagram showing a signal source separation apparatus according to a first embodiment of the present invention disclosed in Korean Patent No. 10-0486736 (entitled " Method and Apparatus for Separating Signals According to Objective Using Two Sensors).
2 is a block diagram showing a speech recognition apparatus according to an embodiment of the present invention.
3 is an exemplary diagram for explaining a pre-processing technique applied to the microphone of FIG.
4 is a block diagram showing the speech recognition unit of FIG.
5 is an exemplary diagram illustrating a first signal separation algorithm applied to the first speech recognition module of FIG.
FIG. 6 is a flowchart for explaining the operation procedure of FIG. 2. FIG.
FIG. 7 is a block diagram illustrating a lighting system to which the speech recognition apparatus of FIG. 2 is applied.
8 is a block diagram showing a control unit installed in the illumination lamp of Fig.
Fig. 9 is an exemplary view of Fig. 7. Fig.
10 is a block diagram showing the controller of Fig.
11 is an exemplary diagram showing a blinking period determined by the lamp control unit of Fig.

이하, 첨부된 도면을 참조하여 본 발명의 일실시예를 설명한다.Hereinafter, an embodiment of the present invention will be described with reference to the accompanying drawings.

도 2는 본 발명의 일실시예인 음성인식장치를 나타내는 구성도이다.2 is a block diagram showing a speech recognition apparatus according to an embodiment of the present invention.

본 발명의 일실시예인 음성인식장치(1)는 복수개의 마이크로폰(11-1), (11-2), (11-3)들로부터 음향신호를 입력받으면, 후술되는 도 4의 제1 음성인식모듈(51)이 제1, 2 마이크로폰(11-1), (11-2)들의 입력신호(H1), (H2)들에 대하여 제1 신호분리 알고리즘을 이용하여 각 입력신호(H1), (H2)들로부터 원신호(S1), (S2)들 및 잡음신호(N1), (N2)들을 분리한 후 원신호(S1), (S2)들을 합산하여 1차 원신호(X1=S1+S2)를 검출한다.The speech recognition apparatus 1 according to an embodiment of the present invention receives a sound signal from a plurality of microphones 11-1, 11-2, and 11-3, The module 51 outputs each of the input signals H1 and H2 to the input signals H1 and H2 of the first and second microphones 11-1 and 11-2 using the first signal separation algorithm. S2 + S2 from the original signals S1 and S2 after the original signals S1 and S2 and the noise signals N1 and N2 are separated from the original signals S1 and S2, ).

또한 후술되는 도 4의 제2 음성인식모듈(52)은 제2 신호분리 알고리즘을 이용하여 제3 마이크로폰(11-3)의 입력신호(H3)로부터 원신호(S3) 및 잡음신호(N3)를 검출하면, 검출된 원신호(S3)를 제1 음성인식모듈(51)로부터 입력된 1차 원신호(X1)와 합산하여 최종 원신호(X2=S3+X1)를 검출한다.The second speech recognition module 52 of FIG. 4 to be described later uses the second signal separation algorithm to calculate the original signal S3 and the noise signal N3 from the input signal H3 of the third microphone 11-3 The detected original signal S3 is added to the primary signal X1 input from the first voice recognition module 51 to detect the final original signal X2 = S3 + X1.

즉 본 발명의 음성인식장치(1)는 동일 음향신호에 대한 마이크로폰들로 입력되는 입력신호들은 음향발생위치와의 방향 및 거리에 따라 원신호 및 잡음의 주파수크기가 각기 다르게 형성되기 때문에 본 발명에서는 서로 다른 신호분리 알고리즘이 적용되는 두 개의 음성인식모듈들을 통해 각 입력신호에 대한 잡음제거 시 원신호가 함께 제거되는 현상을 상쇄시킴과 동시에 원신호의 검출이 반복적으로 이루어져 음성을 정확하고 정밀하게 검출할 수 있으며, 각 신호분리 알고리즘의 장점은 부각시키되, 단점을 서로 상쇄시켜 음성인식의 정확성을 현저히 높일 수 있게 된다.That is, in the speech recognition apparatus 1 according to the present invention, since the input signals input to the microphones for the same acoustic signal are different in frequency magnitude of the original signal and noise according to the direction and distance to the acoustic generating position, Through the two speech recognition modules with different signal separation algorithms, it is possible to cancel out the phenomenon that the original signals are removed together with the noise removal for each input signal, and at the same time, the original signal is repeatedly detected, The advantages of each signal separation algorithm are highlighted, but the accuracy of speech recognition can be significantly improved by canceling out the disadvantages.

이때 제1 음성인식모듈(51) 및 제2 음성인식모듈(52)은 동일한 신호분리 알고리즘을 이용하여 음향신호로부터 원신호 및 잡음신호를 분리하는 것으로 구성되어도 무방하나, 서로 다른 신호분리 알고리즘이 적용되는 경우 각 신호분리 알고리즘의 단점을 서로 상쇄시킬 수 있기 때문에 음성인식에 있어서 더 효과적이다.At this time, the first and second speech recognition modules 51 and 52 may be configured to separate the original signal and the noise signal from the acoustic signal using the same signal separation algorithm, but different signal separation algorithms may be applied , It is more effective in speech recognition because it can offset the disadvantages of each signal separation algorithm.

이때 본 발명에서는 설명의 편의를 위해 마이크로폰들이 3개이고, 음성인식모듈이 2개인 것으로 예를 들어 설명하였으나, 마이크로폰들의 수량은 4개 이상이고, 음성인식부의 수량은 3개 이상인 것으로 구성될 수 있음은 당연하다.In the present invention, for example, three microphones and two voice recognition modules have been described for convenience of description. However, the number of microphones may be four or more and the number of voice recognition units may be three or more Of course.

또한 음성인식장치(1)는 도 2에 도시된 바와 같이, 음향신호 입력부(3)와, 음성인식부(5), 특징파라미터 검출부(6), 비교 및 매칭부(7), 참조모델 데이터베이스부(8), 단어결정부(9)로 이루어진다.2, the speech recognition apparatus 1 includes an acoustic signal input unit 3, a speech recognition unit 5, a feature parameter detection unit 6, a comparison and matching unit 7, (8), and a word determination unit (9).

음향신호 입력부(3)는 3개의 마이크로폰(11-1), (11-2), (11-3)들로부터 입력되는 음향신호들을 입력받는다. 이때 마이크로폰(11-1), (11-2), (11-3)들은 서로 다른 음향입력각도들을 갖도록 이격되게 설치되어 음향신호들을 각각 입력받고, 음향신호에는 원신호 및 잡음신호(노이즈)가 포함된다.The acoustic signal input unit 3 receives acoustic signals input from the three microphones 11-1, 11-2, and 11-3. At this time, the microphones 11-1, 11-2, and 11-3 are installed so as to have different sound input angles to receive sound signals, respectively, and a sound signal and a noise signal (noise) .

이때 마이크로폰(11-1), (11-2), (11-3)들은 어레이 형태로 설치되며, 음향신호를 입력받아 전기 신호로 변환한다.At this time, the microphones 11-1, 11-2, and 11-3 are installed in an array form, and receive the acoustic signals and convert them into electric signals.

또한 마이크로폰(11-1), (11-2), (11-3)들은 빔-포밍(Beam forming) 기법이 적용되어 제1 마이크로폰(11-1)은 전방 방향으로 빔을 형성하고, 제2 마이크로폰(11-2) 및 제3 마이크로폰(11-3)은 제2 마이크로폰(11-2)의 빔을 대칭으로 좌측 및 우측을 형성하도록 설치된다.In addition, the microphones 11-1, 11-2, and 11-3 are applied with a beam-forming technique so that the first microphone 11-1 forms a beam in the forward direction, The microphone 11-2 and the third microphone 11-3 are installed to symmetrically form the left and right sides of the beam of the second microphone 11-2.

이때 빔-포밍 기법은 송신 기기나 수신 기기의 배열을 이용하여 방사 패턴의 방향이나 민감도를 조절하기 위한 목적을 위해 주로 사용되는 신호 처리 기법으로서, 신호를 전송하는 경우, 전송하고자 하는 방향의 신호 강도를 키우고, 이외의 방향으로 전송되는 신호의 강도는 줄일 수 있게 된다.In this case, the beam-forming technique is a signal processing technique that is mainly used for the purpose of adjusting the direction and sensitivity of a radiation pattern by using an array of a transmitting device or a receiving device. When a signal is transmitted, The intensity of the signal transmitted in the other direction can be reduced.

도 3은 도 2의 마이크로폰에 적용되는 전처리 기술을 설명하기 위한 예시도이다.3 is an exemplary diagram for explaining a pre-processing technique applied to the microphone of FIG.

본 발명의 마이크로폰(11)은 도 3에 도시된 바와 같이, 마이크로폰(11)으로부터 입력되는 입력신호로부터 잡음회귀현상을 제거할 수 있는 음향반향삭제(AEC, Acoustic Echo Cancellation)가 적용됨으로써 동적 잡음원을 적응적으로 제거함으로써 마이크로폰(11)으로부터 입력되는 입력신호만을 추출하도록 하였다.As shown in FIG. 3, the microphone 11 of the present invention has a dynamic noise source by applying acoustic echo cancellation (AEC) capable of removing a noise recurrence phenomenon from an input signal input from the microphone 11 So that only the input signal input from the microphone 11 is extracted.

또한 음향반향삭제(AEC) 기술은 NLMS(Normalized Least Mean Square)기반의 가변 학습율 잡음제거 알고리즘이 적용되어 스피커 등과 같은 동적 잡음원을 제거함과 동시에 마이크로폰(11)으로부터 입력되는 입력신호는 자연스러운 상태로 유지시킴으로써 입력신호에 대한 전처리 기능을 수행할 수 있게 된다.In addition, the acoustic echo cancellation (AEC) technique uses a normalized least mean square (NLMS) -based variable learning rate noise elimination algorithm to remove a dynamic noise source such as a speaker and keep the input signal input from the microphone 11 in a natural state It is possible to perform the preprocessing function on the input signal.

즉 본 발명의 음향신호 입력부(3)는 마이크로폰(11-1), (11-2), (11-3)들로부터 음향신호를 입력받되, 마이크로폰(11-1), (11-2), (11-3)들 각각에 음향반향삭제(AEC) 기술이 적용됨으로써 전처리 된 음향신호들을 입력받을 수 있고, 이에 따라 음성인식의 정확성을 높일 수 있게 된다.That is, the acoustic signal input unit 3 of the present invention receives the acoustic signals from the microphones 11-1, 11-2, and 11-3, and the microphones 11-1, 11-2, The acoustic echo cancellation (AEC) technique can be applied to each of the sound processing units 11-3 to receive the preprocessed sound signals, thereby improving the accuracy of speech recognition.

도 4는 도 2의 음성인식부를 나타내는 블록도이고, 도 5는 도 4의 제1 음성인식모듈에 적용되는 제1 신호분리 알고리즘을 설명하기 위한 예시도이다.FIG. 4 is a block diagram showing the speech recognition unit of FIG. 2, and FIG. 5 is an exemplary diagram illustrating a first signal separation algorithm applied to the first speech recognition module of FIG.

음성인식부(5)는 도 4에 도시된 바와 같이, 기 설정된 제1 신호분석 알고리즘을 이용하여 음향신호 입력부(3)를 통해 입력된 제1, 2 마이크로폰(11-1), (11-2)들의 입력신호(H1), (H2)들로부터 원신호(S1), (S2)들 및 잡음신호(N1), (N2)들을 분리한 후 원신호(S1), (S2)들을 합산하여 제1 원신호(X1)를 검출하는 제1 음성인식모듈(51)과, 기 설정된 제2 신호분석 알고리즘을 이용하여 음향신호 입력부(3)를 통해 입력된 제3 마이크로폰(11-3)의 입력신호를 분석하여 원신호(S3) 및 잡음신호(N3)를 분리한 후 분리된 원신호(S3)를 제1 음성인식모듈(51)로부터 입력된 제1 원신호(X1)와 합산하여 최종 원신호(X2)를 검출하는 제2 음성인식모듈(52)로 이루어진다.As shown in FIG. 4, the speech recognition unit 5 receives the first and second microphones 11-1 and 11-2 input through the sound signal input unit 3 using a predetermined first signal analysis algorithm, The original signals S1 and S2 are separated from the input signals H1 and H2 of the input signals S1 and S2 and the noise signals N1 and N2, A first voice recognition module 51 for detecting a one-way signal X1 of the third microphone 11-3 input through the sound signal input part 3 using a predetermined second signal analysis algorithm, And separates the original signal S3 and the noise signal N3 to add the separated original signal S3 to the first original signal X1 input from the first speech recognition module 51, And a second voice recognition module 52 for detecting the second voice X2.

제1 음성인식모듈(51)은 음향신호 입력부(3)로부터 제1, 2 마이크로폰(11-1), (11-2)들의 음향신호(H1), (H2)들을 입력받는다.The first speech recognition module 51 receives the acoustic signals H1 and H2 of the first and second microphones 11-1 and 11-2 from the acoustic signal input unit 3. The first speech recognition module 51 receives the acoustic signals H1 and H2 from the first and second microphones 11-1 and 11-2.

또한 제1 음성인식모듈(51)은 기 설정된 제1 신호분리 알고리즘을 이용하여 입력된 음향신호(H1)를 원신호(S1) 및 잡음신호(N1)로 분리하며, 입력된 음향신호(H2)를 원신호(S2) 및 잡음신호(N2)로 분리한다.The first speech recognition module 51 separates the inputted sound signal H1 into the original signal S1 and the noise signal N1 using a predetermined first signal separation algorithm, Into the original signal S2 and the noise signal N2.

또한 제1 음성인식모듈(51)은 음향신호로부터 신호들이 분리되면, 분리된 원신호(S1), (S2)들을 합산하여 1차 원신호(X1)를 검출한다.The first speech recognition module 51 detects the primary signal X1 by summing the separated original signals S1 and S2 when the signals are separated from the acoustic signal.

제2 음성인식모듈(52)은 기 설정된 제2 신호분리 알고리즘을 이용하여 음향신호 입력부(3)를 통해 입력된 제3 마이크로폰(11-3)으로부터 원신호(S3) 및 잡음신호(N3)를 분리한다.The second voice recognition module 52 uses the predetermined second signal separation algorithm to generate the original signal S3 and the noise signal N3 from the third microphone 11-3 input through the acoustic signal input unit 3 Separate.

또한 제2 음성인식모듈(52)은 분리된 원신호(S3)와, 제1 음성인식모듈(51)로부터 입력된 1차 원신호(X1)를 합산하여 최종 원신호(X2)를 검출한다.The second speech recognition module 52 also detects the final original signal X2 by summing the original signal S3 separated from the original speech signal and the primary signal X1 input from the first speech recognition module 51. [

다시 말하면, 본원 발명은 제1 음성인식모듈(51) 및 제2 음성인식모듈(52)이 서로 다른 신호분리 알고리즘을 이용하여 음향신호로부터 원신호 및 잡음신호를 분리하도록 구성됨과 동시에 제1 음성인식모듈(51)은 제1, 2 마이크로폰(11-1), (11-2)들에 대한 원신호들을 합산하여 1차 원신호(X1)를 검출하되, 제2 음성인식모듈(52)은 분리한 원신호(S3)를 제1 음성인식모듈(51)에 의해 검출된 1차 원신호(X1)와 합산하여 최종 원신호(X2)를 검출하도록 구성됨으로써 서로 다른 신호분리 알고리즘이 적용되는 두 개의 음성인식모듈(51), (52)들을 통해 각 입력신호에 대한 잡음제거 시 원신호가 함께 제거되는 현상을 상쇄시킬 수 있을 뿐만 아니라 원신호의 검출이 반복적으로 이루어져 음성을 정확하고 정밀하게 검출할 수 있으며, 각 음성인식모듈에 적용되는 신호분리 알고리즘의 장점은 부각시키되, 단점을 서로 상쇄시켜 음성인식의 정확성을 현저히 높일 수 있게 된다.In other words, the present invention is configured such that the first speech recognition module 51 and the second speech recognition module 52 separate the original signal and the noise signal from the acoustic signal using different signal separation algorithms, The module 51 detects the primary signal X1 by summing the original signals for the first and second microphones 11-1 and 11-2 and the second voice recognition module 52 detects The first original signal X1 detected by the first speech recognition module 51 so as to detect the final original signal X2 so that two different signal separation algorithms are applied, It is possible not only to cancel out the phenomenon that the original signals are removed together with the noise elimination for each input signal through the speech recognition modules 51 and 52 but also to detect the original voice accurately and precisely Can be applied to each speech recognition module, The advantages of the speech recognition system can be emphasized, but the accuracy of speech recognition can be significantly improved by canceling out the weak points.

이때 제1 음성인식모듈(51)에 적용되는 제1 신호분리 알고리즘 및 제2 음성인식모듈(52)에 적용되는 제2 신호분리 알고리즘은 서로 다른 방식의 연산처리를 갖도록 구성된다. In this case, the first signal separation algorithm applied to the first speech recognition module 51 and the second signal separation algorithm applied to the second speech recognition module 52 are configured to have different types of calculation processing.

예를 들어, 제1 음성인식모듈(51)에 적용되는 제1 신호분리 알고리즘은 도 5에 도시된 바와 같이, 제1, 2 마이크로폰(11-1), (11-2)들로부터 입력된 음향신호(H1), (H2)들을 주파수영역으로 변환(STFT, short-time Fourier transform)한 후 소프트 마스크(IE soft-mask) 알고리즘 및 IVA 알고리즘이 적용되는 다중채널 음원분리(BSS, Blind Source Separation)가 적용될 수 있다.For example, as shown in FIG. 5, the first signal separation algorithm applied to the first speech recognition module 51 is an algorithm for dividing the sound inputted from the first and second microphones 11-1 and 11-2, (BSS, Blind Source Separation) applying an IE soft-mask algorithm and an IVA algorithm after converting the signals H1 and H2 into a frequency domain (STFT, short-time Fourier transform) Can be applied.

또한 제2 음성인식모듈(52)에 적용되는 제2 신호분리 알고리즘은 단일채널 음원분리 방식이 적용될 수 있다.The second signal separation algorithm applied to the second speech recognition module 52 may be a single channel sound source separation method.

예를 들어, 제2 신호분리 알고리즘은 우선 제3 마이크로폰(11-3)으로부터 입력된 음향신호(H3)를 주파수영역으로 변환(STFT) 한 후 ICA(Independent Component Analysis)를 통해 신호들을 분리시킨다.For example, the second signal separation algorithm first transforms the sound signal H3 input from the third microphone 11-3 into a frequency domain (STFT), and then separates the signals through ICA (Independent Component Analysis).

이때 제2 신호분리 알고리즘으로 ICA가 적용되는 경우, 우선 음향신호(H1)의 차원을 음원의 차원으로 줄이기 위한 목적으로 Linear transformation을 수행한 후, Linear transformation 처리된 신호에 단일 행렬(unitary matrix)(B)을 곱해줌으로써 분리된 신호의 주파수 영역의 값을 구하고, 앞서 검출된 분리 행렬(separation matrix)(V*B)을 통해 분리된 신호를 검출한다.In this case, when ICA is applied to the second signal separation algorithm, first, a linear transformation is performed to reduce the dimension of the acoustic signal H1 to the dimension of a sound source, and then a unitary matrix ( B) to obtain the value of the frequency domain of the separated signal, and detects the separated signal through the separation matrix (V * B) detected earlier.

즉 제1 신호분리 알고리즘으로 소프트 마스크(IE soft-mask)및 IVA를 포함하는다중채널 음원분리(BSS, Blind Source Separation)가 적용된다고 가정할 때, 제1 신호분리 알고리즘은 잔향 시간(reverberation time)이 큰 경우 각 채널에 잔여교차성분(residual cross-talk component)이 분리 후에도 존재하여 분리 성능이 저하되는 단점을 갖고, 제2 신호분리 알고리즘으로 ICA가 적용된다고 가정할 때, 제2 신호분리 알고리즘은 주파수 bin 사이가 독립적이지 않아 정적잡음에 취약한 단점을 갖는다.That is, assuming that a soft mask (IE soft-mask) and a blind source separation (BSS) including IVA are applied to the first signal separation algorithm, the first signal separation algorithm has a reverberation time, The second signal separation algorithm assumes that the residual cross-talk component exists in each channel even after the separation so that the separation performance deteriorates. When the ICA is applied to the second signal separation algorithm, Frequency bins are not independent and are vulnerable to static noise.

그러나 본 발명에서는 1)제1 음성인식모듈(51)이 제1 신호분리 알고리즘을 이용하여 원신호(S1), (S2)들을 분리시키고, 2)제2 음성인식모듈(52)이 제2 신호분리 알고리즘을 이용하여 원신호(S3)를 분리시키고, 3)최종 원신호(X2)가 제1 음성인식모듈(51)에 의한 1차 원신호(X1)와 제2 음성인식모듈(52)에 의한 원신호(S3)를 합산하여 검출되도록 구성됨으로써 제1 신호분리 알고리즘의 단점과 제2 신호분리 알고리즘의 단점을 서로 상쇄시킬 수 있으며, 원신호의 검출이 반복적으로 이루어져 음성을 정확하고 정밀하게 검출할 수 있으며, 각 음성인식모듈에 적용되는 신호분리 알고리즘의 장점은 부각시키되, 단점을 서로 상쇄시켜 음성인식의 정확성을 현저히 높일 수 있게 된다.However, in the present invention, 1) the first speech recognition module 51 separates the original signals S1 and S2 using the first signal separation algorithm, and 2) the second speech recognition module 52 uses the second signal 3) the final original signal X2 is output to the first original signal X1 by the first speech recognition module 51 and the original signal X2 by the second speech recognition module 52 The first signal separation algorithm and the second signal separation algorithm can be canceled from each other, and the detection of the original signal can be repeatedly performed, so that the voice can be accurately and precisely detected The advantage of the signal separation algorithm applied to each speech recognition module is highlighted, but the accuracy of the speech recognition can be significantly improved by canceling out the disadvantages.

또한 음성인식부(5)는 제2 음성인식모듈(52)에 의해 최종 원신호(X2)가 검출되면, 검출된 원신호의 초성에 기 설정된 모음을 조합하되, 종성이 삭제된 음절로 변환시킨다. Further, when the final original signal X2 is detected by the second speech recognition module 52, the speech recognition unit 5 combines the vowel set in the prefix of the detected original signal, and converts the vowel to the deleted syllable .

예를 들어 음성인식부(5)는 제2 음성인식모듈(52)에 의해 ‘홍길동’이 검색되면, 검출된 음성을 ‘하가다’와 같은 음절로 변환시킨다.For example, the speech recognition unit 5 converts the detected speech into a syllable such as 'Hagada' when the second speech recognition module 52 searches for 'Hong Gil Dong'.

이때 본 발명에서는 설명의 편의를 위해 마이크로폰이 3개이고, 음성인식모듈이 2개인 것으로 예를 들어 설명하였으나, 마이크로폰이 4개 이상이고, 음성인식모듈이 3개 이상인 경우, 제1 음성인식모듈은 도 4와 동일한 방식으로 1차 원신호를 검출하되, 제n 음성인식모듈은 (n-1) 마이크로폰으로부터 입력되는 음향신호와 제n-1 음성인식모듈로부터 입력되는 원신호를 이용하여 원신호를 검출하게 된다.In the present invention, three microphones and two voice recognition modules have been described for convenience of explanation. However, if the number of microphones is four or more and the number of voice recognition modules is three or more, The n-th speech recognition module detects the original signal by using the acoustic signal input from the (n-1) microphone and the original signal input from the (n-1) .

특징파라미터 검출부(6)는 음성인식부(5)에 의해 검출된 원신호를 분석하여 인식에 필요한 특징벡터를 추출한다. The feature parameter detector 6 analyzes the original signal detected by the speech recognition unit 5 and extracts a feature vector necessary for recognition.

이때 특징파라미터 검출부(6)는 선형 예측 부호화(LPC, Linear Predictive Coding)를 통해 입력된 음성신호로부터 특징벡터를 추출한다.At this time, the feature parameter detector 6 extracts a feature vector from the speech signal input through LPC (Linear Predictive Coding).

또한 특징파라미터 검출부(6)는 추출된 특징벡터를 이용하여 입력된 음성신호에 대한 특징파라미터를 생성한다. 이때 특징파라미터는 음성신호를, 참조모델과의 비교 알고리즘 수행이 가능하도록 처리한 데이터이다.The feature parameter detector 6 generates feature parameters for the input speech signal using the extracted feature vectors. In this case, the feature parameter is data obtained by processing the speech signal so that a comparison algorithm with the reference model can be performed.

또한 특징파라미터 검출부(6)에 의해 검출되는 특징 파라미터는 비교 및 매칭부(7)로 입력된다.The characteristic parameters detected by the characteristic parameter detecting section 6 are input to the comparing and matching section 7.

비교 및 매칭부(7)는 기 설정된 비교 알고리즘을 이용하여 참조모델 데이터베이스부(8)에 저장되는 기 설정된 참조모델과, 입력된 특징 파라미터를 분석하여 특징 파라미터와 가장 유사도가 높은 참조모델에 대한 정보를 음성인식 결과로 출력한다. The comparison and matching unit 7 analyzes a predetermined reference model stored in the reference model database unit 8 using a preset comparison algorithm and information on the reference model having the highest similarity to the feature parameter As a speech recognition result.

다시 말하면, 비교 및 매칭부(7)는 특징파라미터 검출부(6)로부터 입력된 특징파라미터와 기 설정된 참조모델을 음절단위로 생성하여 비교 및 분석한다.In other words, the comparison and matching unit 7 generates and compares and analyzes the feature parameters input from the feature parameter detection unit 6 and predetermined reference models in syllable units.

이때 참조모델 데이터베이스부(8)에는 기 설정된 참조모델 정보들이 저장된다.At this time, the reference model database 8 stores preset reference model information.

또한 비교 및 매칭부(7)는 입력된 음성과 참조음성 사이의 발음 속도와 길이의 차이를 보상하기 위하여 입력 특징파라미터와 참조모델을 비선형적으로 정합하여 가장 유사도가 높은 참조모델의 음성을 인식하기 위한 동적시간 워핑(DTW; Dynamic Time Warping) 알고리즘을 이용하여 특징파라미터와 참조모델들 각각의 유클리드 제곱 거리(Squared Euclidean Distance)를 산출한 후 그 거리가 가장 작은 참조모델을 특징파리미터와 가장 유사한 모델로 인식한다.The comparison and matching unit 7 non-linearly matches the input feature parameter and the reference model in order to compensate for the difference in the speed and length between the input speech and the reference speech, thereby recognizing the speech of the reference model having the highest degree of similarity (Squared Euclidean Distance) of each of the feature parameters and the reference models is calculated using a dynamic time warping (DTW) algorithm for the reference model, and the reference model having the smallest distance is used as the model most similar to the feature parameter .

이때 특정한 입력음성(특징파라미터)에 대하여, 참조모델에는 특징파라미터와의 유사도가 기 설정된 임계치 이내인 참조모델이 한 개이거나 또는 2개 이상일 수 있다. 예를 들어 가'와 '카', '다'와 '타' 등은 발음의 유사성으로 인해 신호 패턴도 어느 정도 유사하기 때문에 입력음성 ‘가’에 대한 유사도가 기 설정된 임계치 이내인 참조모델은 ‘가’, ‘카’와 같이 2개로 검출될 수 있다.At this time, for a specific input speech (feature parameter), the reference model may have one reference model or two or more reference models whose degree of similarity with the feature parameter is within a predetermined threshold value. For example, since the signal patterns are similar to each other due to the similarity of pronunciations of 'a', 'ka', 'da' and 'ta', the reference model having similarity to the input speech ' Quot ;, and " k ".

이에 따라 비교 및 매칭부(7)는 만약 특징파라미터와의 유사도가 기 설정된 임계치 이내인 참조모델이 1개인 경우 유사도가 높은 참조모델을 입력음성으로 결정한다.Accordingly, the comparing and matching unit 7 determines the reference model having a high degree of similarity as the input voice if there is one reference model whose similarity with the feature parameter is within a predetermined threshold value.

또한 비교 및 매칭부(7)는 만약 특정파라미터와의 유사도가 기 설정된 임계치 이내인 참조모델이 2개 이상인 경우 동적 시간 워핑 방식 보다 인식률이 우수한 분석을 다시 수행, 상세하게로는 음성신호를 음소단위로 분리한 후 은닉 마르포크 모델(Hidden Markov model)을 통해 패턴 비교 알고리즘을 수행한다. 이때 은닉 마르코프 모델은, 모델링하는 시스템이 미지의 파라미터를 가진 Markov process일 것이라고 가정하여, 그 가정에 기초해서 관측된 파라미터로부터 숨겨진 파라미터를 결정하는 하나의 통계모델이며, 음성인식분야에서 널리 사용되는 방식 중 하나이기 때문에 상세한 설명은 생략하기로 한다.In addition, the comparison and matching unit 7 performs an analysis with a better recognition rate than the dynamic time warping method when there are two or more reference models whose similarity with a specific parameter is within a predetermined threshold value. Specifically, And then performs a pattern comparison algorithm using the Hidden Markov model. The Hidden Markov Model is a statistical model that determines hidden parameters from the observed parameters based on the assumption that the modeling system is a Markov process with unknown parameters and is a widely used method in the field of speech recognition The detailed description will be omitted.

또한 비교 및 매칭부(7)는 검출된 참조모델에 대한 음성을 단어결정부(9)로 입력한다.Further, the comparison and matching unit 7 inputs the voice for the detected reference model to the word determination unit 9. [

단어결정부(9)는 비교 및 매칭부(7)로부터 입력된 참조모델에 대응되는 문자를 검색어로 하여 단어를 검색하며, 검색된 단어를 최종 출력함으로써 음성인식을 수행하게 된다.The word determination unit 9 searches for words using a character corresponding to the reference model input from the comparison and matching unit 7 as a search word, and performs speech recognition by finally outputting the searched word.

즉 본 발명의 음성인식장치(1)는 참조모델 데이터베이스부(8)에 저장되어 특징 파라미터와 비교되는 참조모델의 수가 많을수록 정확한 음성인식이 가능하나, 이 경우 참조모델 데이터베이스부(8)의 용량이 방대해야할 뿐만 아니라 특징 파라미터와 참조모델의 비교 알고리즘 수행 시 연산량이 과도하게 증가하게 되고, 이러한 연산량 증가는 임베디드 시스템에 적용될 경우 시스템 자원이 제한되기 때문에 최소한의 자원으로 정확한 음성인식 결과를 도출하기 위하여 본 발명에서는 초성 기반의 음성인식을 적용하였다.That is, the speech recognition apparatus 1 of the present invention can accurately recognize the speech as the number of reference models stored in the reference model database unit 8 and compared with the feature parameters increases. In this case, the capacity of the reference model database unit 8 In addition to being large, it is necessary to increase the amount of computation when performing comparison algorithm between feature parameters and reference model. In order to derive correct speech recognition result with minimum resource because the system resource is limited when applied to an embedded system, In the invention, the speech recognition based on the prefix is applied.

특히 초성 'ㄱ', 'ㄴ', 'ㄷ' 등을 음성으로 입력함에 있어서, '기역', '니은', '디귿'과 같이 초성의 명칭을 사용하지 않고, '가', '나', '다' 같이 초성에 하나의 통일된 모음을 조합하여 발음하여 입력하도록 하고, 특징파라미터 또한 초성과 통일된 하나의 모음이 조합된 형태의 음성신호에 대응되도록 한다.Especially, in inputting 'ㄱ', 'ㄴ', 'ㄷ', etc. as a voice, it does not use the name of the prefix such as' base ',' A single unified vowel is combined with the first vowel, and the feature parameter is made to correspond to the voice signal having a combination of the vowel and the unified vowel.

도 6은 도 2의 동작과정을 설명하기 위한 플로차트이다.FIG. 6 is a flowchart for explaining the operation procedure of FIG. 2. FIG.

본 발명의 음성인식장치(1)의 동작과정인 음성인식방법(S1)은 음향신호 입력단계(S10)와, 음성인식단계(S20), 특징파라미터 생성단계(S30), 분석단계(S40), 판단단계(S50), 음소단위 패턴분석 수행단계(S60), 음소결정단계(S70), 단어결정단계(S80)로 이루어진다.The speech recognition method S1 as an operation process of the speech recognition apparatus 1 of the present invention includes a sound signal input step S10, a speech recognition step S20, a feature parameter generation step S30, an analysis step S40, A determination step S50, a phoneme unit pattern analysis step S60, a phoneme determination step S70, and a word determination step S80.

음향신호 입력단계(S10)는 마이크로폰(11-1), (11-2), (11-3)들로부터 음향신호를 입력받는 단계이다.The acoustic signal input step S10 is a step of receiving acoustic signals from the microphones 11-1, 11-2, and 11-3.

음성인식단계(S20)는 음향신호 입력단계(S10)를 통해 입력된 음향신호들을 도 4에서 전술하였던 바와 같이, 2개의 음성인식모듈을 이용하여 음성을 인식하는 단계이다.The speech recognition step S20 is a step of recognizing speech using the two speech recognition modules as described above with reference to FIG. 4 for the acoustic signals input through the speech signal input step S10.

또한 음성인식단계(S20)는 초성에 공통된 하나의 모음을 조합한 발음을 기초로 생성된 참조모델을 이용하여 음성인식이 수행되므로 '가', '나', '다' 와 같이 공통된 모음을 갖고, 종성을 포함하지 않는 음절들의 조합으로 변환시킨다. In addition, in the voice recognition step (S20), speech recognition is performed using a reference model generated based on pronunciation of a combination of vowels common to the initials, so that they have a common vowel such as 'a', 'b' , And a syllable that does not contain a trait.

예를 들어, 음성인식단계(S20)는 음성인식모듈들에 의해 '홍길동'이 검출되는 경우, 검출된 음성을 '하가다'와 같은 음성으로 변환시킨다.For example, the voice recognition step S20 converts the detected voice into a voice such as " Hagada " when " Hong Gil-dong " is detected by the voice recognition modules.

또한 음성인식단계(S20)는 변환된 음성신호를 특징파라미터 생성단계(S30)로 입력시킨다.The voice recognition step S20 inputs the converted voice signal to the feature parameter generation step S30.

특징파라미터 생성단계(S30)는 선형 예측 부호화(LPC, Linear Predictive Coding)를 통해 입력된 음성신호로부터 특징벡터를 추출한다.The feature parameter generation step S30 extracts a feature vector from the speech signal input through LPC (Linear Predictive Coding).

또한 특징파라미터 검출단계(S30)는 추출된 특징 벡터를 이용하여 입력된 음성신호에 대한 특징파라미터를 생성한다. 이때 특징파라미터는 음성신호를 참조모델과의 비교 알고리즘 수행이 가능하도록 처리한 데이터이다.The feature parameter detection step S30 generates feature parameters for the input speech signal using the extracted feature vectors. In this case, the feature parameter is data processed so that a voice signal can be compared with a reference model.

또한 특징파라미터 생성단계(S30)에 의해 생성된 특징파라미터는 분석단계(S40)로 입력된다.In addition, the feature parameters generated by the feature parameter generation step S30 are input to the analysis step S40.

분석단계(S40)는 특징파라미터 생성단계(S30)로부터 입력된 특징파라미터와 기 설정된 참조모델을 음절단위로 생성하여 비교 및 분석한다. In the analysis step S40, feature parameters input from the feature parameter generation step S30 and predetermined reference models are generated in syllable units and compared and analyzed.

또한 분석단계(S40)는 입력된 음성과 참조 음성 사이의 발음 속도와 길이의 차이를 보상하기 위하여 입력 패턴과 참조 패턴을 비선형적으로 정합하여 가장 유사도가 높은 참조 패턴의 음성으로 입력된 음성을 인식하기 위한 동적시간 워핑(DTW; Dynamic Time Warping) 알고리즘을 이용하여 특징파라미터와 참조모델들 각각의 유클리드 제곱 거리(Squared Euclidean Distance)를 산출한 후 그 거리가 가장 작은 참조모델을 특징파리미터와 가장 유사한 모델로 인식한다.In the analysis step S40, the input pattern and the reference pattern are non-linearly matched to compensate for the difference between the speed and the length between the input voice and the reference voice, and the voice inputted with the voice of the reference pattern having the highest similarity is recognized (Squared Euclidean Distance) of each of the feature parameters and the reference models is calculated using a dynamic time warping (DTW) algorithm. The reference model having the smallest distance is used as a model most similar to the feature parameter .

판단단계(S50)는 분석단계(S40)에 의해 특징파라미터와의 유사도가 기 설정된 범위 내인 참조모델이 2개 이상인지를 판단하는 단계이다.The determining step S50 is a step of determining whether there are two or more reference models whose similarity with the feature parameter is within a predetermined range by the analysis step S40.

다시 말하면, 판단단계(S50)는 분석단계(S40)에서 특징파라미터와 참조모델들 각각의 유클리드 제곱입력 패턴과 각각의 참조 패턴 사이의 유클리드 제곱 거리를 산출한 결과, 기 설정된 임계값보다 작은 유클리드 제곱 거리를 갖는 참조모델이 2개 이상인지의 여부를 판단한다.In other words, in the determination step S50, the Euclidean squared distance between the Euclidean squared input pattern of each of the feature parameters and the reference models and the respective reference patterns is calculated in the analysis step S40. As a result, the Euclidean squares It is determined whether or not there are two or more reference models having distances.

즉 현재 입력된 음성이 2개 이상의 유사한 음성으로 인식될 가능성이 있는 경우에 해당하기 때문에 보다 정확한 패턴 분석이 요구된다.That is, since the present input voice is likely to be recognized as two or more similar voices, a more accurate pattern analysis is required.

예를 들어 '가'와 '카', '다'와 '타' 등은 발음의 유사성으로 인해 신호 패턴도 어느 정도 유사하므로 이를 동적 시간 워핑 방식만으로 비교하는 경우, 사용자가 의도한 바와 다르게 인식될 가능성이 있다.For example, when the 'a', 'ka', 'da', and 'ta' are similar to each other due to the similarity of pronunciation, signal patterns are compared to each other only by the dynamic time warping method. There is a possibility.

따라서 본 발명에서는 판단단계(S50)에서 분석단계(S40) 시 유사한 참조모델이 2개 이상이 이상인지 여부를 판별하여, 2개 이상인 경우 동적 시간 워핑 방식 보다 인식률이 우수한 패턴 분석을 다시 수행하도록 한다.Accordingly, in the present invention, it is determined whether two or more similar reference models exist in the analysis step S40 in the determination step S50, and pattern analysis having a better recognition rate than the dynamic time warping method is performed again when the reference models are two or more .

즉 판단단계(S50)는 만약 유사한 참조모델이 2개 이상인 경우 음소단위 패턴분석 수행단계(S60)를 진행하고, 만약 유사한 참조모델이 1개인 경우 음소결정단계(S70)를 진행한다.That is, if the number of similar reference models is two or more, the determination step S50 proceeds with the phoneme unit pattern analysis step S60, and if the similar reference model is one, the phoneme determination step S70 is performed.

음소단위 패턴분석 수행단계(S60)는 음성신호를 음소단위로 분리한 후 은닉 마르코프 모델(Hidden Markov model)과 같은 방식에 의하여 음소단위의 패턴 비교 알고리즘을 수행한다.The phoneme unit pattern analysis step S60 separates the speech signal into phonemes and performs a phoneme-by-phoneme pattern comparison algorithm in the same manner as the hidden Markov model.

이때 은닉 마르코프 모델은, 모델링하는 시스템이 미지의 파라미터를 가진 Markov process일 것이라고 가정하여, 그 가정에 기초해서 관측된 파라미터로부터 숨겨진 파라미터를 결정하는 하나의 통계모델이며, 음성인식분야에서 널리 사용되는 방식 중 하나이기 때문에 상세한 설명은 생략하기로 한다.The Hidden Markov Model is a statistical model that determines hidden parameters from the observed parameters based on the assumption that the modeling system is a Markov process with unknown parameters and is a widely used method in the field of speech recognition The detailed description will be omitted.

음소결정단계(S70)는 분석단계(S40) 또는 음소단위 패턴분석 수행단계(S60)에서 수행된 패턴 분석 결과에 따라 음소를 결정한다.The phoneme determining step S70 determines the phoneme according to the pattern analysis result performed in the analyzing step S40 or the phoneme unit pattern analyzing step S60.

즉 음소결정단계(S70)는 판단단계(S50)에서 유사한 참조모델이 하나인 경우 분석단계(S40)에 의해 해당 참조모델에 해당하는 음성에 대응되는 음소를 입력된 음소로 결정하며, 판단단계(S50)에서 유사한 참조모델이 2개인 경우 음소단위 패턴분석 수행단계(S60)에 의해 가장 유사도가 높은 음소를 입력된 음소로 결정한다.That is, the phoneme determining step S70 determines the phoneme corresponding to the voice corresponding to the reference model as the inputted phoneme in the analysis step S40 when the similar reference model is one in the determining step S50, S50), the phoneme having the highest similarity is determined as the input phoneme by the phoneme unit pattern analysis step (S60).

예를 들어 사용자가 음성 '가'를 입력하여 분석단계(S40)에서 각각 '가'와 '카'에 해당하는 참조모델이 유사하다고 판단한 경우에는 음소단위 패턴분석 수행단계(S60)를 통해 다시 저장된 음성 신호의 음소 부분만을 따로 처리하여 은닉 마르코프 모델을 수행함으로써 사용자가 실질적으로 입력하고자 하였던 초성 'ㄱ'을 인식된 음소로 결정하게 되고, 다른 예를 들자면 사용자가 음성 '나'를 입력하여 분석단계(S40)에서 유사한 참조모델이 '나'로 인식되었다면 음소단위 패턴분석 수행단계(S60)를 거치지 않고 바로 'ㄴ'이 입력된 것으로 결정한다.For example, if the user inputs the voice 'a' and determines that the reference models corresponding to 'a' and 'ka' are similar in the analysis step S40, the phoneme unit pattern analysis is performed again in step S60 The phoneme portion of the speech signal is separately processed to perform the hidden Markov model, so that the user determines the initial phoneme 'A' to be recognized as the recognized phoneme. In another example, the user inputs the phoneme ' If the similar reference model is recognized as 'I' in step S40, it is determined that 'b' has been input without going through the phoneme unit pattern analysis performing step S60.

단어결정단계(S80)는 음소결정단계(S70)에 의해 검출된 음소들을 이용하여 단어를 검색하며, 검색된 단어 중 최종 결과를 선택하는 단계이다.The word determination step S80 is a step of searching for a word using the phonemes detected by the phoneme determination step S70 and selecting the final result among the searched words.

이와 같이 본 발명의 음성인식장치(1)는 우선 초성 기반의 음성인식을 통해 비교되는 참조패턴의 개수를 절감시킴으로써 메모리를 절약함과 동시에 연산 처리량을 감소시킬 수 있으며, 비교적 계산량이 많지 않은 음절 단위 패턴에 대한 동적 시간 워핑을 기본적으로 사용하되, 정확성이 요구되는 경우에만 음소 단위 패턴에 대한 은닉 마르코프 모델방식을 보조적으로 사용함으로써 시스템에 과도한 부하를 주지 않으면서 음성인식의 정확성 및 신뢰도를 높일 수 있게 된다.As described above, the speech recognition apparatus 1 according to the present invention can first reduce the number of reference patterns compared through the preliminary speech recognition, thereby saving memory and reducing the computational throughput. In addition, By using the dynamic time warping of the pattern basically, it is possible to increase the accuracy and reliability of the speech recognition without adding excessive load to the system by using the hidden Markov model method for the phoneme unit pattern only when the accuracy is required do.

도 7은 도 2의 음성인식장치가 적용된 조명시스템을 설명하기 위한 구성도이고, 도 8은 도 7의 조명등에 설치되는 제어부를 나타내는 블록도이고, 도 9는 도 7의 예시도이다.FIG. 7 is a block diagram for explaining a lighting system to which the speech recognition apparatus of FIG. 2 is applied, FIG. 8 is a block diagram showing a control unit installed in the illumination lamp of FIG. 7, and FIG. 9 is an example of FIG.

도 7과 8의 조명시스템(900)은 조명장소의 천장에 간격을 두고 설치되는 조명등(910-1), ..., (910-N)들 각각에 설치되는 제어부(911)들과, 조명등(910-1), ..., (910-N)들 각각의 제어부(911)들을 관리 및 제어하는 컨트롤러(920)와, 컨트롤러(920) 및 조명등(910-1), ..., (910-N)들의 제어부(911)들 사이의 데이터 이동경로를 제공하는 통신망(930)으로 이루어진다.The lighting system 900 of FIGS. 7 and 8 includes a control unit 911 installed in each of the illumination lamps 910-1, ..., and 910-N installed at intervals on the ceiling of the illumination place, A controller 920 for managing and controlling each of the controllers 911 of the plurality of light sources 910-1 to 910-N and a controller 920 for controlling and controlling the controllers 920 and 910-1 to 910- And a communication network 930 for providing a data movement path between the control units 911 of the mobile stations 910-N.

통신망(930)은 컨트롤러(920) 및 조명등(910-1), ..., (910-N)들의 제어부(911)들 사이의 데이터통신을 지원하며, 상세하게로는 지그비(Zig-bee), 와이파이(Wi-Fi), 블루투스(Bluetooth) 등의 근거리 통신망(LAN), 광역통신망(WAN) 등의 유무선 네트워크망, 이동통신망 등으로 구성될 수 있다.The communication network 930 supports data communication between the controller 920 of the controller 920 and the illumination lamps 910-1, ..., 910-N, and more specifically, a Zig- A wired / wireless network such as a local area network (LAN) such as Wi-Fi, Bluetooth, and a wide area network (WAN), and a mobile communication network.

조명등(910-1), ..., (910-N)들은 조명장소의 천장구조물에 설치되며, 서로 간격을 두고 설치된다. 이때 조명등은 빛을 출사하는 LED 등의 광원소자와, 하우징, 커버, 확산판, 회로기판, 전기소자 등을 포함하고, 조명등의 구성은 널리 사용되는 기술이기 때문에 상세한 설명은 생략하기로 한다.The illumination lamps 910-1, ..., and 910-N are installed in the ceiling structure of the illumination site and are spaced apart from each other. In this case, the illumination lamp includes a light source element such as an LED that emits light, a housing, a cover, a diffusion plate, a circuit board, an electric element and the like, and the configuration of the illumination lamp is a widely used technique and therefore a detailed description thereof will be omitted.

또한 조명등(910-1), ..., (910-N)들 각각의 내부에는 제어부(911)가 설치된다.Also, a control unit 911 is installed in each of the lamps 910-1, ..., and 910-N.

이때 조명등(910-1), ..., (910-N)들은 평상시에는 조명용 기능을 수행하되, 돌발 상황 시 비상대피경로에 따라 점멸이 이루어져 비상등 및 유도등의 기능을 수행할 수 있게 된다.At this time, the illumination lamps 910-1, ..., and 910-N perform lighting functions at normal times, and blink according to the emergency evacuation path in an unexpected situation, so that they can perform the functions of emergency lights and guidance lights.

또한 조명등(910-1), ..., (910)들 각각의 외면에는 전술하였던 제1, 2, 3 마이크로폰(11-1), (11-2), (11-3)들이 설치되고, 제1, 2, 3 마이크로폰(11-1), (11-2), (11-3)들에 의해 수집된 음향신호는 제어부(911)로 입력된다.Also, the first, second and third microphones 11-1, 11-2, and 11-3 described above are installed on the outer surfaces of the respective lamps 910-1, ..., and 910, The acoustic signals collected by the first, second and third microphones 11-1, 11-2, and 11-3 are input to the controller 911. [

제어부(911)는 도 9에 도시된 바와 같이 전술하였던 도 2와 동일한 구성 및 동작을 수행하는 음향신호 입력부(3), 음성인식부(5), 특징파라미터 검출부(6), 비교 및 매칭부(7), 참조모델 데이터베이스부(8), 단어결정부(9)를 포함한다.9, the control unit 911 includes an acoustic signal input unit 3, a speech recognition unit 5, a feature parameter detection unit 6, a comparison and matching unit (not shown) for performing the same configuration and operation as those of FIG. 7, a reference model database unit 8, and a word determination unit 9.

또한 제어부(911)는 LED의 점소등을 관리 및 제어하는 점소등 관리부(912)와, 컨트롤러(920)와 데이터를 송수신하는 통신 인터페이스부(913)와, 돌발 상황이라고 판단할 수 있는 문자인 ‘돌발관련 비교대상문자’들과 조명등을 제어하기 위한 문자라고 판단할 수 있는 ’조명제어관련 비교대상문자‘들과 조명제어관련 비교대상문자들 각각에 대한 실제 조명의 제어데이터를 매칭한 매칭테이블이 기 설정되어 저장되는 메모리(914)와, 단어결정부(9)에 의해 음성단어가 결정되면 메모리(914)를 탐색하여 결정된 음성단어와 ‘돌발관련 비교대상문자’들 각각의 연관관계를 검출한 후 검출된 연관관계가 임계치를 넘어서는 경우 돌발 상황이 발생하였다고 판단하는 판단부(951)를 더 포함한다.The control unit 911 includes a light management unit 912 for managing and controlling the lighting of the LEDs, a communication interface unit 913 for transmitting and receiving data to and from the controller 920, A matching table that matches the control data of the actual lighting for each of the illumination control related characters and the illumination control related comparison characters that can be determined as characters for controlling the illumination light And a memory 914 for searching for a voice word by the word determining unit 9 and detecting a relation between each of the determined voice words and the 'unexpected comparison target characters' And a determination unit 951 that determines that an unexpected event has occurred when the detected association is greater than a threshold value.

또한 판단부(951)는 만약 돌발 상황이 발생되었다고 판단되면, 점소등 관리부(912)로 돌발 상황이 발생하였다는 돌발확인 데이터를 입력하고, 점소등 관리부(912)는 돌발확인 데이터를 입력받으면 기 설정된 소정시간(t) 동안 광원소자를 점멸시킨다.If it is determined that an unexpected condition has occurred, the determination unit 951 inputs unexpected confirmation data indicating that an unexpected condition has occurred in the unlit state management unit 912. When the unlit state management unit 912 receives the unexpected data, And the light source element is turned on for a predetermined time t.

또한 판단부(951)는 단어결정부(9)에 의해 음성단어가 결정되면, 결정된 음성단어와 메모리에 저장된 ‘조명제어관련 비교대상문자’들 각각의 연관관계를 검출한 후 검출된 연관관계가 임계치를 넘어서는 경우 매칭테이블을 탐색하여 해당 ‘조명제어관련 비교대상문자’에 대응되는 제어데이터를 검출하여 점소등 관리부로 입력한다.Further, when the voice word is determined by the word determination unit 9, the determination unit 951 also detects the association between each of the determined voice words and the 'illumination control related comparison characters' stored in the memory, If the threshold value is exceeded, the matching table is searched and the control data corresponding to the 'illumination control-related comparison object character' is detected and input to the light-off management unit.

예를 들어 ’조명제어관련 비교대상문자‘들로는 ’불꺼‘, ’불켜‘, ’높여‘, ’낮춰‘ 등을 포함할 수 있고, 조명의 제어데이터는 ’불꺼‘가 검출되는 경우 전원을 OFF시키기 위한 제어데이터이고, ’불켜‘가 검출되는 경우 전원을 ON시키기 위한 제어데이터이고, ’높여‘가 검출되는 경우 조명세기를 한 단계 높이기 위한 제어데이터이고, ’낮춰‘가 검출되는 경우 조명세기를 한 단계 낮추기 위한 제어데이터로 구성될 수 있다.For example, the 'illumination control related characters to be compared' may include 'light', 'light', 'raise', 'lower', etc., and the control data of the illumination may be turned off Control data for turning on the power when 'no' is detected, control data for increasing the illumination intensity by one step when 'high' is detected, and control data for increasing the light intensity when 'low' is detected And step-down control data.

또한 제어부(911)는 만약 판단부(915)에 의해 돌발 상황이 발생하였다고 판단하면, 통신 인터페이스부(913)를 제어하여 컨트롤러(920)로 돌발 상황이 발생하였다는 돌발 상황 확인데이터를 전송한다. 이때 제어부(911)는 연관관계가 임계치를 넘어서는 비교대상문자를 판단부(915)에 매칭하여 컨트롤러(920)로 전송한다.The control unit 911 also controls the communication interface unit 913 to transmit the unexpected status confirmation data to the controller 920 when the determination unit 915 determines that an unexpected condition has occurred. At this time, the control unit 911 matches the character to be compared, whose association exceeds the threshold value, to the determination unit 915 and transmits the matching character to the controller 920.

또한 제어부(911)는 통신 인터페이스부(913)를 통해 컨트롤러(920)로부터 점소등 제어데이터를 전송받으면, 전송받은 점소등 제어데이어를 점소등 관리부(912)로 입력하여 컨트롤러(920)의 제어에 따라 점등 및 소등이 이루어지도록 한다. The control unit 911 receives the light control data from the controller 920 through the communication interface unit 913 and inputs the received light control control data to the light management controller 912 to control the controller 920 To be turned on and off according to the control signal.

이때 판단부(915)가 돌발 상황 발생 여부를 판단하는 방식은 본 출원인에 의해 출원되어 특허 등록된 국내등록특허 제10-1625121호(발명의 명칭 : 음성인식을 이용한 비상경보 방법, 이를 위한 컴퓨터 프로그램, 그 기록 매체)에 개시된 비상경보를 판단하는 방법을 적용하였다.In this case, the judgment unit 915 judges whether or not an unexpected situation has occurred in accordance with the method disclosed in Korean Patent No. 10-1625121 filed by the applicant of the present application and entitled " Emergency alarm method using voice recognition, , Its recording medium) was applied.

도 10은 도 7의 컨트롤러를 나타내는 블록도이다.10 is a block diagram showing the controller of Fig.

도 10의 컨트롤러(920)는 제어부(921)와, 메모리(922), 통신 인터페이스부(923), 돌발 상황 처리부(924), 조명등 관리부(925)로 이루어진다.The controller 920 in Fig. 10 includes a control unit 921, a memory 922, a communication interface unit 923, an unexpected condition processing unit 924, and an illumination lamp management unit 925.

제어부(921)는 컨트롤러(920)의 O.S(Operating System)이며, 제어대상(922), (923), (924), (925)들을 관리 및 제어한다.The control unit 921 is an OS (Operating System) of the controller 920 and manages and controls the control objects 922, 923, 924, and 925.

또한 제어부(921)는 통신 인터페이스부(923)를 통해 조명등(910-1), ..., (910-N)들 각각의 제어부(911)로부터 돌발 상황 확인데이터를 전송받으면, 돌발 상황 처리부(924) 및 조명등 관리부(925)를 구동시킨다.The control unit 921 receives the unexpected situation confirmation data from the control unit 911 of each of the illumination lamps 910-1 through 910-N via the communication interface unit 923, 924 and the illumination lamp management unit 925. [

메모리(922)에는 주기적으로 제어부(921)로부터 전송받는 음성단어들이 저장된다.In the memory 922, voice words received from the control unit 921 are periodically stored.

또한 메모리(922)에는 조명장소의 비상대피경로의 위치정보가 저장된다. 예를 들어 메모리(922)에는 화재 시 대피가 이루어질 수 있는 비상경로의 위치정보가 저장될 수 있다.The memory 922 also stores the location information of the emergency evacuation route of the illumination site. For example, the memory 922 may store location information of an emergency path where evacuation can be made in the event of a fire.

또한 메모리(922)에는 조명등(910-1), ..., (910-N)들 각각의 제어부(911)들의 통신식별정보 및 위치정보가 저장된다.The memory 922 also stores communication identification information and position information of the control units 911 of the illumination lamps 910-1, ..., and 910-N.

돌발 상황 처리부(925)는 조명등(910-1), ..., (910-N)들 각각의 제어부(921)로부터 돌발 상황 확인데이터를 전송받을 때 구동되며, 조명장소에 설치된 비상벨을 구동시킴과 동시에 스피커들을 통해 안내방송이 출력되도록 한다.The unexpected situation processing unit 925 is driven when receiving the unexpected situation confirmation data from the control unit 921 of each of the illumination lamps 910-1, ..., and 910-N, At the same time, announcements are output through the speakers.

조명등 관리부(925)는 돌발 상황이 발생하였을 때 구동되며, 조명등(910-1), ..., (910-N)들의 점소등을 관리한다.The illumination lamp management unit 925 is driven when an unexpected situation occurs, and manages lighting of the illumination lights 910-1, ..., and 910-N.

또한 조명등 관리부(925)는 조명등(910-1), ..., (910-N)들의 점소등이 이루어지는 주기로 정의되는 점멸주기를 결정한다.Also, the illumination lamp management unit 925 determines a blinking period defined by a period at which the illumination lights 910-1, ..., and 910-N are turned on and off.

도 11은 도 10의 조명등 제어부에 의해 결정되는 점멸주기를 나타내는 예시도이다.11 is an exemplary diagram showing a blinking period determined by the lamp control unit of Fig.

조명등 관리부(925)는 돌발 상황이 발생하였을 때 도 11에 도시된 바와 같이, 조명등(910-1), ..., (910-N)들 각각의 위치정보 및 기 설정된 비상대피경로를 이용하여 비상대피경로의 이동경로에 대응되는 조명등(910-1), ..., (910-N)들의 점멸순서를 결정한 후 결정된 점멸순서에 따라 조명등(910-1), ..., (910-N)들의 점멸(n초)이 이루어지도록 구성된다.The illumination light management unit 925 uses the position information of each of the illumination lamps 910-1, ..., and 910-N and the predetermined emergency evacuation path as shown in FIG. 11 when an unexpected situation occurs ..., 910-N corresponding to the movement path of the emergency evacuation route, and determines the blinking sequence of the illumination lights 910-1, ..., 910- N (n seconds) are made.

즉 조명등 관리부(925)는 조명등(910-1), (910-2), (910-3)들이 비상대피경로의 이동경로를 따라 설치된 경우, 우선 소정시간(n초) 동안 조명등(910-1)에 점멸이 이루어지고, 그 이후 조명등(910-2)이 소정시간(n초) 동안 점멸되고, 마지막으로 조명등(910-3)이 소정시간(n초) 동안 점멸되도록 하는 점멸주기를 결정한다. 이때 조명등 관리부(925)에 의해 결정된 점멸주기 데이터를 조명등(910-1), ..., (910-N)들 각각의 제어부(911)들로 전송된다.That is, when the illumination lamps 910-1, 910-2, and 910-3 are installed along the path of the emergency evacuation path, the illumination lamp management unit 925 firstly controls the illumination lights 910-1 ), And thereafter determines the blinking period in which the illumination lamp 910-2 is flickered for a predetermined time (n seconds) and finally the illumination lamp 910-3 is blinked for a predetermined time (n seconds) . At this time, the blinking cycle data determined by the illumination lamp management unit 925 is transmitted to the control units 911 of the illumination lights 910-1, ..., and 910-N.

이에 따라 조명장소의 사람들은 돌발 상황 발생 시 비상벨, 스키퍼, 조명등(910-1), ..., (910-N)들의 소등을 통해 돌발 상황 발생여부를 인지하게 되고, 조명등(910-1), ..., (910-N)들의 점멸이 이루어지는 방향을 육안으로 확인하여 비상대피경로의 위치를 간단하게 인지하여 신속한 대피가 이루어지도록 한다.Accordingly, the people at the illuminated place recognize whether or not an unexpected situation occurs by turning off the emergency bell, the skipper, the illumination lights 910-1, ..., 910-N when an unexpected situation occurs, ), ..., (910-N) are visually confirmed in the direction of blinking so that the position of the emergency evacuation route can be easily recognized and the evacuation can be performed quickly.

1:음성인식장치 3:음향신호 입력부
5:음성인식부 6:특징파라미터 검출부
7:비교 및 매칭부 8:참조모델 데이터베이스부
9:단어결정부 51:제1 음성인식모듈
52:제2 음성인식모듈 900:조명시스템
910-1, ..., 910-N:조명등들 911:제어부
912:점소등 관리부 913:통신 인터페이스부
914:메모리 915:판단부
920:컨트롤러 930:통신망1: voice recognition device 3: acoustic signal input part
5: voice recognition unit 6: feature parameter detection unit
7: comparison and matching unit 8: reference model database unit
9: word determination unit 51: first speech recognition module
52: second speech recognition module 900: illumination system
910-1, ..., 910-N: illumination lights 911:
912: Lights-out management unit 913:
914: memory 915:
920: Controller 930:

Claims

First, second and third microphones for collecting acoustic signals and converting them into electrical signals;
A reference model database unit storing predetermined reference models;
An acoustic signal input unit receiving acoustic signals obtained by the microphones;
A voice recognition unit for analyzing the acoustic signals input by the acoustic signal input unit and detecting the original signal X2;
A feature parameter generation unit for extracting a feature vector of the original signal X2 detected by the speech recognition unit and generating a feature parameter using the extracted feature vector;
A comparison and matching unit for analyzing the reference models stored in the reference model database unit and the feature parameters generated by the feature parameter generation unit using a predetermined comparison algorithm to detect a reference model having the highest similarity to the feature parameters;
And a word determiner for searching for a word using a character corresponding to the reference model detected by the comparison and matching unit as a search word and finally outputting the searched word to perform speech recognition,
The speech recognition unit
The sound signal of the first microphone is separated into the original signal S1 and the noise signal N1 using a predetermined first signal separation algorithm and the sound signal of the second microphone is converted into the original signal S1 by using the first signal separation algorithm. A first speech recognition module for separating the original signal S1 into a noise signal S2 and a noise signal N2 and for summing the separated original signals S1 and S2 to detect a primary signal X1;
And separating the sound signal of the third microphone into the original signal (S3) and the noise signal (N3) using a predetermined second signal separation algorithm, and then separating the original signal into a first order signal detected by the first speech recognition module Further comprising a second speech recognition module for summing the signal X1 and detecting the final original signal X2,
Wherein the first signal separation algorithm and the second signal separation algorithm separate the original signal (S) and the noise signal (N) from the acoustic signal in different ways.

delete

The method of claim 1, wherein the first, second, and third microphones are beam-forming techniques, wherein the first microphone forms a beam in a straight-ahead direction, and the second and third microphones form a left- Forming a beam,
Wherein the first, second, and third microphones are applied with Acoustic Echo Cancellation (AEC) for removing a dynamic noise source from input acoustic signals.

The speech recognition apparatus of claim 3, wherein when the final original signal (X2) is detected, the speech recognition unit combines the vowel set in the initials of the detected original signal (X2)
The comparing and matching unit
Dynamic Time Warping (DTW) for recognizing the speech of the reference model with the highest similarity by nonlinearly matching the input feature parameter with the reference model to compensate for the difference in the speed and length between the input voice and the reference voice Warping algorithm is used to calculate the Euclidean distance of each feature parameter and reference model. Then, the reference model with the smallest distance is recognized as the model most similar to the feature parameter,
The comparing and matching unit
When there is one reference model whose similarity with the feature parameter is within a predetermined threshold value, the input speech is determined as the reference model with the highest similarity. If there are two or more reference models whose similarity with a specific parameter is within a predetermined threshold, Wherein the input speech is determined as a phoneme having the highest similarity through a pattern comparison algorithm through a hidden Markov model after separating the signal into phonemes.

delete

A lighting system comprising a light source device, a housing, a microphone installed outside the housing, and a lighting unit including a control unit installed inside the housing, the lighting system comprising:
The control unit
A light switch for controlling the light source device;
The control data of the actual illumination for each of the 'illumination control related comparison characters' defined by the characters that can be determined as the characters for controlling the illumination of the voice word and the 'illumination control related comparison characters' And a memory in which the matched matching table is stored,
The control unit analyzes the acoustic signal input from the microphone to determine a voice word. If the determined voice word is any one of the 'illumination control related comparison characters', the control unit searches the matching table to detect the control data The lighting control unit controls the lighting according to the detected control data,
The microphones are composed of first, second and third microphones installed at intervals,
The control unit
A reference model database unit storing predetermined reference models;
An acoustic signal input unit receiving acoustic signals obtained by the first, second and third microphones;
A voice recognition unit for analyzing the acoustic signals input by the acoustic signal input unit and detecting the original signal X2;
A feature parameter generation unit for extracting a feature vector of the original signal X2 detected by the speech recognition unit and generating a feature parameter using the extracted feature vector;
A comparison and matching unit for analyzing the reference models stored in the reference model database unit and the feature parameters generated by the feature parameter generation unit using a predetermined comparison algorithm to detect a reference model having the highest similarity to the feature parameters;
A word determination unit for searching for a word using a character corresponding to the reference model detected by the comparing and matching unit as a search word and finally outputting the searched word to perform speech recognition;
When the detected relation is greater than a threshold value after detecting the association between the voice word determined by the word determination unit and each of the 'illumination control related comparison characters' stored in the memory, the matching table is searched for, Further comprising: a determination unit for detecting control data corresponding to a control-related comparison target character 'and inputting the control data to the light-off management unit;
The speech recognition unit
The sound signal of the first microphone is separated into the original signal S1 and the noise signal N1 using a predetermined first signal separation algorithm and the sound signal of the second microphone is converted into the original signal S1 by using the first signal separation algorithm. A first speech recognition module for separating the original signal S1 into a noise signal S2 and a noise signal N2 and for summing the separated original signals S1 and S2 to detect a primary signal X1;
And separating the sound signal of the third microphone into the original signal (S3) and the noise signal (N3) using a predetermined second signal separation algorithm, and then separating the original signal into a first order signal detected by the first speech recognition module Further comprising a second speech recognition module for summing the signal X1 and detecting the final original signal X2,
Wherein the first signal separation algorithm and the second signal separation algorithm separate the original signal (S) and the noise signal (N) from the acoustic signal in different ways.

delete

The speech recognition apparatus of claim 10, wherein, when the final original signal (X2) is detected, the speech recognition unit combines the vowel set in the prefix of the detected original signal (X2)
The comparing and matching unit
Dynamic Time Warping (DTW) for recognizing the speech of the reference model with the highest similarity by nonlinearly matching the input feature parameter with the reference model to compensate for the difference in the speed and length between the input voice and the reference voice Warping algorithm is used to calculate the Euclidean distance of each feature parameter and reference model. Then, the reference model with the smallest distance is recognized as the model most similar to the feature parameter,
The comparing and matching unit
When there is one reference model whose similarity with the feature parameter is within a predetermined threshold value, the input speech is determined as the reference model with the highest similarity. If there are two or more reference models whose similarity with a specific parameter is within a predetermined threshold, Wherein the input speech is determined as a phoneme with the highest similarity through a pattern comparison algorithm through a hidden Markov model after separating the signal into phonemes.

The memory according to claim 12, wherein the memory further stores 'erroneous comparison-related characters'
The determination unit
When the detected relation is greater than a threshold value after detecting the association between each of the 'spontaneous related comparison characters' stored in the memory and the voice word determined by the word determination unit, it is determined that an unexpected situation has occurred,
Wherein the lighting control unit controls the light source to be turned on in a predetermined manner when the determination unit determines that an unexpected condition has occurred, so that the lighting control unit is used for lighting normally during an unexpected situation, And is used as an emergency light when an unexpected situation occurs.

The lighting system according to claim 13, further comprising a controller for managing and controlling the luminaire mechanisms,
Wherein the control unit transmits unexpected condition data indicating that an unexpected condition has occurred to the controller when the determination unit determines that an unexpected condition has occurred,
The controller
The location information of each of the luminaire mechanisms and the predetermined emergency evacuation route, and when the unexpected situation confirmation data is received from any one of the luminaire mechanisms, the location information and the emergency evacuation route of each of the luminaire mechanisms are used And arranging the illumination lamps in the order corresponding to the emergency evacuation path, and then causing the illumination lamps to be flashed in the ordered sequence.