KR20030075316A

KR20030075316A - Speech recognition method and speech recognition system for juke box system

Info

Publication number: KR20030075316A
Application number: KR1020020014469A
Authority: KR
Inventors: 김윤환; 강천모
Original assignee: 주식회사 아이티매직
Priority date: 2002-03-18
Filing date: 2002-03-18
Publication date: 2003-09-26

Abstract

PURPOSE:A method for recognizing a voice for a karaoke machine and a system therefor are provided to extract a command word corresponding to the karaoke machine by removing a noise not corresponding to a voice of a man and recognizing the voice. CONSTITUTION:A noise removing unit(100) extracts only an area corresponding to a voice bandwidth of a man from a command inputted in a voice, and converts the voice command of the extracted area into digital signals for removing a noise included in the voice bandwidth of the man. A voice recognizing unit(200) recognizes wavelength of the voice command for comparing the same with syllable wave form of a command word preliminarily stored to search for the consistent command. A serial communication unit(400) converts the retrieved command word into a code a karaoke machine is capable of interpreting, and transmits the converted code to the karaoke machine by using a port communication with the karaoke machine.

Description

Speech recognition method for karaoke machine and its system {SPEECH RECOGNITION METHOD AND SPEECH RECOGNITION SYSTEM FOR JUKE BOX SYSTEM}

본 발명은 노래방 기계용 음성 인식 방법 및 그 시스템에 관한 것으로, 특히 능동적으로 소음을 제거하여 음성을 인식하는 방법 및 그 시스템에 관한 것이다.The present invention relates to a voice recognition method and system for a karaoke machine, and more particularly, to a method and system for recognizing a voice by actively removing noise.

노래방은 새로운 개념이 도입된 이래로 급속도의 발전을 거듭해 왔고, 비슷한 성격인 PC방도 그러하듯, 노래방 역시 새로운 방식에 대한 기대감이 형성되고 있다. 이러한 고객들의 요구 사항을 만족시킬 수 있도록 고안된 시스템이 음성 인식 노래방 선곡 시스템이다. 음성 인식 선곡 시스템은 기존의 번호를 눌러서 선곡했던 시스템에 비해서 음성 인식을 통해서 원하는 곡을 선곡할 수 있는 시스템을 의미한다.Karaoke has been developing rapidly since the introduction of a new concept, and as with the PC room, which is similar in character, karaoke is also creating expectations for new ways. A system designed to meet the needs of these customers is a voice recognition karaoke selection system. The voice recognition selection system refers to a system capable of selecting a desired song through voice recognition, compared to a system that was selected by pressing a conventional number.

음성 인식 기술은 그 분류 기준에 따라 여러 가지 종류로 나뉜다. 우선 인식의 대상으로 삼는 화자에 따라 화자 독립 및 화자 종속 인식 기술로 분류된다. 먼저, 화자 종속 시스템은 특정 화자의 음성을 인식하기 위한 시스템으로, 현재 휴대폰에 탑재되어 사용되는 음성 다이얼링 시스템이 대표적인 예이다. 화자 종속 시스템에서는 일반적으로 시스템의 사용 전에 사용자의 음성을 저장하여 등록시키고, 실제 인식을 수행할 때는 입력된 음성의 패턴과 저장된 음성의 패턴을 비교하는 패턴 매칭 기법이 사용된다.Speech recognition technology is divided into various types according to the classification criteria. First of all, according to the speaker to be recognized, it is classified into speaker-independent and speaker-dependent recognition technology. First, the speaker dependent system is a system for recognizing a voice of a specific speaker, and a voice dialing system that is currently installed and used in a mobile phone is a representative example. In speaker-dependent systems, a pattern matching technique is generally used to store and register a user's voice before using the system, and to compare the pattern of the input voice with the stored voice when performing actual recognition.

반면, 화자 독립 시스템은 불특정 다수 화자의 음성을 인식하기 위한 것으로, 화자 종속 시스템에서와 같이 사용자가 시스템의 동작 전에 음성을 등록시켜야되는 번거로움이 없다. 화자 독립 시스템은 다수 화자의 음성을 수집하여 통계적인 모델을 학습시키고, 학습된 모델을 이용하여 인식을 수행하게 된다. 따라서, 각 화자의 특징적인 특성은 사라지고 각 화자간에 공통적으로 나타나는 특성이 부각된다. 같은 어휘를 대상으로 같은 양의 학습 데이터를 사용한다면 대체적으로 화자 종속 시스템의 성능이 화자독립 시스템 보다 높게 나온다. 그러나, 화자 종속 시스템의 경우, 음성이 등록된 화자 이외의 사람이 시스템을 사용한다면, 인식률은 크게 저하된다.On the other hand, the speaker independent system is for recognizing the voice of an unspecified majority speaker, and there is no need for a user to register the voice prior to the operation of the system as in the speaker dependent system. The speaker independent system collects the voices of multiple speakers and trains a statistical model, and performs recognition using the trained model. Therefore, the characteristic characteristics of each speaker disappear, and the characteristics common to each speaker are highlighted. If you use the same amount of training data for the same vocabulary, the speaker-dependent system generally outperforms the speaker-independent system. However, in the speaker dependent system, if a person other than the speaker whose voice is registered uses the system, the recognition rate is greatly reduced.

다음으로는 발음의 형태에 따라 고립어 인식 시스템과 연속어 인식 시스템으로 나뉜다. 고립어 인식 시스템에서는 각 단어가 또박또박 발음되고 각 단어 사이에는 충분한 길이의 묵음 구간이 존재한다고 가정한다. 따라서, 인식의 초점이 각 단어가 다른 단어와 얼마나 다른가에 있고 인접한 단어의 영향은 무시된다. 이에 반해, 연속어 인식 시스템은 문장 단위로 인식을 수행하는 시스템을 의미한다. 각 문장은 평상시와 같이 발음되며 특별히 단어 사이의 묵음은 첨가되지 않는다.Next, it is divided into isolated language recognition system and continuous language recognition system according to the form of pronunciation. In the isolated word recognition system, it is assumed that each word is pronounced again and again, and there is a silent section of sufficient length between each word. Thus, the focus of recognition is on how different each word is from other words and the influence of adjacent words is ignored. In contrast, the continuous word recognition system refers to a system that performs recognition in units of sentences. Each sentence is pronounced as usual, with no special silence between words.

음성 인식 기술은 위에서 언급한 바와 같이 분류 기준에 따라 여러 종류로 나뉜다. 분류된 각 기술은 모두 고유의 장단점을 지니고 있기 때문에 어느 하나가 다른 것보다 우월하다고는 얘기할 수 없다.As mentioned above, speech recognition technology is divided into various types according to classification criteria. Each classed technology has its own advantages and disadvantages, so we can't say that one is superior to the other.

노래방 기계의 경우에는 많은 사용자가 사용하기 때문에 화자 독립 시스템을 채용하는 것이 일반적이다.In the case of karaoke machines, it is common to employ a speaker independent system because many users use them.

이러한 노래방 기계는 일반적으로 제조사별로 다양한 운영 체제 및 하드웨어로 구성되어 있다. 따라서, 화자 독립 시스템 등의 방법을 사용하여 음성 인식한 결과를 노래방 기계에 전달해 주기 위해서는 노래방 기계와 인터페이스할 필요가 있다. 이와 같이 노래방 기계와 인터페이스하기 위해서는 해당하는 기계의 운영 체제에 맞도록 프로그램을 재 설계해야 해야 한다. 제조사별 및 제품별 다양한 운영 체제마다 이러한 작업을 반복한다면, 상당히 번거로우며 또한 범용적인 제품에 응용하는데 커다란 문제점이 있다.These karaoke machines are typically made up of various operating systems and hardware for different manufacturers. Therefore, it is necessary to interface with the karaoke machine in order to deliver the voice recognition result to the karaoke machine using a method such as a speaker independent system. In order to interface with a karaoke machine like this, the program must be redesigned for the operating system of the corresponding machine. If you repeat these tasks for different operating systems by manufacturer and by product, there are significant problems in applying them to general-purpose products.

또한 노래방 기계의 경우에 주변에 존재하는 많은 소음으로 인해 음성 인식률이 떨어지는 문제점이 있다.In addition, in the case of a karaoke machine there is a problem that the speech recognition rate is lowered due to a lot of noise present in the vicinity.

본 발명은 이와 같은 문제점을 해결하기 위하여 사람의 음성에 해당하지 않는 소음을 제거하고 음성 인식을 하는 것을 그 기술적 과제로 한다.In order to solve the above problems, the technical problem is to remove noise that does not correspond to human voice and perform voice recognition.

또한 본 발명은 노래방 기계에 독립적으로 운영 가능한 음성 인식 시스템을 제공하는 것을 그 기술적 과제로 한다.Another object of the present invention is to provide a speech recognition system that can be operated independently of a karaoke machine.

도 1은 본 발명의 일 실시예에 따른 노래방 기계용 음성 인식 시스템을 나타내는 블록도이다.1 is a block diagram showing a voice recognition system for a karaoke machine according to an embodiment of the present invention.

도 2는 본 발명의 일 실시예에 따른 노래방 기계용 음성 인식 방법을 나타내는 흐름도이다.2 is a flowchart illustrating a voice recognition method for a karaoke machine according to an embodiment of the present invention.

도 3은 본 발명의 일 실시예에 따른 능동적인 소음 제거 방법을 나타내는 흐름도이다.3 is a flowchart illustrating an active noise canceling method according to an embodiment of the present invention.

본 발명은 능동적인 소음 제거 방법 및 리모콘이 노래방과 통신하는 코드를 이용하여 음성 인식 결과를 노래방 기계에 전달함으로써 이와 같은 과제를 해결한다.The present invention solves this problem by transmitting a voice recognition result to a karaoke machine using an active noise canceling method and a code in which the remote controller communicates with karaoke.

본 발명의 첫 번째 특징에 따르면 소음 제거부, 음성 인식부 및 직렬 통신부를 포함하는 노래방 기계용 음성 인식 시스템이 제공된다. 소음 제거부는 소음과 함께 입력된 음성에서 소음을 제거하여 음성 명령만을 추출한다. 음성 인식부는 음성 명령의 파장을 인식하여 미리 저장되어 있는 명령어의 음절 파형과 비교해서 일치하는 명령어를 검색한다. 직렬 통신부는 검색한 명령어를 노래방 기계가 해석할 수 있는 코드로 변환하고, 변환된 코드를 노래방 기계의 리모콘이 노래방 기계와 통신하는 포트를 이용하여 노래방 기계로 전송한다.According to a first aspect of the invention there is provided a speech recognition system for a karaoke machine comprising a noise canceling unit, a speech recognition unit and a serial communication unit. The noise canceling unit extracts only a voice command by removing noise from a voice input together with the noise. The voice recognition unit recognizes a wavelength of the voice command and searches for a matching command by comparing the syllable waveform of the pre-stored command. The serial communication unit converts the retrieved command into a code that can be interpreted by the karaoke machine, and transmits the converted code to the karaoke machine using a port through which the remote control of the karaoke machine communicates with the karaoke machine.

이때, 소음 제거부는 밴드 패스 필터, 아날로그-디지털 변환기 및 디지털 신호 처리부를 포함하는 것이 바람직하다. 밴드 패스 필터는 사람의 음성 대역폭에서 벗어나는 소음을 필터링하고, 아날로그-디지털 변환기는 필터링된 음성 신호를 디지털 신호를 변환한다. 디지털 신호 처리부는 디지털 신호에서 사람의 음성 대역폭 내의 소음을 제거한다.In this case, the noise canceling unit preferably includes a band pass filter, an analog-to-digital converter, and a digital signal processing unit. The band pass filter filters out noise outside the human voice bandwidth, and the analog-to-digital converter converts the filtered voice signal into a digital signal. The digital signal processor removes noise in a human voice bandwidth from the digital signal.

또한 직렬 통신부는 검색한 명령어를 상기 리모콘이 상기 노래방 기계로 명령을 내릴 때 사용하는 코드로 변환하고, 변환된 코드를 RS232C 직렬 통신 포트 또는 IR(infrared) 포트를 사용하여 노래방 기계로 전송하는 것이 바람직하다.Also, the serial communication unit converts the searched command into a code used when the remote control commands the karaoke machine, and transmits the converted code to the karaoke machine using an RS232C serial communication port or an IR (infrared) port. Do.

또한 이 음성 인식 시스템은 명령어의 음절 파형을 미리 저장하고 있는 데이터베이스를 더 포함할 수 있다.The speech recognition system may further include a database that stores syllable waveforms of commands in advance.

본 발명의 두 번째 특징에 따르면 노래방 기계용 음성 인식 방법이 제공된다. 이 방법에 의하면 먼저 사용자가 음성으로 입력한 명령에서 소음을 제거하고, 소음을 제거한 음성 명령을 인식하여 해당하는 명령어를 검색한다. 다음에 검색한 명령어를 노래방 기계가 해석할 수 있는 코드로 변환하여 노래방 기계의 리모콘이 노래방 기계와 통신하는 포트를 이용하여 노래방 기계로 전송한다.According to a second aspect of the invention there is provided a speech recognition method for a karaoke machine. According to this method, first, a user removes noise from a voice input command, and recognizes a voice command from which the noise is removed to search for a corresponding command. Next, the searched command is converted into a code that can be interpreted by the karaoke machine, and the remote control of the karaoke machine transmits it to the karaoke machine through a port that communicates with the karaoke machine.

이때, 소음을 제거할 때는 음성으로 입력한 명령에서 사람의 음성 대역폭에 해당하는 영역만을 추출하고, 추출한 영역의 음성 명령을 디지털 신호로 변환한 후 사람의 음성 대역폭에 포함되는 소음을 제거하는 것이 바람직하다.In this case, when removing the noise, it is preferable to extract only the area corresponding to the human voice bandwidth from the voice input command, convert the extracted voice command into a digital signal, and then remove the noise included in the human voice bandwidth. Do.

또한 음성을 인식할 때는 음성 명령의 파장을 인식하여 미리 저장되어 있는 명령어의 음절 파형과 비교하여 일치하는 명령어를 추출하는 것이 바람직하다.In addition, when recognizing the voice, it is preferable to recognize the wavelength of the voice command and compare the syllable waveform with the pre-stored syllable waveform to extract a matching command.

또한 노래방 기계로 전송할 때는 검색한 명령어를 리모콘이 노래방 기계로 명령을 내릴 때 사용하는 코드로 변환하고, 변환된 코드를 RS232C 직렬 통신 포토 또는 IR(infrared) 포트를 사용하여 노래방 기계로 전송하는 것이 바람직하다.Also, when transmitting to a karaoke machine, it is recommended to convert the retrieved command into the code used by the remote control to command the karaoke machine, and transmit the converted code to the karaoke machine using the RS232C serial communication port or IR (infrared) port. Do.

그러면 도면을 참조하여 본 발명의 일 실시예에 따른 음성 인식 방법 및 시스템에 대하여 자세하게 설명한다.Next, a voice recognition method and system according to an embodiment of the present invention will be described in detail with reference to the accompanying drawings.

먼저, 도 1을 참조하여 본 발명의 일 실시예에 따른 음성 인식 시스템에 대하여 설명한다.First, a voice recognition system according to an embodiment of the present invention will be described with reference to FIG. 1.

도 1에 도시한 바와 같이, 본 발명의 일 실시예에 따른 노래방의 음성 인식 시스템은 소음 제거부(100), 음성 인식부(200), 곡명 데이터베이스(300) 및 직렬 통신부(400)를 포함한다.As shown in FIG. 1, the voice recognition system of karaoke according to an embodiment of the present invention includes a noise removing unit 100, a voice recognition unit 200, a song name database 300, and a serial communication unit 400. .

소음 제거부(100)는 사람의 음성 대역폭 이외의 부분을 필터링하는 밴드 패스 필터(110), 아날로그 음성 신호를 디지털 신호로 변환하는 아날로그-디지털 변환기(120) 및 디지털 음성 신호에서 소음을 제거하는 디지털 신호 처리부(130)를 포함하며, 외부의 음성 입력 장치(도시하지 않음)로부터 입력된 음성에 대하여 능동적으로 소음을 제거하고 남은 순수한 명령어만을 음성 인식부(200)로 전달한다.The noise canceller 100 includes a band pass filter 110 for filtering portions other than the human voice bandwidth, an analog-digital converter 120 for converting an analog voice signal into a digital signal, and a digital for removing noise from the digital voice signal. It includes a signal processor 130, and actively removes the noise for the voice input from an external voice input device (not shown) and delivers only the remaining pure command to the voice recognition unit 200.

능동적인 소음 제거 방법은 입력되는 신호에서 소음에 해당하는 주파수 대역을 제거하고, 인간의 음성에 해당하는 주파수 대역폭만 입력으로 받아서 나머지 소음을 제거하는 방법이다.The active noise removing method removes the frequency band corresponding to the noise from the input signal and receives only the frequency bandwidth corresponding to the human voice as an input to remove the remaining noise.

곡명 데이터베이스(300)는 노래방 데이터베이스에 해당하는 곡명 및 가수명을 저장하고 있다.The song name database 300 stores a song name and a singer name corresponding to the karaoke database.

음성 인식부(200)는 소음 제거부(100)에서 소음이 제거된 음성 신호의 파장을 인식하고 이에 대한 음향 분석(acoustic analysis)을 한다. 이러한 음향 분석 방법으로는 매칭(matching) 방법 또는 상태열(state sequence) 방법을 사용할 수 있으며, 본 발명의 일 실시예서는 매칭 방법을 사용하여 설명하지만 이에 한정되는것은 아니다.The speech recognizer 200 recognizes a wavelength of the speech signal from which the noise is removed by the noise remover 100 and performs an acoustic analysis thereof. As the acoustic analysis method, a matching method or a state sequence method may be used. An embodiment of the present invention will be described using a matching method, but is not limited thereto.

매칭 방법을 사용하여 음성 인식부(200)는 음성 신호를 순차적으로 한 음절씩 곡명 데이터베이스(300)에 저장된 음절 파형과 비교하여 일치하는 음절 파형을 찾는다. 만약 일치하는 음절 파형을 찾으면 음성 인식부(200)는 일치하는 음절 파형에 대응하는 곡명 및 가수명을 추출한다.Using the matching method, the speech recognition unit 200 sequentially compares the speech signal with syllable waveforms stored in the song name database 300 by one syllable to find a matching syllable waveform. If a matching syllable waveform is found, the voice recognition unit 200 extracts a song name and a mantissa name corresponding to the matching syllable waveform.

직렬 통신부(400)는 음성 인식부(200)에서 음성 인식이 끝나면 음성 인식 결과를 노래방 기계가 해석할 수 있는 코드로 변환하여 기계에 전달한다. 노래방 기계에 명령어를 전달할 때 기존에 노래방 기계와 리모콘이 통신하는 포트를 이용한다.The serial communication unit 400 converts the speech recognition result into a code that can be interpreted by the karaoke machine and transmits the speech recognition result to the machine after the speech recognition is completed in the speech recognition unit 200. When sending commands to a karaoke machine, the port that the karaoke machine and the remote control communicate with is used.

즉, 노래방 기계에서 리모콘은 동일한 규약으로 된 코드를 노래방 기계로 전송하고, 전송할 때는 RS232C 직렬 통신 포트나 IR(infrared) 포트를 사용하여 통신을 한다. 본 발명의 일 실시예에서는 음성 인식부(200)에서의 음성 인식 결과를 리모콘에서 사용하는 코드로 변환하고 변환된 코드를 이와 같은 통신 포트를 사용하여 노래방 기계에 전송한다.In other words, in the karaoke machine, the remote controller transmits the code of the same protocol to the karaoke machine. When transmitting, the remote control communicates using the RS232C serial communication port or IR (infrared) port. In an embodiment of the present invention, the voice recognition result of the voice recognition unit 200 is converted into a code used in the remote control, and the converted code is transmitted to the karaoke machine using such a communication port.

이와 같이 하면, 노래방 기계의 하드웨어 및 소프트웨어와는 독립적으로 운용될 수 있다.In this way, the hardware and software of the karaoke machine can be operated independently.

이하, 도 2를 참조하여 본 발명의 일 실시예에 따른 노래방 기계용 음성 인식 방법을 설명한다.Hereinafter, a voice recognition method for a karaoke machine according to an embodiment of the present invention will be described with reference to FIG. 2.

사용자가 먼저 마이크 등의 음성 입력 장치를 통하여 선곡하고자 하는 노래의 곡명 또는 가수명을 음성으로 입력하면, 입력된 음성은 주변의 소음과 함께 음성 인식 시스템(10)에 입력된다(S201). 음성 인식 시스템(10)의 소음 제거부(100)는 소음과 함께 입력된 음성에서 능동적인 소음 제거 방법을 사용하여 순수한 음성에 해당하는 부분을 추출하여 음성 인식부(200)에 전달한다(S202). 이러한 능동적인 소음 제거 방법에 대해서는 도 3을 참조하여 아래에서 자세하게 설명한다.When the user first inputs a song name or singer name of a song to be selected through a voice input device such as a microphone as the voice, the input voice is input to the voice recognition system 10 together with ambient noise (S201). The noise removing unit 100 of the speech recognition system 10 extracts a portion corresponding to the pure voice from the voice input together with the noise and transmits the portion corresponding to the pure voice to the speech recognition unit 200 (S202). . This active noise cancellation method will be described in detail below with reference to FIG. 3.

음성 인식부(200)는 소음 제거부(100)에서 소음이 제거된 음성 신호의 파장을 인식하고 음성 신호를 순차적으로 한 음절씩 곡명 데이터베이스(300)에 저장된 음절 파형과 비교한다(S203). 만약 일치하는 음절 파형을 찾으면 음성 인식부 (200)는 일치하는 음절 파형에 대응하는 곡명 및 가수명을 추출하여 직렬 통신부 (400)로 전달하고, 일치하는 음절 파형을 찾지 못하였으면 다시 음성 입력을 기다린다(S204).The speech recognizer 200 recognizes the wavelength of the speech signal from which the noise is removed from the noise remover 100 and compares the speech signal with syllable waveforms stored in the song name database 300 one by one syllable (S203). If a matching syllable waveform is found, the voice recognition unit 200 extracts a song name and a mantissa name corresponding to the matching syllable waveform and transmits the same to the serial communication unit 400. If the matching syllable waveform is not found, the voice recognition unit waits for a voice input again. S204).

직렬 통신부(400)는 음성 인식부(200)에서 인식한 결과를 노래방 기계가 해석할 수 있는 코드로 변환하고(S205), 이 코드를 RS232C 직렬 통신 포트 또는 IR 포트를 사용하여 노래방 기계로 전송한다(S206).The serial communication unit 400 converts the result recognized by the voice recognition unit 200 into a code that can be interpreted by the karaoke machine (S205), and transmits the code to the karaoke machine using the RS232C serial communication port or IR port. (S206).

이때, 사용자가 입력한 명령에 따라 노래방 기계에 표시되는 화면이 달라진다. 예를 들면, 사용자가 곡명을 입력한 경우에는 해당하는 노래가 2개 이상인 경우에 곡명과 가수명이 같이 노래방 기계에 표시되어 다시 사용자의 선택 입력을 기다린다. 마찬가지로 가수명을 입력한 경우에는 해당하는 가수의 곡명 리스트가 노래방 기계에 표시되어 다시 사용자의 선택 입력을 기다린다. 또한 사용자가 곡명과 가수명을 함께 입력한 경우에는 해당하는 노래가 노래방 기계에서 선곡이 된다.At this time, the screen displayed on the karaoke machine is changed according to the command input by the user. For example, when the user inputs a song name, when there are two or more corresponding songs, the song name and the singer name are displayed together on the karaoke machine and wait for the user's selection input again. Similarly, when a singer name is entered, a list of song names of the corresponding singer is displayed on the karaoke machine and awaits the user's selection input again. In addition, when a user enters a song name and a singer name together, the corresponding song is selected by the karaoke machine.

다음에 도 3을 참조하여 본 발명의 일 실시예에 따른 능동적인 소음 제거 방법에 대하여 설명한다.Next, an active noise canceling method according to an exemplary embodiment of the present invention will be described with reference to FIG. 3.

먼저, 소음 제거부(100)는 소음과 함께 입력된 음성에서 밴드 패스 필터 (110)를 통하여 사람의 음성 주파수의 대역폭에서 벗어나는 소음을 제거한다 (S301). 일반적으로 사람의 음성 주파수의 대역폭은 300㎐에서 3400㎐ 사이이다. 밴드 패스 필터(110)에서 필터링된 음성 입력은 아날로그-디지털 변환기(120)에서 디지털 신호로 변환된다(S302).First, the noise removing unit 100 removes the noise deviating from the bandwidth of the human voice frequency through the band pass filter 110 in the voice input together with the noise (S301). In general, the bandwidth of a human voice frequency is between 300 kHz and 3400 kHz. The voice input filtered by the band pass filter 110 is converted into a digital signal by the analog-digital converter 120 (S302).

변환된 디지털 신호는 디지털 신호 처리부(130)에서 음성 주파수의 대역폭에 해당하는 소음이 제거되어 음성 인식부(200)로 전달된다(S303). 소음을 제거하는 방법은 기존의 방법을 활용할 수 있다.The converted digital signal is removed from the digital signal processing unit 130 and noise corresponding to the bandwidth of the voice frequency is transmitted to the voice recognition unit 200 (S303). The method of removing noise may use existing methods.

본 발명의 일 실시예에서는 음성 입력으로 곡명 또는 가수명을 입력하는 경우에 대해서만 설명하였지만 본 발명은 이에 한정되지 않고 노래방 기계의 기능을 조절하는 명령어도 음성으로 입력할 수 있다. 예를 들면, 소리 높임 또는 줄임 명령, 보컬 삽입 명령, 템포 조절 명령 등을 음성으로 입력할 수 있으며, 이와 같은 명령이 입력된 경우의 동작은 앞의 실시예에서 설명한 내용으로부터 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 알 수 있으므로 설명을 생략한다.In the exemplary embodiment of the present invention, only a case of inputting a song name or a singer name by voice input has been described. However, the present invention is not limited thereto, and a command for adjusting a function of a karaoke machine may also be input by voice. For example, a voice up or down command, a vocal insertion command, a tempo control command, or the like may be input by voice, and the operation in the case where such a command is input may be performed in the technical field to which the present invention belongs. In the following description, it will be easily understood by those skilled in the art.

이와 같이 본 발명에 의하면, 음성으로 입력한 명령어에서 사람의 음성에 해당하지 않는 소음을 제거하고 음성 인식을 하여 노래방 기계에 해당하는 명령어를 추출할 수 있다. 또한 리모콘과 노래방 기계가 통신하는 포트 및 코드를 사용함으로써 노래방 기계의 종류에 관계없이 독립적으로 운영될 수 있다.As described above, according to the present invention, a command corresponding to a voice of a person may be removed from a command input by voice, and voice recognition may be performed to extract a command corresponding to a karaoke machine. In addition, by using the port and code that the remote control and the karaoke machine communicates, it can be operated independently of any kind of karaoke machine.

Claims

Noise canceling unit that removes only the voice command by removing noise from the voice input with the noise,

A voice recognition unit recognizing a wavelength of the voice command and searching for a matched command by comparing with a syllable waveform of a pre-stored command, and

A serial communication unit for converting the searched command into a code that can be interpreted by a karaoke machine, and transmitting the converted code to the karaoke machine using a port through which a remote control of the karaoke machine communicates with the karaoke machine.

Speech recognition system for karaoke machine comprising a.

The method of claim 1,

The noise removing unit

A band pass filter that filters out noise outside the human voice bandwidth,

An analog-digital converter for converting the filtered voice signal into a digital signal, and

Digital signal processing unit for removing noise in the voice bandwidth of the person from the digital signal

Speech recognition system for karaoke machine comprising a.

The method of claim 1,

The serial communication unit converts the retrieved command into a code used when the remote control commands the karaoke machine, and transmits the converted code to the karaoke machine using an RS232C serial communication port or an IR (infrared) port. Speech recognition system for karaoke machines.

The method of claim 1,

And a database for storing syllable waveforms of the commands in advance.

A first step of removing noise from a voice input command by the user,

A second step of recognizing the voice command from which the noise is removed and searching for a corresponding command;

A third step of converting the searched command into a code that can be interpreted by a karaoke machine, and

A fourth step of transmitting the converted code to the karaoke machine using a port through which the remote control of the karaoke machine communicates with the karaoke machine;

Speech recognition method for a karaoke machine comprising a.

The method of claim 5,

The first step is

Extracting only an area corresponding to a voice bandwidth of a person from the voice input command;

Converting the voice command of the extracted region into a digital signal, and

Removing noise included in the voice bandwidth of the person from the converted digital signal

Speech recognition method for a karaoke machine comprising a.

The method of claim 5,

The second step is to recognize the wavelength of the voice command to compare the syllable waveforms of the pre-stored syllable waveforms and extract the matching command voice recognition method for a karaoke machine.

The method of claim 5,

The third step is a voice recognition method for a karaoke machine to convert the searched command to a code used when the remote control commands to the karaoke machine.

The method of claim 5,

In the fourth step, the converted code is transmitted to the karaoke machine using an RS232C serial communication port or an IR (infrared) port.