KR20000032270A

KR20000032270A - Voice recognition method of voice typing system

Info

Publication number: KR20000032270A
Application number: KR1019980048674A
Authority: KR
Inventors: 유하진
Original assignee: 구자홍; 엘지전자 주식회사
Priority date: 1998-11-13
Filing date: 1998-11-13
Publication date: 2000-06-05

Abstract

PURPOSE: A voice recognition method of a voice typing system is to add non-registered word to a dictionary using a voice typing, thereby enhancing a recognizing capability of the word. CONSTITUTION: A voice recognition method comprises the steps of: when a voice is inputted through a microphone, extracting a characteristic of the input voice; selecting the most similar word to the pronunciation method defined in a word dictionary; completing the voice recognition; and correcting an erroneously recognized word after the voice recognition is completed, and registering a new word and a pronunciation method at the word dictionary during the correcting step. The method comprises a step of asking whether to register the new word and pronunciation method at the word dictionary when the new word is found.

Description

Voice recognition method of voice typing system

본 발명은 음성 타이핑 시스템에 관한 것으로, 특히 사용중 사전 미등록어의 추가에 의하여 지속적으로 인식 성능을 향상시키는 음성 타이핑 시스템의 음성 인식 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech typing system, and more particularly, to a speech recognition method of a speech typing system that continuously improves recognition performance by adding a non-registered word in use.

일반적으로 음성 타이핑 시스템은 타자기의 키보드를 사용하지 않고 마이크를 통하여 타이핑하기 원하는 내용을 음성으로 읽음으로써 문서를 작성하도록 하는 시스템이다.In general, a voice typing system is a system for writing a document by reading the contents desired to be typed through a microphone without using a typewriter's keyboard.

음성 타이핑 시스템에서 음성을 인식하는 과정은 도 1과 같다.The process of recognizing the voice in the voice typing system is shown in FIG. 1.

먼저, 마이크를 통하여 음성을 입력받아서 컴퓨터에 이진수로 저장한 다음, 신호 처리 과정을 거쳐서 인식에 필요한 특징 파라미터를 추출한다.First, a voice is input through a microphone, stored in a binary number on a computer, and the feature parameters necessary for recognition are extracted through a signal processing process.

단어 사전에는 일반적으로 많이 쓰이는 단어의 발음 기호를 저장하여 놓고, 특징 파라미터로부터 단어 사전에 정의된 발음 방법과 가장 유사한 단어를 찾아내게 된다.The word dictionary stores the phonetic symbols of commonly used words, and finds the words most similar to the pronunciation methods defined in the word dictionary from feature parameters.

또한, 문법 정보인 언어 모델을 이용함으로써, 문장내의 문맥상 가장 적당한 단어를 찾아내게 된다.In addition, by using a language model that is grammar information, the most suitable word in context in a sentence is found.

여기서, 만일 타이핑하려는 단어가 사전에 정의되어 있지 않으면 단어를 인식할 수 없게 된다.Here, if the word to be typed is not defined in the dictionary, the word cannot be recognized.

또한, 기존에 사전에 정의되어 사용하고 있는 단어라도 언어 모델에 충분히 반영되어 있지 않으면 인식률이 저하되게 된다.In addition, even if words previously defined and used are not sufficiently reflected in the language model, the recognition rate is lowered.

사용자에 따라서 흔히 사용하는 단어는 다르게 되는데, 예를 들어 공학용 논문을 자주 쓰는 사람은 다른 사람들이 많이 사용하지 않는 공학적 용어를 많이 쓰게 되며, 상업에 종사하는 사람은 상업 용어를 자주 사용하게 된다.Different words are used differently depending on the user. For example, a person who frequently writes scientific papers uses a lot of engineering terms that others do not use, and a person who is in commerce often uses commercial terms.

이러한 모든 단어를 미리 모두 정의하는 것은 필요로 하지 않는 사람을 위해서는 필요없는 기억 장소의 낭비가 되고, 단어의 수가 많아지면 인식 시간이 많아지며 인식 성능이 저하되게 된다.Defining all of these words in advance is a waste of unnecessary storage for those who do not need them. As the number of words increases, recognition time increases and recognition performance decreases.

그러므로 시스템의 초기에는 가장 기본적인 단어만을 정의해 두고, 사용자에 따라서 필요한 단어를 추가로 정의하는 것이 필요하다.Therefore, at the beginning of the system, it is necessary to define only the most basic words, and additionally define necessary words according to the user.

본 발명은 이와 같은 문제들을 해결하기 위한 것으로, 사용중 사전에 등록되어 있지 않은 미등록어의 추가에 의하여 지속적으로 인식 성능을 높일 수 있는 음성 타이핑 시스템의 음성 인식 방법을 제공하는데 그 목적이 있다.An object of the present invention is to provide a speech recognition method of a speech typing system that can continuously improve recognition performance by adding non-registered words that are not registered in advance during use.

도 1은 종래 기술에 따른 음성 타이핑 시스템의 음성 인식 과정을 개략적으로 보여주는 도면1 is a view schematically showing a speech recognition process of a speech typing system according to the prior art;

도 2는 본 발명에 따른 음성 타이핑 시스템의 음성 인식 과정을 개략적으로 보여주는 도면2 is a diagram schematically illustrating a speech recognition process of a speech typing system according to the present invention.

도 3 내지 도 7은 본 발명의 실시예들에 따른 음성 인식 방법을 보여주는 순서도3 to 7 are flowcharts illustrating a voice recognition method according to embodiments of the present invention.

본 발명에 따른 음성 타이핑 시스템의 음성 인식 방법의 특징은 음성 타이핑을 마친 후, 사용자가 잘못 인식된 단어를 교정하고 나면 사전에 미리 정의되어 있지 않은 새로운 단어에 대하여 사전에 등록시킨 다음, 차후에 인식할 때는 이를 사용함으로써 인식 성능을 지속적으로 향상시키는데 있다.A feature of the speech recognition method of the speech typing system according to the present invention is that after the user has completed the speech typing, after the user corrects an incorrectly recognized word, the user registers a new word that is not predefined in the dictionary, and then recognizes it later. It is used to improve the recognition performance continuously.

본 발명의 다른 특징은 음성 타이핑을 마친 후, 사용자가 잘못 인식된 단어를 교정하고 나면 사전에 미리 정의되어 있지 않은 새로운 단어에 대하여 사전에 등록시킨 다음, 언어모델을 재학습시켜 차후에 인식할 때는 이를 사용함으로써 인식 성능을 지속적으로 향상시키는데 있다.Another feature of the present invention is that after the user completes the voice typing, after correcting a word that is incorrectly recognized, the user registers a new word that is not previously defined in a dictionary, and then re-learns a language model to recognize it later. By using it, the recognition performance is continuously improved.

상기와 같은 특징을 갖는 본 발명에 따른 음성 타이핑 시스템의 음성 인식 방법을 첨부된 도면을 참조하여 설명하면 다음과 같다.The speech recognition method of the voice typing system according to the present invention having the above characteristics will be described with reference to the accompanying drawings.

도 2는 본 발명에 따른 음성 타이핑 시스템의 음성 인식 과정을 개략적으로 보여주는 도면으로서, 도 2에 도시된 바와 같이 먼저, 마이크를 통하여 타이핑하고자 하는 문장 또는 단어의 음성을 입력받아서 컴퓨터에 이진수로 저장한 다음, 신호 처리 과정을 거쳐서 인식에 필요한 특징 파라미터를 추출한다.FIG. 2 is a diagram schematically illustrating a speech recognition process of a speech typing system according to the present invention. First, as shown in FIG. 2, a speech of a sentence or a word to be typed through a microphone is received and stored in binary on a computer. Next, feature parameters necessary for recognition are extracted through signal processing.

그리고, 특징 파라미터로부터 단어 사전에 정의된 발음 방법과 가장 유사한 단어를 찾아내어 음성인식 결과가 나오면, 사용자는 잘못 인식된 단어를 교정하도록 한다.Then, if the word most similar to the pronunciation method defined in the word dictionary is found from the feature parameter and the voice recognition result is obtained, the user may correct the wrongly recognized word.

이 사용자에 의 한 교정에 의해 새로운 단어가 발견되면 단어 사전에 새로운 단어를 등록하고 그에 대한 언어모델을 갱신하게 됨으로써, 차후에 인식할 때는 이를 사용하여 인식 성능을 지속적으로 향상시킨다.When a new word is found by the user's correction, the new word is registered in the word dictionary and the language model is updated, so that it is used later to improve the recognition performance.

이와 같이 인식 결과후 사용자에 의한 교정은 여러 실시예별로 나누어 수행할 수 있다.As described above, the calibration by the user after the recognition result may be divided into various embodiments.

이에 대해 실시예별로 설명하면 다음과 같다.This will be described with reference to embodiments.

제 1 실시예First embodiment

제 1 실시예는 도 3에 도시된 바와 같이, 마이크를 통하여 음성이 입력되면(101) 입력된 음성의 특징 파라미터를 추출한 후(102) 단어사전에 정의된 발음방법과 가장 유사한 단어를 선정하여(103) 음성 인식을 완료한다.(104)In the first embodiment, as shown in FIG. 3, when a voice is input through a microphone (101), a feature parameter of the input voice is extracted (102), and a word most similar to a pronunciation method defined in a word dictionary is selected ( 103) Complete speech recognition (104).

음성 인식이 완료된 후, 사용자에 의해 잘못 인식된 단어를 교정하게 되는데,(105) 교정시 새로운 단어가 발견되면 단어 사전에 새로운 단어 및 발음방법(발음기호 또는 음소열)을 등록한다.(106)After speech recognition is completed, a word that is incorrectly recognized by the user is corrected (105). If a new word is found during the correction, a new word and a pronunciation method (pronounced phoneme or phoneme string) are registered in the word dictionary. (106)

제 2 실시예Second embodiment

제 2 실시예는 도 4에 도시된 바와 같이, 제 1 실시예의 진행 과정 중에 사용자에 의한 교정시(105), 새로운 단어가 발견되면 새로운 단어와 발음방법을 단어 사전에 정의할 것인지를 사용자에게 질문하여 추가 여부를 선택하도록 한다.(107)As shown in FIG. 4, the second embodiment asks the user whether to define a new word and a pronunciation method in the word dictionary when a new word is found 105 when the user corrects it during the process of the first embodiment. (107)

만일 사용자가 발견된 새로운 단어가 필요없는 단어인 경우에는 단어 사전에 등록하지 않고, 필요한 단어인 경우에는 단어 사전에 새로운 단어를 등록한다.(106)If the user does not need a new word found, the user does not register the word in the word dictionary. If the user requires the new word, the new word is registered in the word dictionary.

제 3 실시예Third embodiment

제 3 실시예는 도 5에 도시된 바와 같이, 제 1 실시예의 진행 과정 중에 사용자에 의한 교정시(105), 새로운 단어가 발견되면 새로운 단어와 발음방법을 단어 사전에 정의하고(106), 그의 언어모델을 갱신하여 재학습시킨다.(108)In the third embodiment, as shown in FIG. 5, when a user corrects a new word 105 during the process of the first embodiment, new words and pronunciation methods are defined in the word dictionary (106). Re-learn language model (108)

제 4 실시예Fourth embodiment

제 4 실시예는 도 6에 도시된 바와 같이, 제 1 실시예의 진행 과정 중에 사용자에 의한 교정시(105), 새로운 단어가 발견되면 새로운 단어와 발음방법을 단어 사전에 정의하고(106), 그 언어모델을 갱신할 것인지를 사용자에게 질문하여 갱신 여부를 선택하도록 한다.(109)As illustrated in FIG. 6, the fourth embodiment defines a new word and a pronunciation method in a word dictionary when a new word is found 105 when a user corrects it during the process of the first embodiment 105. The user is asked whether or not to update the language model to select whether or not to update (109).

이 경우에도 제 2 실시예와 마찬가지로 만일 사용자가 언어모델의 갱신이 필요없을 경우에는 갱신을 하지 않고, 언어모델의 갱신이 필요한 경우에는 언어모델을 갱신하여 재학습시킨다.(108)Also in this case, as in the second embodiment, if the user does not need to update the language model, the user is not updated. If the user needs to update the language model, the language model is updated and relearned.

제 5 실시예Fifth Embodiment

제 5 실시예는 도 7에 도시된 바와 같이, 제 1 실시예의 진행 과정 중에 사용자에 의한 교정시(105), 새로운 단어가 발견되면 새로운 단어와 발음방법을 단어 사전에 정의할 것인지를 사용자에게 질문하여 추가 여부를 선택하도록 한다.(107)As shown in FIG. 7, the fifth embodiment asks the user whether to define a new word and a pronunciation method in the word dictionary when a new word is found 105 when the user corrects it during the process of the first embodiment. (107)

그리고, 등록된 새로운 단어에 대한 언어모델을 갱신할 것인지를 사용자에게 질문하여 갱신 여부를 선택하도록 한다.(109)Then, the user is asked whether to update the language model for the registered new word so as to select whether to update it.

만일 사용자가 언어모델의 갱신이 필요없을 경우에는 갱신을 하지 않고, 언어모델의 갱신이 필요한 경우에는 언어모델을 갱신하여 재학습시킨다.(108)If the user does not need to update the language model, the user does not update. If the user needs to update the language model, the user updates the language model and relearns.

상기 실시예들을 이용하여 사용중 등록되지 않은 미등록어의 추가에 의하여 지속적으로 인식 성능을 높일 수 있다.By using the above embodiments, the recognition performance can be continuously improved by the addition of unregistered words not registered during use.

본 발명에 따른 음성 타이핑 시스템의 음성 인식 방법에 있어서는 다음과 같은 효과가 있다.The voice recognition method of the voice typing system according to the present invention has the following effects.

본 발명의 음성 인식 방법은 사용자가 필요로 하는 단어들만을 계속적으로 추가 등록시킴으로써, 인식 성능을 지속적으로 향상시켜주고 인식 시간이 짧아진다.The speech recognition method of the present invention continuously improves the recognition performance and shortens the recognition time by continuously registering only words required by the user.

Claims

In the voice recognition method of the voice typing system to extract the features of the input voice when the voice is input through the microphone and select the word most similar to the pronunciation method defined in the word dictionary to complete the voice recognition,

A first step of correcting a misrecognized word after speech recognition is completed;

And a second step of registering a new word and a pronunciation method in the word dictionary at the time of correction.

The method of claim 1, further comprising: inquiring a user of whether to register a new word and a pronunciation method in a word dictionary when a new word is found after the first step.