KR19990010211A

KR19990010211A - Apparatus and method for character recognition using speech synthesis

Info

Publication number: KR19990010211A
Application number: KR1019970032909A
Authority: KR
Inventors: 김경희
Original assignee: 윤종용; 삼성전자 주식회사
Priority date: 1997-07-15
Filing date: 1997-07-15
Publication date: 1999-02-05

Abstract

본 발명이 이루고자하는 기술적과제는 문자 인식 장치 및 방법에 관한 것으로서, 특히 음성 합성을 이용한 온라인 문자 인식 장치에 관한 것이다. 본 발명의 목적을 위하여 문자를 검지하는 문자 입력부, 상기 문자 입력부로 부터 발생하는 문자 정보로 특징값들을 생성하는 전처리부, 상기 전처리부에서 생성된 특징값들로 인식한 문자를 해당하는 코드로 변환하는 문자 인식부, 상기 문자 인식부로부터 생성된 코드 각각에 대한 음소를 구분하고 각 음소에 대한 음성 신호를 결합시켜 문자에 해당하는 음성 신호를 발생하는 음성 합성부, 상기 문자 인식부 및 음성 합성부에서 출력되는 문자 및 음성 신호를 각각 화면 및 스피커로 출력하는 출력부를 포함한다. 문자 인식 장치에서 인식 결과가 화면과 음성으로 동시에 출력되기 때문에 오인된 문자를 신속하게 찾아내어 재입력할 수 있는 이점이 있다.The present invention relates to a character recognition apparatus and method, and more particularly, to an online character recognition apparatus using speech synthesis. Character input unit for detecting a character for the purpose of the present invention, a pre-processing unit for generating feature values from the character information generated from the character input unit, converting the characters recognized by the feature values generated in the pre-processing unit to the corresponding code A speech synthesizer for distinguishing a phoneme for each code generated from the character recognizer and combining a voice signal for each phoneme to generate a voice signal corresponding to a character, the text recognizer and the voice synthesizer And an output unit for outputting text and voice signals output from the screen and the speaker, respectively. Since the recognition result is output on the screen and the voice at the same time in the text recognition device, there is an advantage in that it is possible to quickly find and re-enter a mistaken character.

Description

Apparatus and method for character recognition using speech synthesis

본 발명이 이루고자하는 기술적과제는 문자 인식 장치 및 방법에 관한 것으로서, 특히 음성 합성을 이용한 온라인 문자 인식 장치 및 방법에 관한 것이다. 일반적으로 문자 인식은 대상 문자가 인쇄된 것인가와 필기된 것인가에 따라 인쇄체 문자 인식과 필기체 문자 인식으로 분류되며 필기체 문자 인식은 다시 문자 영상 정보를 얻는 방식에 따라 온라인 인식과 오프라인 인식으로 나누어진다. 도 1은 일반적인 온라인 문자 인식 장치를 보이는 블록도이며, 문서를 입력하는 문자 입력부(100), 전처리부(120), 텍스트 파일을 출력하는 문자 인식부(120)로 구성된다. 문자 입력부(100)는 문자 입력을 감지한다. 즉 예를 들면 테블릿(Tablet)이나 LCD(Liquid Crystal Display)상에 입력 도구를 이용하여 문자를 입력하면 그 정보를 저장한다. 전처리부(110)는 입력된 정보를 이용하여 인식기에서 사용할 특징값들을 생성한다. 문자 인식부(120)는 생성된 특징값을 이용하여 인식한 문자를 해당하는 코드로 출력한다.The present invention relates to a character recognition apparatus and method, and more particularly, to an online character recognition apparatus and method using speech synthesis. Generally, character recognition is classified into printed character recognition and handwritten character recognition according to whether the target character is printed and handwritten, and handwritten character recognition is divided into online recognition and offline recognition according to a method of obtaining character image information. 1 is a block diagram illustrating a general online character recognition apparatus, and includes a character input unit 100 for inputting a document, a preprocessor 120, and a character recognition unit 120 for outputting a text file. The text input unit 100 detects a text input. For example, when a character is input using an input tool on a tablet or a liquid crystal display (LCD), the information is stored. The preprocessor 110 generates feature values to be used in the recognizer by using the input information. The character recognition unit 120 outputs the recognized character using a generated feature value as a corresponding code.

도 2는 도 1의 장치의 외관도이며, 210은 문자 출력부이며, 220은 문자 입력부이며, 230은 문자 입력을 위한 펜이다. 도 2는 온라인 문자를 입력할 수 있는 문자 입력 영역(220)과 그 결과를 출력해주는 출력 화면(210)으로 구성되며, 문자 입력과 인식기가 동시에 수행되면서 그 인식 결과가 텍스트로 화면상(210)의 특정 영역에 출력된다. 이와 같이 도 1 및 도 2의 장치는 사용자가 여러개의 문자를 입력하는 동안에는 사용자의 시선이 입력 영역(220)에 집중되므로 출력되는 텍스트를 동시에 볼 수 없다. 예를 들어 필기자가 문자 인식 기술이라는 문자를 입력하고자 할 때 식 자가 시로 오인식된 경우, 필기자는 모든 입력이 끝난 시점에서 문자 출력부(210)에 표시된 시자를 보게되므로 그 때서야 수정이 가능하고, 틀린 글자를 놓쳐버릴 수도 있다. 따라서 입력이 모두 끝난 시점에서 인식 결과를 보고 인식이 틀린 문자를 찾아내어 재입력하게 되는 문제점이 있다.2 is an external view of the apparatus of FIG. 1, 210 is a character output unit, 220 is a character input unit, and 230 is a pen for character input. 2 is composed of a character input area 220 for inputting online characters and an output screen 210 for outputting the results. The character input and recognizer are simultaneously performed, and the recognition result is displayed on the text 210 as text. Is output in a specific area. As described above, in the apparatus of FIGS. 1 and 2, since the user's eyes are concentrated on the input area 220 while the user inputs a plurality of characters, the output text cannot be simultaneously viewed. For example, when a writer tries to input a character called a character recognition technology, if a writer is mistaken for poetry, the writer sees the poetry displayed on the character output unit 210 at the end of all the inputs. You may miss the wrong letter. Therefore, there is a problem in that the recognition result is found at the point in which all inputs are completed, and the characters that are recognized are not recognized and input again.

본 발명이 이루고자하는 기술적과제는 음성 합성을 이용해 문자 인식을 실시간으로 처리하여 오인된 문자에 대한 재입력이나 수정을 신속하게 해주는 장치에 관한 것이다.The technical problem to be achieved by the present invention relates to a device for quickly re-entering or correcting a mistaken character by processing the character recognition in real time using speech synthesis.

본 발명이 이루고자하는 다른 기술적과제는 음성 합성을 이용해 문자 인식을 실시간으로 처리하여 오인된 문자에 대한 재입력이나 수정을 신속하게 해주는 방법에 관한 것이다.Another technical problem to be achieved by the present invention relates to a method for quickly re-entering or correcting a mistaken character by processing character recognition in real time using speech synthesis.

도 1은 일반적인 온라인 문자 인식 장치를 보이는 블록도이다.1 is a block diagram showing a general online character recognition apparatus.

도 2는 도 1의 장치의 외관도이다.2 is an external view of the apparatus of FIG. 1.

도 3은 본 발명에 따른 음성 합성을 이용한 문자 인식 장치를 보이는 블록도이다.3 is a block diagram showing a character recognition apparatus using speech synthesis according to the present invention.

도 4는 도 3 장치의 외관도이다.4 is an external view of the apparatus of FIG. 3.

도 5는 도 3의 장치를 이용한 문자 인식 방법을 보이는 흐름도이다.5 is a flowchart illustrating a character recognition method using the apparatus of FIG. 3.

상기의 기술적 과제를 해결하기 위하여 본 발명은 문자 인식 장치에 있어서, 문자를 검지하는 문자 입력부; 상기 문자 입력부로 부터 발생하는 문자 정보로 특징값들을 생성하는 전처리부; 상기 전처리부에서 생성된 특징값들로 인식한 문자를 해당하는 코드로 변환하는 문자 인식부; 상기 문자 인식부로부터 생성된 코드 각각에 대한 음소를 구분하고 각 음소에 대한 음성 신호를 결합시켜 문자에 해당하는 음성 신호를 발생하는 음성 합성부; 상기 문자 인식부 및 음성 합성부에서 출력되는 문자 및 음성 신호를 각각 화면 및 스피커로 출력하는 출력부를 포함하는 것을 특징으로 하는 음성 합성을 이용한 문자 인식 장치이다.In order to solve the above technical problem, the present invention provides a character recognition apparatus, comprising: a character input unit detecting a character; A preprocessor for generating feature values from character information generated from the character input unit; A character recognition unit for converting characters recognized by the feature values generated by the preprocessor into a corresponding code; A speech synthesizer for classifying phonemes for each code generated from the character recognizer and combining a voice signal for each phoneme to generate a voice signal corresponding to a character; And a text output unit for outputting text and voice signals output from the text recognition unit and the voice synthesis unit to a screen and a speaker, respectively.

상기의 다른 기술적 과제를 해결하기 위하여 본 발명은 문자 인식 방법에 있어서, 문자 입력인가를 판단하는 과정; 상기 과정에서 문자 입력이면 그 문자를 해당 문자 코드로 변환하는 과정; 상기 과정에서 변환된 코드가 소정 형태의 문자 코드이면 해당되는 폰트를 화면의 소정 위치에 출력하고, 문자 코드를 음성 합성하여 음성 신호로 출력하는 과정을 포함하는 것을 특징으로 하는 음성 합성을 이용한 문자 인식 방법이다.According to another aspect of the present invention, there is provided a character recognition method, including: determining whether a character is input; Converting the character into a corresponding character code if the character is input in the process; If the code converted in the process is a character code of a predetermined type, character recognition using speech synthesis, comprising: outputting a corresponding font to a predetermined position on the screen, and synthesizing the character code into a voice signal Way.

이하에서 첨부된 도면을 첨부하여 본 발명의 바람직한 실시예를 상세히 설명한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 3은 본 발명에 따른 음성 합성을 이용한 문자 인식 장치를 보이는 블록도이며, 문자 입력부(300), 전처리부(310), 문자 인식부(320), 음성 합성부(340), 출력부(350)로 구성된다.3 is a block diagram showing a character recognition apparatus using the speech synthesis according to the present invention, a character input unit 300, pre-processing unit 310, character recognition unit 320, speech synthesis unit 340, output unit 350 It is composed of

도 4는 도 3 장치의 외관도이며, 410은 문자 출력부이며, 420은 음성 출력부 또는 스피커이며, 430은 문자 입력부이며, 440은 문자 입력을 위한 펜이다.4 is an external view of the apparatus of FIG. 3, 410 is a text output unit, 420 is a voice output unit or a speaker, 430 is a text input unit, and 440 is a pen for text input.

도 3 내지 도 5를 참조하여 본 발명의 작용 및 효과를 설명하면 다음과 같다.Referring to Figures 3 to 5 will be described the operation and effect of the present invention.

먼저, 사용자는 문자 입력부(300)에 입력하고자하는 문자를 입력한다(510과정). 즉 도 4에 도시된 바와 같이 펜(440)을 이용하여 입력 화면 영역(430)에 문자 인시이라는 문자를 입력한다. 다음 한 문자의 입력이 완료되었으면(520과정) 전처리부(310)는 입력된 문자 정보를 이용하여 인식기에서 사용할 특징값들을 생성한다. 문자 인식부(120)는 생성된 특징값을 이용하여 인식한 문자를 해당하는 코드(텍스트)로 변환하며(530과정), 문자 인식 방법은 원형 비교 방법, 통계적 방법, 구조적 방법이 통상적으로 이용되고 있다. 또한 문자 인식부(320)는 해당하는 문자 코드가 2350자로 이루어지는 한글 완성형 코드중의 하나라고 판단되면(540과정) 그 문자 코드에 해당하는 폰트를 도 4의 화면상(410)의 적절한 위치에 출력시킨다. 따라서 출력부(350)인 도 4의 화면상(410)에는 문자 인시라는 텍스트 파일이 출력하게 되며, 동시에 음성 합성부(340)는 문자 인식부에서 출력되는 문자 코드를 분석하여 그 문자를 이루는 음소를 구분하고 각 음소에 대한 신호를 생성하며, 각 음소에 대한 신호를 결합하여 그 문자에 해당하는 음성 신호를 합성하고(550과정) 스피커(420)로 출력한다. 음성 합성은 컴퓨터를 이용하여 일반적인 문서를 음성으로 변환하는 공지 기술이며, 문서의 문법적 구조를 분석하는 언어 처리 단계, 분석된 문서 구조에 의해 사람의 목소리와 같은 운율을 생성하는 단계, 그리고 생성된 운율에 따라서 저장된 음성 데이터 베이스의 기본 단위들을 모아 합성음을 생성하는 파형 합성 단계로 이루어진다. 본 발명의 응용으로서 언어 처리 단계에서 문자를 구성하는 음소를 찾으며, 운율 생성 과정에서 그 음소에 해당하는 운율을 생성하며, 파형 합성 과정에서 그 음소들을 조합하여 한 음절에 적합한 파형을 합성한다. 도 4의 문자 입력 화면(430)에서 문자를 입력함과 동시에 출력부(350)인 화면(410) 및 스피커(420)에 문자 및 그 문자에 해당하는 음성 신호가 동시에 출력된다(560과정). 예를 들면, 사용자가 문자 인식 기술을 문자 입력부(430)에 입력하기 위하여 한 문자씩 입력한다고 할 때, 문자 인까지를 입력했을 때 음성 합성부(340)에서 문자인을 출력하고 다음식자를 입력했을 때 그 인식 결과가 식이 아니고 시로 오인식되어 음성합성부(340)에서 시를 출력해주면 사용자는 즉시 식이 잘못 인식되었다는 것을 인식하여 다음 문자를 입력하기전에 오인식된 문자를 수정할 수있다. 상기 동작들은 한 문자의 입력 후 다음 문자의 입력이 끝나기전에 모든 과정을 수행하여 문자 입력과 그 인식 결과의 출력이 실시간으로 이루어지게 된다.First, the user inputs a character to be input to the character input unit 300 (step 510). That is, as shown in FIG. 4, a character called text input is input to the input screen area 430 using the pen 440. When the input of the next character is completed (step 520), the preprocessor 310 generates feature values to be used by the recognizer by using the input character information. The character recognition unit 120 converts the recognized character into a corresponding code (text) using the generated feature value (step 530), and the character recognition method is a circular comparison method, a statistical method, or a structural method. have. In addition, when it is determined that the corresponding character code is one of the Hangul complete codes consisting of 2350 characters (step 540), the character recognition unit 320 outputs a font corresponding to the character code to an appropriate position on the screen 410 of FIG. Let's do it. Accordingly, a text file called character in is output on the screen 410 of FIG. 4, which is an output unit 350. At the same time, the voice synthesizer 340 analyzes a character code output from the character recognition unit and forms a phoneme. To generate a signal for each phoneme, combine a signal for each phoneme, synthesize a voice signal corresponding to the character (step 550), and output the signal to the speaker 420. Speech synthesis is a well-known technique for converting a general document into speech using a computer, a language processing step of analyzing a grammatical structure of the document, a step of generating a rhyme like a human voice by the analyzed document structure, and a generated rhyme In accordance with the present invention consists of a waveform synthesis step of generating the synthesized sound by collecting the basic units of the stored voice database. As an application of the present invention, a phoneme constituting a character is found in a language processing step, a rhyme corresponding to the phoneme is generated in a rhyme generation process, and a waveform suitable for a syllable is synthesized by combining the phonemes in a waveform synthesis process. While the character is input on the character input screen 430 of FIG. 4, the character and a voice signal corresponding to the character are simultaneously output to the screen 410 and the speaker 420, which are output units 350 (560). For example, when a user inputs a character recognition technology one by one in order to input the character recognition technology into the character input unit 430, when the character input is inputted, the voice synthesizer 340 outputs the character in and inputs the next expression. When the recognition result is not an expression but is incorrectly recognized as a poem, and the voice synthesis unit 340 outputs the poem, the user may immediately recognize that the expression is incorrectly recognized and correct the misrecognized character before inputting the next character. The operations are performed after the input of one character and before the completion of the input of the next character, so that the character input and the output of the recognition result are performed in real time.

본 발명은 상술한 실시예에 한정되지 않으며, 본 발명의 사상내에서 당업자에 의한 변형이 가능함은 물론이다. 즉, PDA(개인 정보 단말 장치)나 HPC(핸드 헬드 퍼스널 컴퓨터)에 부착되어 있는 온라인 문자 입력 장치(예를 들면 LCD 나 디지타이저)와 스피커를 이용하여 본 발명의 문자 인식과 음성 합성을 이용할 수 있다.The present invention is not limited to the above-described embodiment, and of course, modifications may be made by those skilled in the art within the spirit of the present invention. That is, the character recognition and speech synthesis of the present invention can be used by using an online text input device (for example, an LCD or a digitizer) and a speaker attached to a personal digital assistant (PDA) or a handheld personal computer (HPC). .

상술한 바와 같이 본 발명에 의하면, 문자 인식 장치에서 인식 결과가 화면과 음성으로 동시에 출력되기 때문에 오인된 문자를 신속하게 찾아내어 재입력할 수 있는 이점이 있다.As described above, according to the present invention, since the recognition result is simultaneously output to the screen and the voice in the character recognition apparatus, there is an advantage of quickly finding and re-entering a mistaken character.

Claims

In the character recognition apparatus,

A character input unit detecting a character;

A preprocessor for generating feature values from character information generated from the character input unit;

A character recognition unit for converting characters recognized by the feature values generated by the preprocessor into a corresponding code;

A speech synthesizer for distinguishing phonemes for each code generated from the character recognizer and combining voice signals for each phoneme to generate a voice signal corresponding to a character.

And a text output unit for outputting text and voice signals output from the text recognition unit and the speech synthesis unit to a screen and a speaker, respectively.

In the character recognition method,

A first step of determining whether the character is input;

A second step of converting the character into a corresponding character code if the character is input in the first step;

And a third process of outputting a corresponding font at a predetermined position on the screen if the code converted in the second process is a character code of a predetermined form, and synthesizing the character code into a voice signal. Character recognition method using.

The method of claim 2, wherein the speech synthesis of the third process comprises: a language processing process of finding a phoneme constituting a text;

A rhyme generating process of generating a rhyme corresponding to a phoneme of the process;

Character recognition method using speech synthesis, characterized in that for combining the phonemes of the process to synthesize a waveform suitable for one syllable.