KR20040023498A

KR20040023498A - Device and method for recognizing character image in picture screen

Info

Publication number: KR20040023498A
Application number: KR1020030053137A
Authority: KR
Inventors: 임채환; 서정욱
Original assignee: 삼성전자주식회사
Priority date: 2002-09-11
Filing date: 2003-07-31
Publication date: 2004-03-18
Also published as: KR100593986B1; DE60322486D1

Abstract

PURPOSE: An apparatus and method for recognizing a letter image in an image screen is provided to recognize a letter image, contained in an image screen, through a device having an image processing function. CONSTITUTION: A control part(101) controls the overall operations of a portable terminal that recognizes a document. A memory(103) stores a program to control the operations of the portable terminal, and executes a function to temporarily store the data generated in the execution of the program. A camera(107) executes a function to photograph the image of a document. An image processing part(109) converts an image screen, photographed from the camera(107), into digital data. An audio processing part(111) processes a voice for a modification of a text having an error generated in the execution of the program. An input part(113), a touch screen module, can be embodied with a display part(115) in a single unit. The input part(113) generates desired text and function key input using a stylus pen. A key input part(105) comprises function keys to set up various function of the portable terminal.

Description

DEVICE AND METHOD FOR RECOGNIZING CHARACTER IMAGE IN PICTURE SCREEN}

본 발명은 글자 인식장치 및 방법에 관한 것으로, 특히 영상화면 내에 포함된 글자 이미지를 인식할 수 있는 장치 및 방법에 관한 것이다.The present invention relates to an apparatus and method for character recognition, and more particularly, to an apparatus and method for recognizing a character image included in an image screen.

현재 휴대용 단말장치는 고속의 데이터를 전송할 수 있는 구조로 변환하고 있다. 특히 IMT 2000 규격의 이동통신 망을 구현하면, 상기 소형 휴대 단말장치를 이용하여 고속의 데이터 통신을 구현할 수 있다. 상기 데이터 통신을 수행하는 휴대용 단말장치에서 처리할 수 있는 데이터들은 패킷 데이터 및 영상 데이터들이 될 수 있다.Currently, portable terminal devices are converting to a structure capable of transmitting high speed data. In particular, if a mobile communication network of the IMT 2000 standard is implemented, high-speed data communication may be implemented using the small portable terminal device. Data that can be processed by the portable terminal device performing the data communication may be packet data and image data.

그러나 상기와 같은 휴대용 단말장치는 정보를 입력할 때 제한된 키패드를 사용하게 되므로, 문자를 입력하는 방법이 복잡하다. 즉, 상기 휴대용 단말장치는 소프트 방식의 키보드 입력장치를 사용하므로, 문자 입력 속도가 느리고 매우 번거롭다. 따라서 상기 소프트 방식의 키보드의 단점을 해결하기 위해 문자 및(또는 음성) 인식장치를 사용할 수 있다. 그러나 상기 필기체 문자 인식장치를 사용하는 경우에도 문장 인식 및 입력속도가 느린 문제점을 가지게 된다. 그리고 음성인식 장치를 사용하는 경우에도 제한된 단어의 인식만이 가능한 문제점을 가진다. 따라서 문자 입력을 위한 별도의 하드웨어 키보드 입력장치를 사용하는 방법을 사용할 수 있다. 그러나 상기와 같은 방법은 휴대용 단말장치에 문자 입력을 위한 부가적인 장치를 필요로 하는 문제점을 가진다.However, since the portable terminal device uses a limited keypad when inputting information, a method of inputting a text is complicated. That is, since the portable terminal device uses a soft keyboard input device, character input speed is slow and very cumbersome. Therefore, a text and / or voice recognition device may be used to solve the disadvantage of the soft keyboard. However, even when the handwritten character recognition apparatus is used, the sentence recognition and input speed are slow. And even when using a speech recognition device has a problem that only limited word recognition. Therefore, a method of using a separate hardware keyboard input device for character input can be used. However, the above method has a problem in that the portable terminal requires an additional device for text input.

또한 상기 휴대용 단말장치에 영상처리 기능을 부가하여 복합 기능을 가지도록 하는 추세이다. 상기와 같은 휴대용 영상처리장치는 영상을 촬영하는 카메라와, 상기 카메라로부터 촬영된 영상신호를 표시하는 표시부를 구비한다. 여기서 상기카메라는 CCD나 CMOS 센서를 사용할 수 있으며, 표시부는 LCD를 사용할 수 있다. 또한 상기 카메라 장치의 소형화에 따라 상기 영상을 촬영하는 장치는 점점 소형화되는 추세에 있다. 이때 상기 휴대 단말장치는 영상 화면을 촬영하여 동화상(moving picture) 및 정지화상(still picture)로 표시할 수 있으며, 또한 촬영된 화면을 전송할 수도 있다. 그러나 상기 카메라를 부착한 휴대용 단말장치는 영상화면을 촬영, 저장, 관리 및 전송하는 단순한 기능만을 수행한다.In addition, an image processing function is added to the portable terminal device to have a complex function. The portable image processing apparatus as described above includes a camera for photographing an image and a display unit for displaying an image signal photographed from the camera. The camera may use a CCD or a CMOS sensor, and the display unit may use an LCD. In addition, with the miniaturization of the camera device, the device for capturing the image is becoming more and more miniaturized. In this case, the portable terminal device can capture a video screen to display a moving picture and a still picture, and can also transmit the captured screen. However, the portable terminal device with the camera performs only a simple function of capturing, storing, managing, and transmitting a video screen.

상기 휴대용 단말장치는 휴대용 전화기 및(또는) PDA가 될 수 있다. 종래의 상기 PDA 단말기의 문자 입력 방법은 펜(stylus pen)으로 소프트 키패드를 이용하여 입력하거나, 필기체 인식을 통하는 입력하는 방법 밖에 없다. 그러나 상기와 같은 종래의 문자 입력방법들은 대량의 문자를 입력해야 할 경우, 느린 속도와 번거로운 작업 때문에 사용자들에게 큰 불편을 초래하고 있다. 특히, 많은 정보를 포함하고 있는 명함의 내용들을 PDA에 입력하고자 할 경우에는 많은 시간과 노력이 필요하게 된다. 따라서, 새로운 입력방법이나 사용자의 편리성을 강조할 수 있는 방법의 개발이 필수적이다.The portable terminal device may be a portable telephone and / or a PDA. The conventional character input method of the PDA terminal is a pen (stylus pen) input using a soft keypad, or through a handwriting recognition method. However, the conventional character input methods as described above are causing great inconvenience to users because of the slow speed and cumbersome work when a large amount of characters need to be input. In particular, if you want to input the contents of the business card containing a lot of information to the PDA requires a lot of time and effort. Therefore, it is essential to develop a new input method or a method that can emphasize user convenience.

따라서 본 발명의 목적은 영상처리 기능을 가지는 장치에서 영상화면 내에 포함된 글자 이미지를 인식할 수 있는 장치 및 방법을 제공함에 있다.Accordingly, an object of the present invention is to provide an apparatus and method for recognizing a character image included in an image screen in an apparatus having an image processing function.

본 발명의 다른 목적은 영상처리 기능을 가지는 장치에서 영상화면 내에 포함된 글자 이미지를 인식하여 설정된 문서 형태로 저장할 수 있는 장치 및 방법을제공함에 있다.Another object of the present invention is to provide an apparatus and method for recognizing a text image included in an image screen and storing the text image included in an image screen in a device having an image processing function.

본 발명의 또 다른 목적은 영상처리 기능을 가지는 장치에서 영상화면 내에 포함된 글자 이미지를 인식하고, 상기 인식된 글자 정보의 오류를 수정할 수 있는 장치 및 방법을 제공함에 있다.Another object of the present invention is to provide an apparatus and method for recognizing a character image included in an image screen and correcting an error of the recognized character information in an apparatus having an image processing function.

본 발명의 또 다른 목적은 영상처리 기능을 가지는 장치에서 영상화면 내에 포함된 글자 이미지 영역을 추출하여 인식할 수 있는 형태로 전처리할 수 있는 장치 및 방법을 제공함에 있다.Another object of the present invention is to provide an apparatus and method for preprocessing a text image area included in an image screen in a device having an image processing function and recognizing it.

본 발명의 또 다른 목적은 카메라를 구비하는 단말장치에서 카메라를 이용하여 문서를 촬영하고, 촬영된 문서의 이미지를 문자로 인식하며, 인식 문자의 오류 수정시 후보 문자테이블을 이용하여 수정할 수 있는 장치 및 방법을 제공함에 있다.Still another object of the present invention is a device capable of capturing a document using a camera in a terminal device having a camera, recognizing an image of the photographed document as a character, and using a candidate character table when correcting an error of a recognized character. And providing a method.

본 발명의 또 다른 목적은 카메라 및 음성인식기를 구비하는 단말장치에서 카메라를 이용하여 문서를 촬영하고, 촬영된 문서의 이미지를 문자로 인식하며, 인식 문자의 오류 수정시 음성 인식기를 통해 오류가 발생된 문자를 수정할 수 있는 장치 및 방법을 제공함에 있다.Still another object of the present invention is to shoot a document using a camera in the terminal device having a camera and a voice recognizer, recognize the image of the photographed document as a character, an error occurs through the voice recognizer when correcting the error of the recognition character It is an object of the present invention to provide an apparatus and a method for modifying a given character.

본 발명의 또 다른 목적은 카메라를 구비하는 단말장치에서 카메라를 이용하여 문서를 촬영하고, 촬영된 문서의 이미지를 문자로 인식하며, 인식 문자의 오류 수정시 사용자가 입력하는 필기체 문자를 인식하여 오류가 발생된 문자를 수정할 수 있는 장치 및 방법을 제공함에 있다.Still another object of the present invention is to photograph a document using a camera in a terminal device having a camera, recognize the image of the photographed document as a character, and recognize a handwritten character input by a user when correcting an error of a recognized character. It is to provide an apparatus and method for modifying the generated character.

본 발명의 또 다른 목적은 카메라를 구비하는 단말장치에서 카메라를 이용하여 문서를 촬영하고, 촬영된 문서의 이미지를 문자로 인식하며, 인식문자의 오류 수정시 소프트 키패드를 이용하여 오류가 발생된 문자를 수정할 수 있는 장치 및 방법을 제공함에 있다.Still another object of the present invention is to shoot a document using a camera in the terminal device having a camera, recognize the image of the photographed document as a character, when the error of the recognition character by using a soft keypad character It is to provide an apparatus and method for modifying the.

따라서 본 발명의 목적은 카메라를 구비하는 휴대 통신장치에서 카메라를 이용하여 폰북 정보가 포함되는 문서를 촬영하고, 촬영된 문서의 이미지 내에 포함된 폰북 정보를 인식하여 저장할 수 있는 장치 및 방법을 제공함에 있다.Accordingly, an object of the present invention is to provide a device and method for photographing a document including phonebook information using a camera in a portable communication device having a camera, and recognizing and storing phonebook information included in an image of the photographed document. have.

도 1은 본 발명의 실시예에 따라 문서 인식을 위한 장치의 구성을 도시하는 도면1 is a diagram illustrating a configuration of an apparatus for document recognition according to an embodiment of the present invention.

도 2는 본 발명의 제1실시예에 따라 문서를 인식하는 절차를 도시하는 도면2 is a diagram illustrating a procedure of recognizing a document according to the first embodiment of the present invention.

도 3은 도 2의 문서를 촬영하는 과정의 동작 절차를 도시하는 도면3 is a diagram illustrating an operation procedure of a process of photographing a document of FIG.

도 4는 본 발명의 실시예에 따른 영상문서 처리장치에서 전처리부121의 구성을 도시하는 도면4 is a diagram illustrating a configuration of a preprocessor 121 in an image document processing apparatus according to an embodiment of the present invention.

도 5는 도 4에서 영상화면의 블러링 여부를 판정하는 영상블러링판정부의 구성을 도시하는 도면FIG. 5 is a diagram illustrating a configuration of an image blurring determiner for determining whether an image screen is blurred in FIG. 4. FIG.

도 6은 도 5의 블록분류부 구성을 도시하는 도면FIG. 6 is a diagram illustrating a block classification unit configuration of FIG. 5. FIG.

도 7은 도 5의 글자블록에너지계산부의 구성을 도시하는 도면FIG. 7 is a diagram illustrating a configuration of a letter block energy calculation unit of FIG. 5. FIG.

도 8은 영상블러링판정부가 영상화면의 블러링 여부를 판단하는 절차를 도시하는 도면8 is a diagram illustrating a procedure of determining whether an image blur is blurred by an image blurring decision unit;

도 9는 도 4에서 영상화면 내의 피사체의 기울기를 보정하는 피사체기울기보정부의 구성을 도시하는 도면FIG. 9 is a diagram illustrating a configuration of a subject tilt correction unit for correcting a tilt of a subject in an image screen in FIG. 4; FIG.

도 10은 도 9에서의 상기 이진화부의 구성을 도시하는 도면FIG. 10 is a diagram illustrating a configuration of the binarization unit in FIG. 9. FIG.

도 11은 상기 도 10에서 블록분류부의 상세 구성을 도시하는 도면FIG. 11 is a diagram showing the detailed configuration of the block classification unit in FIG. 10; FIG.

도 12는 도 9의 회전각결정부에서 스트라이프의 회전각을 계산하는 절차를 설명하기 위한 도면FIG. 12 is a diagram for describing a procedure of calculating a rotation angle of a stripe in the rotation angle determination unit of FIG. 9. FIG.

도 13은 피사체기울기보정부에서 영상화면 내의 피사체의 기울기를 보정하는 절차를 도시하는 도면FIG. 13 is a diagram illustrating a procedure of correcting a tilt of a subject in an image screen in a subject tilt correction unit; FIG.

도 14는 도 4에서 영상화면 내의 글자영역을 확장하는 영상영역확장부의 구성을 도시하는 도면FIG. 14 is a diagram illustrating a configuration of an image region expansion unit which extends a character region in an image screen in FIG. 4; FIG.

도 15는 도 14의 블록분류부의 구성을 도시하는 도면15 is a diagram illustrating the configuration of a block classification unit of FIG. 14.

도 16은 영상영역확장부에서 글자영역의 확장 절차를 설명하기 위한 도면FIG. 16 is a diagram for explaining a text area expansion procedure in an image area expansion unit; FIG.

도 17a는 도 4의 잡음제거부에서 인접 화소들을 도시하는 도면이고, 도 17b는 잡음제거부에서 중심화소의 4 방향을 표시하는 도시하는 도면FIG. 17A is a diagram illustrating adjacent pixels in the noise canceling unit of FIG. 4, and FIG. 17B is a diagram illustrating four directions of center pixels in the noise removing unit.

도 18a - 도 18d 는 도 4의 잡음제거부에서 각 방향 영역의 화소 쌍들을 도시하는 도면18A to 18D are diagrams illustrating pixel pairs of respective directional regions in the noise canceling unit of FIG. 4.

도 19는 도 4에서 영상 이진화부의 구성을 도시하는 도면19 is a diagram illustrating a configuration of an image binarization unit in FIG. 4.

도 20은 도 19에서 블록분류부의 상세 구성을 도시하는 도면20 is a diagram illustrating a detailed configuration of a block classification unit in FIG. 19.

도 21은 도 19에서 에지향상부의 상세 구성을 도시하는 도면FIG. 21 is a diagram showing a detailed configuration of the edge enhancement unit in FIG. 19; FIG.

도 22는 에지향상부에서 글자블록의 에지 향상을 수행하는 동작을 설명하기 위한 도면FIG. 22 is a diagram for explaining an operation of performing edge enhancement of a letter block in an edge enhancement unit; FIG.

도 23은 쿼드래틱 필터를 사용하는 영상 이진화부에서 영상화면을 이진화하는 절차를 설명하기 위한 도면FIG. 23 is a diagram for describing a procedure of binarizing an image screen in an image binarization unit using a quadratic filter; FIG.

도 24a 및 도 24b는 도 2의 문자 인식 및 저장항목 선택 과정의 동작 절차를 도시하는 도면24A and 24B illustrate an operation procedure of the character recognition and storage item selection process of FIG. 2.

도 25a 및 도 25b는 도 2의 오류 수정 과정의 동작 절차를 도시하는 도면25A and 25B illustrate an operation procedure of the error correction process of FIG. 2.

도 26a - 도 26e는 문서를 촬영하는 과정의 표시부 상태를 도시하는 도면26A to 26E are views showing the display unit state in the process of photographing a document;

도 27a - 도 27b는 문자 인식 및 저장항목 선택 과정의 표시부 상태를 도시하는 도면27A to 27B are diagrams showing a display unit state of a character recognition and storage item selection process;

도 28a - 도 28d는 오류 수정 과정의 표시부 상태를 도시하는 도면28A-28D show a display state of an error correction process;

도 29a - 도 29b는 오류 수정 후의 표시부 상태를 도시하는 도면29A to 29B show the display unit state after error correction

도 30은 본 발명의 제2실시예에 따라 문서를 인식하는 절차를 도시하는 도면30 is a diagram illustrating a procedure of recognizing a document according to the second embodiment of the present invention.

도 31은 도 30의 문서를 촬영하는 과정의 동작 절차를 도시하는 도면FIG. 31 is a view illustrating an operation procedure of a process of photographing a document of FIG. 30;

도 32는 도 30의 문자 인식, 저장항목 선택 및 저장 과정의 동작 절차를 도시하는 도면32 is a flowchart illustrating an operation procedure of a character recognition, storage item selection, and storage process of FIG. 30;

도 33은 도 32의 저장항목 선택 과정의 상세 동작 절차를 도시하는 도면33 is a diagram illustrating a detailed operation procedure of a storage item selection process of FIG. 32;

도 34a - 도 34d는 도 30의 오류 수정 과정의 동작 절차를 도시하는 도면34A to 34D illustrate an operation procedure of the error correction process of FIG. 30.

이하 본 발명의 바람직한 실시예들의 상세한 설명이 첨부된 도면들을 참조하여 설명될 것이다. 도면들 중 동일한 구성들은 가능한 한 어느 곳에서든지 동일한 부호들을 나타내고 있음을 유의하여야 한다.DETAILED DESCRIPTION A detailed description of preferred embodiments of the present invention will now be described with reference to the accompanying drawings. It should be noted that the same components in the figures represent the same numerals wherever possible.

하기 설명에서 명함, PDA, 영상화면의 크기 등과 같은 특정 상세들이 본 발명의 보다 전반적인 이해를 제공하기 위해 나타나 있다. 이들 특정 상세들 없이 또한 이들의 변형에 의해서도 본 발명이 용이하게 실시될 수 있다는 것은 이 기술분야에서 통상의 지식을 가진 자에게 자명할 것이다.In the following description, specific details such as business card, PDA, size of video screen, etc. are shown to provide a more general understanding of the present invention. It will be apparent to one of ordinary skill in the art that the present invention may be readily practiced without these specific details and also by their modifications.

본 발명의 실시예는 영상처리 기능을 가지는 단말장치에서 영상화면에 포함된 글자 이미지를 인식하여 문서로 저장한다. 즉, 본 발명의 실시예는 영상화면의 글자 이미지를 인식하여 문서화할 때, 사용자의 문자 입력방법(usability)을 향상시키고, 사용자의 입력장치 작동을 최소화하며, 문자 인식시 오인식된 문자를 음성인식으로 간편하게 수정하며, 대량의 문장 입력을 가능하게 한다.According to an embodiment of the present invention, a terminal device having an image processing function recognizes a text image included in an image screen and stores the text image as a document. That is, the embodiment of the present invention improves the user's character input method (usability), minimizes the operation of the user's input device, and recognizes a character that is misrecognized when the character is recognized when the character image of the video screen is recognized and documented. Easy to modify and enable a large amount of sentence input.

이를 위하여 본 발명의 실시예에 따른 단말장치는 영상화면 내에 포함된 글자이미지를 인식하기 전에 영상화면 내의 글자 이미지를 전처리하는 영상 전처리 기능, 상기 전처리된 영상화면에서 글자 이미지를 인식하는 인식 기능, 상기 인식된 글자 정보들 중에서 오인식된 글자 정보를 수정하는 기능을 가진다. 그리고 상기 오인식된 글자정보를 수정하기 위하여, 본 발명의 실시예에 따른 단말장치는 오 인식된 글자를 음성으로 수정하기 위한 음성인식 기능, 오인식된 글자를 사용자의 필기체 입력에 의해 수정하기 위한 필기체 인식 기능, 오 인식된 글자와 유사한 글자들을 가지는 후보 글자들을 표시하여 선택할 수 있도록 하는 기능 및(또는) 소프트키 패드를 구비하여 오인식된 글자를 입력하는 기능 등을 가지는 수정 사용자 인터페이스를 가질 수 있다.To this end, the terminal device according to an embodiment of the present invention, the image preprocessing function for preprocessing the character image in the image screen before recognizing the character image included in the image screen, the recognition function for recognizing the character image in the pre-processed image screen, It has a function of correcting misrecognized character information among the recognized character information. And in order to correct the misrecognized character information, the terminal device according to an embodiment of the present invention is a voice recognition function for correcting a misrecognized character by voice, handwriting recognition for correcting a misrecognized character by the user's handwriting input It may have a modified user interface having a function, a function for displaying and selecting candidate letters having letters similar to a misrecognized character, and / or a function for inputting a misrecognized character with a softkey pad.

상기한 바와 같인 본 발명의 실시예에 따른 단말장치는 상기와 같은 구성들을 구비하며, 영상화면에 포함된 글자 이미지를 인식하여 문서로 편집하여 저장한다. 이때 상기 영상화면의 문서는 일정 형태를 가지는 문서가 될 수 있으며, 아닐 수도 있다. 또한 상기 단말장치는 카메라를 구비하며, 카메라를 통해 인식하고자 하는 문서를 촬영하며, 촬영된 영상화면 내에 포함된 글자 이미지를 인식하는 장치가 될 수 있다. 또한 상기 단말장치는 통신 기능을 가지는 장치로써, 수신되는 영상화면 내에 포함된 글자 이미지를 인식하여 문서로 저장할 수도 있다. 또한 상기 단말장치는 외부 입력장치를 구비하고, 외부 입력장치에서 입력되는 영상화면을 저장한 후, 저장된 영상화면에 포함된 글자 이미지를 인식하여 문서로 저장할 수 있다.The terminal device according to the embodiment of the present invention as described above has the above-described configuration, recognizes the text image included in the video screen, and edits and stores it as a document. In this case, the document of the video screen may be a document having a certain form, or may not be. In addition, the terminal device may be a device having a camera, photographing a document to be recognized through the camera, and recognizing a character image included in the captured video screen. In addition, the terminal device is a device having a communication function, and may recognize a text image included in a received video screen and store it as a document. The terminal device may include an external input device, store an image screen input by the external input device, and recognize a text image included in the stored image screen and store the image as a document.

상기와 같은 특징을 구현하기 위하여, 상기 단말장치의 카메라는 미세 초점 조절이 가능한 카메라를 사용하는 것이 바람직하다. 이는 인식을 위한 문서 이미지의 해상도(resolution)를 향상시키기 위함이다.In order to implement the above features, the camera of the terminal device preferably uses a camera capable of fine focus adjustment. This is to improve the resolution of the document image for recognition.

상기한 바와 같이 문자인식을 위한 영상 전처리 기능은 하드웨어 스펙의 지원 및 소프트웨어 스펙의 지원이 필요한다. 먼저 하드웨어 스펙은 촬영되는 영상 이미지의 초점 미세조절 기능 지원되고, 초점 조절시 최적 초점 상태 확인을 위한 디스플레이 속도 확보(최소 12 fps)되어야 하며, 초점 조절시 최적 초점 상태 확인을 위한 최대 화면 크기 확보되어야 하고, 문자 인식을 위한 최상의 화질 획득을 위한 우수한 렌즈가 확보되어야 한다. 그리고 소프트웨어 전처리는 핀 홀 렌즈에 의한 카메라 영상의 원형화 왜곡 복원되어야 하고, 근접 촬영에 따른 대상 영상의 초점 불일치의 왜곡이 복원될 수 있어야 하며, 문자 크기 및 초점 조절의 문자인식의 적합성을 판단할 수 있어야 하고, 대상체의 비수직 투영에 의한 영상의 왜곡이 복원될 수 있어야 하며, 복잡한 조명 조건에서 대상체 문자의 이진화가 가능하여야 한다.As described above, the image preprocessing function for character recognition requires support of hardware specifications and software specifications. First of all, the hardware spec should support the focus fine adjustment function of the captured video image, obtain the display speed (minimum 12 fps) to check the optimal focus state when adjusting the focus, and secure the maximum screen size to check the optimal focus state when adjusting the focus. In addition, an excellent lens for obtaining the best image quality for character recognition should be secured. The software preprocessing should restore the circular distortion of the camera image by the pinhole lens, the distortion of the focus mismatch of the target image due to the close-up photography can be restored, and determine the suitability of the character recognition of the character size and focus adjustment. It should be possible to be able to recover the distortion of the image due to the non-vertical projection of the object, and to be able to binarize the object character under complex lighting conditions.

또한 상기와 같이 카메라로부터 촬영된 문서의 이미지를 문자로 인식하기 위한 문자 인식 기능이 추가되어야 한다. 상기 문자 인식을 위해서는 라이트(Light) 문자를 인식하는 엔진을 개발하고, 상기 엔진 크기 소정의 데이터 량 보다 작아야 하며(엔진크기< 5Mbytes), 인식 대상 문자는 다양한 폰트의 인쇄체 영문, 한글 및 숫자가 되어야 하며, 최소 인식률(문자 당 최소 80 % 인식률)이 확보되어야 함다. 또한 오류 수정시 음성으로 오류 문자를 수정할 수 있도록 음성인식 모듈을 구비하는 것이 바람직하며, 상기 문자인식과 음성인식에 의한 문장입력 사용자 인터페이스를 구현하여야 한다.In addition, a text recognition function for recognizing an image of a document photographed from a camera as a text should be added as described above. In order to recognize the character, an engine for recognizing light characters should be developed, and the engine size should be smaller than a predetermined amount of data (engine size <5 Mbytes), and the characters to be recognized should be printed fonts of various fonts in English, Korean, and numbers. The minimum recognition rate (at least 80% recognition rate per character) should be secured. In addition, it is preferable to include a voice recognition module to correct the error text by the voice at the time of error correction, it is necessary to implement a sentence input user interface by the text recognition and voice recognition.

여기서 상기한 바와 같이 본 발명의 실시예에 따른 단말장치는 PDA라 가정하고, 촬영되는 문서는 명함이라 가정한다. 그리고 상기 명함의 이미지를 촬영하고, 촬영된 이미지를 전처리하여 글자 이미지를 추출하며, 상기 추출된 글자 이미지를 인식하여 글자 데이터로 변환하고, 상기 인식된 글자 데이터의 오류를 수정한 후 폰북에 저장하는 예를 들어 설명한다.As described above, it is assumed that the terminal device according to the embodiment of the present invention is a PDA, and that the document to be photographed is a business card. And photographing the image of the business card, preprocessing the photographed image to extract a text image, recognizing the extracted text image, converting it into text data, correcting the error of the recognized text data, and storing it in the phone book. An example is demonstrated.

상기와 같은 배경 속에서 본 발명의 실시예에서는 다양한 입력장치(문자인식기, 음성인식기, 펜, 키보드)들을 이용하여 명함과 같이 많은 정보를 포함하고 있는 문서를 손쉽게 PDA에 입력할 수 있도록 하기와 같은 방식을 제안한다.In the background of the present invention in the embodiment of the present invention using a variety of input devices (character recognizer, voice recognizer, pen, keyboard), such as to easily enter a document containing a lot of information, such as a business card to the PDA such as Suggest a way.

먼저, PDA에 내장된 카메라를 이용하여 명함이나 문서를 촬영한 다음, 전처리부를 통해 영상화면 내의 글자 이미지를 선명하게 처리한 후, 문자인식기를 통해 전처리된 글자 이미지들을 인식하여 글자 데이터로 변환한다. 변환된 내용 중에서 오류가 있는 부분은 stylus pen, 음성인식, 필기체인식, 소프트 키패드(soft keypad) 등 다양한 수단을 이용하여 수정한 다음, 원하는 영역의 데이터베이스에 저장한다.First, a business card or document is photographed using a camera embedded in the PDA, and then the text image in the image screen is sharply processed by the preprocessor, and the text image is recognized by the text recognizer and converted into text data. In the converted contents, an error part is corrected by various means such as stylus pen, voice recognition, handwriting recognition, soft keypad, and stored in a database of a desired area.

도 1은 본 발명의 실시예에 따라 영상화면 내에 포함된 글자 이미지를 인식하는 휴대용 단말장치의 구성을 도시하는 도면이다.1 is a diagram illustrating a configuration of a portable terminal device recognizing a character image included in an image screen according to an embodiment of the present invention.

상기 도 1을 참조하면, 제어부101은 문서를 인식하는 휴대 단말장치의 전반적인 동작을 제어한다. 메모리103은 휴대 단말장치의 동작을 제어하는 프로그램을저장하며, 또한 프로그램 수행 중에 발생되는 데이터를 일시 저장하는 기능을 수행한다.Referring to FIG. 1, the controller 101 controls the overall operation of a portable terminal device recognizing a document. The memory 103 stores a program for controlling the operation of the portable terminal device, and also temporarily stores data generated during program execution.

카메라107은 문서의 이미지를 촬영하는 기능을 수행한다. 여기서 상기 문서는 명함이 될 수 있다. 상기 카메라는 전처리 기능을 수행할 수 있는 카메라가 될 수 있다. 즉, 상기 카메라는 초점 및 거리를 조정할 수 있는 카메라로써, 촬영되는 영상 화면의 화질을 높일 수 있다. 영상처리부109는 상기 카메라109로부터 촬영되는 영상 화면을 디지털 데이터로 변환 및 압축 부호화하는 기능을 수행할 수 잇다. 상기 영상처리부109는 본원 출원인에 의해 선출원된 대한민국 특허출원 제 2002-22844호의 영상처리부를 사용할 수 있다.The camera 107 performs a function of photographing an image of a document. The document may be a business card. The camera may be a camera capable of performing a preprocessing function. That is, the camera is a camera that can adjust the focus and distance, and can improve the quality of the captured video screen. The image processor 109 may perform a function of converting and compressing and encoding an image screen captured by the camera 109 into digital data. The image processor 109 may use the image processor of Korean Patent Application No. 2002-22844 filed by the applicant of the present application.

오디오처리부111은 프로그램 수행 중에 오류가 발생된 문자의 수정을 위한 음성을 처리하고, 또한 프로그램 수행 중의 안내 및 수행 결과를 표시하기 위한 음성을 처리한다. 입력부113은 터치 스크린 모듈(touch screen module)로써 표시부115와 일체형으로 구현될 수 있다.The audio processor 111 processes a voice for correcting a character in which an error has occurred during program execution, and also processes a voice for displaying a guide and a result of execution during program execution. The input unit 113 may be implemented integrally with the display unit 115 as a touch screen module.

상기 입력부113은 스타일러스 펜(stylus pen)을 사용하여 원하는 문자 및 기능키 입력을 발생할 수 있다. 상기 입력부113은 본 발명의 실시예에 따라 촬영된 문서를 인식하기 위한 사진찍기키, 문서인식키, 확인키, 수정키, 완료키, 삽입키, 삭제키등을 구비한다. 상기 촬영키는 표시되는 영상이미지를 저장하는 키이다. 상기 문서인식키는 현재 표시중인 영상화면의 문자이미지를 인식하기 위한 키이다. 이때 상기 인식되는 문서들이 각각 고유한 형태를 가지는 경우, 이에 대응되는 각각 대응되는 문서인식키를 구비할 수 있다. 예를들면, 명함과 같이 특정 정보들이기록되어 있는 문서인 경우, 이를 이용하여 휴대 단말장치의 폰북을 작성할 수 있다. 이런 경우 명함인식키를 구비하고, 상기 명함에 기록된 공통적인 정보들을 선택할 수 있는 항목들을 테이블화하여 저장하고 있으면, 휴대 단말장치의 폰북을 용이하게 작성할 수 있다. 상기 확인키는 선택된 항목의 문자 데이터를 등록하는 키이다. 수정키는 선택된 항목의 문자 데이터를 수정하기 위한 키이다. 삽입키는 선택된 문장의 특정 위치에 커서가 존재할 때에 그 위치에 문자를 추가할 수 있는 키이다. 즉, 문자 인식결과 문장에서 글자가 하나 이상 빠져 있을 경우, 커서 앞에 새로운 문자를 삽입할 수 있는 키이다. 삭제키는 선택된 항목의 문자데이터를 삭제하기 위한 키이다. 완료키는 현재의 동작을 종료하는 키이다.The input unit 113 may generate a desired character and function key input using a stylus pen. The input unit 113 includes a photographing key, a document recognition key, a confirmation key, a correction key, a completion key, an insertion key, a deletion key, and the like for recognizing a document taken according to an embodiment of the present invention. The photographing key is a key for storing a displayed image image. The document recognition key is a key for recognizing a text image of a video screen currently being displayed. In this case, when each of the recognized documents has a unique form, a corresponding document recognition key may be provided. For example, in the case of a document in which specific information is recorded, such as a business card, a phone book of the portable terminal device can be created using the document. In this case, if a business card recognition key is provided and items for selecting common information recorded on the business card are stored in a table, the phone book of the portable terminal device can be easily created. The confirmation key is a key for registering character data of the selected item. The modify key is a key for correcting the character data of the selected item. The insertion key is a key for adding a character at a position when a cursor exists at a specific position of the selected sentence. That is, when one or more letters are missing from the text recognition result sentence, a new character can be inserted before the cursor. The delete key is a key for deleting the character data of the selected item. The completion key is a key for ending the current operation.

키입력부105는 휴대 단말장치의 각종 기능을 설정하기 위한 기능 키들을 구비한다. 여기서 상기 키입력부105에 위치될 수 있는 기능키들은 음성 인식부123을 구동하기 위한 음성인식키, 카메라107의 전처리 동작을 제어하기 위한 초점 및 거리 조정키, 상기 카메라107에서 출력되는 프리뷰 화면을 저장하기 위한 사진찍기키 들로 구성될 수 있다. 물론 상기 키입력부105의 키들은 상기 입력부113도 가질 수 있다. 본 발명의 실시예에서는 설명의 편의를 위해 모든 기능키들이 입력부113에 배열되어 있다고 가정하여 설명하기로 한다. 여기서 상기 카메라107, 입력부113, 오디오처리부111 및 키입력부105는 모두 입력장치로 동작될 수 있다.The key input unit 105 includes function keys for setting various functions of the portable terminal device. Here, the function keys that may be located in the key input unit 105 may store a voice recognition key for driving the voice recognition unit 123, a focus and distance adjustment key for controlling a preprocessing operation of the camera 107, and a preview screen output from the camera 107. It can be composed of photo taking keys to Of course, the keys of the key input unit 105 may also have the input unit 113. In the embodiment of the present invention, it is assumed that all the function keys are arranged in the input unit 113 for convenience of description. The camera 107, the input unit 113, the audio processor 111, and the key input unit 105 may all operate as input devices.

표시부115는 본 발명의 실시예에 의해 수행되는 문자 인식 과정의 상태를 표시하는 기능을 수행한다. 즉, 상기 표시부115는 상기 카메라107에서 촬영되는 영상 이미지를 프리뷰 화면으로 표시하고, 문자 인식모드 수행시 문자 인식된 결과를 표시하며, 오류 수정 결과를 표시할 수 있는 영역을 구비한다. 상기 표시부115는 제1표시영역71, 제3표시영역73 및 제2표시영역75을 구비한다. 상기 제1표시영역71은 인식된 문자 데이터를 표시하는 영역이며, 제2영역73은 선택된 항목의 문자데이터 또는 오류 수정을 위한 후보 문자데이터들을 표시하는 영역이고, 제2표시영역75는 항목정보 또는 오류 수정을 위해 입력되는 필기체 문자를 표시하는 영역 또는(및) 문자를 입력할 수 있는 소프트 키패드 등이 될 수 있다. 또한 상기 제1표시영역 - 제2표시영역75의 특정 영역들에는 본 발명의 실시예에 따라 문자 인식시 각종 명령들을 입력하기 위한 메뉴 정보들을 표시하는 영역들을 적절하게 위치될 수 있다.The display unit 115 displays a state of the character recognition process performed by the embodiment of the present invention. That is, the display unit 115 includes an area for displaying a video image captured by the camera 107 on a preview screen, displaying a result of character recognition when performing a character recognition mode, and displaying an error correction result. The display unit 115 includes a first display area 71, a third display area 73, and a second display area 75. The first display area 71 is an area for displaying recognized character data, the second area 73 is an area for displaying text data of a selected item or candidate character data for error correction, and the second display area 75 is for item information or It may be an area for displaying handwritten characters inputted for error correction, and / or a soft keypad for inputting characters. In addition, specific areas of the first display area to the second display area 75 may be appropriately positioned to display menu information for inputting various commands during character recognition according to an embodiment of the present invention.

상기 제어부101은 상기 입력부113으로부터 문자인식키가 입력될 때, 전처리부121 및 문자인식부123을 구동한다.The controller 101 drives the preprocessing unit 121 and the character recognition unit 123 when a character recognition key is input from the input unit 113.

먼저 상기 전처리부121은 상기 표시부115에 표시되고 있는 영상화면을 입력하여 전처리 동작을 수행한다. 상기 전처리부121은 입력되는 영상화면이 인식가능한 해상도를 가지고 있는 화면인지 아니면 블러드 영상화면(blurred picture)인지를 판정한다. 그리고 상기 판정된 결과를 제어부101에 통보하며, 이때 블러드 영상화면으로 판정된 경우 상기 제어부101은 상기 표시부115에 인식 불가를 표시한다. 그러나 상기 영상화면이 블러드 영상화면이 아니면 상기 전처리부121은 영상화면 내의 피사체의 기울기 여부를 검사하여 피사체의 기울기를 보정하고, 상기 영상화면 내에 이미지가 존재하지 않는 영역을 제거한 후 이미지가 존재하는 영역을 확장하며, 또한 상기 영상화면 내의 잡음들을 제거하고, 상기 영상화면 내의 화소들을 이진화하여 출력한다. 여기서 상기 전처리부121은 상기와 같은 영상블러링판정, 피사체 기울기 보정, 영상영역 확장, 잡음 제거 및 영상 이진화 기능을 모두 구비할 수 있으며, 또한 상기 기능들 중에 일부만 구비할 수도 있다.First, the preprocessor 121 performs a preprocessing operation by inputting an image screen displayed on the display 115. The preprocessing unit 121 determines whether the input video screen has a recognizable resolution or a blurred picture. The controller 101 notifies the controller 101 of the determined result, and if it is determined that the blood video screen is determined, the controller 101 displays the recognition impossible on the display unit 115. However, if the video screen is not a blood video screen, the preprocessing unit 121 checks whether the subject is inclined in the video screen to correct the inclination of the subject, and removes an area where the image does not exist in the video screen, and then displays the image. And removes noises in the video screen, and outputs the binarized pixels in the video screen. The preprocessing unit 121 may include all of the above-described image blurring determination, subject tilt correction, image region expansion, noise reduction, and image binarization, and may include only some of the above functions.

두 번째로 상기 문자인식부123은 상기 전처리부121에서 전처리된 영상화면의 글자 이미지들을 인식하여 글자 데이터로 변환하는 기능을 수행한다. 그리고 상기 인식된 문자데이터들은 상기 제어부101의 제어하에 표시부115의 제1표시영역71에 표시된다. 여기서 상기 문자인식부123은 인쇄체인식모듈과 필기체인식모듈로 구성될 수 있다. 상기 인쇄체인식모듈은 상기 전처리부121에서 전처리된 영상화면 내의 문자 이미지를 인식하는 모듈로 사용될 수 있으며, 상기 필기체인식모듈은 오류수정시 입력되는 필기체 문자 이미지를 인식하는 모듈로 사용될 수 있다. 또한 상기 문자인식부123은 소프트 키패드에 의해 입력되는 소프트 키 데이터들을 문자로 변환할 수 있는 모듈을 구비할 수도 있다.Second, the character recognition unit 123 performs a function of recognizing character images of the image screen preprocessed by the preprocessor 121 and converting the character images into character data. The recognized character data are displayed in the first display area 71 of the display unit 115 under the control of the controller 101. The character recognition unit 123 may be composed of a print recognition module and a handwriting recognition module. The print recognition module may be used as a module for recognizing a character image in an image screen preprocessed by the preprocessing unit 121, and the handwriting recognition module may be used as a module for recognizing a handwritten character image input when an error is corrected. In addition, the character recognition unit 123 may include a module capable of converting soft key data input by a soft keypad into a character.

상기 제어부101은 상기 입력부113으로부터 오류수정키가 입력될 때 인식오류처리부125를 구동한다. 상기 인식오류처리부125는 상기 표시부115의 제1표시영역71에 표시되고 있는 문자 데이터들 중에서 선택된 문자를 음성인식부129 또는 문자인식부123에서 수정한 문자로 보정하여 인식시 발생된 문자의 인식 오류를 수정한다.The controller 101 drives the recognition error processor 125 when an error correction key is input from the input unit 113. The recognition error processor 125 corrects a character selected from the text data displayed on the first display area 71 of the display unit 115 with a character modified by the voice recognition unit 129 or the character recognition unit 123 and recognizes a character generated when the character is recognized. Modify

상기 제어부101은 오류수정키가 입력된 상태에서 상기 음성인식키가 수신될 때 음성인식부129를 구동한다. 상기 음성인식부129는 상기 오디오처리부111에서 수신되는 음성을 인식한다. 이때 음성인식은 오류 수정을 원하는 항목을 선택하는 음성 및 선택된 항목에서 발생된 오류 문자를 수정하고자하는 음성이 될 수 있다. 그리고 상기 음성인식부129는 상기 오류 문자를 수정하기 위해 입력되는 음성신호를문자데이터로 변환하는 기능을 수행한다. 음성합성부127은 음성출력모드시 상기 제어부101의 제어하에 상기 인식된 결과의 문자 데이터들을 음성으로 합성하여 출력하는 기능을 수행한다. 즉, 상기 음성인식부129는 상기 제어부101의 제어하에 인식 과정에서 오인식된 문자데이타를 수정하기 위한 음성신호를 문자데이타로 변환하여 오류를 수정하는 기능을 수행하며, 음성합성부127은 상기 제어부101의 제어하에 인식 완료 후 저장하고자 하는 문자 데이터들을 음성으로 합성하여 출력하는 기능을 수행한다.The controller 101 drives the voice recognition unit 129 when the voice recognition key is received while the error correction key is input. The voice recognition unit 129 recognizes the voice received from the audio processor 111. In this case, the voice recognition may be a voice for selecting an item for which error correction is desired and a voice for correcting an error text generated from the selected item. The voice recognition unit 129 converts the input voice signal into text data to correct the error text. The voice synthesizer 127 performs a function of synthesizing the text data of the recognized result into voice under the control of the controller 101 and outputting the voice. That is, the voice recognition unit 129 performs a function of correcting an error by converting a voice signal for correcting a text data which is misrecognized in the recognition process under the control of the control unit 101 into text data, and the voice synthesis unit 127 controls the control unit 101. After the recognition is completed under the control of the text data to be stored by synthesizing the voice function.

데이터베이스131은 상기 제어부101의 제어하에 상기 인식된 문자 데이터들을 각 항목에 대응되게 저장한다. 여기서 상기 인식한 문서가 명함인 경우 상기 데이터 베이스131은 폰북(phone book) 또는 주소록 메모리가 될 수 있다. 사용자 인터페이스부133은 상기 단말장치에 연결되는 사용자 데이터들을 휴대 단말장치와 인터페이스하는 기능을 수행한다.The database 131 stores the recognized text data corresponding to each item under the control of the controller 101. When the recognized document is a business card, the database 131 may be a phone book or an address book memory. The user interface unit 133 functions to interface user data connected to the terminal device with the portable terminal device.

상기한 바와 같이 본 발명의 실시예에 따른 휴대 단말장치는 카메라 모듈, 입력(touch screen 포함) 모듈, 오디오 모듈, 전처리 모듈, 문자인식 모듈, 인식오류수정 모듈, 음성인식 및 합성모듈, 사용자 인터페이스 모듈 등으로 구성된다. 상기와 같은 구성을 가지는 단말장치의 크게 6개의 부분으로 구성이 되는데, 이는 영상 이미지 입력과정, 전처리과정, 문자인식과정, 저장항목 선택과정, 오류수정과정 및 저장과정 등이 될 수 있다. 상기 각각의 과정들은 서로 유기적으로 연결이 되어 있으며, 내부적으로 매우 다양한 방법으로 구현이 가능하다. 상기 각 과정에서 사용되는 주요 모듈을 간단히 살펴보면, 영상 이미지 과정은 카메라 모듈에 의해 수행되고, 영상화면의 전처리 과정은 전처리모듈에 의해 수행되며, 문자인식과정은 문자인식 모듈 및 음성인식 모듈에 수행되고, 저장항목선택과정은 음성인식 모듈 및 입력(stylus pen) 모듈에 의해 수행되며, 오류수정과정은 음성인식 모듈, 입력(stylus pen) 모듈, 필기체 인식 모듈, 소프트키 인식모듈에 수행되고, 저장과정은 데이터베이스 모듈에 의해 수행된다.As described above, the portable terminal device according to the embodiment of the present invention includes a camera module, an input (including a touch screen) module, an audio module, a preprocessing module, a character recognition module, a recognition error correction module, a voice recognition and synthesis module, and a user interface module. And the like. It consists of six parts of the terminal device having the above configuration, which may be a video image input process, preprocessing process, character recognition process, storage item selection process, error correction process and storage process. Each of the above processes are organically connected to each other, and can be implemented in a variety of ways internally. Looking briefly at the main modules used in each process, the image image process is performed by the camera module, the pre-processing process of the image screen is performed by the pre-processing module, the character recognition process is performed to the character recognition module and the voice recognition module , The storage item selection process is performed by the voice recognition module and the stylus pen module, and the error correction process is performed by the voice recognition module, the stylus pen module, the handwriting recognition module, and the softkey recognition module. Is performed by the database module.

상기 문서 인식 절차는 여러 가지 방법으로 구현될 수 있다. 본 발명의 제1실시예에서는 도 2에 도시된 바와 같이 문서를 영상화면으로 촬영한 후, 촬영된 영상화면 내의 문자 이미지들을 전처리하고, 상기 전처리된 영상화면 내의 문자 이미지들을 인식하며, 상기 인식된 문자들의 항목을 선택하여 오류를 수정한 후 저장하는 동작을 순차적으로 구현하는 방법을 도시하고 있다. 그리고 본 발명의 제2실시예에서는 문서를 영상화면으로 촬영한 후 상기 촬영된 영상화면의 문자 이미지들을 전처리하고, 상기 전처리된 문자이미지들을 문자데이타로 인식한 후, 오류 수정할 항목을 선택하여 오인식된 문자를 수정하여 저장하며, 이후 다음의 오류 수정할 항목을 선택하는 동작을 반복하여 구현하는 방법을 도시하고 있다. 여기서 상기 제1 및 제2실시예에서는 인식하고자 하는 문서를 영상화면으로 촬영하는 것을 예로들어 설명하고 있으나, 상기 인식하고자 하는 문서를 영상화면으로 촬영하는 동작은 생략될 수 있다. 즉, 상기 단말장치는 저장하고 있는 영상화면 또는 외부 장치로부터 입력되는 영상화면을 선택한 후, 문자인식 기능을 선택하는 경우에도 상기와 같은 절차를 수행하면서 문서 이미지를 인식할 수 있다.The document recognition procedure may be implemented in various ways. According to the first embodiment of the present invention, as shown in FIG. 2, after photographing a document on an image screen, preprocessing character images in the captured image screen, and recognizing character images in the preprocessed image screen, A method of sequentially implementing an operation of correcting and storing an error by selecting an item of characters is illustrated. In the second embodiment of the present invention, after photographing a document on an image screen, preprocessing the text images of the photographed video screen, recognizing the preprocessed text images as text data, and selecting an item to be corrected for error are misrecognized. A method of modifying and storing a character and then repeatedly implementing an operation of selecting an item to correct an error is illustrated. Here, in the first and second embodiments, the document to be recognized on the image screen has been described as an example, but the operation of capturing the document on the image screen to be recognized may be omitted. That is, the terminal device may recognize the document image while performing the above procedure even when the character screen function is selected after selecting the video screen stored or the video screen input from the external device.

또한 이하의 설명에서, 본 발명의 제1실시예의 오류 수정은 오류 수정할 항목을 선택 및 오류를 수정하는 동작을 문서인식기를 통해 구현하는 방법으로 설명될 것이며, 제2실시예의 오류 수정은 오류를 수정할 항목의 선택 및 오류 수정 동작을 문서인식기 및 음성인식기를 통해 구현하는 방법으로 설명될 것이다. 그러나 상기 제1실시예에서도 상기 문서인식기 및 음성인식기를 사용하여 문서를 인식 및 수정할 수 있으며, 제2실시예에서도 문서인식기만을 사용하여 문서를 인식 및 수정할 수 있다.In addition, in the following description, the error correction of the first embodiment of the present invention will be described as a method of implementing an operation of selecting an item to be corrected and correcting an error through a document recognizer, and the error correction of the second embodiment corrects the error. The selection and error correction operations of the item will be described as a method of implementing the document recognizer and the speech recognizer. However, in the first embodiment, the document may be recognized and modified using the document recognizer and the voice recognizer, and in the second embodiment, the document may be recognized and modified using only the document recognizer.

먼저 본 발명의 제1실시예에 따른 문서 인식 절차를 설명한다.First, a document recognition procedure according to the first embodiment of the present invention will be described.

도 2는 본 발명의 제1실시예에 따른 문서 인식 절차를 도시하는 흐름도이다.2 is a flowchart showing a document recognition procedure according to the first embodiment of the present invention.

상기 도 2를 참조하면, 상기 제어부101은 200 과정에서 문서를 촬영하여 인식하고자 하는 문서의 영상화면을 발생한다. 이때 상기 카메라107에서 촬영되는 영상 이미지는 영상처리부109에서 디지털 데이터로 변환되고, 표시부115에 표시된다. 이때 상기 피사체의 촬영은 동영상 또는 정지영상으로 촬영할 수 있다. 이때 동영상으로 촬영하는 경우, 상기 제어부101은 표시부105에 프리뷰 화면 형태로 촬영되는 동영상신호를 표시하며, 상기 표시부115에 동영상신호가 표시되는 상태에서 사진찍기 명령이 발생되면, 상기 제어부101은 상기 표시부115에 표시되고 있는 영상화면을 정지화상으로 표시하고, 상기 표시부115에 표시되고 있는 영상 이미지를 메모리103의 화상메모리 영역에 저장한다. 이때 상기 표시부115에 표시되는 영상 이미지는 일반 영상화면이 될 수 있고, 또한 명함 등과 같은 문자 이미지를 가지는 영상화면이 될 수 있다. 본 발명의 실시예에서는 상기 촬영된 영상화면이 문자 이미지를 가지는 영상화면이라고 가정한다.Referring to FIG. 2, the controller 101 generates a video screen of a document to be photographed and recognized in step 200. In this case, the video image photographed by the camera 107 is converted into digital data by the image processor 109 and displayed on the display unit 115. In this case, the photographing of the subject may be taken as a moving picture or a still image. In this case, when shooting with a video, the controller 101 displays a video signal photographed in the form of a preview screen on the display unit 105, and when a picture taking command is generated while the video signal is displayed on the display unit 115, the controller 101 displays the display unit. The video image displayed on the screen is displayed as a still image, and the video image displayed on the display section 115 is stored in the image memory area of the memory 103. In this case, the video image displayed on the display unit 115 may be a general video screen, or may be a video screen having a character image such as a business card. In the embodiment of the present invention, it is assumed that the captured video screen is a video screen having a text image.

또한 상기 200과정은 생략될 수 있다. 이런 경우, 사용자는 저장하고 있는 영상화면 또는 입력되는 영상화면을 표시부115에 표시할 수 있다. 즉, 사용자는 문서를 인식하고자 하는 경우, 저장하고 있는 영상화면 또는 입력되는 영상화면을 선택하여 표시하고, 상기 영상화면이 표시되는 상태에서 문자 인식 과정을 수행할 수도 있다.In addition, the 200 process may be omitted. In this case, the user may display the stored video screen or the input video screen on the display unit 115. That is, when a user wants to recognize a document, the user may select and display a stored video screen or an input video screen, and perform a character recognition process while the video screen is displayed.

상기와 같은 상태에서 단말장치의 사용자가 입력부113을 통해 현재 표시중인 영상화면 내에서 포함된 문자 이미지들을 인식하기 위한 키 명령을 발생하면서 본 발명의 실시예에 따른 문서 인식 절차가 수행된다. 여기서 상기 인식을 위한 키는 문서인식키라고 가정한다. 그러면 상기 제어부101은 상기 문서입력키의 입력에 의해 210 과정에서 상기 전처리부121을 구동한다. 상기 전처리부121은 영상 블러링 판정부, 피사체 기울기 보정부, 영상영역 확장부, 잡음제거부, 영상이진화부 등으로 구성될 수 있다. 이에 대한 상세한 동작은 도 4에서 설명될 것이다.In the above state, while the user of the terminal device generates a key command for recognizing the text images included in the image screen currently displayed through the input unit 113, a document recognition procedure according to an embodiment of the present invention is performed. It is assumed here that the key for recognition is a document recognition key. Then, the controller 101 drives the preprocessor 121 in step 210 by the input of the document input key. The preprocessor 121 may include an image blur determination unit, a subject tilt correction unit, an image region expansion unit, a noise remover, an image binarization unit, and the like. Detailed operation thereof will be described in FIG. 4.

상기 영상화면의 전처리가 종료되면, 상기 전처리된 영상화면은 문자인식부123에 입력된다. 그러면 상기 문자인식부123은 상기 전처리된 영상화면 중의 문자 이미지들을 인식하여 문자 데이터를 변환시킨다. 여기서 상기 문자인식부123은 각 언어에 따라 각각 대응되는 인식기를 사용할 수 있다. 본 발명의 실시예에서는 영어문자인 경우 FineReader 5.0 office trial version(company: ABBYY, mainly recognizes English language)를 사용하고, 한글인 경우에는 GN2000 version(company: HIART corporation, recognizes Korean and English)을 사용할 수 있다. 이후 상기 제어부101은 상기 문자인식부123에서 인식된 문자 데이터를 상기 표시부115의 제1표시영역71에 표시하고, 문서입력키의 종류에 따른 항목정보들을 상기 표시부115의 제2표시영역75에 표시한다.When the preprocessing of the video screen is finished, the preprocessed video screen is input to the character recognition unit 123. Then, the text recognition unit 123 recognizes text images in the preprocessed video screen and converts text data. Herein, the character recognition unit 123 may use a recognizer corresponding to each language. In an embodiment of the present invention, FineReader 5.0 office trial version (company: ABBYY, mainly recognizes English language) may be used for English characters, and GN2000 version (company: HIART corporation, recognizes Korean and English) may be used for Korean characters. . Thereafter, the controller 101 displays the character data recognized by the character recognition unit 123 in the first display area 71 of the display unit 115, and displays item information according to the type of the document input key in the second display area 75 of the display unit 115. do.

이후 사용자가 상기 표시부115의 제1표시영역71에 표시되고 있는 인식된 문자데이터들 및 상기 제2표시영역75에 표시되고 있는 항목을 선택하면, 상기 제어부101은 230 과정에서 상기 선택된 문자데이터 및 항목을 상기 표시부115의 제3표시영역73에 표시한다. 여기서 저장항목은 상기 인식된 문서의 항모들 중에서 원하는 항목들만 선택하여 저장하기 위함이다. 즉, 명함의 경우, 이름, 휴대전화기 번호, 전자우편주소, 회사주소, 회사전화번호, 팩시밀리 번호 등은 많은 항목들을 가지고 있다. 이런 경우 사용자는 상기 각 항목들 중에서 원하는 항목들을 선택하여 저장할 수 있다.Then, when the user selects the recognized text data displayed in the first display area 71 of the display unit 115 and the item displayed in the second display area 75, the controller 101 selects the selected text data and the item in step 230. Is displayed on the third display area 73 of the display unit 115. The storage item is for selecting and storing only desired items from among the items of the recognized document. That is, in the case of a business card, a name, a mobile phone number, an e-mail address, a company address, a company phone number, a facsimile number, and the like have many items. In this case, the user may select and store desired items among the above items.

이후 수정키가 입력되면 상기 제어부101은 240 과정으로 진행하며, 상기 인식된 문자데이터들 중에서 오류가 발생된 문자의 수정을 한다. 이때 상기 수정 방법은 오인식된 문자에 대한 후보 문자군들을 표시하고, 상기 후보문자들 중에서 선택되면 상기 제어부101은 상기 오인식된 문자를 상기 선택된 후보문자로 대체한다. 그러나 상기 후보문자들 중에서 오인식된 문자를 수정할 수 없는 경우 사용자는 입력부113을 통해 수정하고자 하는 문자를 필기체로 입력하며, 제어부101은 상기 문자인식부157를 구동하여 해당 필기체 문자를 인식하여 수정한다. 또한 상기 필기체인식모듈 이외에 소프트 키패드를 구비하며, 상기 소프트 키패드에서 발생되는 소프트 키 데이터들을 분석하여 오류된 문자를 수정하는 방법도 가능하다.Thereafter, if a correcting key is input, the controller 101 proceeds to step 240 and corrects an error character among the recognized character data. In this case, the correction method displays candidate character groups for the misrecognized character, and if selected from the candidate characters, the controller 101 replaces the misrecognized character with the selected candidate character. However, if the misrecognized character cannot be corrected among the candidate characters, the user inputs a character to be corrected through the input unit 113 as a cursive handwriting, and the controller 101 drives the character recognition unit 157 to recognize and correct the handwritten character. In addition, a soft keypad is provided in addition to the handwriting recognition module, and a method of correcting an error character by analyzing soft key data generated from the soft keypad is also possible.

여기서 상기 230과정의 저장항목 선택 과정과 240 과정의 오류 수정 과정은순서를 바꾸어 수행하여도 동일한 효과를 가질 수 있다.Here, the storage item selection process of step 230 and the error correction process of step 240 may have the same effect even if the order is changed.

상기와 같이 수정이 완료되면, 상기 제어부101은 상기 수정완료된 문자 데이터를 해당 항목의 문자데이터로 상기 데이터 베이스131에 저장한다.When the modification is completed as described above, the controller 101 stores the modified character data in the database 131 as the character data of the corresponding item.

도 3은 상기 도 2의 210 과정에서 수행되는 문서 촬영 과정의 절차를 도시하는 도면이다.FIG. 3 is a diagram illustrating a document photographing procedure performed in step 210 of FIG. 2.

상기 도 3을 참조하면, 사용자는 인식을 원하는 문서를 적정 위치에 놓고 단말장치의 카메라107을 이용하여 촬영을 시작한다. 그러면 상기 카메라107에서 촬영되는 영상 화면은 영상처리부109를 통해 처리되어 표시부115에 표시된다. 이때 상기 단말장치의 사용자가 키입력부105(입력부 113에서도 가능)의 카메라 조정키를 입력하면, 상기 제어부101은 313단계에서 이를 감지하고 315단계에서 상기 카메라107을 제어한다. 이때 상기 카메라107의 조정은 거리 및 노출 조정이 될 수 있다. 여기서 상기 거리 조정은 줌 기능을 수행하여 피사체와 단말장치 간의 거리를 조정하는 방법, 또는 사용자가 단말장치를 이동시키는 방법 등이 될 수 있다. 또한 상기 노출 조정은 상기 카메라107 내의 이미지센서(image sensor)의 노출(exposure)을 제어하는 방법을 사용할 수 있다. 상기와 같은 조정 동작은 생략되거나 어느 한 가지 조정 방법만을 사용할 수도 있다. 또한 상기 문서의 촬영 방법은 문서 전체를 촬영하는 방법, 또는 상기 문서 중에서 원하는 일부를 촬영하는 방법을 사용할 수 있다. 도 26a 및 도 26b는 촬영되는 문서가 명함이고, 명함 중의 일부를 촬영한 예를 도시하고 있다.Referring to FIG. 3, the user places a document to be recognized at an appropriate position and starts photographing using the camera 107 of the terminal device. Then, the video screen shot by the camera 107 is processed by the image processor 109 and displayed on the display 115. In this case, when the user of the terminal device inputs a camera adjustment key of the key input unit 105 (also possible in the input unit 113), the control unit 101 detects this in step 313 and controls the camera 107 in step 315. In this case, adjustment of the camera 107 may be distance and exposure adjustment. The distance adjustment may be a method of adjusting a distance between a subject and a terminal device by performing a zoom function, or a method of moving a terminal device by a user. In addition, the exposure adjustment may use a method of controlling exposure of an image sensor in the camera 107. The adjustment operation as described above may be omitted or only one adjustment method may be used. The document capturing method may be a method of photographing the entire document or a method of photographing a desired part of the document. 26A and 26B show an example in which a document to be photographed is a business card and a part of the business card is photographed.

상기한 바와 같이 상기 카메라107의 거리 및 노출 조정에 따라 촬영되는 문서 이미지는 도 26a와 같이 표시부115에 표시된다. 상기와 같은 상태에서 사용자가 스타일러스 펜을 이용하여 입력부113의 촬영키를 누르면(또는 키입력부105의 사진찍기 키를 입력하면), 상기 제어부101은 317단계에서 이를 사진찍기로 감지하고, 상기 촬영키가 입력된 시점의 문서 이미지를 319단계에서 상기 표시부115에 도 26c와 같이 정지화상으로 표시한다. 상기 도 26c와 같이 표시부115에 표시되는 문서 이미지가 양호한 경우, 상기 사용자는 상기 스타일러스 펜을 이용하여 입력부113에 표시되고 있는 저장키를 누른다. 상기 저장키가 발생되면, 상기 제어부101은 321단계에서 이를 감지하고, 표시중인 문서 이미지를 이름과 함께 상기 메모리103의 화상 메모리 영역에 저장한다. 이때 상기 321단계 및 323단계를 수행하는 동안 상기 표시부115는 도 26c - 도 26e와 같은 표시 동작을 수행한다. 그러나 상기 사용자가 취소키를 발생하면, 상기 제어부101은 325단계에서 이를 감지하고 상기 표시중인 문서이미지의 표시 동작을 중단하고 종료한다.As described above, the document image photographed according to the distance and exposure adjustment of the camera 107 is displayed on the display unit 115 as shown in FIG. 26A. When the user presses the photographing key of the input unit 113 using the stylus pen (or inputs the photographing key of the key input unit 105) in the above state, the control unit 101 detects this by photographing in step 317 and the photographing key. In step 319, the document image at the time of inputting is displayed on the display unit 115 as a still image as shown in FIG. 26C. When the document image displayed on the display unit 115 is satisfactory as shown in FIG. 26C, the user presses a storage key displayed on the input unit 113 using the stylus pen. When the storage key is generated, the controller 101 detects it in step 321 and stores the displayed document image in the image memory area of the memory 103 with a name. At this time, the display unit 115 performs the display operation as shown in FIGS. 26C to 26E during steps 321 and 323. However, if the user generates a cancel key, the controller 101 detects this in step 325 and stops and ends the display operation of the displayed document image.

상기한 바와 같이 상기 문서를 촬영하는 210 과정에서는 사용자가 원하는 영상을 카메라를 통해 입력하고, 카메라 미세 조정을 통해 입력 영상의 해상도를 높여 선명한 영상화면을 획득한 후 문자 인식을 위해 저장한다. 이후 상기 촬영된 영상화면을 문자인식을 통해 입력 영상에서 문자를 추출하여 문자데이터(text)로 저장할 것인지, 그냥 사진으로 저장할 것인지를 확인한다.As described above, in step 210 of capturing the document, a user inputs a desired image through a camera, obtains a clear image screen by increasing the resolution of the input image through camera fine adjustment, and stores it for character recognition. After that, the character image is extracted from the input image through text recognition to check whether the text is to be stored as text data or just as a picture.

여기서 상기 인식하고자 문서의 이미지를 카메라로 촬영하여 획득하는 절차를 설명하고 있지만, 저장 중인 문서 이미지 또는 외부로부터 입력되는 문서 이미지를 획득하여 수행할 수도 있다. 이때 상기 휴대 단말장치의 사용자가 인식을 요구하면, 도2의 220과정에서 전처리 동작을 수행하며, 이후 230 과정에서 상기 전처리된 영상화면의 문자를 인식하는 동작을 연속으로 수행한다.Although a procedure of capturing and acquiring an image of a document to be recognized is described herein, a document image being stored or a document image input from the outside may be acquired and performed. In this case, when the user of the portable terminal device requires recognition, the preprocessing operation is performed in step 220 of FIG.

도 4는 상기 도 1에서 전처리부121의 구성을 도시하는 도면이다.FIG. 4 is a diagram illustrating a configuration of the preprocessor 121 in FIG. 1.

상기 도 4를 참조하면, 입력되는 신호는 영상화면의 신호로써, 카메라(camera), 스캐너(scanner), 모뎀 및 네트워크 등을 포함하는 통신인터페이스부, 컴퓨터에서 발생되는 영상신호가 될 수 있다. 또한 상기 입력 영상화면은 메모리103에 저장된 영상신호가 될 수도 있다.Referring to FIG. 4, the input signal may be a signal of an image screen, and may be an image signal generated from a communication interface unit including a camera, a scanner, a modem, a network, and the like. In addition, the input video screen may be a video signal stored in the memory 103.

영상블러링판정부(decision on blurring of image part)910은 상기 입력되는 영상화면을 글자블록 및 배경화면으로 분류한다. 그리고 상기 분류된 글자블록들의 평균에너지 비율을 계산하며, 상기 글자블록의 평균 에너지비율을 미리 설정된 기준값과 비교하여 영상화면의 블러링 여부를 판정한다. 이때 상기 영상화면이 블러드 영상(blurred image) 화면으로 판정되는 경우, 상기 제어부101에 이를 통보하여 영상화면의 재입력을 요구하며, 블러드 영상화면이 아니면(non-blurred image) 상기 입력되는 영상화면이 피사체기울기보정부920에 전달되도록 한다. 따라서 상기 제어부101은 상기 영상블러링판정부910에서 출력되는 블러링 판정여부에 따라 영상화면을 다시 발생하도록 제어하거나 도는 상기 영상화면을 전처리부121에서 처리하도록 제어한다.A decision on blurring of image part 910 classifies the input image screen into a letter block and a background screen. The average energy ratio of the classified letter blocks is calculated, and the average energy ratio of the letter blocks is compared with a preset reference value to determine whether the image screen is blurred. At this time, if the video screen is determined to be a blurred image screen, the controller 101 notifies this to request re-entry of the video screen, and if the video screen is not a non-blurred image, the input video screen is displayed. It is transmitted to the subject tilt correction unit 920. Therefore, the controller 101 controls to generate the video screen again or processes the video screen in the preprocessor 121 according to the blurring determination output from the image blurring determiner 910.

피사체기울기보정부(skew correction part)920은 먼저 상기 입력되는 영상화면을 소정 크기의 블록으로 분할한 후, 상기 분할된 블록들을 글자블록 및 배경블록들로 분류하며, 상기 분류된 각 블록들의 화소들을 이진화한다. 두 번째로 상기이진화된 글자블록의 영역에 대하여 확장(dilation)을 수행하여 이웃하는 글자들이 접하게 되는 후보 스트라이프(candidate stripe)를 생성한다. 세 번째로 상기 후보스트라이프들 중에서 일정 크기 이상의 길이를 가지는 후보스트라이프들을 스트라이프로 분류한다. 세 번째로 상기 분류된 스트라이프들의 방향각을 각각 계산하여 그 개수를 누적하며, 누적된 개수가 가장 많은 방향각을 선택하여 기울어진 영상화면 내의 피사체의 회전각으로 결정한다. 네 번째로 상기 입력부10에서 출력되는 영상신호를 입력하며, 상기 결정된 회전각에 의해 상기 영상신호를 회전시켜 상기 영상화면 내의 피사체의 기울기를 보정한다. 다섯 번째로 상기 기울기 보정에 의해 화소들이 없게 되는 영상화면의 빈 영역에 특정 화소들을 채워 상기 입력된 영상화면과 동일한 크기를 가지는 영상화면을 발생한다.The subject skew correction part 920 first divides the input image screen into blocks having a predetermined size, and then classifies the divided blocks into letter blocks and background blocks, and divides the pixels of the classified blocks. Binarize. Second, by performing a dilation on the area of the binarized letter block, a candidate stripe in which neighboring letters are in contact with each other is generated. Third, candidate stripes having a length greater than or equal to a predetermined size among the candidate stripes are classified into stripes. Third, the direction angles of the classified stripes are respectively calculated and the number is accumulated, and the direction angle with the largest accumulated number is selected to determine the rotation angle of the subject in the tilted image screen. Fourth, the image signal output from the input unit 10 is input, and the image signal is rotated by the determined rotation angle to correct the inclination of the subject in the image screen. Fifth, the image screen having the same size as the input image screen is generated by filling certain pixels in the blank area of the image screen where the pixels are absent due to the tilt correction.

영상영역확장부(ROC(Region of contents) extension part)930은 먼저 상기 피사체기울기보정부920에서 발생되는 영상화면을 상기 블록으로 분할하고, 상기 분할된 블록들에 포함되는 화소들을 검사하여 글자 블록 및 배경블록들로 분류한 후, 상기 분류된 글자블록의 화소들을 이진화한다. 두 번째로 상기 이진화된 영상화면을 메디안 필터링하여 상기 영상화면에서 테두리나 잡음에 의해 잘못 분류된 글자영역을 제거한다. 세 번째로 상기 메디안 필터링된 영상화면을 수평 및 수직 방향으로 스캔하여 글자영역의 위치를 탐색한다. 네 번째로 위치가 탐색된 글자영역의 영상화면을 추출한다. 다섯 번째로 상기 추출된 글자영역의 영상화면을 상기 입력 영상화면의 크기로 확장한다.The ROC (Region of contents) extension part (930) first divides the image screen generated by the subject tilt correction unit 920 into the block, examines the pixels included in the divided blocks, and checks the character block and After classifying the background blocks, the pixels of the classified letter blocks are binarized. Second, the median filtering of the binarized video screen is performed to remove a text area that is incorrectly classified by a border or noise in the video screen. Third, the position of the text area is searched by scanning the median filtered video screen in horizontal and vertical directions. Fourthly, the image screen of the text area where the position is found is extracted. Fifth, the image screen of the extracted text area is expanded to the size of the input image screen.

잡음제거부(noise reduction part)30은 상기 영상영역확장부930에서 출력되는 영상화면에 포함된 잡음을 제거하는 기능을 수행한다.The noise reduction part 30 performs a function of removing noise included in an image screen output from the image area expansion unit 930.

일반적으로 카메라로부터 영상화면을 획득 시에 잡음이 생기게 되는데, 이런 잡음 성분들 중에 대표적인 잡음 성분으로서 가우시안 잡음을 들 수 있다. 상기 가우시안 잡음을 제거하기 위해서는 여러 종류의 잡음제거필터를 사용할 수 있다. 그러나 명함등을 촬영한 영상화면인 경우에는 글자 영역의 에지 부분에 많은 정보를 가지게 된다. 따라서 상기 명함과 같은 영상화면인 경우에는 단순한 잡음 제거 필터만을 사용하면 글자 정보의 심각한 손상을 일으킬 수 있다. 따라서 상기 잡음제거부940은 글자의 에지 정보를 잘 보존하면서 동시에 영상의 잡음을 잘 제거하기 위한 필터를 사용하는 것이 바람직하다. 여기서는 상기 잡음제거부940이 방향성 리 필터(directional Lee filter)와 같은 특수한 잡음 제거 필터를 사용한다고 가정한다.In general, noise is generated when an image is obtained from a camera. Among the noise components, a representative noise component is Gaussian noise. In order to remove the Gaussian noise, various kinds of noise removing filters may be used. However, in the case of an image screen photographing a business card or the like, a lot of information is provided at the edge of the text area. Therefore, in the case of an image screen such as a business card, only a simple noise reduction filter may cause serious damage to character information. Therefore, the noise removing unit 940 preferably uses a filter for removing noise in the image while preserving edge information of the letter well. Here, it is assumed that the noise canceling unit 940 uses a special noise canceling filter such as a directional lee filter.

상기 잡음제거부940은 상기 영상블러링판정부910과 상기 피사체기울기보정부920의 중간에 위치될 수 있으며, 또는 상기 피사체기울기보정부920과 상기 영상영역확장부930의 중간에 위치될 수 있고, 또한 생략될 수도 있다.The noise canceller 940 may be positioned between the image blurring determiner 910 and the subject tilt correction unit 920, or may be positioned between the subject tilt correction unit 920 and the image area extension unit 930. May be omitted.

영상이진화부950은 먼저 상기 영상영역확장부930에서 출력되는 영상화면 또는 잡음제거부940에서 출력되는 영상화면을 소정 크기를 가지는 블록으로 분할하고, 상기 분할된 블록들에 포함되는 화소들을 검사하여 글자블록 및 배경블록들로 분류한다. 두 번째로 분류된 글자블록들의 글자화소와 주변화소들 간의 관계를 이용하여 글자블록의 에지를 향상시키고 잡음을 감소시킨 화소들을 생성하며, 또한상기 화소들을 이진화하기 위한 기준값을 계산한다. 상기 영상신호의 에지 향상 및 잡음 제거는 쿼드래틱 필터 또는 개선된 쿼드래틱 필터를 사용할 수 있다. 세 번째로 에지향상 및 잡음이 제거된 글자블록 및 배경블록들의 화소들을 상기 기준값과 비교하여 제1 및 제2밝기 값으로 이진화한다.The image binarizer 950 first divides the image screen output from the image region expansion unit 930 or the image screen output from the noise canceller 940 into blocks having a predetermined size, and examines pixels included in the divided blocks. Classify into blocks and background blocks. By using the relationship between the text pixels of the second classified letter blocks and the neighboring pixels, pixels with improved edges and reduced noise of the letter blocks are generated, and a reference value for binarizing the pixels is calculated. Edge enhancement and noise reduction of the video signal may use a quadratic filter or an improved quadratic filter. Third, pixels of the letter block and the background blocks from which edge enhancement and noise are removed are compared with the reference value and binarized to the first and second brightness values.

상기 영상이진화부950에서 출력되는 이진화된 영상화면 정보는 문자인식부123에 인가되어 영상화면 내의 글자들이 인식된다.The binarized image screen information output from the image binarization unit 950 is applied to a character recognition unit 123 to recognize characters in the image screen.

상기와 같은 본 발명의 실시예에 따른 전처리부121의 영상블러링판정부910, 피사체기울기보정부920, 영상영역확장부930, 잡음제거부940 및 영상이진화부950의 구성은 다음과 같이 구현할 수 있다.The image blurring determiner 910, the subject tilt correction unit 920, the image region expansion unit 930, the noise removing unit 940 and the image binarization unit 950 of the preprocessor 121 according to the embodiment of the present invention can be implemented as follows. .

이하 설명되는 도면에서 도 5는 상기 영상블러링판정부910의 구성을 설명하기 위한 도면이며, 도 9는 상기 피사체기울기보정부920의 구성을 설명하기 위한 도면이고, 도 14는 상기 영상영역확장부930의 구성을 설명하기 위한 도면이며, 도 17a - 도 18d는 상기 잡음제거부940의 구성을 설명하기 위한 도면이고, 도 12는 상기 영상이진화부950의 구성을 설명하기 위한 도면이다.5 is a view for explaining the configuration of the image blurring determiner 910, FIG. 9 is a view for explaining the configuration of the subject tilt correction unit 920, and FIG. 14 is the image area expansion unit 930. 17A to 18D are views for explaining the configuration of the noise canceling unit 940, and FIG. 12 is a view for explaining the configuration of the image binarization unit 950. FIG.

도 5는 상기 도 4의 영상블러링판정부910의 구성을 도시하는 도면이다.FIG. 5 is a diagram illustrating a configuration of the image blurring determiner 910 of FIG. 4.

상기 도 5를 참조하면, 블록분류부(block classification part)1110은 상기 입력되는 영상화면을 상기 블록으로 분할하고, 상기 분할된 블록들에 포함되는 화소들을 검사하여 글자블록(character block: CB) 및 배경블록(background block: BB)들로 분류하는 기능을 수행한다. 상기와 같이 블록분류부1110이 각 블록들을 글자블록 및 배경블록들로 분류하는 이유는 글자가 포함되어 있는 영역만을 이용하여블러링 여부를 판정하기 위함이다. 여기서 상기 블록은 8×8 화소의 크기를 가진다고 가정한다.Referring to FIG. 5, a block classification part 1110 divides the input image screen into the block, examines pixels included in the divided blocks, and then selects a character block (CB); Performs a classification of background blocks (BBs). The reason why the block classification unit 1110 classifies the blocks into the letter blocks and the background blocks as described above is to determine whether to blur using only the area in which the letters are included. It is assumed here that the block has a size of 8x8 pixels.

글자블록평균에너지계산부1120은 상기 블록분류부1110에서 출력되는 글자블록의 평균 에너지 비율을 계산한다. 상기와 같이 글자블록의 평균 에너지비율을 계산하는 이유는 영상화면을 구성하는 글자블록들의 평균에너지비율을 계산하므로써, 글자가 포함되어 있는 영역들만을 이용하여 블러링 여부를 판정하기 위함이다.The letter block average energy calculator 1120 calculates an average energy ratio of the letter blocks output from the block classification unit 1110. The reason for calculating the average energy ratio of the letter blocks as described above is to determine whether blurring is performed using only the areas containing the letters by calculating the average energy ratio of the letter blocks constituting the image screen.

블러링판단부1130은 상기 글자블록평균에너지계산부1120에서 출력되는 글자블록의 평균 에너지비율을 미리 설정된 기준값과 비교하여 영상화면의 블러링 여부를 판정한다. 상기 블러링판단부1130은 상기 영상화면이 블러드 영상(blurred image) 화면으로 판정되는 경우, 상기 입력부10에 이를 통보하여 영상화면의 재입력을 요구한다.The blurring determination unit 1130 determines whether the image screen is blurred by comparing the average energy ratio of the letter block output from the letter block average energy calculator 1120 with a preset reference value. When it is determined that the video screen is a blurred image screen, the blurring unit 1130 notifies the input unit 10 to request re-entry of the video screen.

도 6은 상기 블록분류부1110의 구성을 도시하는 도면이다. 상기 블록분류부1110은 상기 영상화면을 소정 크기의 블록들로 나누고, 각 블록들을 각각 글자블록 및 배경블록으로 분류하는 동작을 수행한다. 이때 상기 블록분류부1110이 각 블록들을 분류하는 목적은 영상화면의 블러링 여부를 판정할 때 글자가 포함된 영역만을 이용하여 수행하기 위함이다.6 is a diagram illustrating a configuration of the block classification unit 1110. The block classification unit 1110 divides the image screen into blocks having a predetermined size, and classifies each block into a letter block and a background block, respectively. In this case, the purpose of classifying the blocks by the block classifier 1110 is to perform only the region including the text when determining whether the image screen is blurred.

상기 도 6을 참조하면, 블록분할부1111은 상기 영상화면을 소정의 크기를 가지는 블록으로 분할한다. 이 때 상기 영상화면이 640×480 화소이고, 상기 블록이 8×8 화소이면, 상기 블록분할부1111은 상기 영상화면을 4800개의 블록들로 분할한다.Referring to FIG. 6, the block dividing unit 1111 divides the video screen into blocks having a predetermined size. In this case, if the video screen is 640 × 480 pixels and the block is 8 × 8 pixels, the block divider 1111 divides the video screen into 4800 blocks.

상기 블록분할부1111에서 출력되는 블록들은 DCT변환부1113에 인가되어 DCT(discrete cosine transform) 변환된다. 그리고 에너지계산부1115는 상기 DCT 변환된 블록 내에서 우수한 DCT 계수들(dominant DCT coefficients)의 절대값의 합을 계산한다. 이때 상기 글자블록의 DCT 계수들(coefficients)의 에너지분포(energy distribution)는 배경블록의 그것보다 큰 값을 가진다. 상기한 바와 같이 글자블록의 DCT 계수들은 배경블록의 DCT 계수들보다 큰 값을 가지고 있으며, 일부 DCT 계수가 절대값의 합의 평균이 큰 값을 가진다는 것을 알 수 있다. 따라서 본 발명의 실시 예에서는 블록분류 시 사용되는 우수한 DCT 계수들(dominant DCT coefficients)은 실험결과 D₁- D₉까지이다. 따라서 k번째 블록에서의 우수한 DCT계수들의 절대값의 합은 하기 <수학식 1>과 같이 계산할 수 있다.The blocks output from the block division unit 1111 are applied to the DCT transform unit 1113 to perform a discrete cosine transform (DCT) transformation. The energy calculator 1115 calculates a sum of absolute values of dominant DCT coefficients in the DCT transformed block. At this time, the energy distribution of the DCT coefficients of the letter block has a larger value than that of the background block. As described above, the DCT coefficients of the letter block have a larger value than the DCT coefficients of the background block, and it can be seen that some DCT coefficients have a larger average of the sum of absolute values. Therefore, in the embodiment of the present invention, the superior DCT coefficients used in the block classification are D ₁ to D ₉ . Therefore, the sum of absolute values of excellent DCT coefficients in the k-th block may be calculated as in Equation 1 below.

상기 <수학식 1>에서는 k번째 블록의 i번째 우수한 DCT 계수를 의미하고, S^k는 k번째 블록의 DCT 계수들의 절대값의 합을 의미한다. 따라서 본 발명의 실시예에서는 우수한 DCT 계수들인 D₁- D₉까지의 DCT 계수들의 절대값의 합을 계산한다.In Equation 1 Denotes the i-th excellent DCT coefficient of the k-th block, and S ^k denotes the sum of absolute values of the DCT coefficients of the k-th block. Therefore, the embodiment of the present invention calculates the sum of absolute values of DCT coefficients up to D ₁ -D ₉ which are excellent DCT coefficients.

상기 에너지 계산부1115는 상기 <수학식 1>과 같은 계산을 모든 블록들(k=0,1,2,...,4799)에 대하여 수행한다. 그리고 상기 각 블록별 에너지값S^k(k=0,1,...,4799)들은 블록기준값계산부1117에 인가된다.The energy calculator 1115 performs the same calculation as for Equation 1 on all blocks k = 0,1,2, ... 4799. The energy values S ^k (k = 0,1, ..., 4799) for each block are applied to the block reference value calculator 1117.

상기 블록기준값계산부1117은 상기 각 블록별로 계산된 에너지값 S^k(k=0,1,...,4799)들을 가산한 후, 상기 가산된 전체블록의 에너지값을 블록의 총개수(TBN)로 나누어 평균값 <S^k>을 구한다. 이때 상기 <S^k> 값은 하기 <수학식 2>와 같이 구하며, 이때의 상기 <S^k> 값은 상기 블록을 글자블록 또는 배경블록으로 판정하기 위한 블록 기준값 Cth가 된다.The block reference value calculation unit 1117 adds the energy values S ^k (k = 0,1, ..., 4799) calculated for each block, and then adds the energy values of all the added blocks to the total number of blocks (TBN). Divide by) to get the average value <S ^k >. In this case, the value of <S ^k > is obtained as shown in Equation 2 below, and the value of <S ^k > is a block reference value Cth for determining the block as a letter block or a background block.

상기 <수학식 2>에서 TBN은 블록의 총 개수를 나타낸다.In Equation 2, TBN represents the total number of blocks.

블록판정부1119는 상기 에너지계산부1115에서 출력되는 블록별 에너지값(우수한 DCT 계수들의 절대값의 합)들을 순차적으로 입력하며, 상기 입력되는 블록 에너지값을 상기 기준값 Cth와 비교하여 글자블록 또는 배경블록으로 판정한다. 이때 상기 블록판정부1119는 하기 <수학식 3>에 나타낸 바와 같이, S^k값이 상기 블록 기준값 Cth보다 크거나 같으면 해당하는 k번째 블록을 글자블록으로 판정하고, 상기 기준값 Cth보다 작으면 해당하는 k번째 블록을 배경 블록으로 판정한다.The block determiner 1119 sequentially inputs energy values for each block (sum of absolute values of excellent DCT coefficients) output from the energy calculator 1115, and compares the input block energy value with the reference value Cth to form a letter block or a background. Determine with block. In this case, as shown in Equation 3, the block decision unit 1119 determines that the k-th block is a letter block if the S ^k value is greater than or equal to the block reference value Cth, and if the block value is less than the reference value Cth, The k-th block is determined as the background block.

상기와 같이 블록분류부1110에서 분류된 블록들의 화소는 0-255의 그레이 레벨(gray level)을 가질 수 있다. 상기 블록분류부1110에서 출력되는 글자블록의 영상은 상기 글자블록평균에너지계산부1120에 입력된다. 상기 글자블록평균에너지계산부1120은 상기 분류된 각 글자블록의 에너지비율을 계산한 후, 이를 이용하여 전체 영상화면에서의 글자블록의 평균에너지 비율을 계산한다. 도 7은 상기 글자블록에너지계산부1120의 구성을 도시하는 도면이다.As described above, the pixels of the blocks classified by the block classification unit 1110 may have gray levels of 0-255. The image of the letter block output from the block classifier 1110 is input to the letter block average energy calculator 1120. The letter block average energy calculator 1120 calculates an energy ratio of each classified letter block, and then calculates an average energy ratio of the letter block on the entire image screen using the same. 7 is a diagram illustrating a configuration of the letter block energy calculation unit 1120.

상기 도 7을 참조하면, 에너지비율계산부1121은 각 블록분류부1110에서 분류된 각 글자블록에서 DCT계수의 에너지비율을 계산한다. 이때 M×M 화소 크기의 블록에서 글자블록 DCT계수의 비율은 하기 <수학식 4>와 같이 구할 수 있다.Referring to FIG. 7, the energy ratio calculation unit 1121 calculates the energy ratio of the DCT coefficient in each letter block classified by each block classification unit 1110. In this case, the ratio of the letter block DCT coefficient in the block of M × M pixel size may be obtained as in Equation 4 below.

상기 <수학식4> 에서In Equation 4 above

k번째블록의(m,n)위치에서의저주파성분의DCT계수 DCT coefficient of low frequency component at (m, n) of kth block

k번째 블록의 (m,n)위치에서의 고주파 성분의 DCT계수 DCT coefficient of high frequency component at (m, n) of kth block

본 발명의 실시예에서는 상기한 바와 같이 블록은 8×8화소(M=8)로 가정하고 있다. 여기서 상기 글자블록의 에너지비율을 구하기 위하여 사용한 저주파 성분과 고주파 성분의 DCT 계수들의 위치선정의 타당성을 검증하기 위해 실험을 하고, 각 글자 블록에서 DCT 에너지비율을 계산하기 위한 단계별 DCT 계수의 위치들을 구한다. 이때 상기 실험은 영상블러링의 정도를 증가시켜 가면서 글자블록의 평균에너지비율 값의 변화를 확인한다. 상기와 같은 실험결과에 따라 각 글자블록의 DCT 계수의 에너지비율을 계산하기 위한 DCT 계수들 중, L_m,n은 m+n=1,2의 위치에서의 저주파 성분의 DCT계수가 되고, H_m,n은 m+n=3,4,5,6의 위치에서의 고주파 성분의 DCT계수가 된다.In the embodiment of the present invention, as described above, the block is assumed to be 8x8 pixels (M = 8). Here, experiments are conducted to verify the validity of the position selection of the DCT coefficients of the low frequency and high frequency components used to calculate the energy ratio of the letter block, and the positions of the DCT coefficients for calculating the DCT energy ratio in each letter block are obtained. . At this time, the experiment confirms the change of the average energy ratio value of the letter block while increasing the degree of image blurring. Among the DCT coefficients for calculating the energy ratio of the DCT coefficients of each letter block, L _{m, n} becomes the DCT coefficient of the low frequency component at the position of m + n = 1,2 according to the experimental results as described above. _{m, n} becomes DCT coefficient of the high frequency component in the position of m + n = 3,4,5,6.

상기 에너지비율계산부1121에서는 상기한 바와 같이 각 글자블록들에 대한 DCT 계수의 에너지 비율 R^k를 상기 <수학식 4>에 구한다. 그리고 평균에너지비율계산부1123은 전체영상에서 DCT계수의 평균에너지비율 <R^k>를 구한다. 즉, 상기 평균에너지비율계산부1123은 에너지비율 계산부1121에서 구한 R^k들을 이용하여 전체 영상에서의 평균 R^k를 하기 <수학식 5>와 같이 계산한다.As described above, the energy ratio calculating unit 1121 obtains the energy ratio R ^k of the DCT coefficients for the respective letter blocks in Equation 4. The average energy ratio calculator 1123 calculates the average energy ratio <R ^k > of the DCT coefficients in the entire image. That is, the average energy ratio calculator 1123 calculates an average R ^k of the entire image using the R ^{k values} obtained by the energy ratio calculator 1121 as shown in Equation 5 below.

상기 <수학식 5>에서 TCN은 글자블록의 총개수를 의미한다.In Equation 5, TCN means the total number of letter blocks.

상기와 같이 전체 영상에서의 평균 에너지비율 <R^k>가 계산되면, 블러링판단부1130은 하기 <수학식 6>과 같이 상기 평균 에너지비율 <R^k>를 실험적으로 구한 기준값Bth와 비교하여 입력된 영상화면의 블러링 여부를 판단한다. 즉, 상기 블러링판단부1130은 상기 평균 에너지비율 <R^k>이 기준값 Bth보다 크거나 같을 경우 입력된 영상화면이 블러링되었다고 판단하여 입력부10에 영상화면의 재입력을 요구한다. 그러나 상기 평균 에너지비율 <R^k>가 기준값 Bth 보다 작으면 입력된 영상화면을 인식할 수 있도록 잡음제거부940 또는 영상이진화부950에 인가될 수 있도록 한다.When the average energy ratio <R ^k > of the entire image is calculated as described above, the blurring determination unit 1130 inputs the average energy ratio <R ^k > by comparing with the reference value Bth experimentally obtained as shown in Equation 6 below. It is determined whether the captured video screen is blurred. That is, the blurring determination unit 1130 determines that the input image screen is blurred when the average energy ratio <R ^k > is greater than or equal to the reference value Bth, and requests the input unit 10 to re-enter the image screen. However, when the average energy ratio <R ^k > is smaller than the reference value Bth, the average energy ratio <R ^k > may be applied to the noise removing unit 940 or the image binarization unit 950 to recognize the input image screen.

여기서 상기 기준값 Bth는 실험적으로 영상화면의 글자정보의 시각적 인식 가능 여부와 영상화면의 이진화 출력 결과의 성능을 기준으로 하여 선택한다.In this case, the reference value Bth is experimentally selected based on whether visual recognition of character information of an image screen is possible and the performance of the binarization output result of the image screen.

도 8은 본 발명의 실시예에 따라 입력된 영상화면의 블러링 여부를 판정하는 절차를 설명하는 도면이다.8 is a diagram illustrating a procedure of determining whether an input video screen is blurred according to an embodiment of the present invention.

상기 도 8을 참조하면, 먼저 1151단계에서 영상화면을 입력한다. 이때 입력되는 영상화면은 640×480화소의 크기를 갖는다고 가정한다. 그리고 1153단계에서 상기 영상화면을 설정된 블록 크기로 분할한다. 상기 블록은 8×8화소 크기를 가지며, 생성되는 블록은 4800개가 된다. 이후 1155단계에서 상기 분할된 각 블록들을 DCT변환하며, 1157단계에서 상기 <수학식 1>과 같이 상기 DCT변환된 각 블록들의 우수한 DCT계수들의 절대값의 합 S^k(k=BN=0,...,4799)를 계산하여 각 블록의 에너지로 출력하다. 이후 1159단계에서 상기 <수학식 2>와 같이 각 블록들의 우수한 DCT 계수들의 절대값의 합을 각각 가산한 후 평균하여 블록기준값 Cth(=<S^k>)를 계산한다. 여기서 상기 블록기준값 Cth는 전체 영상화면의 각 블록들의 우수한 DCT계수들의 절대값의 합들을 평균한 값으로써, 각 블록들을 글자블록 및 배경블록으로 분류하기 위한 블록기준값이 된다. 이후 1161단계에서 상기 블록들의 우수한 DCT계수들의 절대값의 합(S^k)들을 순차적으로 억세스하면서, 상기 <수학식 3>과 같이 이 값(S^k)을 상기 블록 기준값과 비교하며, 비교결과 상기 값이 블록기준값 보다 크거나 같으면 글자블록으로 분류하고, 작으면 1163단계에서 배경블록으로 분류한다. 그리고 1165단계에서에서 상기 <수학식 4>와 같이 글자블록으로 분류된 블록들에 대하여 DCT 계수의 에너지비율 R^k를 계산하며, 상기 1167단계에서는 상기 <수학식 5>계산된 글자블록들의 DCT계수의 에너지비율들을 가산 및 평균하여 전체 영상에서 글자블록의 평균 에너지비율 <R^k>를 계산한다. 그리고 1169단계에서 상기 <수학식6>과 같이 상기 글자블록의 평균에너지비율<R^k>를 블러링 판단을 위한 기준값 Bth와 비교하여 블러링 여부를 판정한다. 이때 상기 글자블록이 평균에너지비율<R^k>가 상기 기준값 Bth보다 크거나 같으면 입력된 영상화면을 블러드 화면(blurred image)로 판정하고 상기 510 단계으로 되돌아간다. 그러나 상기 글자블록이 평균에너지비율<R^k>가 상기 기준값 Bth보다 작으면 입력된 영상화면을 정상영상(non-blurred image)화면으로 판정하고, 문자인식부123에 통보한다. 그러면 상기 문자인식부123은 전처리부121에서 출력되는 전처리된 영상화면 내에 포함된 글자들을 인식하는 동작을 수행하게 된다.Referring to FIG. 8, first, an image screen is input in step 1151. In this case, it is assumed that the input video screen has a size of 640 × 480 pixels. In operation 1153, the image screen is divided into a set block size. The block has an 8x8 pixel size and 4800 blocks are generated. After that, in step 1155, the respective blocks are DCT transformed, and in step 1157, the sum of absolute values of excellent DCT coefficients of each of the DCT-converted blocks, as shown in Equation 1, S ^k (k = BN = 0,. Calculate .47,4799 and output the energy of each block. Then, in step 1159, as shown in Equation 2, the sum of the absolute values of the excellent DCT coefficients of each block is added and averaged to calculate a block reference value Cth (= <S ^k >). The block reference value Cth is an average of sums of absolute values of excellent DCT coefficients of each block of the entire video screen, and is a block reference value for classifying each block into a letter block and a background block. Thereafter, in step 1161, while sequentially accessing the sum (S ^k ) of the absolute values of the excellent DCT coefficients of the blocks, the value (S ^k ) is compared with the block reference value as shown in Equation (3), and the comparison result If the value is greater than or equal to the block reference value, it is classified as a letter block. If the value is smaller, it is classified as a background block in step 1163. In operation 1165, the energy ratio R ^k of the DCT coefficient is calculated for the blocks classified into the letter blocks as shown in Equation 4, and in step 1167, the DCT coefficients of the calculated letter blocks are calculated. Calculate the average energy ratio <R ^k > of the letter block in the entire image by adding and averaging the energy ratios of. In step 1169, as shown in Equation 6, the average energy ratio <R ^k > of the letter block is compared with the reference value Bth for the blurring determination to determine whether the blurring occurs. At this time, if the average energy ratio <R ^k > is greater than or equal to the reference value Bth, the letter block determines the input image screen as a blurred image and returns to step 510. However, if the letter block has an average energy ratio <R ^k > less than the reference value Bth, the input image screen is determined as a non-blurred image screen, and the character recognition unit 123 is notified. Then, the character recognition unit 123 recognizes the characters included in the pre-processed image screen output from the preprocessor 121.

도 9는 상기 도 4의 피사체기울기보정부920의 구성을 도시하는 도면이다.FIG. 9 is a diagram illustrating a configuration of the subject tilt correction unit 920 of FIG. 4.

상기 도 9를 참조하면, 이진화부(binarization part)1210은 상기 입력되는 영상화면을 상기 블록으로 분할하고, 상기 분할된 블록들에 포함되는 화소들을 검사하여 글자 블록 및 배경블록들로 분류한 후, 각 블록들의 화소들을 이진화하는 기능을 수행한다. 상기와 같이 이진화부1210이 각 블록들을 글자블록 및 배경블록들로 분류하는 이유는 글자가 포함되어 있는 영역을 이진화한 후, 이를 이용하여 글자열을 분류하기 위함이다.Referring to FIG. 9, a binarization part 1210 divides the input image screen into the blocks, examines the pixels included in the divided blocks, and classifies them into letter blocks and background blocks. The function of binarizing the pixels of each block is performed. The reason why the binarization unit 1210 classifies each block into a letter block and a background block is to binarize the character string by using the binarized area after the character is included.

수평화소감축부1220은 상기 이진화된 영상화면에 대하여 수평방향으로 서브샘플링을 수행하여 상기 영상화면의 수평화소들을 감축한다. 상기 수평화소감축부1220에서 수평화소들을 감축하는 이유는 후술하는 후보스트라이프 생성시 글자열이 수평 방향으로 잘 뭉쳐진 스트라이프로 될 수 있도록 한다.The horizontal pixel reduction unit 1220 reduces the horizontal pixels of the video screen by performing subsampling in the horizontal direction on the binarized video screen. The reason for reducing the horizontal pixels in the horizontal pixel reduction unit 1220 is that the character strings can be formed as a stripe that is well grouped in the horizontal direction when generating the candidate stripe described later.

후보스트라이프생성부1230은 상기 글자블록의 영역에 대하여 확장(dilation)을 수행하여 이웃하는 글자들이 접하게 되는 후보스트라이프들을 생성한다. 상기 후보스트라이프생성부1230은 상기 이진화된 글자블록들의 영역에 대하여 확장(dilation)을 수행하여 이웃하는 글자들이 접하게 되는 후보스트라이프들을 생성하며, 확장(dilation) 단계에서 상기 후보스트라이프들이 상하로 인접한 것들끼리 서로 붙는 것을 방지하기 위하여 축소(erosion) 동작을 수행한다.The candidate stripe generation unit 1230 generates a candidate stripe in which neighboring letters come into contact with each other by performing a dilation on the area of the letter block. The candidate stripe generation unit 1230 generates candidate stripe that the neighboring letters encounter by performing a dilation on the regions of the binarized letter blocks, and the candidate stripe adjacent to each other in the expansion step. Perform an erosion operation to prevent them from sticking together.

수직화소감축부1240은 상기 수평화소의 감축 비율로 상기 후보스트라이프로 변환된 영상화면에 대하여 수직방향으로 서브샘플링을 수행하여 수직화소들을 감축한다. 상기 수직화소감축부1240은 상기 수평화소감축부1220의 수평화소 감축에 따라 변경된 영상화면의 비율을 원래 영상화면의 비율로 복원시키기 위함이다. 상기 수직화소감축부1240은 수평화소를 증가시켜도 동일한 기능을 수행할 수 있다.The vertical pixel reduction unit 1240 reduces the vertical pixels by performing subsampling in the vertical direction on the video screen converted into the candidate stripe at the reduction ratio of the horizontal pixels. The vertical pixel reduction unit 1240 restores the ratio of the image screen changed according to the horizontal pixel reduction of the horizontal pixel reduction unit 1220 to the ratio of the original image screen. The vertical pixel reduction unit 1240 may perform the same function even if the horizontal pixel is increased.

스트라이프분류부1250은 상기 수직화소가 감소된 상기 후보스트라이프들 중에서 일정 크기 이상의 길이를 가지는 스트라이프들을 분류한다. 상기 스트라이프분류부1250은 상기 이진화된 후보스트라이프들의 모멘트를 이용한 블롭 크기(blob size) 및 또는 이심율(eccentricity)을 계산하여 일정크기 이상의 길이를 가지는 스트라이프들을 분류한다. 여기서 상기 스트라이프들은 영상의 수평축을 기준으로 기울어진 영상화면 내의 피사체의 방향각을 계산하기 위한 대상신호로 사용된다. 즉, 상기 스트라이프분류부1250은 상기 이진화된 글자들이 서로 붙은 형태의 스트라이프를 이용하여 방항각을 구하기 위한 스트라이프들을 분류하는 기능을 수행한다.The stripe classification unit 1250 classifies stripes having a length greater than or equal to a predetermined size among the candidate stripes whose vertical pixels are reduced. The stripe classification unit 1250 classifies stripes having a predetermined size or more by calculating a blob size and / or eccentricity using the moments of the binarized candidate stripes. The stripes are used as a target signal for calculating a direction angle of a subject in the image screen inclined with respect to the horizontal axis of the image. That is, the stripe classification unit 1250 performs a function of classifying stripes to obtain a steering angle by using stripes of the binary letters.

회전각 결정부1260은 상기 분류된 스트라이프들의 방향각을 각각 계산하여 각 방향각의 개수를 누적하며, 누적된 개수가 가장 많은 방향각을 선택하여 기울어진 영상화면 내의 피사체의 회전각으로 결정한다. 상기 회전각결정부1260은 상기 스트라이프들의 방향각들을 각각 계산하며, 상기 계산된 결과의 개수를 누적하여 가장 많은 개수를 가지는 방향각을 회전각으로 결정한다.The rotation angle determiner 1260 calculates the direction angles of the classified stripes and accumulates the number of direction angles, and selects the direction angle with the largest cumulative number to determine the rotation angle of the subject in the tilted image screen. The rotation angle determiner 1260 calculates the direction angles of the stripes, and determines the direction angle having the largest number as the rotation angle by accumulating the number of the calculated results.

기울기보정부1270은 상기 입력부10에서 출력되는 영상신호를 입력하며, 상기 회전각결정부1260의 회전각에 의해 상기 영상신호를 회전시켜 상기 영상화면 내의 피사체의 기울기를 보정한다.The tilt correction unit 1270 inputs an image signal output from the input unit 10, and rotates the image signal by the rotation angle of the rotation angle determiner 1260 to correct the tilt of the subject in the image screen.

영상보정부1280은 상기 영상화면 내의 피사체의 기울기가 보정된 영상화면의 귀퉁이(corner)에 영상신호를 삽입한다. 즉, 상기 기울기보정부1270이 상기 영상화면 내의 피사체의 기울기를 보정하면, 상기 영상화면의 회전에 따라 화소들이 없는 영역이 발생된다. 상기 영상보정부1280은 상기 기울기 보정에 의해 화소들이 없게 되는 영상화면의 빈 영역에 특정 화소들을 채우는 기능을 수행한다. 이때 채워지는 화소들은 글자와 무관한 데이터들이므로, 상기 기울기보정부1270의 출력을 그대로 출력하여도 영상화면의 글자를 인식하는데 영향을 미치지 않는다.The image compensator 1280 inserts a video signal into a corner of the video screen in which the inclination of the subject in the video screen is corrected. That is, when the tilt correction unit 1270 corrects the tilt of the subject in the video screen, an area without pixels is generated according to the rotation of the video screen. The image correction unit 1280 fills specific pixels in a blank area of the image screen in which no pixels are caused by the tilt correction. In this case, the pixels to be filled are data irrelevant to the text, and thus outputting the tilt correction 1270 as it is does not affect the recognition of the text on the image screen.

상기 도 9와 같은 구성을 가지는 피사체기울기보정부920의 동작을 구체적으로 살펴본다.The operation of the subject tilt correction unit 920 having the configuration as shown in FIG. 9 will be described in detail.

먼저 상기 입력되는 영상화면은 N×M의 크기를 가진다. 또한 상기 입력되는 영상은 컬러 영상(color image) 또는 색상정보가 없는 흑백영상(gray image)이 될 수 있다. 본 발명의 실시예에서는 상기 영상화면이 흑백 영상이라고 가정한다.First, the input video screen has a size of N × M. In addition, the input image may be a color image or a gray image without color information. In the embodiment of the present invention, it is assumed that the video screen is a black and white video.

상기 영상화면은 이진화부1210에 입력되여 블록으로 분할된 후 글자블록 및 배경블록으로 분류되며, 상기 분류된 블록 영상들을 이진화한다.The image screen is inputted to the binarization unit 1210, divided into blocks, classified into a letter block and a background block, and binarized.

도 10은 상기 이진화부1210의 구성을 도시하는 도면이다. 상기 이진화부1210은 상기 입력된 영상화면을 소정 크기의 블록들로 나누고, 각 블록들을 각각 글자블록 및 배경블록으로 분류한 후, 분류된 블록 영상들의 화소를 글자화소 및 배경화소들로 이진화한다. 이때 상기 이진화부1210이 글자블록 및 배경블록으로 분류한 후, 블록 영상화소들을 이진화하는 목적은 영상화면 내의 피사체의 기울기를 보정할 때 글자열들의 방향각을 구하여 영상화면내의 피사체의 회전각을 구하기 위함이다. 상기 도 10을 참조하면, 블록분류부1211은 입력되는 상기 영상화면을 설정된 블록크기로 분할하며, 상기 분할된 블록들을 글자블록 및 배경블록으로 분류한다. 그러면 블록그룹핑부1213은 상기 분류된 글자블록을 인접한 8개의 블록들과 그룹핑하며, 기준값 계산부는 상기 그룹핑된 블록들로 부터 기준값을 생성한다. 그러면 화소판정부1217은 상기 기준값 계산부에서 출력되는 기준값을 이용하여 상기 블록분류부1211에서 출력되는 배경블록의 화소들은 제2밝기값을 가지는 배경화소들로 일괄 변환하며, 상기 글자블록의 화소들은 상기 기준값에 의해 제1밝기값을 가지는 글자화소 및 제2밝기 값을 가지는 배경화소들로 이진화하여 출력한다.10 is a diagram illustrating a configuration of the binarization unit 1210. The binarization unit 1210 divides the input image screen into blocks having a predetermined size, classifies each block into a letter block and a background block, and binarizes the pixels of the classified block images into letter pixels and background pixels. In this case, after the binarization unit 1210 classifies the letter block and the background block, the purpose of binarizing the block image pixels is to obtain the rotation angle of the subject in the image screen by obtaining the direction angle of the character strings when correcting the tilt of the subject in the image screen. For sake. Referring to FIG. 10, the block classification unit 1211 divides the input image screen into a set block size, and classifies the divided blocks into letter blocks and background blocks. Then, the block grouping unit 1213 groups the classified letter blocks with eight adjacent blocks, and the reference value calculator generates a reference value from the grouped blocks. Then, the pixel determiner 1217 converts the pixels of the background block output from the block classification unit 1211 into background pixels having a second brightness value by using the reference value output from the reference value calculator, and the pixels of the letter block Based on the reference value, the output signal is binarized to a character pixel having a first brightness value and a background pixel having a second brightness value.

도 11은 상기 도 10에서 블록분류부1211의 상세 구성을 도시하는 도면이다. 상기 블록분류부1211은 영상블러링판정부910의 블록분류부1110과 동일하게 구성할 수 있다. 따라서 상기 도 11과 같은 같은 블록분류부1211은 상기 도 6과 같은 블록분류부1110과 동일한 구성을 가지며, 영상화면에서 블록들을 분류하는 동작도 상기블록분류부1110의 동작과 동일하다.FIG. 11 is a diagram illustrating a detailed configuration of the block classification unit 1211 in FIG. 10. The block classification unit 1211 may be configured in the same manner as the block classification unit 1110 of the image blurring determiner 910. Therefore, the same block classification unit 1211 as shown in FIG. 11 has the same configuration as the block classification unit 1110 as shown in FIG. 6, and the operation of classifying blocks in the image screen is the same as the operation of the block classification unit 1110.

상기와 같이 블록분류부1211에 분류된 글자블록들의 화소는 0-255의 그레이 레벨(gray level)을 가질 수 있다. 상기 블록분류부1211에서 출력되는 글자블록의 영상은 블록그룹핑부1213 및 화소판정부1217에 입력된다.As described above, the pixels of the letter blocks classified in the block classification unit 1211 may have a gray level of 0-255. The image of the letter block output from the block classification unit 1211 is input to the block grouping unit 1213 and the pixel determiner 1217.

상기 블록분류부1211에서 출력되는 분류된 블록들은 블록그룹핑부1213에 인가된다. 이때 상기 이진화부1210은 영상화면의 글자열을 분류하기 위한 것이므로, 배경블록들에 대해서는 소정 밝기 값을 가지는 배경화소로 일괄 변환한다. 따라서 상기 배경블록에 대해서는 블록그룹핑 및 기준값계산 동작을 수행하지 않는 것으로 가정한다.The sorted blocks output from the block classification unit 1211 are applied to the block grouping unit 1213. In this case, since the binarization unit 1210 classifies the character strings of the image screen, the background blocks are collectively converted into the background pixels having a predetermined brightness value. Therefore, it is assumed that the block grouping and the reference value calculation operation are not performed on the background block.

상기 블록그룹핑부1213은 상기 블록분류부1211에서 출력되는 글자블록을 중심으로 인접한 8개의 블록들을 그룹핑하여 그룹핑된 블록을 생성한다. 이는 상기 글자블록의 크기가 8×8화소의 크기를 갖는데, 이런 크기의 글자블록 하나만으로 배경화소와 글자화소를 구분하기 위한 기준값을 정하여 이진화 과정을 수행하면 블록의 크기가 너무 작아 인접한 글자블록의 기준값과 그 값의 차이가 크게 나서 이진화 영상에서 블록간의 불연속 현상이 발생할 수도 있다. 따라서 상기와 같이 그룹핑된 블록을 생성하여 이진화를 수행하기 위한 영역을 확장하므로써 이진화의 신뢰성을 향상시킬 수 있게 된다.The block grouping unit 1213 creates a grouped block by grouping eight adjacent blocks around the letter block output from the block classification unit 1211. The letter block has a size of 8 × 8 pixels, and when the binarization process is performed by setting a reference value for distinguishing a background pixel and a text pixel using only one letter block of this size, the block size is too small to determine the size of the adjacent letter block. Since the difference between the reference value and the value is large, discontinuity between blocks may occur in the binarized image. Therefore, the reliability of binarization can be improved by extending the area for performing binarization by generating the grouped blocks as described above.

화소기준값계산부1215는 상기 글자블록의 각 화소를 글자화소와 배경화소로 분류하기 위한 화소기준값 Pth를 계산한다. 상기 화소기준값계산부1215는 상기 화소기준값 Pth를 생성하며, 상기 화소기준값 Pth는 글자화소와 배경화소를 구분하여이진화시 화소 기준값으로 사용된다. 이때 상기 화소기준값 Pth는 두 종류의 화소의 분산의 차(between-class variance)가 최대가 되는 그레이 값(gray value)을 선택하는 오츠(Otsu)의 방식이나 카푸르(Kapur) 방식 등의 다른 방식을 사용하여 선택할 수 있다. 상기 오츠 방식을 사용하여 상기 화소기준값 Pth를 계산하는 것으로 가정한다. 상기 오츠방법에 의한 화소기준값 Pth 계산 방법은 하기와 같은 <수학식 7>에 의해 구할 수 있으며, 이는 오츠(N. Otsu)에 의해 발표된 논문 "A Threshold Selection Method from Gray-Level Histogram" [IEEE Trans. on Systems Man and Cybernetics, Vol.SMC-9, no.1, pp.62-66, Jan. 1979.]에 기재되어 있다.The pixel reference value calculator 1215 calculates a pixel reference value Pth for classifying each pixel of the letter block into a letter pixel and a background pixel. The pixel reference value calculator 1215 generates the pixel reference value Pth, and the pixel reference value Pth is used as a pixel reference value during binarization by dividing a character pixel and a background pixel. In this case, the pixel reference value Pth is another method such as an Otsu method or a Kapur method that selects a gray value in which the difference between two classes of pixels is maximum. Can be selected using. It is assumed that the pixel reference value Pth is calculated using the Otsu method. The method of calculating the pixel reference value Pth by the Otsu method can be obtained by Equation 7 below, which is a paper published by N. Otsu "A Threshold Selection Method from Gray-Level Histogram" [IEEE Trans. on Systems Man and Cybernetics, Vol. SMC-9, no. 1, pp. 62-66, Jan. 1979.].

그러면 상기 화소판정부1217은 상기 블록분류부1211에서 출력되는 글자블록의 각 화소들을 상기 화소 기준값을 이용하여 배경화소와 글자화소로 이진화하고, 상기 배경블록의 각 화소들을 배경화소로 일괄하여 이진화한다. 즉, 상기 화소판정부1217은 상기 글자블록 영상이 입력되면 대응되는 화소기준값 Pth와 상기 글자블록의 화소들을 비교하며, 비교결과 상기 영상화소 값이 상기 화소기준값 Pth 보다크거나 같으면 글자화소로 분류하고 작으면 배경화소로 분류한다. 그리고 상기 화소판정부1217은 상기 분류된 결과에 따라 글자화소는 α밝기 값으로 변환하고 배경화소는 β밝기 값으로 변환하여 이진화한다. 상기 화소판정부1217에서 글자블록의 화소들을 이진화하는 방법은 하기 <수학식 8>과 같다.Then, the pixel panel unit 1217 binarizes each pixel of the letter block output from the block classification unit 1211 into a background pixel and a letter pixel using the pixel reference value, and binarizes each pixel of the background block into a background pixel. . That is, when the letter block image is input, the pixel panel unit 1217 compares the corresponding pixel reference value Pth with the pixels of the letter block, and if the image pixel value is greater than or equal to the pixel reference value Pth, it is classified as a character pixel. If it is small, it is classified as a background pixel. According to the classified result, the pixel panel unit 1217 converts the font pixel into an alpha brightness value and converts the background pixel into a beta brightness value and binarizes it. A method of binarizing the pixels of the letter block in the pixel plate unit 1217 is shown in Equation 8 below.

상기 <수학식 8>에서 y(m,n)은 상기 블록분류부1211에서 출력되는 글자블록의 영상화소들이며, Pth는 상기 화소기준값이고, y_B(m,n)은 이진화된 글자블록의 화소들이다.In Equation (8), y (m, n) is the image pixels of the letter block output from the block classification unit 1211, Pth is the pixel reference value, and y _B (m, n) is a pixel of the binarized letter block. admit.

또한 상기 화소판정부1217은 블록분류부1211에서 출력되는 배경블록의 화소들을 수신하여 β밝기 값으로 일괄 변환한다.In addition, the pixel determiner 1217 receives the pixels of the background block output from the block classification unit 1211 and collectively converts them into β brightness values.

상기와 같이 이진화부1210에서 이진화된 영상화면은 후보스트라이프생성부1230 또는 수평화소감축부1220에 입력될 수 있다. 여기서는 상기 수평화소감축부1220에 입력되는 경우를 가정하여 살펴본다.The image screen binarized by the binarization unit 1210 may be input to the candidate stripe generation unit 1230 or the horizontal pixel reduction unit 1220. In this case, it is assumed that the horizontal pixel reduction unit 1220 is input.

상기 수평화소감축부1220은 상기 이진화된 영상에 대하여 수평방향으로 설정된 비율로 서브샘플링(subsampling)을 수행한다. 이때 상기 서브샘플링 비율은 2:1이라고 가정하면, 상기 수평화소감축부1220은 상기 이진화된 영상신호에 대하여 수평방향으로 2:1로 서브 샘플링하여 수평방향 방향의 화소의 수를 1/2로 감축한다. 상기와 같이 수평화소를 감축하는 목적은 뒷단의 후보스트라이프생성부1230에서 글자열이 스트라이프 형태로 잘 뭉쳐질 수 있도록 하기 위함이다.The horizontal pixel reduction unit 1220 performs subsampling with respect to the binarized image at a ratio set in the horizontal direction. In this case, when the subsampling ratio is 2: 1, the horizontal pixel reduction unit 1220 subsamples the binarized video signal in a horizontal direction of 2: 1 to reduce the number of pixels in the horizontal direction by 1/2. do. The purpose of reducing the horizontal pixels as described above is to allow the strings to be well formed in a stripe form in the candidate stripe generation unit 1230 at the rear end.

상기 후보스트라이프생성부1230은 상기 이진화부1210에서 출력되는 이진화된 영상화면 또는 상기 수평화소감축부1220에서 출력되는 수평화소가 감축된 이진화된 영상화면을 입력한다. 상기 후보스트라이프생성부1230은 수신되는 영상화면에서 글자로 이루어진 각 글자열을 스트라이프로 만든다. 상기 후보스트라이프생성부1230은 확장기(dilation part) 및 수축기(erosion part)로 구성되는 모포로지컬 필터(morphological filter: 형태학적 필터)로 구현할 수 있다. 상기 모포로지컬 필터는 상기 글자영역을 확장(dilation)한 후 수축(erosion)하여 글자들을 서로 접하게한다. 즉, 상기 확장기는 상기 이진화된 글자영역들을 확장하여 이웃하는 글자들과 접하게 만들며, 이로인해 글자들이 서로 접하게 되는 글자열들을 생성하게 된다. 여기서 상기 생성되는 글자열을 후보스트라이프(candidate stripe)라 칭하기로 한다. 그리고 수축기는 상기 생성된 후보스트라이프을 수축한다. 이는 상기 확장 과정에서 상기 후보스트라이프들이 인접한 상하의 후보스트라이프들과 붙어버린 경우 이를 떨어지게 하기 위함이다. 상기와 같은 모포로지컬 필터는 곤잘레스(R.C.Gonzalez)와 우즈(R.Woods) 등에 의해 출판된 책 "Digital Image Processing" [2nd ed., Prentice Hall, pp.519-560, 2002.]에 기재되어 있다.The candidate stripe generation unit 1230 inputs a binarized image screen output from the binarization unit 1210 or a binarized image screen from which the horizontal pixel output from the horizontal pixel reduction unit 1220 is reduced. The candidate stripe generating unit 1230 stripes each character string formed of letters on the received video screen. The candidate stripping unit 1230 may be implemented as a morphological filter (morphological filter) composed of a dilation part and an erosion part. The morphological filter expands the letter area and then contracts the letters to abut each other. That is, the expander expands the binarized character areas to make contact with neighboring letters, thereby generating character strings in which letters are in contact with each other. Herein, the generated character string will be referred to as a candidate stripe. And the systolic contracts the generated candidate stripe. This is to cause the candidate stripe to fall off when the candidate stripe sticks to adjacent upper and lower candidate stripe in the expansion process. Such morphological filters are described in the book "Digital Image Processing" [2nd ed., Prentice Hall, pp. 519-560, 2002.] published by RCGonzalez and R. Woods et al. .

상기 수직화소감축부1240은 상기 후보스트라이프생성부1230에서 출력되는 영상에 대하여 수직방향으로 설정된 비율로 서브샘플링(subsampling)을 수행한다. 이때 상기 서브샘플링 비율은 상기 수평화소감축부1220에서와 같은 2:1이라고 가정한다. 그러면 상기 수직화소감축부1240은 상기 수평화소 감축에 의해 변환된 영상화면의 가로 대 세로 비율을 상기 영상화면의 비율로 환원시키기 위해 사용된다. 따라서 상기 수직화소감축부1240에서 출력되는 영상화면은 상기 영상화면 크기의 가로 세로 각각 1/2로 감축된 영상화면을 출력한다. 여기서 상기 수직화소감축부1240에 대신에 수평화소신장부를 사용할 수 있다. 그러면 상기 원 영상화면의 크기로 환원될 수 있다.The vertical pixel reduction unit 1240 performs subsampling at a ratio set in the vertical direction with respect to the image output from the candidate stripping unit 1230. In this case, it is assumed that the subsampling ratio is 2: 1 as in the horizontal pixel reduction unit 1220. Then, the vertical pixel reduction unit 1240 is used to reduce the aspect ratio of the video screen converted by the horizontal pixel reduction to the ratio of the video screen. Accordingly, the video screen output from the vertical pixel reduction unit 1240 outputs the video screens each reduced by 1/2 of the size of the video screen. In this case, a horizontal pixel extension unit may be used instead of the vertical pixel reduction unit 1240. Then it can be reduced to the size of the original image screen.

스트라이프분류부1250은 상기 이진화부1210에서 출력되는 이진화된 영상화면, 상기 후보스트라이프생성부1230에서 생성되는 영상화면 또는 상기 수직화소감축부1240에서 출력되는 영상화면을 입력할 수 있다. 여기서는 상기 수직화소감축부1240에서 출력되는 영상화면을 입력하는 것으로 가정한다.The stripe classification unit 1250 may input a binarized image screen output from the binarization unit 1210, an image screen generated from the candidate stripe generation unit 1230, or an image screen output from the vertical pixel reduction unit 1240. In this case, it is assumed that an image screen output from the vertical pixel reduction unit 1240 is input.

상기 스트라이프분류부1250은 상기 이진화된 영상에서 생성된 후보스트라이프에 번호를 매긴다(labeling on candidate stripe). 이때 상기 번호가 매겨지는 후보스트라이프는 방향각을 계산하기 위한 후보스트라이프들이다. 이후 상기 스트라이프분류부1250은 상기 번호가 매겨진 후보스트라이프들의 스트라이프 형태를 검사하여 어느 정도 크기 이상을 가지며, 길쭉한 모양을 가지는 후보스트라이프들을 분류한다. 이때 상기 후보스트라이프 분류 방법은 모멘트를 이용한 블럽 크기(blob size)와 이심율(eccentricity)을 이용한다. 하기 <수학식 9>는 블럽 크기와 이심율을 구할 때 사용되는 중심모멘트의 정의를 나타내고 블럽의 크기는 하기 <수학식 9>에서 p=0, q=0일 때 구해진다. 또한 하기 <수학식 10>은 중심 모멘트를 이용하려 이심율을 계산하는 방법을 나타내고 있다. 이는 피터스(Pitas)에 의해 출판된 책 "Digital Image Processing Algorithms" [Prentice Hall, pp.326-331, 1993.]에 기재되어 있다.The stripe classification unit 1250 numbers a candidate stripe generated from the binarized image. In this case, the numbered candidate stripes are candidate stripes for calculating a direction angle. Thereafter, the stripe classification unit 1250 examines the stripe shape of the numbered candidate stripe and classifies the candidate stripe having a certain size or more and having an elongated shape. In this case, the candidate stripe classification method uses a blob size and an eccentricity using moments. Equation 9 below shows the definition of the center moment used to calculate the blob size and eccentricity, and the size of the blob is obtained when p = 0 and q = 0 in Equation 9 below. In addition, Equation 10 shows a method of calculating the eccentricity using the central moment. This is described in the book "Digital Image Processing Algorithms" published by Peters (Prentice Hall, pp. 326-331, 1993.).

여기서 상기 이심율e는 후보스트라이프가 얼마나 긴 스트라이프를 가지는가를 나타낸다.In this case, the eccentricity e indicates how long the candidate stripe has.

그리고 상기 <수학식 9> 및 <수학식 10>에서 각각 구해지는 블럽크기 μ(=μ₀₀) 및 이심율 e를 각각 미리 설정된 기준값 μth 및 eth와 비교하여 후보스트라이프를 스트라이프로 선택한다. 여기서 상기 기준값 μth 및 eth는 실험적으로 구하며, μ≥μth 및(또는) e≥eth인 경우에 해당하는 후보스트라이프를 스트라이프로 분류한다. 본 발명의 실시예에서는 그러나 상기 블럽크기 μ 및 이심율 e가 μ≥μth 및 e≥eth인 경우에 해당하는 후보스트라이프를 스트라이프로 분류하고, 상기 블럽크기 μ 및 이심율 e 중에 어느 하나라도 상기 기준값 μth 및 eth 보다 작거나 또는 둘다 작은 경우에는 상기 후보스트라이프는 스트라이프로 선택하지 않는다고 가정한다. 즉, 이런 조건의 경우에는 해당하는 후보 스트라이프는 방향각을 계산하기에 적합하지 않는 스트라이프로 판단하여 스트라이프로 선택하지 않는다. 본 발명의 실시예에서는 상기 블럽크기 μ 및 이심율 e들의 조건을 모두 만족하는후보스트라이프을 스트라이프로 선택하는 것으로 설명하고 있지만, 상기 두 조건들 중에 어느 한 조건만 검사하여 후보스트라이프의 스트라이프 여부를 판정할 수도 있다.The candidate stripe is selected as a stripe by comparing the blob sizes μ (= μ ₀₀ ) and the eccentricity e obtained in Equations 9 and 10 with the preset reference values μth and eth, respectively. Here, the reference values μth and eth are obtained experimentally, and candidate stripes corresponding to the case of μ≥μth and / or e≥eth are classified into stripes. In an embodiment of the present invention, however, candidate strips corresponding to the case where the blob size μ and the eccentricity e are μ≥μth and e≥eth are classified into stripes, and any one of the blob sizes μ and the eccentricity e is the reference value μth and If less than eth or less than both, the candidate stripe is assumed not to be selected as a stripe. That is, in this case, the candidate stripe is not selected as the stripe because it is determined that the stripe is not suitable for calculating the direction angle. In the exemplary embodiment of the present invention, the candidate stripe that satisfies both the blob size μ and the eccentricity e is selected as a stripe. However, only one of the two conditions may be examined to determine whether the candidate stripe is striped. have.

상기와 같이 스트라이프분류부1250에서 분류된 스트라이프들은 회전각결정부1260에 입력되며, 상기 회전각결정부1260은 상기 분류된 스트라이프들 각각에 대하여 방향각(direction angle) θ를 계산하며, 상기 계산된 방향각들의 개수를 누적하여 저장한다. 그리고 상기 개수가 누적 저장된 방향각들 중에서 가장 많은 개수를 가지는 방향각을 회전각으로 결정한다. 도 12는 상기 회전각결정부1260에서 스트라이프의 회전각을 계산하는 절차를 설명하기 위한 도면이다. 상기 도 12에서 SP는 상기 스트라이프분류부1250에서 분류된 스트라이프이며, x'축 및 y'축은 상기 스트라이프가 위치한 좌표 축이 된다. 따라서 스트라이프분류부1250에서 출력되는 스트라이프들에 대하여 각각 상기 스트라이프의 x'축과 X축(real X axis) 간의 방향각 θ를 계산하고, 상기 각 스트라이프에 대해 구해진 방향각 θ의 개수를 누적하여 저장한다. 이때 상기 스트라이프 SP에 대한 방향각 θ는 하기 <수학식 11>과 같이 구할 수 있다.Stripes classified by the stripe classification unit 1250 are input to the rotation angle determination unit 1260, and the rotation angle determination unit 1260 calculates a direction angle θ for each of the classified stripes. Accumulate and store the number of direction angles. The direction angle having the largest number among the direction angles in which the number is stored is determined as the rotation angle. 12 is a diagram for describing a procedure of calculating a rotation angle of a stripe by the rotation angle determiner 1260. In FIG. 12, SP is a stripe classified by the stripe classification unit 1250, and x 'and y' axes are coordinate axes where the stripe is located. Therefore, for each of the stripes output from the stripe classification unit 1250, the direction angle θ between the x 'axis and the real X axis of the stripe is calculated, and the number of direction angles θ obtained for each stripe is accumulated and stored. do. At this time, the direction angle θ for the stripe SP can be obtained as shown in Equation 11 below.

이후 상기 모든 스트라이프들에 대한 방향각 θ의 계산을 완료한 후, 상기 회전각결정부1260은 상기 누적된 방향각θ들의 개수를 검사하여 개수가 가장 많은 방향각θ를 회전각으로 결정한다. 즉, 상기 회전각결정부1260은 개수가 가장 많은 방향각 θ를 회전각으로 결정한다. 즉, 상기 회전각결정부1260은 가장 많은 스트라이프들의 방향각θ를 회전각으로 결정한다.After the calculation of the direction angle θ for all the stripes is completed, the rotation angle determiner 1260 determines the direction angle θ having the largest number as the rotation angle by checking the accumulated number of direction angles θ. That is, the rotation angle determination unit 1260 determines the direction angle θ having the largest number as the rotation angle. That is, the rotation angle determiner 1260 determines the direction angle θ of the most stripes as the rotation angle.

상기 회전각이 결정되면, 기울기보정부1270은 상기 입력부10에 출력되는 영상화면을 상기 회전각결정부1260에서 결정된 회전각으로 영상화면을 회전시켜 영상신호의 기울기를 보정한다. 즉, 상기 기울기보정부1270은 상기 회전각이 결정되면 회전 매트릭스에 의해 상기 영상화면을 회전시킨다. 이때 상기 영상화면의 회전은 역매핑(inverse mapping)방법을 이용한 회전방법을 사용할 수 있다. 역매핑과 회전방법에 관한 설명은 B.Jahne 등에 의해 출판된 책 "Handbook of Computer Vision and Applications" [Academic Press, vol 2, pp. 94-95, 1999] 그리고 L. G. Shapiro와 G.C.Stockman에 의해 출판된 "Computer Vision" [Prentice Hall, pp.415-418, 2001.]에 각각 기재되어 있다.When the rotation angle is determined, the tilt correction unit 1270 rotates the image screen output to the input unit 10 at the rotation angle determined by the rotation angle determiner 1260 to correct the tilt of the image signal. That is, the tilt correction unit 1270 rotates the image screen by the rotation matrix when the rotation angle is determined. At this time, the rotation of the video screen may use a rotation method using an inverse mapping method. For a description of the reverse mapping and rotation method, see the book "Handbook of Computer Vision and Applications" published by B.Jahne et al. [Academic Press, vol 2, pp. 94-95, 1999 and "Computer Vision" [Prentice Hall, pp. 415-418, 2001.] published by L. G. Shapiro and G.C.Stockman, respectively.

상기와 같이 기울기보정부1270이 영상화면을 회전하면, 영상화면의 귀퉁이(corner)에는 화소들이 없는 공백이 나타난다. 상기 공백은 이후의 인식 과정에 영향을 미칠 수 있다. 영상보정부1280은 상기 기울기보정부1270에 기울기가 보정된 영상화면의 귀퉁이에 발생된 공백 영역에 특정 화소들을 채우는 기능을 수행한다(corner filling). 이때 상기 영상보정부1280은 기울기가 보정된 영상화면의 귀퉁이에 공백에 화소를 채울 때, 수평방향으로 공백영역에서 가장 가까운 화소의 값으로 상기 공백 영역을 채울 수 있다. 또한 상기 공백영역에 이진화시 사용되는 배경화소의 밝기 값으로 일괄 보정할 수도 있다.As described above, when the tilt correction unit 1270 rotates the video screen, a blank without pixels appears in a corner of the video screen. This gap can affect subsequent recognition processes. The image correcting unit 1280 performs a function of filling specific pixels in the blank area generated at the corner of the image screen in which the tilt correction unit 1270 is corrected. In this case, when the pixel is filled in the blank at the corner of the corrected image screen, the image correction unit 1280 may fill the blank area with the value of the pixel closest to the blank area in the horizontal direction. In addition, the blank area may be collectively corrected with the brightness value of the background pixel used for binarization.

상기한 바와 같이 영상화면의 글자들을 인식할 때, 입력되는 영상화면의 글자열에 의한 스트라이프들을 추출한 후, 상기 스트라이프들의 기울기에 따른 방향각들을 계산하고, 상기 계산된 방향각들 중에서 가장 많은 방향각을 회전각으로 결정한 후, 상기 결정된 회전각에 따라 영상화면을 회전시킨다. 따라서 입력되는 영상화면의 피사체의 기울기를 보정한 영상화면을 만들 수 있다. 또한 상기 영상화면의 피사체의 기울기 보정시 화소가 없는 귀퉁이 공백을 특정 화소 밝기 값으로 채울 수 있어 인식시 에러를 줄일 수 있다.When the letters of the image screen are recognized as described above, after extracting stripes by the character string of the input image screen, the direction angles according to the slope of the stripes are calculated, and the most direction angles among the calculated direction angles are calculated. After determining the rotation angle, the image screen is rotated according to the determined rotation angle. Therefore, a video screen can be made by correcting an inclination of a subject of an input video screen. In addition, when the tilt of the subject of the image screen is corrected, corner blanks without pixels may be filled with specific pixel brightness values, thereby reducing errors in recognition.

상기와 같은 본 발명의 실시예에 따른 영상화면 내의 피사체의 기울기 보정절차를 도 13을 참조하여 살펴본다.The tilt correction procedure of the subject in the image screen according to the embodiment of the present invention as described above will be described with reference to FIG. 13.

먼저 1310단계에서 영상화면을 입력한다. 1315단계에서 상기 입력된 영상화면을 이진화한다. 이때 상기 이진화 절차는 먼저 수신되는 영상화면을 미리 설정된 블록크기로 분할하며, 분할된 블록들을 각각 글자블록과 배경블록으로 분류한다. 그리고 글자블록을 중심으로 인접한 8개의 블록들을 그룹핑하여 그룹핑된 블록을 만들며, 상기 그룹핑된 블록으로부터 블록 영상의 화소를 글자화소 및 배경화소로 분류하기 위한 화소기준값을 생성한다. 이후 상기 분류된 글자블록의 화소들을 상기 화소기준값과 비교하여 글자화소 및 배경화소로 분류하고, 상기 배경블록의 화소들은 일괄하여 배경화소로 변환한다. 따라서 상기 1315단계에서는 입력 영상화면의 화소들을 글자화소 및 배경화소들로 이진화하여 출력한다.First, the image screen is input in step 1310. In operation 1315, the input image screen is binarized. In this case, the binarization process first divides the received video screen into a predetermined block size, and classifies the divided blocks into letter blocks and background blocks, respectively. In addition, eight adjacent blocks are grouped around the letter block to form a grouped block, and a pixel reference value for classifying pixels of the block image into letter and background pixels is generated from the grouped block. Subsequently, pixels of the classified letter block are classified into a character pixel and a background pixel by comparing with the pixel reference value, and the pixels of the background block are collectively converted into a background pixel. Therefore, in operation 1315, the pixels of the input image screen are binarized and output to the character pixels and the background pixels.

상기 이진화 영상화면은 1320단계에서 수평방향으로 서브샘플링된다. 상기 서브샘플링은 수평 방향으로 수행하는데, 상기와 같이 수평방향으로 화소들을 서브샘플링하는 이유는 뒷단의 후보스트라이프 생성과정에서 글자열이 스트라이프 형태로 뭉쳐지도록 하기 위함이다. 이후 1325단계 및 1330단계에서 상기 수평 감축된 영상화면을 모포로지컬 필터링하여 후보스트라이프들을 생성한다. 즉, 1325단계에서는 상기 영상화면의 이진화된 글자영역들을 확장(dilation)하여 이웃하는 글자들이 서로 접하게 만들어 후보스트라이프을 만들고, 1330단계에서는 상기 확장과정에서 후보스트라이프가 상하로 인접한 다른 후보스트라이프와 붙어버린 것을 떨어지게 만든다. 상기와 같이 모포로지컬 필터링 동작을 수행하고 난 후, 1335단계에서 영상화면의 수직화소를 서브샘플링하여 원래 영상화면의 비율로 환원시킨다. 상기 모포로지컬 필터링한 이진화 영상의 수직방향의 화소들을 서브샘플링하는 목적은 축소된 영상에서 글자열의 기울어진 각도를 구하기 위함이다.The binarized video screen is subsampled in a horizontal direction in step 1320. The subsampling is performed in the horizontal direction. The reason for subsampling the pixels in the horizontal direction is to cause the character strings to be combined in a stripe shape in the process of generating a candidate stripe at the rear end. Thereafter, in steps 1325 and 1330, candidate stripes are generated by morphologically filtering the horizontally reduced video image. That is, in step 1325, the binary character areas of the video screen are expanded to make neighboring letters contact each other, and in step 1330, the candidate stripe is attached to another candidate stripe vertically adjacent to each other in the expansion process. Make it fall After performing the morphological filtering operation as described above, in step 1335, the vertical pixels of the video screen are subsampled to be reduced to the ratio of the original video screen. The purpose of subsampling the pixels in the vertical direction of the morphologically filtered binarization image is to obtain the inclination angle of the character string in the reduced image.

이후 1340단계에서는 상기 영상화면에서 후보스트라이프들에 대한 번호를 부여하며, 1345단계에서 상기 각 후보스트라이프들의 이심율 및 블럽 크기를 계산하여 방향각을 계산하기 위한 스트라이프들을 선택한다. 그리고 1350단계에서는 상기 선택된 스트라이프들에 대한 방향각을 계산하여 그 개수를 누적한다. 상기 선택된 스트라이프들에 대한 방향각의 계산을 완료하면, 1355단계에서 개수가 누적된 방향각들 중에서 개수가 가장 많이 누적된 방향각을 영상화면의 회전각(skew angle)으로 결정한다.Thereafter, in step 1340, a number of candidate stripes is assigned to the image screen, and in step 1345, stripes for calculating a direction angle are calculated by calculating the eccentricity and the blob size of each candidate stripe. In operation 1350, the direction angles of the selected stripes are calculated and the numbers are accumulated. When the calculation of the direction angles with respect to the selected stripes is completed, in operation 1355, the direction angle at which the number is accumulated is the highest among the accumulated direction angles as the skew angle of the image screen.

상기 회전각이 결정되면, 1360단계에서 상기 입력 영상화면을 상기 회전각으로 회전시켜 영상화면의 피사체의 기울기를 보정한다. 상기 기울기가 보정된 영상화면은 상기 영상화면의 회전에 의해 귀퉁이에 화소가 없는 공백부분이 발생된다.이를 보정하기 위하여, 1365단계에서 상기 영상화면의 귀퉁이 부분에 수평방향으로 가장 가까운 화소 값을 선택하여 채운다. 이후 상기 기울기 및 영상화소의 보정이 종료된 화면은 1370단계에서 영상영역확장부930, 잡음제거부940 또는 영상이진화부950에 출력된다.When the rotation angle is determined, the tilt of the subject of the image screen is corrected by rotating the input image screen at the rotation angle in step 1360. The tilted image screen generates a blank portion without pixels at the corners by the rotation of the image screen. To correct this, in step 1365, the pixel value closest to the corner portion of the image screen is selected in the horizontal direction. Fill it up. In operation 1370, the screen on which the tilt and the image pixel are corrected is output to the image region expansion unit 930, the noise removing unit 940, or the image binarization unit 950.

도 14는 상기 도 4의 영상영역확장부930의 구성을 도시하는 도면이다.FIG. 14 is a diagram illustrating a configuration of the image region expansion unit 930 of FIG. 4.

상기 도 14를 참조하면, 입력되는 영상화면은 입력되는 영상화면 또는 피사체기울기보정부920에서 출력되는 영상화면이다.Referring to FIG. 14, the input video screen is an input video screen or a video screen output from the subject tilt correction unit 920.

평균필터1410은 상기 입력 영상화면을 평균필터링(mean filtering)하여 영상화면을 블러링되게 한다. 상기 평균필터링을 수행하는 이유는 상기 입력 영상화면을 블러링(blurring)시켜 뒷단에서 블록 분류시의 글자영역 밖의 배경영역의 영향을 줄이기 위함이다.The average filter 1410 causes the image screen to be blurred by mean filtering the input image screen. The reason for performing the average filtering is to blur the input video screen to reduce the influence of the background area outside the character area in the block classification at the rear end.

블록분류부1420은 상기 평균필터1410에서 출력되는 영상화면을 상기 블록으로 분할하고, 상기 분할된 블록들에 포함되는 화소들을 검사하여 글자 블록 및 배경블록들로 분류한 후, 상기 분류된 글자블록의 화소들을 특정한 값으로 변환하는 기능을 수행한다. 상기와 같이 블록분류부가 각 블록들을 글자블록 및 배경블록들로 분류하는 이유는 글자가 포함되어 있는 영역을 특정화소값을 변환하여 글자영역을 추출할 수 있게 하기 위함이다. 여기서 상기 블록은 상기한 바와 같이 8×8 화소의 크기를 가진다고 가정한다.The block classification unit 1420 divides the image screen output from the average filter 1410 into the block, examines the pixels included in the divided blocks, classifies them into letter blocks and background blocks, and then classifies the classified letter blocks. It converts the pixels to a specific value. The reason why the block classification unit classifies each block into a letter block and a background block is to enable the text area to be extracted by converting a specific pixel value from the area containing the letter. It is assumed here that the block has a size of 8x8 pixels as described above.

화소감축부(subsampling part)1430은 상기 블록분류부1420에서 출력되는 영상화면을 서브샘플링하여 화소수를 감축한다. 상기 화소를 감축하는 이유는 뒷단에서 메디안 필터링을 수행할 때 필터창(filter window)을 작게하여 필터링 속도를 높이기 위함이다. 본 발명의 실시예에서는 상기 화소 감축 비율은 (2:1)²라고 가정한다. 이런 경우, 상기 화소감축부1430은 수평화소를 2:1로 서브샘플링하고 수직화소들을 2:1로 서브샘플링하므로, 출력되는 영상화면의 화소들의 수는 1/4로 감축된다.The subsampling part 1430 reduces the number of pixels by subsampling an image screen output from the block classifier 1420. The reason for reducing the pixel is to increase the filtering speed by making the filter window smaller when median filtering is performed at the rear end. In the embodiment of the present invention, it is assumed that the pixel reduction ratio is (2: 1) ² . In this case, since the pixel reduction unit 1430 subsamples the horizontal pixels 2: 1 and subsamples the vertical pixels 2: 1, the number of pixels of the output image screen is reduced to 1/4.

메디안필터1440은 상기 화소감축부1430에서 출력되는 영상화면을 메디안 필터링하여 상기 영상화면의 잘못 분류된 글자블록을 제거한다. 상기 메디안 필터1440은 상기 블록분류 과정에서 잡음 등에 의해 글자블록으로 잘못 분류된 고립된 글자블록들을 제거하는 기능을 수행한다.The median filter 1440 removes wrongly classified letter blocks of the video screen by median filtering the video screen output from the pixel reduction unit 1430. The median filter 1440 removes isolated letter blocks that are incorrectly classified as letter blocks by noise in the block classification process.

화소복원부(interpolation part)1450은 상기 메디안필터1440에서 출력되는 영상화면의 화소들을 보간(interpolation)하여 확장하다. 본 발명의 실시예에서는 상기 화소 보간 비율은 (2:1)²라고 가정한다. 이런 경우, 상기 화소복원부1450은 상기 메디안필터1440에서 출력되는 영상화면의 수평화소 및 수직화소를 각각 2:1로 보간하므로 출력되는 영상화면의 크기는 4배로 확장된다. 상기 화소를 복원하는 이유는 글자영역의 정확한 위치를 탐색하기 위하여, 상기 화소 감축 과정에서 감축된 영상화면의 크기를 원래 영상화면의 크기로 확장하기 위함이다.The interpolation part 1450 extends by interpolating pixels of an image screen output from the median filter 1440. In an embodiment of the present invention, it is assumed that the pixel interpolation ratio is (2: 1) ² . In this case, the pixel restoring unit 1450 interpolates the horizontal pixels and the vertical pixels of the video screen output from the median filter 1440 at 2: 1, respectively, so that the size of the output video screen is expanded four times. The reason for restoring the pixel is to extend the size of the video screen reduced in the pixel reduction process to the size of the original video screen in order to search for the correct position of the text area.

위치탐색부1460은 상기 메디안 필터링된 영상화면을 수평 및 수직 방향으로 스캔하여 글자영역의 위치를 탐색한다. 상기 위치탐색부1460은 상기 메디안 필터링된 영상화면을 수평방향으로 스캔하여 가장 왼쪽에 위치된 글자블록의 위치(x1) 및가장 오른쪽에 위치된 글자블록의 위치(x2)를 탐색하고, 수직방향으로 스캔하여 가장 상측에 위치된 글자블록의 위치(y1) 및 가장 하측에 위치된 글자블록의 위치(y2)를 탐색한 후, 상기 탐색된 결과에 따라 영상화면에서 글자영역의 위치를 결정한다. 이때 글자영역의 좌상단 및 우하단의 위치는 (x1,y1) 및 (x2,y2)가 된다. 이때 상기 글자영역의 좌상단 및 우하단의 위치(x1,y1) 및 (x2,y2)는 입력 영상화면의 화면비율을 갖도록 결정한다. 이는 뒷단의 영상확장부170에서 영상을 확장할 때 왜곡을 방지하기 위해서이다.The location search unit 1460 scans the median filtered video screen in horizontal and vertical directions to search for a location of a text area. The position search unit 1460 scans the median filtered image screen in a horizontal direction to search for the position (x1) of the leftmost letter block and the position (x2) of the letter block located at the rightmost, and in the vertical direction. After scanning, the position y1 of the uppermost letter block and the position y2 of the lowermost letter block are searched to determine the position of the character area on the image screen according to the search result. At this time, the positions of the upper left and the lower right of the character area are (x1, y1) and (x2, y2). At this time, the positions (x1, y1) and (x2, y2) of the upper left and lower right ends of the character area are determined to have the aspect ratio of the input image screen. This is to prevent distortion when the image is extended by the image expansion unit 170 at the rear end.

글자영역추출부1470은 상기 위치탐색부1460에서 탐색된 글자영역의 영상화면을 추출한다. 즉, 상기 글자영역추출부1470은 상기 위치탐색부1460에서 출력되는 글자영역의 좌상단 및 우하단의 위치 (x1,y1) 및 (x2,y2) 값을 입력하며, 상기 영상화면에서 상기 글자영역의 좌상단 및 우하단의 위치 (x1,y1) 및 (x2,y2) 내에 존재하는 영상화면을 추출한다. 따라서 상기 글자영역추출부1470에서 출력되는 영상화면은 입력 영상화면에서 배경영역이 제거된 글자영역의 영상화면이 된다.The text area extractor 1470 extracts an image screen of the text area searched by the location search unit 1460. That is, the character area extracting unit 1470 inputs the position (x1, y1) and (x2, y2) values of the upper left and lower right ends of the character area output from the position search unit 1460, The image screen existing in the positions (x1, y1) and (x2, y2) of the upper left and lower right is extracted. Accordingly, the video screen output from the text area extractor 1470 becomes a video screen of the text area from which the background area is removed from the input video screen.

영상확장부1480은 상기 추출된 글자영역의 영상화면을 상기 입력 영상화면의 크기로 확장한다. 여기서 상기 영상 확장은 보간에 의해 구현될 수 있으며, 본 발명의 실시예에서는 쌍선형보간 방법(bilinear interpolation)으로 구현한다고 가정한다. 이때 상기 영상확장은 상기 입력화면의 영상화면과 같은 크기가 되도록 보간 동작을 수행한다.The image expansion unit 1480 expands the extracted video screen of the text area to the size of the input video screen. In this case, the image extension may be implemented by interpolation, and in the embodiment of the present invention, it is assumed that the image extension is implemented by a bilinear interpolation method. At this time, the image expansion performs an interpolation operation so as to be the same size as the image screen of the input screen.

이하의 설명에서는 상기 도 14를 중심으로 상기 영상영역확장부930의 동작을 상세하게 살펴본다.Hereinafter, the operation of the image region expansion unit 930 will be described in detail with reference to FIG. 14.

먼저 입력되는 상기 영상화면은 N×M의 크기를 가진다. 또한 상기 입력되는 영상은 컬러 영상(color image) 또는 색상정보가 없는 흑백영상(gray image)이 될 수 있다. 본 발명의 실시예에서는 상기 영상화면이 흑백 영상이라고 가정한다.The video screen input first has a size of N × M. In addition, the input image may be a color image or a gray image without color information. In the embodiment of the present invention, it is assumed that the video screen is a black and white video.

상기 영상화면을 입력하는 평균필터1410은 상기 영상화면을 평균필터링하여 영상화면을 블러링되게 만든다. 이는 뒷단의 블록분류부1420에서 글자영역을 분류할 때 영상화면의 글자영역 밖의 배경영역의 영향을 덜받게 하기 위함이다. 상기와 같은 평균필터는 곤잘레스(R.C.Gonzalez)와 우즈(R.Woods) 등에 의해 출판된 책 "Digital Image Processing" [2nd ed., Prentice Hall, pp.119-123, 2002.]에 기재되어 있다.The average filter 1410 inputting the video screen causes the video screen to be blurred by average filtering the video screen. This is to reduce the influence of the background area outside the text area of the image screen when classifying the text area in the block classification unit 1420 at the rear end. Such average filters are described in the book "Digital Image Processing" [2nd ed., Prentice Hall, pp. 119-123, 2002.] published by R. C. Gonzalez and R. Woods et al.

상기 평균필터링된 영상화면은 블록분류부1420에 인가된다. 상기 블록분류부1420은 상기 평균필터1410에서 출력되는 영상화면을 상기 블록으로 분할하고, 상기 분할된 블록들에 포함되는 화소들을 검사하여 글자 블록 및 배경블록들로 분류한 후, 상기 분류된 글자블록의 화소들을 특정한 값으로 변환하는 기능을 수행한다.The average filtered video screen is applied to the block classifier 1420. The block classification unit 1420 divides the image screen output from the average filter 1410 into the block, examines the pixels included in the divided blocks, classifies them into letter blocks and background blocks, and then classifies the classified letter blocks. Converts the pixels of the pixel to a specific value.

도 15는 상기 도 14의 블록분류부1420의 구성을 도시하는 도면이다. 상기 블록분류부1420은 영상블러링판정부910의 블록분류부1110과 동일하게 구성할 수 있다. 따라서 상기 도 15와 같은 블록분류부1420은 상기 도 6과 같은 블록분류부1110과 동일한 구성을 가지며, 영상화면에서 블록들을 분류하는 동작도 상기 블록분류부1110의 동작과 동일하다.FIG. 15 is a diagram illustrating a configuration of the block classification unit 1420 of FIG. 14. The block classification unit 1420 may be configured in the same manner as the block classification unit 1110 of the image blurring determiner 910. Accordingly, the block classification unit 1420 of FIG. 15 has the same configuration as that of the block classification unit 1110 of FIG. 6, and the operation of classifying blocks on an image screen is the same as that of the block classification unit 1110.

상기와 같이 도 15의 블록판정부1119에서 분류된 글자블록들의 화소는 0-255의 그레이 레벨(gray level)을 가질 수 있다. 그러면 블록화소보정부(block filling part)1421은 상기 블록판정부1119에서 분류된 글자블록의 화소들은 제1밝기 값을 가지는 화소로 변환하고, 배경블록의 화소들은 제2밝기값을 가지는 화소들로 변환한다. 본 발명의 실시예에서는 상기 블록화소보정부1421이 글자블록의 화소들은 흰색 화소로 변환하고 배경블록의 화소들은 검은색 화소로 변환한다고 가정한다. 따라서 상기 블록화소보정부1421은 상기 영상화면에서 글자블록으로 분류된 블록들은 흰색화소로 채우고 배경블록으로 분류된 블록들은 검은색 화소로 채운다. 상기와 같이 상기 블록분류부1420이 각 블록들을 글자블록 및 배경블록들로 분류한 후, 각각 다른 밝기 값을 가지는 화소값으로 채우는 하는 이유는 영상화면의 글자영역들을 표시하기 위함이다.As described above, the pixels of the letter blocks classified in the block decision unit 1119 of FIG. 15 may have a gray level of 0-255. Then, the block filling part 1421 converts the pixels of the letter block classified in the block determining unit 1119 into pixels having a first brightness value, and converts pixels of the background block into pixels having a second brightness value. do. In the embodiment of the present invention, it is assumed that the block pixel correction unit 1421 converts the pixels of the letter block into white pixels and the pixels of the background block into black pixels. Accordingly, the block pixel correction unit 1421 fills blocks classified as letter blocks on the image screen with white pixels and blocks classified as background blocks with black pixels. As described above, the block classification unit 1420 classifies the blocks into letter blocks and background blocks, and then fills them with pixel values having different brightness values to display letter areas of the image screen.

이후 상기 화소감축부1430은 상기 블록분류부1420에서 출력되는 영상화면을 서브샘플링하여 수평 및 수직화소 수를 감축한다. 상기 화소를 감축하는 이유는 뒷단의 메디안필터1440에서 메디안 필터링을 수행할 때 필터창(filter window)을 작게하여 필터링 속도를 높이기 위함이다. 본 발명의 실시예에서는 상기 화소 감축 비율은 (2:1)²라고 가정한다. 이런 경우, 상기 블록분류부1420에서 출력되는 영상화면의 화소들의 수는 1/4로 감축된다. 이런 경우 감축된 영상화면의 크기는 320×240 화소의 크기를 갖게된다.Thereafter, the pixel reduction unit 1430 reduces the number of horizontal and vertical pixels by subsampling the image screen output from the block classification unit 1420. The reason for reducing the pixel is to increase the filtering speed by making the filter window smaller when median filtering is performed in the median filter 1440 at the rear stage. In the embodiment of the present invention, it is assumed that the pixel reduction ratio is (2: 1) ² . In this case, the number of pixels of the image screen output from the block classification unit 1420 is reduced to 1/4. In this case, the size of the reduced video screen has a size of 320 × 240 pixels.

그러면 상기 메디안필터1440은 상기 화소감축부1430에서 출력되는 영상화면을 메디안 필터링하여 상기 영상화면의 배경블록 및 잘못 분류된 글자블록을 제거한다. 상기 메디안 필터1440은 상기 블록분류 과정에서 잡음 등에 의해 글자블록으로 잘못 분류된 고립된 글자블록들을 제거하는 기능을 수행한다. 상기와 같은 메디안필터는 제인(A.K.Jain)에 의해 출판된 책 "Fundamentals of Digital Image Processing" [Prentice Hall, pp.246-249.]에 기재되어 있다.The median filter 1440 removes the background block and the wrongly classified letter block of the video screen by median filtering the video screen output from the pixel reduction unit 1430. The median filter 1440 removes isolated letter blocks that are incorrectly classified as letter blocks by noise in the block classification process. Such median filters are described in the book "Fundamentals of Digital Image Processing" [Prentice Hall, pp. 246-249.] Published by A.K.Jain.

상기 영상화면을 메디안 필터링한 후, 화소복원부(interpolation part)480은 상기 메디안필터1440에서 출력되는 영상화면의 수평 및 수직화소들을 보간(interpolation)하여 영상화면은 상기 입력 영상화면의 크기로 확장한다. 본 발명의 실시예에서는 상기 화소 보간 비율은 (2:1)²라고 가정한다. 상기 화소를 복원하는 이유는 글자영역의 정확한 위치를 탐색하기 위하여, 상기 화소 감축 과정에서 감축된 영상화면의 크기를 원래 영상화면의 크기로 확장하기 위함이다.After the median filtering is performed on the video screen, the pixel restoration unit 480 interpolates horizontal and vertical pixels of the video screen output from the median filter 1440 to expand the video screen to the size of the input video screen. . In an embodiment of the present invention, it is assumed that the pixel interpolation ratio is (2: 1) ² . The reason for restoring the pixel is to extend the size of the video screen reduced in the pixel reduction process to the size of the original video screen in order to search for the correct position of the text area.

위치탐색부1460은 상기 메디안 필터링된 영상화면을 수평 및 수직 방향으로 스캔하여 글자영역의 위치를 탐색한다. 상기 위치탐색부1460은 상기 메디안 필터링된 영상화면을 수평방향으로 스캔하여 가장 왼쪽에 위치된 글자블록의 위치(x1) 및 가장 오른쪽에 위치된 글자블록의 위치(x2)를 탐색하여 그 결과값을 저장한다. 이후 상기 영상화면을 다시 수직방향으로 스캔하여 가장 상측에 위치된 글자블록의 위치(y1) 및 가장 하측에 위치된 글자블록의 위치(y2)를 탐색한 후, 그 결과 값을 저장한다. 이후 상기 탐색된 결과에 따라 영상화면에서 글자영역 좌상단 및 우하단의 위치 (x1,y1) 및 (x2,y2)를 결정한다. 이때 상기 글자영역의 위치(x1,y1) 및 (x2,y2)는 입력 영상화면의 화면비율을 갖도록 결정한다. 이는 뒷단의영상확장부170에서 영상을 확장할 때 왜곡을 방지하기 위해서이다. 본 발명의 실시예에서는 입력 영상화면의 가로 대 세로 비율이 4:3(640화소:480화소)이므로, 위치탐색부1460에서 탐색하는 글자영역도 가로 대 세로 비율이 4:3이 되도록 글자영역 좌상단 및 우하단의 위치(x1,y1) 및 (x2,y2)를 결정한다.The location search unit 1460 scans the median filtered video screen in horizontal and vertical directions to search for a location of a text area. The position search unit 1460 scans the median filtered video screen in a horizontal direction, searches for the position of the leftmost letter block (x1) and the position of the rightmost letter block (x2), and retrieves the result value. Save it. Thereafter, the image screen is scanned again in the vertical direction to search for the position y1 of the uppermost letter block and the position y2 of the lowermost letter block, and then store the result value. Subsequently, the positions (x1, y1) and (x2, y2) of the upper left and lower right portions of the character area are determined according to the search result. At this time, the position (x1, y1) and (x2, y2) of the character area is determined to have the aspect ratio of the input image screen. This is to prevent distortion when expanding the image in the image expansion unit 170 at the rear end. In the exemplary embodiment of the present invention, the aspect ratio of the input image screen is 4: 3 (640 pixels: 480 pixels), so that the character area searched by the location searcher 1460 also has an aspect ratio of 4: 3. And the positions (x1, y1) and (x2, y2) of the lower right end.

글자영역추출부1470은 상기 위치탐색부1460에서 탐색된 글자영역의 영상화면을 추출한다. 즉, 상기 글자영역추출부1470은 상기 위치탐색부1460에서 출력되는 글자영역의 좌상단 및 우하단의 위치 (x1,y1) 및 (x2,y2) 값을 입력하며, 상기 입력부10에서 출력되는 영상화면에서 상기 글자영역의 좌상단 및 우하단의 위치 (x1,y1) 및 (x2,y2) 내에 존재하는 영상화면을 추출한다. 이때 상기 글자영역의 좌상단 및 우하단의 위치 (x1,y1) 및 (x2,y2)에 의해 상기 글자영역추출부450은 영상화면에서 수평 방향으로 x1위치에서 x2위치 사이, 그리고 수직방향으로 y1위치에서 y2위치 사이에 존재하는 화소들을 글자영역의 화소들로 추출한다. 상기 글자영역추출부1470에서 출력되는 영상화면은 입력 영상화면에서 배경영역이 제거된 글자영역의 영상화면이 된다.The text area extractor 1470 extracts an image screen of the text area searched by the location search unit 1460. That is, the text area extractor 1470 inputs the position (x1, y1) and (x2, y2) values of the upper left and lower right ends of the text area output from the location search unit 1460, and the image screen output from the input unit 10. Extracts an image screen existing within the positions (x1, y1) and (x2, y2) of the upper left and lower right ends of the character area. At this time, the letter region extracting unit 450 is positioned between x1 position and x2 position in the horizontal direction and y1 position in the vertical direction by the positions (x1, y1) and (x2, y2) of the upper left and lower right ends of the character area. Pixels existing between y2 positions are extracted as the pixels of the text area. The video screen output from the text area extractor 1470 becomes a video screen of the text area from which the background area is removed from the input video screen.

영상확장부1480은 상기 추출된 글자영역의 영상화면을 상기 입력 영상화면의 크기로 확장한다. 여기서 상기 영상 확장은 보간에 의해 구현될 수 있으며, 본 발명의 실시예에서는 쌍선형보간 방법(bilinear interpolation)으로 구현한다고 가정하며, 이는 하기 <수학식 12>와 같다.The image expansion unit 1480 expands the extracted video screen of the text area to the size of the input video screen. The image expansion may be implemented by interpolation, and in the embodiment of the present invention, it is assumed that the image expansion is implemented by a bilinear interpolation method, which is represented by Equation 12 below.

이때 상기 영상확장은 상기 입력 영상화면의 크기 및 화면비율과 같도록 보간 동작을 수행한다. 상기와 같은 쌍선형보간 방법은 프레스(W.H.Press)와 튜콜스키(S.A.Teukolsky) 등에 의해 출판된 책 "Numerical Recipies in C" [2nd ed., Cambridge, pp.123-125, 1988.]에 기재되어 있다.At this time, the image expansion performs an interpolation operation so as to be equal to the size and aspect ratio of the input image screen. Such a bilinear interpolation method is described in the book "Numerical Recipies in C" published by WHPress and SATeukolsky et al. [2nd ed., Cambridge, pp. 123-125, 1988.]. have.

도 16은 본 발명의 실시예에 따른 영상 영역을 확장하는 절차를 설명하기 위한 도면이다.16 is a diagram for describing a procedure of expanding an image area according to an embodiment of the present invention.

상기 도 16의 절차에 의거 영상확장 절차를 살펴보면, 1510단계에서 영상화면을 입력한다. 이후 1515단계에서는 영상화면을 평균필터링하여 블러드 영상화면(blurred image)을 생성한다. 이는 상기한 바와 같이 블록분류 과정에서 글자영역 밖의 배경영역의 영향을 덜받게 하기 위함이다.Referring to the image expansion procedure based on the procedure of FIG. 16, the image screen is input in step 1510. In operation 1515, the average image is filtered to generate a blurred image. This is to reduce the influence of the background area outside the letter area in the block classification process as described above.

이후 1520단계에서 상기 평균필터링된 영상화면을 설정된 크기의 블록들로 분할하고, 상기 분할된 블록들에 포함되는 화소들을 검사하여 글자 블록 및 배경블록들로 분류한 후, 상기 분류된 글자블록의 화소들을 특정한 값으로 변환하는 기능을 수행한다. 상기와 같은 블록 분류 동작을 수행하면, 상기 영상화면은 글자블록 및 배경블록으로 분류되며, 상기 분류된 글자블록은 흰색 화소로 변환되고 배경블록은 검은색 화소로 변환된다. 따라서 영상화면은 분류된 블록에 따라 흰색 또는검은색 화소로 채워진다.Thereafter, in step 1520, the average filtered video screen is divided into blocks having a predetermined size, the pixels included in the divided blocks are inspected, classified into a letter block and a background block, and then the pixels of the classified letter block. Converts them to specific values. When the block classification operation is performed, the video screen is classified into a letter block and a background block, the classified letter block is converted into a white pixel, and the background block is converted into a black pixel. Therefore, the image screen is filled with white or black pixels according to the classified blocks.

상기 1520단계에서 상기 영상화면이 생성되면, 1525단계에서 상기 영상화면을 서브샘플링하여 수평 및 수직화소가 감축된 영상화면을 생성한다. 상기 화소를 감축하는 이유는 다음 과정의 메디안 필터링 과정의 필터창(filter window)을 작게하여 필터링 속도를 높이기 위함이다. 상기와 같은 서브샘플링 동작을 수행한 후, 1530단계에서 상기 축소된 영상화면을 메디안 필터링한다. 상기 메디안 필터링 동작을 수행하면, 영상화면의 테두리 또는 잡음 등에 의해 잘못 분류되어 영상화면에 남아있는 고립된 글자블록들이 제거된다. 상기와 같이 메디안 필터링 동작을 수행하여 잘못 분류된 글자블록을 제거한 후, 1535단계에서 상기 메디안 필터링된 영상화면의 수평 및 수직화소들을 보간(interpolation)하여 상기 입력 영상화면의 크기로 확장한다.When the video screen is generated in step 1520, the video screen is subsampled in step 1525 to generate a video screen with reduced horizontal and vertical pixels. The reason for reducing the pixel is to increase the filtering speed by making the filter window of the median filtering process of the next process small. After performing the subsampling operation as described above, the median filter of the reduced image screen is performed in step 1530. When the median filtering is performed, isolated letter blocks that are incorrectly classified by the edge or noise of the video screen and remain on the video screen are removed. After the median filtering operation is performed to remove the wrongly classified letter block, the horizontal and vertical pixels of the median filtered video screen are interpolated and expanded to the size of the input video screen in step 1535.

이후 1540단계에서 상기 원래 크기로 복원된 영상화면을 수평 및 수직방향으로 스캔하여 글자영역의 위치를 탐색한다. 상기 위치탐색 절차는 상기 메디안 필터링된 영상화면을 수평방향으로 스캔하여 가장 왼쪽에 위치된 글자블록의 위치(x1) 및 가장 오른쪽에 위치된 글자블록의 위치(x2)를 탐색하다. 그리고 상기 영상화면을 다시 수직방향으로 스캔하여 가장 상측에 위치된 글자블록의 위치(y1) 및 가장 하측에 위치된 글자블록의 위치(y2)를 탐색한다. 이후 1545단계에서 상기 탐색된 결과에 따라 영상화면에서 글자영역 좌상단 및 우하단의 위치 (x1,y1) 및 (x2,y2)를 결정하며, 이때 상기 글자영역의 좌상단 및 우하단의 위치(x1,y1) 및 (x2,y2)는 입력 영상화면의 화면비율을 갖도록 결정한다. 이는 다음 단계에서 영상을 확장할때 왜곡을 방지하기 위해서이다.Thereafter, in step 1540, the image screen restored to the original size is scanned in the horizontal and vertical directions to search for the position of the text area. The location search procedure scans the median filtered video screen in a horizontal direction and searches for the position x1 of the leftmost letter block and the position x2 of the rightmost letter block. The image screen is scanned again in the vertical direction to search for the position y1 of the uppermost letter block and the position y2 of the lowermost letter block. Thereafter, the positions (x1, y1) and (x2, y2) of the upper left and lower right portions of the character area are determined on the image screen according to the search result in step 1545, and the positions (x1, y1) and (x2, y2) determine to have the aspect ratio of the input video screen. This is to prevent distortion when expanding the image in the next step.

상기와 같이 글자영역의 위치를 탐색한 후, 1550단계에서 상기 입력 영상화면에서 상기 탐색된 글자영역의 위치의 글자영역에 존재하는 영상화면을 추출한다. 즉, 상기 글자영역의 추출은 상기 영상화면에서 상기 글자영역의 좌상단 및 우하단의 위치 (x1,y1) 및 (x2,y2) 내에 존재하는 영상화면을 추출한다. 이때 추출되는 상기 글자영역의 영상화면은 상기 입력 영상화면에서 수평 방향으로 x1위치에서 x2위치 사이, 그리고 수직방향으로 y1위치에서 y2위치 사이가 된다. 상기 글자영역의 영상화면은 입력 영상화면에서 배경영역이 제거된 글자영역의 영상화면이 된다.After the location of the text area is searched as described above, the image screen existing in the text area of the searched text area is extracted from the input image screen in step 1550. That is, the extraction of the text area extracts the video screen existing in the positions (x1, y1) and (x2, y2) of the upper left and lower right ends of the text area on the video screen. At this time, the extracted video screen of the text area is between x1 position and x2 position in the horizontal direction from the input image screen and between y1 position and y2 position in the vertical direction. The video screen of the text area becomes the video screen of the text area from which the background area is removed from the input video screen.

상기 글자영역의 영상화면을 추출한 후, 1555단계에서 영상확장부1480은 상기 글자영역의 영상화면을 상기 입력 영상화면의 크기로 확장한다. 여기서 상기 영상 확장은 보간에 의해 구현될 수 있으며, 본 발명의 실시예에서는 쌍선형보간 방법(bilinear interpolation)으로 구현할 수 있다. 상기 확장된 영상화면은 1560단계에서 인식기에 출력되거나 저장되어 다른 용도로 사용될 수 있다.After extracting the video screen of the text area, in step 1555, the image expansion unit 1480 expands the video screen of the text area to the size of the input video screen. In this case, the image expansion may be implemented by interpolation, and in the embodiment of the present invention, it may be implemented by a bilinear interpolation method. The expanded video screen may be output or stored in the recognizer in step 1560 to be used for other purposes.

도 4의 잡음제거부940의 동작을 살펴본다.The operation of the noise canceling unit 940 of FIG. 4 will be described.

일반적으로 디지털 카메라 등으로부터 피사체의 영상화면을 획득하는 경우에는 촬영되는 영상화면에서 잡음 성분이 포함된다. 상기와 같은 잡음 성분들 중에서 대표적인 잡음성분은 가우시안 잡음이 될 수 있다. 상기와 같은 가우시안 잡음을 제거하는데는 필터를 사용하는 방법이 일반적이며, 상기 가우시안 잡음을 제거하기 위한 다양한 종류의 잡음 제거 필터들이 있다. 그러나 명함등과 같은 영상화면을 촬영하는 경우에는 영상화면 내에서 글자 영역의 에지 부분에 많은 정보를 가지게된다. 따라서 상기 명함과 같은 영상화면을 단순한 잡음 제거 필터만을 사용하여 잡음을 제거는 경우 경우에는 글자 정보의 심각한 손상을 일으킬 수 있다. 그래서 글자의 에지 정보를 잘 보존하면서 동시에 영상의 잡음을 잘 제거하기 위한 특수한 잡음 제거 필터가 필요하다. 본 발명의 실시예에서는 방향 리 필터(directional Lee filter)를 사용한다고 가정한다. 상기 방향성 리 필터에 대한 수식적 표현은 하기 <수학식13>과 같다.In general, when a video screen of a subject is acquired from a digital camera or the like, noise components are included in the video screen to be photographed. Among the noise components described above, a representative noise component may be a Gaussian noise. A method using a filter is generally used to remove the Gaussian noise, and there are various kinds of noise canceling filters for removing the Gaussian noise. However, when photographing a video screen such as a business card, etc., the video screen has a lot of information in the edge portion of the text area. Therefore, when the noise is removed using only a noise reduction filter on a video screen such as a business card, serious damage to character information may occur. Therefore, a special noise reduction filter is needed to remove the noise of the image while preserving the edge information of the character well. In an embodiment of the present invention, it is assumed that a directional Lee filter is used. Formal expression of the directional re-filter is as shown in Equation (13).

상기 <수학식 13>에서 영상화면의 국부영역의 신호 평균과 분산을 이용하여 적응적으로 필터의 파라미터를 조절한다. 상기 <수학식 13>은 영상화면의 배경영역에서는 하기 <수학식 14>와 같이 잡음의 분산이 국부신호의 분산보다 아주 크게되어 잡음이 제거된 출력영상은 곧 국부적인 평균값이 되고, 에지영역에서는 국부신호의 분산이 잡음의 분산보다 아주 크게되어 잡음 제거된 출력영상은 에지 방향성의 가중치를 준 픽셀의 평균값이 되어 에지를 잘 보존하면서 에지 영역의 잡음을 제거하게 된다.In Equation 13, the filter parameter is adaptively adjusted using the signal average and variance of the local region of the image screen. In Equation (13), in the background region of an image screen, as shown in Equation (14), the variance of noise is much larger than the variance of local signal, and thus the output image from which the noise is removed becomes a local mean value. The variance of the local signal is much larger than the variance of noise so that the noise-free output image is the average value of the weighted pixels of the edge directionality to remove edge noise while preserving the edge well.

영상의 에지영역에서는 에지의 성분을 보존하면서 동시에 잡음을 제거하기 위해서 하기 <수학식 15>와 도 17B에서 보여주는 것처럼 주요 에지방향(90°,135°, 0°,45°)에 직교하는 방향(0°,45°,90°,135°)으로의 1차원 평균 필터의 출력(y_θ)과 각 에지 방향에 대한 방향성의 가중치(w_θ)와의 곱의 합으로 구해진다.In the edge region of the image, in order to preserve the components of the edge and remove noise at the same time, as shown in Equation 15 and FIG. 17B, the direction perpendicular to the main edge directions (90 °, 135 °, 0 °, 45 °) ( It is obtained by the sum of the product of the output (y _θ ) of the one-dimensional average filter at 0 °, 45 °, 90 °, 135 ° and the weight (w _θ ) of the directionality for each edge direction.

상기 <수학식 16>에서는 도 17A와 도 17B를 참조하면 3×3 필터창 내에서 n=1~4까지 변하면서 각 에지 방향에 직교하는 방향(0°,45°,90°,135°)으로의 일차원 평균필터를 수행하는 것을 보여준다. 이로서 각 에지영역의 잡음 성분을 제거하게 된다.In Equation 16, referring to FIGS. 17A and 17B, directions perpendicular to each edge direction (0 °, 45 °, 90 °, 135 °) varying from n = 1 to 4 within a 3 × 3 filter window. We show that we perform a one-dimensional mean filter. This removes noise components in each edge region.

각 에지방향(90°,135°, 0°,45°)에 직교하는 방향(0°,45°,90°,135°)으로의 일차원 평균 필터의 출력에 곱해지는 가중치를 구하는 수식은 하기 <수학식 18>과 같다. 상기 가중치(w_θ)는 도 18A~도 18D에서처럼 3×3 필터창 내에서 n=1~4까지 변하면서 각 에지 방향(90°,135°, 0°,45°)으로 하기 <수학식 17>에서처럼 에지의 강도(D_θ)를 계산한 뒤 각 에지 방향의 가중치의 정규화를 위해 하기 <수학식 18>과 같이 계산한다. 더 자세한 설명은 김(N. C. Kim)에 의해 발표된 논문 "Adaptive Image Restoration Using Local Statistics and Directional Gradient Information" [IEE Electronic Letters 4th, Vol.23, no.12, pp.610-611, June. 1987.]에 기재되어 있다.The formula for calculating the weight multiplied by the output of the one-dimensional average filter in the directions (0 °, 45 °, 90 °, 135 °) orthogonal to each edge direction (90 °, 135 °, 0 °, 45 °) is given by the following < Equation 18 > The weight w _θ is changed in each edge direction (90 °, 135 °, 0 °, 45 °) while changing from n = 1 to 4 in the 3 × 3 filter window as shown in FIGS. 18A to 18D. After calculating the strength of the edge (D _θ ), as shown in > For a more detailed description, see the article "Adaptive Image Restoration Using Local Statistics and Directional Gradient Information" published by NC Kim [IEE Electronic Letters 4th, Vol. 23, no. 12, pp. 610-611, June. 1987.].

도 19는 도 4의 영상이진화부950의 구성을 도시하는 도면이다.FIG. 19 is a diagram illustrating a configuration of the image binarization unit 950 of FIG. 4.

상기 도 19를 참조하면, 블록분류부(block classification part)1610은 상기 입력되는 영상화면을 상기 블록으로 분할하고, 상기 분할된 블록들에 포함되는 화소들을 검사하여 글자 블록 및 배경블록들로 분류하는 기능을 수행한다. 상기와 같이 블록분류부1610이 각 블록들을 글자블록 및 배경블록들로 분류하는 이유는 글자가 포함되어 있는 영역에 대해서만 이진화를 수행하기 위함이다. 여기서 상기 블록은 상기한 바와 같이 8×8 화소의 크기를 가진다고 가정한다.Referring to FIG. 19, a block classification part 1610 divides the input image screen into the block, classifies the image block into a letter block and a background block by inspecting pixels included in the divided blocks. Perform the function. The reason why the block classification unit 1610 classifies each block as a letter block and a background block is to perform binarization only on an area including a letter. It is assumed here that the block has a size of 8x8 pixels as described above.

블록성장부(block growing part)1620은 상기 블록분류부1610에서 분류된 글자블록들을 확장한다. 상기 블록 분류시 글자 사이의 배경 영향으로 글자화소를 포함하는 블록이 배경블록으로 분류될 수 있으며, 상기 글자블록을 성장하는 이유는 이렇게 배경블록으로 분류된 화소들을 글자블록으로 확장하기 위함이다.The block growing part 1620 extends the letter blocks classified by the block classification part 1610. In the block classification, a block including a letter pixel may be classified as a background block due to a background effect between letters. The reason for growing the letter block is to extend pixels classified as a background block into a letter block.

블록그룹핑부(block grouping part)1630은 상기 블록성장부1620에서 출력되는 글자블록들을 입력하며, 상기 글자블록을 중심으로 인접한 주변 8개의 블록들을 그룹핑하여 그룹핑된 블록들을 생성한다. 이는 글자블록(8×8) 하나만으로 배경화소와 글자화소를 구분하기 위한 기준값을 정하여 이진화 과정을 수행하면 블록의 크기가 너무 작아 인접한 글자블록의 기준값과 그 값의 차이가 크게 나서 이진화 영상에서 블록간의 불연속 현상이 발생할 수도 있다. 따라서 상기 블록그룹핑부1630에서 블록들을 그룹핑하는 이유는 글자블록 영역을 확장하여 글자블록의 이진화 신뢰성을 향상시키기 위함이다.The block grouping part 1630 inputs the letter blocks output from the block growth unit 1620, and generates eight grouped blocks by grouping adjacent eight blocks around the letter block. If the binarization process is performed by setting the reference value for distinguishing the background pixel from the text pixel using only one letter block (8 × 8), the block size is too small and the difference between the reference value of the adjacent letter block and the value is large. Discontinuities in the liver may occur. Therefore, the reason for grouping the blocks in the block grouping unit 1630 is to improve the binarization reliability of the letter block by extending the letter block area.

에지향상부(edge enhancement part)1640은 블록그룹핑부1630에서 출력되는 그룹핑된 글자블록의 글자화소와 주변화소들 간의 관계를 이용하여 글자블록의 에지를 향상시키고 잡음을 감소시킨 화소들을 생성하며, 또한 상기 화소들을 이진화하기 위한 화소기준값을 발생한다. 상기 에지향상부1640은 상기 쿼드래틱 필터 또는 개선된 쿼드래틱 필터를 사용할 수 있다.The edge enhancement part 1640 generates pixels that improve the edge of the letter block and reduce noise by using the relationship between the letter pixels of the grouped letter blocks and the peripheral pixels output from the block grouping part 1630. A pixel reference value for binarizing the pixels is generated. The edge enhancement unit 1640 may use the quadratic filter or the improved quadratic filter.

블록분리부(block splitting part)1650은 상기 에지향상부1640에서 출력되는그룹핑된 블록을 입력하며, 상기 그룹핑된 블록에서 상기 글자블록을 분리하여 출력한다. 즉, 상기 블록그룹핑부1630에서 그룹핑된 블록에서 이진화를 위한 글자블록만을 분리하는 기능을 수행한다.The block splitting part 1650 inputs the grouped block output from the edge enhancement unit 1640, and separates and outputs the letter block from the grouped block. That is, the block grouping unit 1630 performs a function of separating only the letter blocks for binarization from the grouped blocks.

이진화부(binarization part)1660은 블록분리부1650에서 분리된 글자블록들의 화소들을 상기 화소기준값과 비교하여 글자화소 및 배경화소의 제1 및 제2밝기 값으로 이진화하고, 또한 상기 블록분류부1610에서 출력되는 배경블록의 화소들을 제2밝기 값으로 이진화한다. 상기 이진화된 영상을 상기 도 1의 문자인식부123에 보내기에 앞서 영상을 압축하여 저장 공간의 효율성을 높이는 압축부를 첨가할 수 있다.The binarization part 1660 binarizes the pixels of the letter blocks separated by the block separator 1650 to the first and second brightness values of the letter pixel and the background pixel by comparing the pixels with the pixel reference value. The pixels of the output background block are binarized to the second brightness value. Prior to sending the binarized image to the character recognition unit 123 of FIG. 1, a compression unit may be added to compress the image to increase the efficiency of the storage space.

상기 이진화부1660에서 처리되는 영상화면은 상기 도 1의 문자인식부123에 입력되어 글자들이 인식된다.The video screen processed by the binarization unit 1660 is input to the character recognition unit 123 of FIG. 1 to recognize letters.

상기 입력되는 영상화면은 블록분류부1610에서 블록으로 분할된 후 글자블록 및 배경블록으로 분류된다.The input video screen is divided into blocks by the block classification unit 1610 and then classified into a letter block and a background block.

도 20은 상기 블록분류부1610의 구성을 도시하는 도면이다. 상기 블록분류부1610은 영상블러링판정부910의 블록분류부1110과 동일하게 구성할 수 있다. 따라서 상기 도 20과 같은 블록분류부1610은 상기 도 6과 같은 블록분류부1110과 동일한 구성을 가지며, 영상화면에서 블록들을 분류하는 동작도 상기 블록분류부1110의 동작과 동일하다. 상기와 같이 블록분류부1610에 분류된 글자블록들의 화소는 0-255의 그레이 레벨(gray level)을 가질 수 있다.20 is a diagram illustrating the configuration of the block classification unit 1610. The block classification unit 1610 may be configured in the same manner as the block classification unit 1110 of the image blurring determiner 910. Therefore, the block classifier 1610 of FIG. 20 has the same configuration as that of the block classifier 1110 of FIG. 6, and the operation of classifying blocks on the image screen is the same as that of the block classifier 1110. As described above, the pixels of the letter blocks classified in the block classification unit 1610 may have gray levels of 0-255.

상기 블록성장부1620은 상기 분류된 글자블록의 영역을 성장(growing)한다.이때 상기 블록분류부1610에서 하나의 글자가 글자 사이의 배경의 영향으로 글자 화소를 포함하는 블록이 배경블록으로 분류되는 경우가 발생될 수 있다. 상기 글자블록을 성장하는 목적은 상기 글자블록을 확장하므로써, 상기 블록 분류시에 글자화소가 포함된 배경블록을 글자블록으로 변경하기 위함이다.The block growth unit 1620 grows an area of the classified letter blocks. In this case, a block including a letter pixel is classified as a background block in the block classification unit 1610 under the influence of a background between letters. Cases may occur. The purpose of growing the letter block is to change the background block including the letter pixel to the letter block by expanding the letter block.

상기 블록성장부1620은 모포로지컬 필터(morphological filter: 형태학적 필터)를 사용하여 구현할 수 있다. 상기 모포로지컬 필터는 상기 글자블록을 확장(dilation)한 후 수축(erosion)하여는 닫힘(closing)으로써 글자블록을 성장한다. 즉 닫힘 연산은 영역의 내부의 구멍을 채우는 역할을 하는데 우선 확장을 통해 글자 블록이 확장됨으로써 글자블록과 글자블록 사이의 고립된 배경블록들이 글자블록으로 변환되고 닫힘 연산의 수축을 통해 원래의 블록크기로 복원된다. 상기와 같은 모포로지컬 필터는 곤잘레스(R.C.Gonzalez)와 우즈(R.Woods) 등에 의해 출판된 책 "Digital Image Processing" [2nd ed., Prentice Hall, pp.519-560, 2002.]에 기재되어 있다. 그리고 상기 블록성장부1620은 상기 블록 성장시 글자화소를 포함하고 있는 배경 블록을 글자블록으로 변경한다.The block growth unit 1620 may be implemented using a morphological filter. The morphological filter grows the letter block by closing the letter block by dilation and then erosion. That is, the closing operation fills the hole inside the area. First, the letter block is expanded through expansion, so that isolated background blocks between the letter block and the letter block are converted to the letter block, and the original block size is reduced through the contraction of the closing operation. Is restored. Such morphological filters are described in the book "Digital Image Processing" [2nd ed., Prentice Hall, pp. 519-560, 2002.] published by RCGonzalez and R. Woods et al. . The block growth unit 1620 changes the background block including the letter pixel to the letter block when the block grows.

상기 블록그룹핑부1630은 상기 블록성장부1620에서 출력되는 글자블록을 중심으로 인접한 8개의 블록들을 그룹핑하여 24×24화소의 크기를 가지는 그룹핑된 블록을 생성한다. 이는 상기 글자블록의 크기가 8×8화소의 크기를 갖는데, 이런 글자블록(8×8) 하나만으로 배경화소와 글자화소를 구분하기 위한 기준값을 정하여 이진화 과정을 수행하면 블록의 크기가 너무 작아 인접한 글자블록의 기준값과 그 값의 차이가 크게 나서 이진화 영상에서 블록간의 불연속 현상이 발생할 수도 있다. 따라서 상기와 같이 그룹핑된 블록을 생성하여 이진화를 수행하기 위한 영역을 확장하므로써 이진화의 신뢰성을 향상시킬 수 있게 된다. 상기 블록그룹핑부1630에서 출력되는 글자블록을 포함하는 그룹핑된 블록은 에지향상부1640에 인가된다.The block grouping unit 1630 groups eight adjacent blocks around the letter block output from the block growth unit 1620 to generate a grouped block having a size of 24 × 24 pixels. The letter block has a size of 8 × 8 pixels. When the binarization process is performed by setting a reference value for distinguishing a background pixel and a text pixel with only one letter block (8 × 8), the block size is too small to be adjacent. As the difference between the reference value of the letter block and the value is large, discontinuity between blocks may occur in the binarized image. Therefore, the reliability of binarization can be improved by extending the area for performing binarization by generating the grouped blocks as described above. The grouped blocks including the letter blocks output from the block grouping unit 1630 are applied to the edge enhancement unit 1640.

상기 에지향상부1640은 쿼드래틱 필터(quadratic filiter: QF) 또는 개선된 쿼드래틱 필터(improved quadratic filter: IQF)를 사용할 수 있다. 여기서는 상기 개선된 쿼드래틱 필터를 이용하여 에지성분을 향상시키는 동작을 살펴보기로 한다. 상기 개선된 쿼드래틱 필터는 도 21에 도시된 바와 같이 글자블록을 정규화한 후 상기 정규화된 글자블록의 에지를 향상시키며, 또한 상기 글자블록으로부터 계산된 기준값을 정규화시켜 글자블록의 화소들을 이진화하기 위한 기준값 BTH_N을 생성한다.The edge enhancement unit 1640 may use a quadratic filter (QF) or an improved quadratic filter (IQF). Here, an operation of improving edge components using the improved quadratic filter will be described. The improved quadratic filter improves the edges of the normalized letter block after normalizing the letter block as shown in FIG. 21, and also normalizes the reference value calculated from the letter block to binarize pixels of the letter block. Generate the reference value BTH _N.

먼저 상기 도 21을 참조하여 개선된 쿼드래틱 필터를 이용하여 글자블록의 에지를 향상시키는 동작을 살펴본다.First, an operation of improving an edge of a letter block using an improved quadratic filter will be described with reference to FIG. 21.

상기 도 21을 참조하면, 먼저 제1기준값계산부1621은 글자블록의 각 화소를 글자화소와 배경화소로 분류하기 위한 제1기준값 Th1을 계산한다. 상기 제1기준값계산부1621은 상기 제1기준값 Th1을 계산하며, 상기 제1기준값 Th1은 글자화소와 배경화소를 구분하여 다음 단계에서 구분한 두 종류의 화소들을 정규화하는데 사용된다. 이때 상기 제1기준값 Th1은 두 종류의 화소의 분산의차(between-class variance)가 최대가 되는 그레이 값(gray value)을 선택한다. 상기 제1기준값 Th1은 오츠(Otsu) 방식 또는 카푸르(Kapur) 방식을 사용할 수 있다. 상기 오츠 방식을사용하여 상기 제1기준값 Th1을 계산하는 방법은 상기 <수학식 19>에 의해 구할 수 있으며, 이는 오츠(N. Otsu)에 의해 발표된 논문 "A Threshold Selection Method from Gray-Level Histogram" [IEEE Trans. on Systems. Man and Cybernetics, Vol.SMC-9, no.1, pp.62-66, Jan. 1979.]에 기재되어 있다.Referring to FIG. 21, first, the first reference value calculator 1621 calculates a first reference value Th1 for classifying each pixel of the letter block into a letter pixel and a background pixel. The first reference value calculator 1621 calculates the first reference value Th1, and the first reference value Th1 is used to normalize two types of pixels distinguished in a next step by dividing a text pixel and a background pixel. In this case, the first reference value Th1 selects a gray value that maximizes the difference between the two classes of pixels (between-class variance). The first reference value Th1 may use an Otsu method or a Kapur method. The method of calculating the first reference value Th1 using the Otsu method can be obtained by Equation 19, which is a paper published by N. Otsu, "A Threshold Selection Method from Gray-Level Histogram." [IEEE Trans. on Systems. Man and Cybernetics, Vol. SMC-9, no. 1, pp. 62-66, Jan. 1979.].

그리고 평균값 계산부1623은 상기 글자블록의 화소들을 상기 제1기준값Th1을 기준으로 글자화소와 배경화소로 분류한 후, 상기 글자블록의 글자화소 및 배경화소의 평균 밝기값들을 계산한다. 상기 평균값 계산(mean computation for two classes) 과정을 살펴보면, 먼저 글자블록 x(m,n)의 화소들을 상기 제1기준값 Th1을 기준으로 하기 <수학식 20>와 같이 글자화소(character pixel: CP)와 배경화소(background pixel: BP)로 분류한 후, 하기 <수학식 21>과 같이 글자화소의 평균 밝기 값 μ₀및 배경화소의 평균 밝기 값 μ₁을 계산한다.The average value calculator 1623 classifies the pixels of the letter block into a letter pixel and a background pixel based on the first reference value Th1, and then calculates average brightness values of the letter pixel and the background pixel of the letter block. Looking at the mean computation for two classes, first, the pixels of the letter block x (m, n) are referred to as a character pixel (CP) as shown in Equation 20 based on the first reference value Th1. After dividing into and the background pixel (BP), the average brightness value μ ₀ and the average brightness value μ ₁ of the background pixel are calculated as shown in Equation 21.

상기 <수학식 20>에서 x(m,n)은 글자블록을 의미하며, Th1은 상기 글자블록의 화소들을 글자화소 및 배경화소로 분류하기 위한 기준값이다.In Equation 20, x (m, n) denotes a letter block, and Th1 is a reference value for classifying pixels of the letter block into letter pixels and background pixels.

상기 <수학식 21>에서 S_c는 글자화소의 밝기 값의 합이고, N_c는 글자화소의 수이며, S_b는 배경화소의 밝기 값의 합이고, N_b는 배경화소의 수이다.In Equation 21, S _c is the sum of the brightness values of the text pixels, N _c is the number of the text pixels, S _b is the sum of the brightness values of the background pixels, and N _b is the number of the background pixels.

그러면 정규화부1625는 상기 입력되는 글자블록 x(m,n)의 화소들을 상기 평균값계산부1623에서 출력되는 글자화소의 평균 밝기 값 μ₀및 배경화소의 평균 밝기 값 μ₁을 이용하여 글자화소를 1에 가까운 값으로, 배경화소를 0에 가까운 값으로 정규화한다. 상기 정규화부1625는 상기 입력되는 글자블록 x(m,n)의 화소들을 정규화하여 입력되는 글자블록 영상의 밝기 값의 변화폭(dynamic range)을 줄이는 기능을 수행한다. 상기 정규화부1625는 하기 <수학식 22>에 의해 입력되는 글자블록 x(m,n)의 화소들을 정규화한다.The normalizer 1625 then selects the pixels of the input text block x (m, n) by using the average brightness value μ ₀ of the text pixel output from the average value calculator 1623 and the average brightness value μ ₁ of the background pixel. Normalizes the background pixel to a value close to 1 and a value close to 0. The normalization unit 1625 performs a function of reducing the dynamic range of brightness values of the input letter block image by normalizing pixels of the input letter block x (m, n). The normalizer 1625 normalizes the pixels of the letter block x (m, n) input by Equation 22 below.

x_N(m,n)은 정규화된 글자블록을 나타내며, μ₀는 글자화소의 평균 밝기 값을 나타내고, μ₁은 배경화소의 평균 밝기 값을 나타낸다.x _N (m, n) represents a normalized letter block, μ ₀ represents an average brightness value of a font pixel, and μ ₁ represents an average brightness value of a background pixel.

이후 상기 정규화된 글자블록 x_N(m,n)은 쿼드래틱 처리부1627에서 쿼드래틱 처리되어 글자블록의 에지가 향상되고 잡음이 감소된다. 상기 쿼드래틱처리부1627은 상기 정규화된 화소의 주변화소와 관계를 이용하여 에지를 향상시키고 잡음을 감소시키는 기능을 수행한다. 도 22는 상기 쿼드래틱 처리부1627에서 처리되는 중심화소와 주변화소들을 도시하고 있으며, <수학식 23>은 상기 쿼드래틱 처리부1627에서 글자블록 화소를 쿼드래틱 처리하여 에지 향상 및 잡음을 감소시키는 특성을 나타낸다. 상기 쿼드래틱 처리부1627은 그레이 레벨차를 크게하여 글자화소는 진하게 처리하고 배경화소는 밝게 처리하므로써, 글자 부분의 에지를 선명하게 처리하는 동시에 잡음을 제거하는 기능을 수행한다.Thereafter, the normalized letter block x _N (m, n) is quadratically processed by the quadratic processor 1627 to improve edges of the letter block and reduce noise. The quadratic processor 1627 improves an edge and reduces noise by using a relationship with a peripheral pixel of the normalized pixel. FIG. 22 illustrates central and peripheral pixels processed by the quadratic processor 1627, and Equation 23 is a quadratic character block pixel in the quadratic processor 1627 to improve edges and reduce noise. Indicates. The quadratic processor 1627 performs a function of sharpening the edges of the text portion and removing noise by increasing the gray level difference so that the text pixels are darkened and the background pixels are bright.

따라서 이진화부1660에서 글자블록의 화소들을 이진화하기 위한 기준값 BTH_N을 생성하기 위하여, 상기 제1기준값 계산부1621에서 계산되는 제1기준값 Th1을 기준값 정규화부362에서 정규화하여 제2기준값 Th2를 생성한다. 이때 상기 제2기준값 Th2는 상기 이진화부1660에서 글자블록의 화소들을 이진화하기 위한 화소 기준값 BTH_N으로 사용된다.Accordingly, in order to generate the reference value BTH _N for binarizing pixels of the letter block, the binarization unit 1660 normalizes the first reference value Th1 calculated by the first reference value calculator 1621 in the reference value normalizer 362 to generate a second reference value Th2. . In this case, the second reference value Th2 is used as the pixel reference value BTH _N for binarizing pixels of the letter block in the binarization unit 1660.

상기 기준값정규화부1631은 상기 정규화부1625의 정규화방법과 동일한 방법으로 상기 제1기준값 Th1을 정규화한다. 상기 기준값정규화부1631은 상기 제1기준값을 하기 <수학식 24>와 같이 정규화하여 제2기준값 Th2(기준값 BTH_N)를 생성한다.The reference value normalization unit 1631 normalizes the first reference value Th1 by the same method as that of the normalization unit 1625. The reference value normalizing unit 1631 generates a second reference value Th2 (reference value BTH _N ) by normalizing the first reference value as shown in Equation (24).

상기 <수학식 24>에서 Th2는 이진화부1660에서 글자화소 및 배경화소를 구분하기 위한 정규화된 기준값 BTH_N이며, x_N(m,n)은 정규화된 글자블록을 나타내며, μ₀는 글자화소의 평균 밝기 값을 나타내고, μ₁은 배경화소의 평균 밝기 값을 나타낸다.In Equation 24, Th2 is a normalized reference value BTH _N for distinguishing a text pixel and a background pixel in the binarization unit 1660, x _N (m, n) represents a normalized letter block, and μ ₀ represents a letter pixel. The average brightness value is shown, and μ ₁ represents the average brightness value of the background pixel.

상기한 바와 같이 도 21과 같은 구성을 가지는 에지향상부1640은 수신되는 글자블록(또는 글자블록을 포함하는 그룹핑된 블록) 내의 글자화소 및 배경화소들을 정규화하여 동적 범위를 줄여 주고, 상기 정규화된 화소들을 쿼드래틱 처리하여 글자블록(또는 글자블록을 포함하는 그룹핑블록)의 에지를 향상시킨다. 또한 상기 쿼드래틱처리부1627에서 출력되는 글자블록(또는 글자블록을 포함하는 그룹핑블록)은 정규화된 블록이므로, 상기 제1기준값을 정규화하여 글자블록의 화소들을 이진화하기 위한 기준값BTH_N을 생성한다.As described above, the edge enhancement unit 1640 having the configuration as shown in FIG. 21 reduces the dynamic range by normalizing the text pixels and the background pixels in the received text block (or a grouped block including the text block), and the normalized pixel. The edges of the letter block (or grouping block including the letter block) are improved by quadratic processing of the blocks. In addition, since the letter block (or grouping block including the letter block) output from the quadratic processor 1627 is a normalized block, the first reference value is normalized to generate a reference value BTH _N for binarizing pixels of the letter block.

상기한 바와 같이 영상이진화부950에서 에지향상부1640을 도 21과 같은 개선된 쿼드래틱 필터를 이용하여 구현할 수 있다. 상기 에지향상부1640을 개선된 쿼드래틱 필터를 이용하여 글자블록(또는 글자블록을 포함하는 그룹핑블록)을 이진화하여 얻은 이진화영상에서 글자 주위의 검은 블록이 생기는 문제를 해결하면서 에지를 향상시키는 기능을 수행한다.As described above, the edge enhancement unit 1640 in the image binarizer 950 may be implemented using the improved quadratic filter as shown in FIG. 21. The edge enhancer 1640 uses an improved quadratic filter to binarize a letter block (or a grouping block including the letter block) to solve the problem of black blocks around letters in a binarized image. To perform.

상기 에지향상부1640에서 출력되는 그룹핑된 글자블록으로써, 상기 에지향상부1640의 출력은 블록분리부1650에 인가된다. 상기 글자블록을 포함하는 그룹핑된 블록을 입력하는 블록분리부1650은 상기 그룹핑된 블록에서 상기 글자블록의 영상을 분리하여 출력한다. 이는 상기 블록그룹핑부1630에서 글자블록의 주변블록들을 그룹핑한 것을 원래대로 복원하는 것이다.As a grouped letter block output from the edge enhancer 1640, the output of the edge enhancer 1640 is applied to the block separator 1650. The block separator 1650 for inputting a grouped block including the letter block separates and outputs an image of the letter block from the grouped block. This is to restore the grouping of the neighboring blocks of the letter block in the block grouping unit 1630.

상기 블록분리부1650에서 출력되는 글자블록은 이진화부1660에 입력된다. 그리고 상기 이진화부1660은 글자블록의 화소들을 이진화하기 위하여 상기 에지향상부1640에서 출력되는 상기 기준값을 수신한다. 이때 상기 이진화부1660에 입력되는 글자블록은 y(m,n)(도 21과 같은 쿼드래틱 필터에서 출력되는 글자블록) 또는 y_N(m,n)(도 21과 같은 개선된 쿼드래틱 필터에서 출력되는 글자블록)이 된다. 따라서 상기 화소 기준값도 BTH 또는 BTH_N이 된다.The letter block output from the block separator 1650 is input to the binarizer 1660. The binarization unit 1660 receives the reference value output from the edge enhancement unit 1640 to binarize pixels of the letter block. In this case, the letter block inputted to the binarization unit 1660 may be y (m, n) (letter block output from the quadratic filter as shown in FIG. 21) or y _N (m, n) (in the improved quadratic filter as shown in FIG. 21). Output character block). Therefore, the pixel reference value is also BTH or BTH _N.

상기 이진화부1660은 상기 수신되는 글자블록의 각 화소들을 상기 기준값을 이용하여 배경화소와 글자화소로 분류하고, 상기 분류된 글자화소 및 배경화소들을두 개의 밝기 값으로 변환하여 이진화 동작을 수행한다. 즉, 상기 이진화부1660은 글자블록이 입력되면 대응되는 기준값과 상기 글자블록의 화소들을 비교하며, 비교결과 상기 영상화소 값이 상기 기준값 보다 크거나 같은 글자화소로 분류하고 작으면 배경화소로 분류한다. 그리고 상기 이진화부1660은 상기 분류된 결과에 따라 글자화소는 α밝기 값으로 변환하고 배경화소는 β밝기 값으로 변환하여 이진화한다. 상기 이진화부1660에서 글자블록의 화소들을 이진화하는 방법은 하기 <수학식 25>와 같다.The binarization unit 1660 classifies each pixel of the received text block into a background pixel and a text pixel using the reference value, and converts the classified text pixel and the background pixels into two brightness values to perform a binarization operation. That is, when the letter block is input, the binarization unit 1660 compares the corresponding reference value with the pixels of the letter block, and classifies the image pixel value as a letter pixel that is greater than or equal to the reference value and as a background pixel when the letter block is smaller. . The binarization unit 1660 converts the font pixels into α brightness values and the background pixels into β brightness values according to the classified results. A method of binarizing pixels of a letter block in the binarization unit 1660 is shown in Equation 25 below.

상기 <수학식 25>에서 y(m,n) 및 BTH는 쿼드래틱 필터에서 출력되는 글자블록 및 기준값이고, y_N(m,n) 및 BTH_N은 개선된 쿼드래틱 필터에서 출력되는 글자블록 및 기준값이며, y_B(m,n)은 이진화된 글자블록이다.In Equation 25, y (m, n) and BTH are letter blocks and reference values output from the quadratic filter, and y _N (m, n) and BTH _N are letter blocks output from the improved quadratic filter and The reference value, y _B (m, n) is a binary block.

또한 상기 이진화부1660은 상기 블록분류부1610 또는 블록성장부1620에서 출력되는 배경블록 영상을 수신한다. 상기 이진화부1660은 상기 배경블록의 화소들을 β밝기 값으로 일괄 변환한다.In addition, the binarization unit 1660 receives a background block image output from the block classification unit 1610 or the block growth unit 1620. The binarization unit 1660 collectively converts the pixels of the background block into β brightness values.

도 23은 상기 에지향상부1640을 개선된 쿼드래틱 필터를 사용하여 구현한 경우의 이진화 절차를 설명하기 위한 도면이다.FIG. 23 is a diagram for describing a binarization procedure when the edge enhancement unit 1640 is implemented using an improved quadratic filter.

상기 도 23을 참조하면, 1711단계에서 영상화면을 입력한다. 그러면 1713단계에서 블록분류부1610은 상기 입력되는 영상화면을 상기 블록으로 분할하고, 상기분할된 블록들에 포함되는 화소들을 검사하여 글자 블록 및 배경블록들로 분류하는 기능을 수행한다.Referring to FIG. 23, in operation 1711, an image screen is input. Then, in step 1713, the block classification unit 1610 divides the input image screen into the blocks, examines pixels included in the divided blocks, and classifies them into letter blocks and background blocks.

그리고 1715단계에서 블록성장부1620은 상기 블록분류부1610에서 분류된 글자블록들을 확장한다. 이는 상기 블록 분류시 글자 사이의 배경 영향으로 글자화소를 포함하는 블록이 배경블록으로 분류될 수 있으며, 상기 글자블록을 성장하는 이유는 이렇게 배경블록으로 분류된 화소들을 글자블록으로 확장하기 위함이다. 이후 1717단계에서 상기 블록성장부1620은 상기 성장된 영상화면의 글자블록들을 순차적으로 블록그룹핑부1630에 출력한다. 이때 상기 블록그룹핑부1630에 출력되는 영상은 글자블록이 될 수 있다. 1719단계에서 상기 블록그룹핑부1630은 상기 블록성장부1620에서 출력되는 글자블록들을 입력하며, 상기 글자블록을 중심으로 인접한 8개의 블록들을 그룹핑하여 그룹핑된 블록들을 생성한다.In operation 1715, the block growth unit 1620 expands the letter blocks classified by the block classification unit 1610. This is because a block including a letter pixel may be classified as a background block due to a background effect between letters in the block classification. The reason for growing the letter block is to extend pixels classified as a background block into a letter block. In operation 1717, the block growth unit 1620 sequentially outputs the letter blocks of the grown image screen to the block grouping unit 1630. In this case, the image output to the block grouping unit 1630 may be a letter block. In operation 1719, the block grouping unit 1630 inputs the letter blocks output from the block growth unit 1620, and groups eight adjacent blocks around the letter block to generate grouped blocks.

상기 그룹핑된 블록 영상들은 에지향상부1640에 입력된다. 여기서 상기 에지향상부1640은 상기한 바와 개선된 쿼드래틱 필터가 된다. 상기 개선된 쿼드래틱필터의 동작절차를 살펴보면, 1721단계에서 상기 글자블록의 각 화소를 글자화소와 배경화소로 분류하기 위한 제1기준값 Th1을 계산한다. 상기 제1기준값 Th1은 상기 <수학식 19>과 같은 방법으로 구할 수 있다. 1723단계에서 상기 <수학식 20> 및 <수학식 21>와 같은 계산을 수행하면서, 상기 글자블록의 화소들을 상기 제1기준값 Th1을 기준으로 글자화소와 배경화소로 분류한 후, 상기 글자블록의 글자화소 및 배경화소의 평균 밝기값들을 계산한다. 이후 1725단계에서 상기 입력되는 글자블록x(m,n)의 화소들을 출력되는 글자화소의 평균 밝기 값 μ₀및 배경화소의 평균 밝기 값 μ₁을 이용하여 글자화소를 1에 가까운 값으로, 배경화소를 0에 가까운 값으로 정규화한다. 상기 정규화 과정에서는 상기 <수학식 22>에 의해 입력되는 글자블록 x(m,n)의 화소들을 정규화한다.The grouped block images are input to the edge enhancer 1640. Here, the edge enhancement unit 1640 is an improved quadratic filter as described above. Referring to the operation procedure of the improved quadratic filter, in operation 1721, a first reference value Th1 for classifying each pixel of the letter block into a letter pixel and a background pixel is calculated. The first reference value Th1 may be obtained by the same method as in Equation 19. In operation 1723, the pixels of the letter block are classified into a letter pixel and a background pixel based on the first reference value Th1 while performing calculations as shown in Equation 20 and Equation 21. Calculate average brightness values of text pixel and background pixel. Subsequently, in step 1725, the text pixel is set to a value close to 1 by using the average brightness value μ ₀ of the text pixel to be output and the average brightness value μ ₁ of the background pixel to output the pixels of the letter block x (m, n). Normalize the pixel to a value close to zero. In the normalization process, the pixels of the letter block x (m, n) input by Equation 22 are normalized.

이후 1727단계에서 상기 정규화된 글자블록 x_N(m,n)은 개선된 쿼드래틱처리부1627에서 개선된쿼드래틱 처리되어 글자블록의 에지가 향상되고 잡음이 감소된다. 상기 쿼드래틱처리 과정은 상기 <수학식 23>과 같은 계산 과정을 수행한다. 그리고 1751단계에서는 상기 정규화 과정과 동일한 방법으로 상기 제1기준값 Th1을 상기 <수학식 25>와 같이 정규화하여 제2기준값 Th2(화소기준값 BTH_N)를 생성한다.Then, in step 1727, the normalized letter block x _N (m, n) is improved by the improved quadratic processor 1627 to improve the edge of the letter block and reduce noise. The quadratic process performs a calculation process as shown in Equation 23. In operation 1751, the first reference value Th1 is normalized as shown in Equation 25 in the same manner as the normalization process to generate a second reference value Th2 (pixel reference value BTH _N ).

그리고 1733단계에서는 상기 개선된 쿼드래틱처리된 그룹핑된 블록을 입력하며, 상기 그룹핑된 블록에서 상기 글자블록을 분리하여 출력한다. 즉, 상기 블록분리 과정은 상기 그룹핑된 블록에서 중앙에 위치한 글자블록만을 분리하는 기능을 수행한다. 그리고 1735단계에서 상기 분리된 글자블록들의 화소들을 상기 화소 기준값 BTH_N과 비교하여 글자화소 및 배경화소의 제1 및 제2밝기 값으로 이진화한다. 상기 블록분류 과정 또는 블록성장 과정 수행 후 생성된 배경블록의 화소들은 제2밝기 값으로 이진화한다.In operation 1733, the improved quadratic processed grouped block is input, and the letter block is separated from the grouped block. That is, the block separating process separates only the letter blocks located at the center of the grouped blocks. In operation 1735, the pixels of the separated letter blocks are compared with the pixel reference value BTH _N and binarized to the first and second brightness values of the letter pixel and the background pixel. The pixels of the background block generated after the block classification process or the block growth process are binarized to a second brightness value.

상기와 같은 동작을 반복하면서 글자블록 및 배경블록의 이진화 동작을 수행하며, 영상화면의 모든 블록에 대한 이진화 동작이 종료되면, 1737단계에서 이를감지하며, 1739단계에서 이진화된 영상화면을 출력한다.While repeating the above operation, the binarization operation of the letter block and the background block is performed, and when the binarization operation for all blocks of the image screen is completed, it is detected in step 1737 and the binarized image screen is output in step 1739.

도 24a 및 도 24b는 상기 도 2의 210 과정 - 230 과정에서 수행되는 문서 이미지의 전처리, 인식 및 항목 선택 과정의 절차를 도시하는 도면이며, 도 27a-도 27b는 상기 과정들을 수행하면서 처리되는 결과를 도시하는 도면이다. 여기서는 상기 문서가 명함인 경우로 가정하며, 저장항목들은 폰북에 저장할 수 있는 항목들이라 가정한다.24A and 24B illustrate a procedure of preprocessing, recognizing, and selecting an item of a document image performed in steps 210 to 230 of FIG. 2, and FIGS. 27A to 27B show results processed while performing the above steps. It is a figure which shows. In this case, it is assumed that the document is a business card, and the stored items are assumed to be items that can be stored in the phone book.

여기서 문서 인식시 문서인식 명령를 발생하는 문서인식키는 자주 사용하는 문서의 종류 별로 구성하는 것이 바람직하다. 예를들면 명함에 기록된 문자 정보들은 휴대 단말장치의 폰북에 저장할 수 있는 정보들이다. 상기 명함에는 회사명, 회사부서, 직급, 이름, 사무소 전화번호, 전자우편주소(e-mail address), 휴대전화번호 등이 기록되어 있다. 따라서 휴대 단말장치에 폰북 정보를 등록하는 경우, 명함을 이용하여 해당하는 사람의 정보를 문자 인식하여 폰북을 작성하면 매우 편리할 수 있다. 그러므로 상기 명함 등과 같은 문서의 문자이미지를 인식하는 경우, 명함의 항목들 및 이들 항목들의 정보를 저장할 수 있는 영역을 할당할 수 있는 테이블을 미리 설정하고, 이에 대응되는 명함인식키 입력시 상기 제어부101은 인식하고자 하는 문서가 명함임을 감지하고 명함의 각 저장 항목들을 자동으로 표시하여 항목을 선택적으로 선택하여 등록할 수 있도록 하는 것이 편리하다. 따라서 본 발명의 실시예에서는 문서의 종류에 따른 문서인식키들을 구비하고, 이들 문서의 종류에 따른 항목들에 테이블을 미리 할당한 후, 해당하는 문서인식키가 수신되면 대응되는 문서의 테이블의 항목들 표시할 수 있다. 그리고 미리 설정되지 않은 문서를 인식하고자 하는 경우에는 문서인식키를 선택하여 수동으로 각 항목들을 설정한 후 처리할 수 있도록 한다. 이하 설명되는 본 발명의 실시예에서는 상기 문서가 명함인 경우를 가정하여 설명하기로 한다.Here, the document recognition key for generating a document recognition command when recognizing a document is preferably configured for each type of document frequently used. For example, the text information recorded on the business card is information that can be stored in the phone book of the portable terminal device. The business name, company department, position, name, office telephone number, e-mail address, mobile phone number and the like are recorded on the business card. Therefore, when registering the phone book information in the mobile terminal device, it may be very convenient to create a phone book by text recognition of the information of the corresponding person using a business card. Therefore, when recognizing a text image of a document such as a business card, the controller 101 may preset a table for allocating items of a business card and an area for storing information of these items, and input the corresponding business card recognition key. It is convenient to detect that the document to be recognized is a business card and to automatically display each stored item of the business card so that the item can be selectively selected and registered. Therefore, according to the embodiment of the present invention, the document identification keys according to the types of documents are provided, the table is pre-assigned to the items according to the types of the documents, and when the corresponding document recognition keys are received, the items of the table of the corresponding document. Can be displayed. If you want to recognize a document that is not set in advance, select the document recognition key to manually set each item and process it. In the embodiment of the present invention described below, it is assumed that the document is a business card.

상기 도 24a를 참조하면, 상기 제어부101은 413단계에서 명함인식키가 발생되기 전에는 411단계에서 도 26e 와 같이 상기 표시부115에 저장 중인 명함의 이미지를 표시한다. 상기와 같은 상태에서 사용자가 입력부113의 명함인식키를 발생하면, 상기 제어부101은 413단계에서 이를 감지하고, 415단계에서 전처리부121을 구동하여 상기 표시중인 문서 이미지의 전처리 동작을 수행한다. 이때 상기 전처리 동작은 상기 도 4와 같은 구성에 의해 수행될 수 있다. 이때 상기 전처리를 수행하는 과정에서 블러드 이미지로 판정되면, 상기 제어부101은 이후의 절차를 중단하고 새로운 문서 이미지의 선택을 요구할 수 있다.Referring to FIG. 24A, before the business card recognition key is generated in step 413, the controller 101 displays an image of the business card being stored on the display unit 115 as shown in FIG. 26E in step 411. When the user generates the business card recognition key of the input unit 113 in the above state, the control unit 101 detects this in step 413, and in step 415 drives the pre-processing unit 121 to perform the pre-processing operation of the displayed document image. In this case, the preprocessing operation may be performed by the configuration as shown in FIG. In this case, if it is determined that the blood image is in the process of performing the preprocessing, the controller 101 may stop the subsequent procedure and request the selection of a new document image.

그리고 블러드 이미지가 아닌 경우로 판단되면, 417단계에서 문자인식부123을 구동하여 상기 전처리된 문서 이미지 내에서 문자 이미지들의 인식을 수행한다. 그러면 상기 문자인식부123은 상기 표시중인 상기 도 26e와 같은 명함의 이미지를 문자 데이터(text)로 변환하며, 제어부101은 상기 변환된 문자 데이터를 표시부115에 도 27a와 같이 표시한다. 문자 인식을 위하여 상기 단말장치들은 다수의 인식기들을 구비하여야 한다. 즉, 상기 명함과 같은 문서들에는 한글, 영문자, 숫자, 특수 기호, 한문, 또는 다른 언어의 문자들이 있을 수 있다. 따라서 문자 인식 과정에서 인식하고자 하는 문자의 종류에 따라 해당하는 인식기 프로그램을 선택하여야 한다. 본 발명의 실시예에서는 상기 인식하는 문자가 영문자라고 가정하며, 인식기는 상기한 FineReader 5.0 office trial version(company: ABBYY, mainly recognizes English language)를 사용한다고 가정한다.If it is determined that the image is not a blood image, the character recognition unit 123 is driven in step 417 to recognize the character images in the preprocessed document image. Then, the character recognition unit 123 converts the image of the business card as shown in FIG. 26E into text data, and the controller 101 displays the converted text data on the display unit 115 as shown in FIG. 27A. The terminal apparatuses should be provided with a plurality of recognizers for character recognition. That is, documents such as business cards may include Korean characters, English letters, numbers, special symbols, Chinese characters, or characters of other languages. Therefore, in the character recognition process, a corresponding recognizer program should be selected according to the type of character to be recognized. In the embodiment of the present invention, it is assumed that the recognized character is an English letter, and it is assumed that the recognizer uses the FineReader 5.0 office trial version (company: ABBYY, mainly recognizes English language).

상기 명함 이미지를 문자 데이터로 변환하면, 상기 제어부101은 419단계에서 도 27a와 같이 표시부115의 제1표시영역71에 명함 이미지의 문자데이터들을 표시하고, 제3표시영역73에 항목 선택을 표시하며, 제2표시영역75에 저장하고자 하는 항목들을 표시한다. 이때 상기 제2표시영역75에 표시되는 항목들은 이름, 회사, 직위, 회사전화번호, 휴대전화기 번호, 집 전화번호, 팩시밀리 번호, 전자우편주소, 회사주소, 기타, 항목 추가 등이 될 수 있다. 상기 도 27a와 같이 표시되고 있는 상태에서, 사용자가 도 27b에 도시된 바와 같이 스타일러스 펜을 이용하여 제1표시영역71의 문자데이터(문장)를 선택하고, 제2표시영역75의 저장 항목을 선택하면, 상기 제어부101은 421단계에서 이를 감지하고, 423단계에서 도 27b와 같이 상기 표시부115의 제3표시영역73에 선택된 항목 및 이에 대응되는 문자 데이터를 표시한다. 이때 상기 입력부113으로부터 확인키가 발생되면, 상기 제어부101은 425단계에서 이를 감지하고 427단계로 진행하여 상기 선택된 항목 및 선택 항목의 문자데이터를 등록한다. 그리고 입력부113으로부터 수정키가 발생되면, 상기 제어부101은 429단계에서 이를 감지하고 431단계로 진행하여 후술하는 도 25a와 같은 오류정정 과정을 수행한다. 이후 상기 수정된 항목의 오류 데이터들은 상기 425단계 및 427단계의 확인 과정을 거쳐 등록하는 과정을 더 수행한다. 그리고 상기 입력부113으로부터 완료키가 입력되면 상기 제어부101은 433단계에서 이를 감지하고 435단계에서 선택된 항목들 및 이에 대응되는 문자데이터들을 표시한다.When the business card image is converted into text data, the controller 101 displays text data of the business card image on the first display area 71 of the display unit 115 and displays an item selection on the third display area 73 in step 419. , Items to be stored are displayed on the second display area 75. In this case, the items displayed on the second display area 75 may be a name, company, job title, company phone number, mobile phone number, home phone number, facsimile number, e-mail address, company address, and the like. As shown in FIG. 27A, the user selects the character data (sentence) of the first display area 71 using the stylus pen and selects a storage item of the second display area 75 as shown in FIG. 27B. In operation 421, the controller 101 detects this, and in operation 423, the controller 101 displays the selected item and the corresponding text data in the third display area 73 of the display unit 115 as illustrated in FIG. 27B. When the confirmation key is generated from the input unit 113, the control unit 101 detects this in step 425 and proceeds to step 427 to register the selected item and the text data of the selected item. When the correction key is generated from the input unit 113, the control unit 101 detects this in step 429 and proceeds to step 431 to perform an error correction process as shown in FIG. 25A. Thereafter, the error data of the modified item is further registered through the checking process of steps 425 and 427. When the completion key is input from the input unit 113, the controller 101 detects this in step 433 and displays the selected items and the corresponding character data in step 435.

도 25a는 본 발명의 실시예에 따라 선택된 항목별로 오류가 발생된 문자 데이터를 수정하는 과정을 도시하는 도면이다.FIG. 25A is a diagram illustrating a process of correcting error-prone text data for each selected item according to an embodiment of the present invention.

상기 도 25a를 참조하여 상기 도 24a의 431단계에서 수행되는 오류 정정 절차를 살펴보면, 수정키 입력시 상기 제어부101은 511단계에서 도 28a와 같이 표시부115의 제3표시영역73에 오류 인식된 항목 및 해당 항목의 문자데이터를 표시한다. 상기 도 28a와 같이 표시되는 상태에서 스타일러스 펜을 이용하여 표시부115의 제1표시영역71의 수정할 문자 데이터를 클릭하면, 상기 제어부101은 513단계에서 이를 감지하고, 515단계에서 도 28b와 같이 수정할 문자데이터를 표시한다.Referring to FIG. 25A, the error correction procedure performed in step 431 of FIG. 24A is performed. When a correction key is input, the controller 101 detects an error recognized in the third display area 73 of the display unit 115 in step 511 as shown in FIG. 28A. The character data of the item is displayed. When the text data to be corrected in the first display area 71 of the display unit 115 is clicked using the stylus pen in the state shown in FIG. 28A, the controller 101 detects this in step 513 and the text to be modified as shown in FIG. 28B in step 515. Display data.

본 발명의 제1실시예에서는 인식 오류가 발생된 문자 데이터를 수정하는 경우 하기와 같은 2가지의 방법으로 오류를 수정할 수 있다. 즉, 도 28b에 도시된 바와 같이 오류 인식된 문자가 지정되면, 상기 제어부101은 도 28b와 같이 표시부115의 제3표시영역73에 상기 오류 인식된 문자를 수정하기 위한 후보 문자들을 표시하며, 또한 제3표시영역75에 오류 인식된 문자를 수정하기 위해 필기체 문자를 입력할 수 있는 인식창을 표시하고, 제4표시영역77에 오류문자를 수정하기 위해 키데이터를 발생할 수 있는 소프트 키패드를 표시한다. 따라서 사용자는 상기 제3표시영역73에 표시된 후보문자들 중 원하는 문자를 선택하여 수정하거나 또는 제2표시영역75에 원하는 문자를 필기체로 입력하여 수정하는 방법을 사용할 수 있다. 또한 상기 필기체 문자를 입력하는 인식창 이외에 소프트 키패드들을 표시하고, 상기 소프트 키패드를 통해 발생되는 키 데이터들을 분석하여 오류 문자를 수정할 수도 있다.In the first embodiment of the present invention, when the character data in which the recognition error occurs is corrected, the error may be corrected in the following two ways. That is, when an error recognized character is designated as shown in FIG. 28B, the controller 101 displays candidate characters for correcting the error recognized character in the third display area 73 of the display unit 115 as shown in FIG. 28B. Displays a recognition window for inputting handwritten characters in order to correct an error recognized character in the third display area 75, and displays a soft keypad capable of generating key data for correcting error characters in the fourth display area 77. . Accordingly, the user may use a method of selecting and modifying a desired letter from the candidate characters displayed in the third display area 73 or inputting and modifying a desired letter in the second display area 75 by handwriting. In addition, in addition to the recognition window for inputting the handwritten characters, soft keypads may be displayed, and error characters may be corrected by analyzing key data generated through the soft keypad.

따라서 상기 도 28b와 같이 오류 인식된 문자가 표시되는 있는 상태에서 제3표시영역73에 표시되고 있는 후보 문자들 중 임의의 문자가 스타일러스 펜에 의해 선택되면, 상기 제어부101은 517단계에서 이를 감지하고 519단계에서 상기 제1영역에 표시되고 있는 오류 인식문자를 상기 선택된 후보문자로 수정한다. 또한 상기 도 28b와 같이 오류 인식된 문자가 표시되고 있는 상태에서 스타일러스 펜에 의해 상기 제2표시영역75의 인식창에 필기체 문자가 입력되면, 상기 제어부101은 521단계에서 이를 감지하고, 523단계에서 상기 문자인식부123의 필기체 문자 인식기를 구동한다. 그리고 525단계에서 상기 제어부101은 상기 오류 인식된 문자데이터를 상기 문자인식부123에 의해 인식된 문자 데이터로 수정한다. 또한 상기 도 28b와 같이 오류 인식된 문자가 표시되고 있는 상태에서 상기 제4표시영역의 소프트 키패드를 통해 키 데이터들이 발생되면, 상기 제어부101은 상기 521단계에서 이를 감지하고, 523단계에서 상기 문자인식부123의 소프트키 인식모듈을 구동한다. 그리고 525단계에서 상기 제어부101은 상기 오류 인식된 문자데이터를 상기 문자인식부123에 의해 인식된 문자 데이터로 수정한다.Therefore, when any character among the candidate characters displayed in the third display area 73 is selected by the stylus pen in the state where the error recognized character is displayed as shown in FIG. 28B, the controller 101 detects this in step 517. In step 519, the error recognition character displayed in the first area is modified to the selected candidate character. In addition, if a handwritten character is input to the recognition window of the second display area 75 by the stylus pen in the state where an error recognized character is displayed as shown in FIG. 28B, the controller 101 detects this in step 521, and in step 523. The handwriting character recognizer of the character recognition unit 123 is driven. In step 525, the controller 101 modifies the error-recognized character data into character data recognized by the character recognition unit 123. In addition, when key data are generated through the soft keypad of the fourth display area in a state where an error recognized character is displayed as shown in FIG. 28B, the controller 101 detects this in step 521 and recognizes the character in step 523. Drives the softkey recognition module of sub-123. In step 525, the controller 101 modifies the error-recognized character data into character data recognized by the character recognition unit 123.

또한 삭제키가 입력되면 상기 제어부는 527단계에서 이를 감지하고, 529단계에서 상기 513단계에서 선택된 오류 인식된 문자를 삭제한다. 그리고 추가키가 입력되면, 상기 제어부101은 531단계에서 이를 감지하고, 533단계에서 문자데이터를 추가(삽입)할 위치를 결정한다. 이때 삽입은 상기 513단계에서 선택된 문자의 앞 또는 뒤 위치가 될 수 있다. 이후 상기 제어부101은 535단계에서 상기 후보문자 선택 또는 필기체 문자 입력의 절차를 수행하면서 상기 결정된 위치에 문자를 추가(삽입)하게 된다.In addition, when a delete key is input, the controller detects this in step 527, and deletes the error recognized character selected in step 513 in step 529. If an additional key is input, the controller 101 detects this in step 531 and determines a position to add (insert) text data in step 533. In this case, the insertion may be a position before or after the character selected in step 513. In step 535, the controller 101 adds (inserts) a character to the determined position while performing the procedure of selecting a candidate character or inputting a handwritten character.

상기와 같이 후보문자를 선택하거나 필기체 문자로 오류 인식된 문자를 수정하거나, 또는 선택된 문자를 삭제 또는 문자를 추가하는 동작을 수행한 후, 사용자가 선택된 항목의 수정할 다른 문자를 선택하면, 상기 제어부101은 527단계에서 이를 감지하고 상기 515단계로 되돌아가 상기와 같은 동작을 반복한다.If the user selects another character to correct the selected item after performing the operation of selecting a candidate character or correcting an error recognized character as a cursive character, or deleting the selected character or adding a character as described above, the controller 101 Detects this in step 527 and returns to step 515 to repeat the above operation.

상기와 같은 동작을 반복하면, 상기 제어부101은 선택된 항목의 오류 인식된 문자 데이터들을 수정한다. 이후 수정 완료키가 입력되면, 상기 제어부101은 529단계에서 이를 감지하고 선택된 항목의 오류 정정 절차를 종료한 후, 상기 도 24a의 421단계로 되돌아간다.If the above operation is repeated, the controller 101 modifies the error-recognized text data of the selected item. Thereafter, when the corrected key is input, the controller 101 detects this in step 529 and ends the error correction procedure of the selected item, and returns to step 421 of FIG. 24A.

상기 도 28a 및 도 28b는 문자데이터 수정시 후보문자들 및 필기체 문자 인식을 통해 수행하는 동작을 설명하고 있다. 그러나 후보 문자들을 사용하지 않고 필기체 문자 인식만으로 오류 인식된 문자들을 수정할 수 있다. 도 28d는 문자 데이터 수정시 후보문자들을 사용하지 않고 필기체 문자 또는 소프트 키를 입력하여 수정하는 방법을 도시하고 있다.28A and 28B illustrate operations performed through candidate characters and handwritten character recognition when character data is corrected. However, error recognition characters can be corrected only by handwritten character recognition without using candidate characters. FIG. 28D illustrates a method of correcting by inputting a handwritten character or a soft key without using candidate characters when modifying character data.

상기 도 24a 및 도 25a와 같은 방법은 항목을 선택하고, 선택된 항목의 문자데이터들의 오류가 없으면 해당 항목 및 문자데이터들을 등록하고, 선택된 항목의 문자데이터들의 오류가 있으면 오류 문자데이터를 수정한 후 수정된 문자데이터들을 해당 항목과 함께 등록하는 방법이다.The method as shown in FIGS. 24A and 25A selects an item, registers the item and the text data if there is no error in the text data of the selected item, and corrects the error text data if there is an error in the text data of the selected item. Registered text data with the corresponding item.

상기와 같은 방법은 도 27b에 도시된 바와 같이 먼저 스타일러스 펜으로 제1표시영역71의 문장을 선택하고, 상기 선택된 문장에 대응되는 항목을제2표시영역75에서 스타일러스 펜으로 선택한다. 이때 선택된 항목과 이에 대응되는 문장은 제3표시영역73에 표시된다. 이때 상기 제3표시영역73에 표시되고 있는 항목 및 문장이 정확한 경우 도 27b에 도시된 바와 같이 확인키를 스타일러스 펜으로 클릭하며, 이런 경우 상기 제3표시영역73에 표시되는 항목 및 문장이 등록된다. 그러나 상기 제3표시영역73에 표시되는 문장의 오류가 발생된 경우, 도 28a에 도시된 바와 같이 수정키를 스타일러스 펜으로 클릭한다. 이후 도 28b에 도시된 바와 같이 제1표시영역71에 표시되는 오류 문자를 스타일러스 펜으로 클릭하면 클릭된 문자가 확대되어 표시되며, 제3표시영역73에 상기 오류 인식된 문자의 후보문자들이 표시된다. 그리고 제2표시영역75에 필기체 문자를 입력하기 위한 인식창을 표시하고, 제4표시영역77에 소프트 키패드를 표시한다. 상기와 같은 상태에서 상기 오류 인식된 문자를 수정하는 방법은 상기 제3표시영역73에 표시된 후보문자를 선택하거나, 상기 제2표시영역75의 인식창에 수정할 필기체 문자를 입력하거나 또는 제4표시영역77의 소프트 키패드를 통해 오류 수정을 위한 문자 키 데이터들을 입력하면 된다. 그리고 문자를 삭제하거나 또는 추가하는 경우에는 삭제키를 입력하거나 또는 추가키를 입력한다. 그리고 선택한 항목에 다른 오류 인식문자가 있으면 위와 같은 과정을 반복한다. 그리고 수정이 완료되면 수정완료키를 스타일러스 펜으로 클릭하며, 그러면 상기도 27a와 같은 상태로 되돌아가 다음 항목을 선택할 수 있도록 한다.In the above-described method, as shown in FIG. 27B, a sentence of the first display area 71 is first selected by the stylus pen, and an item corresponding to the selected sentence is selected by the stylus pen in the second display area 75. In this case, the selected item and the corresponding sentence are displayed on the third display area 73. In this case, when the items and sentences displayed in the third display area 73 are correct, the confirmation key is clicked with a stylus pen as shown in FIG. 27B. In this case, the items and sentences displayed in the third display area 73 are registered. . However, when an error of a sentence displayed in the third display area 73 occurs, the correction key is clicked with a stylus pen as shown in FIG. 28A. Subsequently, as shown in FIG. 28B, when an error character displayed in the first display area 71 is clicked with a stylus pen, the clicked character is enlarged and displayed, and candidate characters of the error recognized character are displayed in the third display area 73. . A recognition window for inputting handwritten characters is displayed in the second display area 75 and a soft keypad is displayed in the fourth display area 77. The method for correcting the error recognized character in the above state may include selecting a candidate character displayed in the third display area 73, inputting a cursive character to be corrected in the recognition window of the second display area 75, or a fourth display area. You can enter character key data for error correction via the soft keypad of 77. When deleting or adding a character, enter a delete key or an additional key. If there is another error recognition character in the selected item, repeat the above process. When the modification is completed, the correction key is clicked with a stylus pen, and the display returns to the state as shown in FIG. 27A and allows the next item to be selected.

상기 설명에서는 오류 문자를 수정하는 경우, 필기체 문자 입력, 후보문자 선택 및 소프트키패드를 통해 수행하는 방법에 대해 설명하고 있으나, 필기체 문자입력 방법을 단독으로 하거나 또는 소프트 키패드를 단독으로 사용하여 구현할 수도 있다. 또한 상기 후보문자 선택 및 필기체 문자 인식을 병행하여 구현하거나 또는 상기 후보문자 선택 및 소프트 키패드를 병행하여 구현할 수도 있다.In the above description, a method of correcting an error character, inputting a handwritten character, selecting a candidate character, and performing the method through a soft key pad is described. However, the method may be implemented by using the handwritten character input method alone or by using a soft keypad alone. . In addition, the candidate character selection and handwritten character recognition may be implemented in parallel, or the candidate character selection and the soft keypad may be implemented in parallel.

도 24b 및 도 25b는 본 발명의 제1실시예의 또 다른 항목 선택 및 오류 수정 절차를 도시하는 도면이다.24B and 25B are diagrams showing another item selection and error correction procedure of the first embodiment of the present invention.

상기 도 24b를 참조하여 문자인식 및 항목 선택 동작을 살펴본다. 상기 도 24b의 문자인식 및 항목 선택 과정은 상기 도 24a와 동일한 절차로 수행되며, 다만 수정키가 입력되면 오류 정정 과정을 즉시 수행하지 않고 해당 항목에 오류가 발생되었음을 표시하기만 한다. 즉, 상기 항목 선택 과정에서 수정키가 발생되면, 상기 제어부101은 429단계에서 이를 감지하고, 450단계에서 해당 항목의 문자 인식데이터에 오류가 발생되었음을 표시하고 상기 421단계로 돌아간다. 상기 도 24b의 동작은 상기와 같은 수정 동작 이외에는 상기 도 24a와 동일한 절차로 동작된다. 따라서 상기 도 24b와 같은 절차로 문자인식 및 항목 선택 과정을 수행하면, 문자인식 후 항목을 선택하는 과정에서 선택된 항목에 문자 인식의 오류가 발생되면 해당 항목에 오류가 발생되었음을 표시하고 리턴하며, 문자 인식의 오류가 발생되지 않은 항목의 문자 데이터들은 해당 항목과 항목에 대응되는 문자가 등록된다.A character recognition and an item selection operation will be described with reference to FIG. 24B. The character recognition and item selection process of FIG. 24B is performed by the same procedure as that of FIG. 24A. However, if a correction key is input, an error correction process is not immediately performed and only an indication that an error has occurred in the corresponding item is performed. That is, when a modification key is generated in the item selection process, the control unit 101 detects this in step 429, indicates that an error has occurred in the character recognition data of the corresponding item in step 450, and returns to step 421. The operation of FIG. 24B is operated in the same procedure as that of FIG. 24A except for the modification operation as described above. Therefore, if the character recognition and item selection process is performed by the procedure as shown in FIG. 24B, if an error of character recognition occurs in the selected item in the process of selecting the item after the character recognition, it indicates that an error has occurred in the corresponding item and returns the character. The character data of an item for which no recognition error has occurred is registered with the corresponding item and the character corresponding to the item.

또한 상기 문자 인식 후 항목을 선택하는 과정에서 확인 및 수정 동작을 수행하지 않고 항목들만 선택하는 방법으로도 구현이 가능하다. 즉, 상기 확인 및 수정 동작을 수행하지 않고 인식된 문서에서 원하는 항목들을 선택한 후, 오류 정정 과정에서 선택된 항목들의 문자데이터들을 확인하여 오류를 일괄 정리하는 방법도구현이 가능하다.In addition, in the process of selecting an item after the character recognition, it is possible to implement the method by selecting only items without performing an operation of checking and correcting. That is, after selecting the desired items in the recognized document without performing the checking and correcting operation, it is possible to implement a method for collectively sorting the errors by checking the text data of the selected items in the error correction process.

도 25b는 상기 도 24b와 같은 방법으로 문자 인식 및 항목을 선택한 후 오류가 발생된 항목들의 문자 데이터들을 수정하는 과정을 도시하는 도면이다.FIG. 25B is a diagram illustrating a process of correcting character data of items in which an error occurs after character recognition and item selection in the same manner as in FIG. 24B.

상기 도 25b를 참조하여 오류 정정 절차를 살펴보면, 수정키 입력시 상기 제어부101은 551단계에서 이를 감지하고, 553단계에서 상기 표시부115의 제2표시영역75에 오류 인식된 항목들을 표시하고 표시부115의 제1표시영역71에 오류 항목들의 문자데이터들을 표시한다. 상기 도 28a와 같이 표시되는 상태에서 스타일러스 펜을 이용하여 표시부115의 제1표시영역71의 수정할 문자 데이터를 클릭하면, 상기 제어부101은 513단계에서 이를 감지하고, 515단계에서 도 28b와 같이 수정할 문자데이터를 표시한다. 이후 사용자가 오류 인식된 문자 데이터를 수정하기 위한 항목을 스타일러스 펜을 이용하여 클릭하면, 상기 제어부101은 555단계에서 이를 감지하고, 557단계로 진행하여 상기 도 25a와 같은 동작을 수행한다.Referring to FIG. 25B, when the correction key is input, the controller 101 detects this in step 551. In step 553, the controller 101 displays an error-recognized item on the second display area 75 of the display unit 115. Character data of the error items is displayed in the first display area 71. When the text data to be corrected in the first display area 71 of the display unit 115 is clicked using the stylus pen in the state shown in FIG. 28A, the controller 101 detects this in step 513 and the text to be modified as shown in FIG. 28B in step 515. Display data. Then, when the user clicks an item for correcting the error-recognized text data using the stylus pen, the controller 101 detects this in step 555 and proceeds to step 557 to perform the operation as shown in FIG. 25A.

상기 도 25a에서는 선택된 항목의 문자 데이터들에서 오류 인식된 문자데이터를 수정하는 동작을 수행한다. 상기와 같은 오류 정정 동작을 수행한 후, 수정 완료시 상기 제어부101은 559단계에서 해당 항목 및 수정된 문자데이터를 저장한다. 상기와 같이 선택 항목의 문자 데이터 수정이 완료된 후 사용자에 의해 다음 항목이 선택되면, 상기 제어부101은 561단계에서 이를 감지하고 상기 557단계로 되돌아가 다시 선택된 항목의 오류 인식된 문자데이터를 수정하는 동작을 반복 수행한다. 상기와 같이 오류 인식된 문자데이터가 있는 항목들을 순차적으로 선택하여 문자데이터의 수정 동작을 수행한다. . 이후 모든 항목들의 문자 데이터 수정이 완료되었으면, 사용자는 입력부113의 수정완료키를 발생한다. 그러면 상기 제어부101은 561단계에서 이를 감지하고, 563단계에서 수정된 항목들 및 각 항목들에 대응되는 수정된 문자데이터들을 표시부115에 표시 및 저장한다.In FIG. 25A, an error recognized text data is corrected in the text data of the selected item. After performing the above error correction operation, when the correction is completed, the controller 101 stores the corresponding item and the modified character data in step 559. When the next item is selected by the user after the modification of the text data of the selected item is completed as described above, the controller 101 detects it in step 561 and returns to step 557 to correct the error-recognized text data of the selected item again. Repeat this. As described above, items having error-recognized character data are sequentially selected to perform a modification operation of the character data. . After the modification of the text data of all the items is completed, the user generates the correction completed key of the input unit 113. In step 561, the controller 101 detects this and displays and stores the modified items and the modified text data corresponding to each item on the display unit 115 in step 563.

상기와 같이 문자인식, 항목선택 및 오류 수정 절차를 종료하면, 상기 명함에 기록된 원하는 정보들을 모두 입력한 상태가 된다. 이런 경우 도 29a와 같이 선택된 항목의 문자 데이터들이 표시된다. 상기와 같은 상태에서 사용자가 완료키를 스타일러스 펜으로 클릭하면, 상기 제어부101은 명함 인식이 종료되었음을 감지하고, 도 29b와 같이 항목들과, 상기 항목들에 대응되는 문자데이터들을 표시부115에 하나의 화면으로 표시한다. 그리고 상기 도 29b와 같은 표시 데이터들은 데이터 베이스131에 저장한다. 즉, 상기와 같은 입력과정과 인식과정, 그리고 수정과정이 모두 끝나면, 선택된 항목들의 문서 인식된 데이터들을 원하는 영역의 데이터베이스에 저장한다. 상기 데이터베이스131의 영역은 폰북, 메모장, 기타 응용프로그램 등 다양한 공간이 될 수 있다. 원하는 데이터를 모두 저장하면 프로그램을 종료시킨다.When the character recognition, item selection, and error correction procedures are completed as described above, all the desired information recorded on the business card is entered. In this case, the character data of the selected item is displayed as shown in FIG. 29A. When the user clicks on the completion key with the stylus pen in the above state, the controller 101 detects that the business card recognition is finished, and displays the items and the text data corresponding to the items on the display 115 as shown in FIG. 29B. Display on the screen. The display data as shown in FIG. 29B is stored in the database 131. That is, after all of the above input process, recognition process, and modification process, document-recognized data of selected items are stored in a database of a desired area. The area of the database 131 may be various spaces such as a phone book, a notepad, and other application programs. Save all the data you want and exit the program.

본 발명의 제2실시예에서는 문서 인식시 각 항목별로 오류를 수정하며, 문서 인식 및 오류 수정시 음성 인식 기법을 도입한다.In the second embodiment of the present invention, an error is corrected for each item when the document is recognized, and a speech recognition technique is introduced when the document is recognized and the error is corrected.

도 30은 본 발명의 제2실시예에 따른 동작 절차를 도시하는 도면이다.30 is a diagram illustrating an operation procedure according to a second embodiment of the present invention.

상기 도 30을 참조하면, 상기 제어부101은 200과정에서 문서 이미지 촬영 동작을 수행한다. 이때 카메라107에서 촬영되는 영상 이미지는 영상처리부109에서 영상처리된 후 [디지털 데이터로 변환되고,] 표시부115에 표시된다. 상기 표시부105에 촬영된 영상 이미지가 표시되는 상태에서 사진찍기 명령이 발생되면, 상기 제어부101은 상기 표시부115에 표시되고 있는 영상화면을 정지화상으로 표시하고, 상기 표시부115에 표시되고 있는 영상 이미지를 메모리103의 화상메모리 영역에 저장한다. 이때 상기 표시부115에 표시되는 영상 이미지는 동화상이 될 수 있고, 또한 명함 등과 같은 문자 이미지 데이터가 될 수도 있다. 상기 200 과정의 동작은 상기 도 24a 및 도 24b에서 설명된 바와 같이 저장된 영상화면 또는 입력되는 영상화면을 사용할 수도 있다.Referring to FIG. 30, the controller 101 performs a document image capturing operation in step 200. In this case, the video image photographed by the camera 107 is processed by the image processor 109 and then converted into digital data and displayed on the display 115. When a picture taking command is generated while the video image photographed on the display unit 105 is displayed, the controller 101 displays the video image displayed on the display unit 115 as a still image, and displays the video image displayed on the display unit 115. It is stored in the image memory area of the memory 103. In this case, the video image displayed on the display unit 115 may be a moving image or may be text image data such as a business card. The operation of step 200 may use a stored video screen or an input video screen as described with reference to FIGS. 24A and 24B.

상기와 같은 상태에서 휴대 단말장치의 사용자는 입력부113을 통해 현재 표시중인 문서에 대응되는 문서인식키를 입력한다. 그러면 상기 제어부101은 210과정에서 상기 전처리부121을 구동하여 문서 이미지의 전처리 동작을 수행하며, 220과정에서 상기 전처리된 문서 이미지의 문자 이미지들을 인식한다. 이때의 동작도 상기 도 24a 및 도 24b에서 설명되는 동작과 동일하게 수행할 수 있다. 그러면 상기 문자인식부123은 상기 표시부115에 표시되고 있는 영상화면 중의 문자 이미지들을 인식하여 문자 데이터로 변환시킨다. 이후 상기 제어부101은 상기 문자인식부123에서 인식된 문자 데이터를 상기 표시부115의 제1표시영역71에 표시하고, 문서입력키의 종류에 따른 항목정보들을 상기 표시부115의 제2표시영역75에 표시한다.In the above state, the user of the mobile terminal apparatus inputs a document recognition key corresponding to the currently displayed document through the input unit 113. Then, the controller 101 drives the preprocessing unit 121 to perform the preprocessing of the document image in step 210, and recognizes the character images of the preprocessed document image in step 220. The operation at this time may also be performed in the same manner as the operation described with reference to FIGS. 24A and 24B. Then, the text recognition unit 123 recognizes text images in the video screen displayed on the display unit 115 and converts the text images into text data. Thereafter, the controller 101 displays the character data recognized by the character recognition unit 123 in the first display area 71 of the display unit 115, and displays item information according to the type of the document input key in the second display area 75 of the display unit 115. do.

이후 사용자가 상기 표시부115의 제1표시영역71에 표시되고 있는 인식된 문자데이터들 및 상기 제2표시영역75에 표시되고 있는 항목을 선택하면, 상기 제어부101은 230 과정에서 상기 선택된 문자데이터 및 항목을 상기 표시부115의 제3표시영역73에 표시한다. 이때 상기 항목을 선택하는 방법은 입력부113을 통해표시되는 항목을 선택하는 방법과, 음성인식부129를 구동하여 항목을 선택하는 방법을 사용할 수 있다.Then, when the user selects the recognized text data displayed in the first display area 71 of the display unit 115 and the item displayed in the second display area 75, the controller 101 selects the selected text data and the item in step 230. Is displayed on the third display area 73 of the display unit 115. In this case, the method for selecting an item may include a method of selecting an item displayed through the input unit 113 and a method of driving the voice recognition unit 129 to select an item.

상기와 같이 항목을 선택한 후 선택된 항목의 문자데이터에 오류가 있는 경우, 해당하는 항목의 문자데이터를 수정하는 동작을 수행한다. 이때 상기 오류수정은 입력부113을 통해 오류가 발생된 문자를 선택하거나 또는 상기 음성인식부129를 구동하여 음성으로 오류가 발생된 문자데이터를 수정할 수 있다. 상기와 같은 방법으로 오류수정이 요구되면, 상기 제어부101은 241단계에서 이를 감지하고, 상기 제어부101은 240 과정으로 진행하여 상기 인식된 문자데이터들 중에서 오류가 발생된 문자의 수정을 한다.If there is an error in the text data of the selected item after selecting the item as described above, the text data of the corresponding item is corrected. In this case, the error correction may be performed by selecting an error-prone text through the input unit 113 or by driving the speech recognition unit 129 to correct text data in which an error occurs by voice. If an error correction is requested in the above manner, the controller 101 detects this in step 241, and the controller 101 proceeds to step 240 to correct the error-prone text among the recognized text data.

도 31은 상기 도 30의 200 과정에서 수행되는 문서 촬영 과정의 절차를 도시하는 도면이며, 도 26a-도 26e는 상기 문서 촬영 과정에서 촬영되는 문서이미지를 표시부115에 표시하는 도면이다. 상기 도 31의 동작은 상기 도 3과 동일하게 구현할 수 있다.31 is a diagram illustrating a procedure of a document photographing process performed in step 200 of FIG. 30, and FIGS. 26A to 26E are diagrams of displaying a document image photographed in the document photographing process on the display unit 115. 31 may be implemented in the same manner as in FIG. 3.

상기 문서 촬영 과정의 동작을 살펴보면, 사용자는 인식을 원하는 문서를 적정 위치에 놓고휴대 단말장치의 카메라107을 이용하여 촬영을 시작한다. 그러면 상기 제어부101은 651단계에서 도 26a 및 26b와 같이 표시부115에 표시(image preview)된다. 이때 휴대 단말기의 사용자가 키입력부105(입력부 113에서도 가능)의 카메라 조정키가 입력되면, 상기 제어부101은 653단계에서 이를 감지하고 상기카메라107을 제어한다. 이때 상기 카메라107의 조정은 노출시간 및 거리 조정이 될 수 있다. 그러면 상기 카메라107의 노출시간 및 거리조정에 따라 촬영되는 문서 이미지의 상기 표시부115 상에 표시되는 도 26a와 같은 문서 이미지는 초점 및 거리가 미세 조정되어 표시된다. 상기와 같은 상태에서 사용자가 스타일러스 펜을 이용하여 입력부113의 촬영키를 누르면, 상기 제어부101은 655단계에서 이를 사진찍기로 감지하고, 상기 촬영키가 입력된 시점의 문서 이미지를 상기 표시부115에 정지화상으로 표시한다.Referring to the operation of the document photographing process, the user places the document to be recognized at an appropriate position and starts photographing using the camera 107 of the mobile terminal device. Then, the controller 101 is displayed on the display unit 115 as shown in FIGS. 26A and 26B in step 651. In this case, when the user of the portable terminal receives a camera adjustment key of the key input unit 105 (also also available from the input unit 113), the control unit 101 detects this in step 653 and controls the camera 107. In this case, adjustment of the camera 107 may be adjustment of exposure time and distance. Then, the document image as shown in FIG. 26A displayed on the display unit 115 of the document image photographed according to the exposure time and distance adjustment of the camera 107 is displayed with fine adjustment of the focus and distance. When the user presses the shooting key of the input unit 113 using the stylus pen in the above state, the control unit 101 detects this by taking a picture in step 655 and stops the document image at the time when the shooting key is input on the display unit 115. It is displayed as an image.

이후 상기 제어부101은 659단계에서 상기 촬영된 문서 이미지를 표시한다. 이때 표시부115에 표시되는 문서이미지는 도 26c와 같다. 상기 도 26c와 같이 표시부115에 표시되는 문서 이미지가 양호한 경우, 상기 사용자는 상기 스타일러스 펜을 이용하여 입력부113에 표시되고 있는 저장키를 누른다. 상기 저장키가 발생되면, 상기 제어부101은 661단계에서 이를 감지하고, 표시중인 문서 이미지를 이름과 함께 상기 메모리103의 화상 메모리 영역에 저장한다. 이때 상기 표시부115에 표시되는 문서 이미지는 도 26e와 같다.In step 659, the controller 101 displays the photographed document image. In this case, the document image displayed on the display unit 115 is as shown in FIG. 26C. When the document image displayed on the display unit 115 is satisfactory as shown in FIG. 26C, the user presses a storage key displayed on the input unit 113 using the stylus pen. When the storage key is generated, the controller 101 detects it in step 661 and stores the displayed document image in the image memory area of the memory 103 with a name. In this case, the document image displayed on the display unit 115 is as shown in FIG. 26E.

상기와 같은 상태에서 사용자가 명함인식키를 클릭하면, 상기 제어부101은 663단계에서 이를 감지하고 상기 220 과정의 문서 인식 과정으로 진행하고, 그렇지 않으면 665단계로 진행하여 현재 표시중인 문서 이미지를 저장하고 종료한다.When the user clicks the business card recognition key in the above state, the control unit 101 detects this in step 663 and proceeds to the document recognition process in step 220, otherwise proceeds to step 665 to store the currently displayed document image and Quit.

상기한 바와 같이 상기 문서를 촬영하는 200 과정에서는 사용자가 원하는 영상을 카메라를 통해 입력하고, 카메라 미세 조정을 통해 문서 영상을 선명하게 촬영한다. 그리고 상기 보정된 영상이 만족스러울 경우, 문자인식을 통해 입력 영상에서 문자를 추출하여 문자데이터(text)로 저장할 것인지, 그냥 사진으로 저장할 것인지를 확인한다. 이때 상기 휴대 단말장치의 사용자가 문자인식을 요구하면, 210 및 220 과정의 전처리 및 문서 인식 절차를 수행한다.As described above, in step 200 of capturing the document, a user inputs a desired image through a camera and sharply captures a document image through camera fine adjustment. If the corrected image is satisfactory, it is checked whether the character is extracted from the input image through text recognition and stored as text data or just as a photo. At this time, if the user of the portable terminal device requests the character recognition, the preprocessing and document recognition procedures of 210 and 220 are performed.

도 32는 상기 도 30의 210 과정 - 230 과정에서 수행되는 문서 이미지에 대한 전처리 및 인식과정 그리고 인식된 문자 데이터의 항목 선택 과정의 절차를 도시하는 도면이며, 도 27a-도 27b는 상기 문서인식 및 항목선택 과정의 상태를 도시하는 도면이다.32 is a diagram illustrating a preprocessing and recognition process for a document image and an item selection process of recognized character data performed in steps 210 to 230 of FIG. 30, and FIGS. 27A to 27B illustrate the document recognition and It is a figure which shows the state of an item selection process.

상기 도 32를 참조하면, 상기 제어부101은 명함인식키가 발생되기 전에는 도 26e 와 같이 상기 표시부115에 저장 중인 명함의 이미지를 표시한다. 상기와 같은 상태에서 사용자가 입력부113의 명함인식키를 발생하면, 상기 제어부101은 이를 감지하고, 751단계에서 상기 전처리부121을 구동하여 표시되고 있는 문서 이미지의 전처리 동작을 수행한다. 이때 상기 전처리 동작은 상기한 바와 같이 수행될 수 있다. 그러면 상기 문자인식부123이 상기 표시중인 상기 도 26e와 같은 명함의 이미지를 문자 데이터(text)로 변환하며, 제어부101은 상기 변환된 문자 데이터를 표시부115에 도 27a와 같이 표시한다. 상기 명함 이미지를 문자 데이터로 변환하면, 상기 제어부101은 도 27a와 같이 표시부115의 제1표시영역71에 명함 이미지의 문자데이터들을 표시하고, 제3표시영역73에 항목 선택을 표시하며, 제2표시영역75에 저장하고자 하는 항목들을 표시한다.Referring to FIG. 32, before the business card recognition key is generated, the controller 101 displays an image of the business card being stored on the display unit 115 as shown in FIG. 26E. When the user generates the business card recognition key of the input unit 113 in the above state, the control unit 101 detects this, and in step 751 the pre-processing unit 121 drives the pre-processing operation of the displayed document image. In this case, the preprocessing operation may be performed as described above. Then, the character recognition unit 123 converts the image of the business card as shown in FIG. 26E into text data, and the controller 101 displays the converted text data on the display unit 115 as shown in FIG. 27A. When the business card image is converted into text data, the controller 101 displays text data of the business card image on the first display area 71 of the display unit 115 and displays an item selection on the third display area 73 as shown in FIG. 27A. Items to be stored are displayed on the display area 75.

상기 도 27a와 같이 인식된 문자데이타들이 표시되고 있는 상태에서, 사용자가 도 27b에 도시된 바와 같이 스타일러스 펜을 이용하여 제1표시영역71의 문자데이터(문장)를 선택하고, 제2표시영역75의 저장 항목을 선택하면, 상기 제어부101은 755단계에서 이를 감지하고 757단계에서 도 27b와 같이 상기 표시부115의 제3표시영역73에 선택된 항목 및 이에 대응되는 문자 데이터를 표시한다. 또한 상기 757단계에서 저장 항목을 선택하는 방법은 음성으로도 실행할 수 있다. 이런 경우 상기 휴대 단말장치의 사용자는 상기 입력부113 또는 키입력부105를 통해 음성인식모드를 선택한 후 원하는 저장항목을 음성으로 입력하는 방법이 있다.In the state where the recognized character data are displayed as shown in FIG. 27A, the user selects the character data (sentence) of the first display area 71 using a stylus pen as shown in FIG. 27B, and the second display area 75 When the stored item is selected, the controller 101 detects this in step 755 and displays the selected item and corresponding text data in the third display area 73 of the display unit 115 as shown in FIG. 27B in step 757. In addition, the method of selecting a stored item in step 757 may be executed by voice. In this case, the user of the portable terminal apparatus may select a voice recognition mode through the input unit 113 or the key input unit 105 and then input a desired storage item by voice.

상기와 같이 문자 데이터를 표시하는 상태에서 상기 입력부113으로부터 수정키가 발생되면, 상기 제어부101은 759단계에서 이를 감지하고 761단계로 진행하여 오류정정 과정을 수행한다. 그러나 상기 수정키가 입력되지 않으면, 상기 제어부101은 다음 항목의 선택키가 입력되는가를 검사한다. 이때 상기 항목의 선택이 입력되면, 상기 제어부는 763단계에서 이를 감지하고 상기 755단계로 진행하여 다음 항목을 선택한다. 그러나 상기 763단계에서 완료키의 입력이 감지되면, 상기 제어부101은 765단계로 진행하여 선택된 항목들의 문자 데이터를 상기 데이터베이스131에 저장하고 문서 인식 절차를 종료한다.When the correction key is generated from the input unit 113 in the state of displaying the text data as described above, the control unit 101 detects this in step 759 and proceeds to step 761 to perform an error correction process. However, if the correction key is not input, the controller 101 checks whether a selection key of the next item is input. When the selection of the item is input, the controller detects it in step 763 and proceeds to step 755 to select the next item. However, if the input of the completion key is detected in step 763, the controller 101 proceeds to step 765 to store the text data of the selected items in the database 131 and ends the document recognition procedure.

상기한 바와 같이 본 발명의 제2실시예에 다른 문서인식 과정은 입력된 문서 이미지에 포함된 문자들을 문자인식기를 구동시켜서 텍스트(text)로 변환시키는 작업을 수행한다. 그리고 상기와 같이 변환된 텍스트를 표시부115에 표시한 후에, 사용자로부터 저장하고 싶은 문자들을 선택하게 한다. 이때 상기 선택된 문자들을 저장할 영역(이름, 주소, 회사 등)을 지정 받아 그 영역에 문자들을 복사하여 입력한다. 상기 항목 선택 과정 중에서 저장할 영역을 지정하는 과정은 후술하는 도 33의항목 선택 과정에서 상세하게 설명한다. 또한 상기 인식된 문자 중에서 수정해야 할 문자가 있을 경우에는 오류 수정 과정으로 이동하고, 그렇지 않은 경우에는 추가로 저장하고 싶은 항목이 있는지 물어 상기 항목 선택 과정을 반복하거나, 저장 과정으로 이동하여 데이터베이스에 저장한 후 프로그램을 종료한다.As described above, another document recognition process according to the second embodiment of the present invention performs a task of converting characters included in the input document image into text by driving a character recognizer. After displaying the converted text on the display unit 115, the user selects characters to be stored. At this time, a region (name, address, company, etc.) to store the selected characters is designated, and the characters are copied and input in the region. A process of designating a region to be stored in the item selection process will be described in detail later in the item selection process of FIG. 33. In addition, if there is a character to be corrected among the recognized characters, go to the error correction process, otherwise, ask if there is an additional item to save, repeat the item selection process, or go to the storage process and store it in the database And exit the program.

도 33은 상기 도 32의 755단계 및 757단계에서 수행되는 항목 선택 과정의 상세 흐름을 도시하는 도면이다.FIG. 33 is a diagram illustrating a detailed flow of an item selection process performed in steps 755 and 757 of FIG. 32.

상기 도 33을 참조하면, 문자인식을 수행하고 난 후 상기 표시부115에는 도 27a와 같이 표시되고 있다. 상기와 같이 표시되는 상태에서 사용자는 원하는 저장항목을 선택하기 위하여 스타일러스 펜으로 상기 제2표시영역75에 표시되는 항목을 선택하거나 입력부113 또는 키입력부105를 통해 음성인식모드를 선택할 수 있다. 이때 상기 음성인식모드가 선택되면, 상기 제어부101은 771단계에서 이를 감지하고, 773단계에서 녹음 버튼을 누른 후 원하는 저장항목 및 데이터를 음성으로 입력한다. 그러면 상기 제어부101은 773단계 및 775단계에서 상기 오디오 처리부111을 통해 수신되는 음성신호를 상기 음성인식부123에 인가하며, 상기 음성인식부129를 구동하여 수신되는 음성을 인식시킨다. 이후 상기 제어부101은 777단계에서 상기 음성 인식된 신호에 대응되는 항목의 문자데이터들을 도 27b와 같이 표시하고 이를 저장한다.Referring to FIG. 33, after performing character recognition, the display unit 115 is displayed as shown in FIG. 27A. In the display state as described above, the user may select the item displayed in the second display area 75 with the stylus pen or select the voice recognition mode through the input unit 113 or the key input unit 105 in order to select a desired storage item. In this case, when the voice recognition mode is selected, the controller 101 detects this in step 771, and in step 773, presses a record button and inputs a desired storage item and data by voice. Then, in step 773 and step 775, the controller 101 applies the voice signal received through the audio processor 111 to the voice recognition unit 123, and drives the voice recognition unit 129 to recognize the received voice. In step 777, the controller 101 displays and stores the text data of the item corresponding to the speech recognized signal as shown in FIG. 27B.

또한 상기 771단계에서 스타일러스 펜으로 저장 항목의 선택이 감지되면, 상기 제어부101은 779단계에서 도 27a와 같이 저장 항목을 표시하고, 스타일러스 펜으로 원하는 항목이 클릭되면 781단계에서 도 27b와 같이 선택된 저장 항목 및 문자 데이터를 표시하며, 783단계에서 선택된 저장 항목에 지정된 영역의 문자데이터(text)를 저장한다.In addition, when the selection of the storage item is detected with the stylus pen in step 771, the controller 101 displays the storage item as shown in FIG. 27A in step 779, and when the desired item is clicked with the stylus pen in step 781, the selected item is stored as shown in FIG. 27B. The item and the text data are displayed, and the text data of the area specified in the storage item selected in step 783 is stored.

상기한 바와 같이, 저장항목을 선택하는 방법은 크게 음성인식으로 선택하는 경우와 스타일러스 펜으로 선택하는 경우로 나눌 수 있다. 우선 음성인식으로 선택하는 방법은 녹음버튼을 누른 후, 도 27a와 같이 표시되는 저장항목들 중에서 원하는 항목을 발음하여 음성인식부129를 통해 그 항목을 지정하는 과정을 거친다. 이때, '항목추가'를 선택하면 그 다음 과정으로 넘어가지 않고, 사용자가 추가로 원하는 저장항목을 입력받아 저장항목 테이블에 추가하게 된다. 또한 두 번째로 스타일러스 펜으로 선택하는 방법은 도 27a와 같이 표시부115에 표시되는 저장항목들 중에서 원하는 항목을 스타일러스 펜으로 클릭하여 선택하게 된다. 이 두 방법은 순차적으로 동작하는 것이 아니라 사용자의 선택에 따라 둘 중에 하나를 사용할 수 있다.As described above, a method of selecting a storage item may be divided into a case of selecting by voice recognition and a case of selecting by a stylus pen. First, a method of selecting voice recognition is performed by pressing a record button and then pronouncing a desired item among the stored items displayed as shown in FIG. 27A and specifying the item through the voice recognition unit 129. At this time, if you select 'add item', the user does not move on to the next step, and the user adds a desired item to the storage item table. In the second method of selecting a stylus pen, a desired item is selected by using a stylus pen among the stored items displayed on the display unit 115 as illustrated in FIG. 27A. These two methods do not operate sequentially but may use one of them depending on the user's choice.

또한 상기 도 33은 원하는 항목 및 문자데이터를 선택할 때 음성인식 또는 스타일러스 펜으로 선택하는 예를 설명하고 있다. 그러나 저장 항목은 음성인식으로 선택하고 문자 데이터는 스타일러스 펜으로 선택하거나 또는 저장 항목은 스타일러스 펜으로 선택하고 문자데이터는 음성인식으로 선택할 수도 있다.33 illustrates an example of selecting a desired item and text data by using a voice recognition or a stylus pen. However, the storage item may be selected by voice recognition and the text data may be selected by a stylus pen, or the stored item may be selected by a stylus pen, and the text data may be selected by voice recognition.

도 34a - 도 34d는 본 발명의 제2실시예에 따라 선택된 항목별로 오류가 발생된 문자 데이터를 수정하는 도 30의 240과정의 동작 절차를 도시하는 도면이다.34A to 34D illustrate an operation procedure of operation 240 of FIG. 30 for correcting character data having an error for each selected item according to the second embodiment of the present invention.

상기 도 34a를 참조하여 상기 도 30의 240 과정에서 오류 정정 절차를 살펴보면, 원하는 항목 선택시 상기 제어부101은 도 28a와 같이 표시부115의 제3표시영역73에 선택된 항목 및 해당 항목의 문자데이터를 표시한다. 이때 상기 도 28a와 같이 선택된 항목의 문자데이터가 오인식된 경우, 사용자는 스타일러스 펜을 이용하여 수정키를 클릭하거나 또는 음성인식모드를 선택하여 수정을 명령한다. 그러면 상기 제어부101은 이를 감지하고 811단계에서 문자인식부123에서 지정된 문자와 가장 유사한 후보문자들을 입력하며, 813단계에서 이를 상기 표시부115의 제3표시영역73에 표시한다. 이때 상기 제어부101은 도 28b와 같이 표시부115의 제3표시영역73에 상기 오류 인식된 문자를 수정하기 위한 후보 문자들을 표시하며, 또한 제2표시영역75에 오류 인식된 문자를 수정하기 위해 필기체 문자를 입력할 수 있는 인식창을 표시하거나 또는 제4표시영역에 소프트키패드를 표시한다. 이때 상기 휴대 단말장치의 사용자는 상기 표시부115의 제3표시영역73에 표시되는 후보문자들 중에 원하는 문자가 있으면 스타일러스 펜을 이용하여 해당하는 문자를 클릭한다. 따라서 상기 후보 문자들 중 임의의 문자가 스타일러스 펜에 의해 선택되면, 상기 제어부101은 815단계에서 이를 감지하고 817단계에서 상기 제1영역에 표시되고 있는 오류 인식문자를 상기 선택된 후보문자로 수정한다.Referring to FIG. 34A, the error correction procedure in step 240 of FIG. 30 is performed. When a desired item is selected, the controller 101 displays the selected item and the text data of the corresponding item in the third display area 73 of the display unit 115 as shown in FIG. 28A. do. At this time, when the text data of the selected item is misrecognized as shown in FIG. 28A, the user clicks a modification key using a stylus pen or selects a voice recognition mode to instruct modification. Then, the controller 101 detects this and inputs candidate characters most similar to the characters designated by the character recognition unit 123 in step 811, and displays them in the third display area 73 of the display unit 115 in step 813. In this case, the controller 101 displays candidate characters for correcting the error recognized character in the third display area 73 of the display unit 115 as shown in FIG. 28B, and also adds handwritten characters to correct the error recognized character in the second display area 75. Displays a recognition window for inputting a key or displays a soft keypad on the fourth display area. In this case, the user of the portable terminal apparatus clicks a corresponding character using a stylus pen if there are desired characters among the candidate characters displayed in the third display area 73 of the display unit 115. Accordingly, when any one of the candidate characters is selected by the stylus pen, the controller 101 detects this in step 815 and in step 817 corrects the error recognition character displayed in the first area to the selected candidate character.

그러나 상기 제3표시영역73에 표시되는 후보 문자들 중에 원하는 문자가 없는 경우, 상기 휴대 단말장치의 사용자는 음성인식모드를 선택하거나 또는 제2표시영역75의 필기체 인식창을 사용하거나 제4표시영역의 소프트키패드를 사용할 수 있다. 이때 상기 사용자가 입력부113 또는 키입력부105를 통해 음성인식모드를 선택하면 상기 제어부101은 820단계로 진행하여 도 34b와 같은 동작을 수행하며, 제2표시영역75의 필기체 인식창에 원하는 필기체 문자를 입력하면 850단계로 진행하여도 34c와 같은 동작을 수행한다.However, if there are no desired characters among the candidate characters displayed in the third display area 73, the user of the portable terminal apparatus selects a voice recognition mode or uses the handwriting recognition window of the second display area 75 or the fourth display area. You can use the softkeypad. In this case, when the user selects the voice recognition mode through the input unit 113 or the key input unit 105, the control unit 101 proceeds to step 820 to perform the operation as shown in FIG. 34B and writes the desired handwritten characters in the handwriting recognition window of the second display area 75. If it is input, the operation as in step 34c is performed even in step 850.

상기 스타일러스 펜을 이용한 수정과정은 문자인식부123에서 추출된 데이터 값을 참고한다. 상기 도 33과 같이 처리되는 문자인식 과정에서 하나의 문자를 인식할 때, 상기 문자인식부123은 입력된 문자와 가장 유사한 문자를 해당항목의 문자 데이터로 결정하고, 그 다음으로 유사한 몇 개의 문자들을 후보 문자들로 보유하게 된다. 이후 상기 도 34a와 같은 절차로 진행되는 오류 수정 과정에서는 사용자가 수정을 원하는 문자의 후보 문자들을 문자인식부123로부터 불러와서 표시부115의 제3표시영역73에 표시한다. 이때 상기 휴대단말장치의 사용자는 이 후보문자들 중에서 원하는 문자가 있으면 스타일러스 펜으로 선택하여 수정하고, 만약 원하는 문자가 없을 경우에는 새로운 문자의 입력을 위하여 도 34b의 음성인식 절차나 도 34c의 필기체 인식 절차나 도 34d의 소프트 키 인식절차를 수행하도록 한다. 이러한 전환은 하나의 화면에서 바로 이루어지는데, 휴대용 단말장치의 표시부115 하단에 필기체 인식 영역 및 소프트 키패드를 항상 구동시켜 놓아 사용자의 선택을 기다리게 하고, 녹음버튼을 누를 경우에는 음성인식부129를 구동시킨다. 따라서 상기 문자인식부123은 인쇄체 문자, 필기체 문자 및 소프트키를 인식할 수 있도록 설계한다.The correction process using the stylus pen refers to the data value extracted from the character recognition unit 123. When one character is recognized in the character recognition process processed as shown in FIG. 33, the character recognition unit 123 determines the character most similar to the input character as the character data of the corresponding item, and then selects several similar characters. It is retained as candidate characters. Subsequently, in the error correction process performed in the procedure of FIG. 34A, candidate characters of characters to be corrected by the user are retrieved from the character recognition unit 123 and displayed in the third display area 73 of the display unit 115. At this time, the user of the portable terminal device selects and corrects a desired character from the candidate characters with a stylus pen. If there is no desired character, the user recognizes the voice recognition procedure of FIG. 34B or the handwriting recognition of FIG. 34C to input a new character. Procedure or the soft key recognition procedure of Fig. 34D. This switching is performed directly on one screen. The handwriting recognition area and the soft keypad are always driven at the bottom of the display unit 115 of the portable terminal device to wait for the user's selection, and when the record button is pressed, the voice recognition unit 129 is driven. . Accordingly, the character recognition unit 123 is designed to recognize printed characters, handwritten characters, and soft keys.

상기 도 34b를 참조하면, 상기 음성인식부129는 각국의 언어에 따라 다르게 구성하여야 한다. 상기 음성인식부129는 단어로 입력하지 않고 문자로 입력하게 되는데, 언어의 특성상 영어와 같이 하나의 문자들이 단어를 구성하는 경우와 한글과 같이 여러 자소들이 하나의 문자를 이루는 경우가 있다. 예를들면 영어의 경우 단어가 "KOREA"인 경우 5개의 문자로 이루어지는데 반해, 한글의 경우 "한국"의 두 개의 한 개의 문자가 각각 3 음소 씩 6개의 음소로 이루어진다. 따라서 한글과 같은 언어의 경우 음성인식부129는 무제한 음성인식엔진이 아니면 원하는 문자를 입력하기 위해서는 음소 단위로 입력하여야 한다. 따라서 음성인식 모드시 먼저 언어모드를 선택하고, 한글인 경우 무제한 엔진 여부를 선택하여 음성인식을 시도한다.Referring to FIG. 34B, the voice recognition unit 129 should be configured differently according to the language of each country. The voice recognition unit 129 is not a word, but is input as a letter. Due to the characteristics of the language, one letter may form a word as in English, and several phonemes may form a letter as in Korean. For example, in the case of English, the word "KOREA" is composed of five letters, whereas in the case of Korean, two characters of "Korea" are composed of six phonemes, each of three phonemes. Therefore, in the case of a language such as Korean, the voice recognition unit 129 must input a phoneme unit in order to input a desired character unless the voice recognition engine is unlimited. Therefore, in the speech recognition mode, the language mode is first selected, and in the case of Hangul, an unlimited engine is selected to attempt speech recognition.

따라서 음성인식시 상기 제어부101은 821단계에서 수정할 문자가 한글인가 아니면 영어인가를 판단한다. 이때 영어 모드인 경우, 상기 휴대단말장치의 사용자는 영어모드를 선택한 후 녹음버튼을 누르고 수정을 위한 문자 데이터를 음성으로 입력한다. 그러면 상기 제어부101은 835단계에서 영어 문자데이터가 음성으로 입력되는 것을 감지하며, 837단계에서 음성인식부129를 구동하여 상기 오디오처리부111에서 출력되는 음성의 영어문자 데이터를 인식하여 제어부101에 출력한다. 그러면 상기 제어부101은 상기 음성 인식된 영어 문자 데이터로 상기 선택된 항목의 문자데이터를 수정하며, 839단계에서 다음 문자 데이터의 수정 여부를 판단한다. 이때 선택된 항목의 문자데이터에서 수정할 문자가 더 있으면 상기 제어부101은 상기 835단계로 되돌아가 위와 같은 과정을 반복 수행하며, 수정할 문자가 없으면 상기 도 30의 251단계로 진행한다.Therefore, during voice recognition, the controller 101 determines whether the character to be modified in step 821 is Korean or English. In this case, in the English mode, the user of the portable terminal device selects the English mode, presses the record button, and inputs text data for correction by voice. Then, the controller 101 detects that the English text data is input to the voice in step 835, and in step 837 drives the voice recognition unit 129 to recognize the English text data of the voice output from the audio processor 111 and outputs it to the controller 101. . Then, the controller 101 corrects the text data of the selected item with the voice-recognized English text data, and determines whether to correct the next text data in step 839. In this case, if there are more characters to be corrected in the text data of the selected item, the controller 101 returns to step 835 and repeats the above process. If there are no characters to be modified, the controller 101 proceeds to step 251 of FIG.

또한 상기 821단계에서 수정할 문자가 한글이면, 상기 제어부101은 823단계로 진행하여 상기 음성인식부123이 무제한 음성인식 엔진인가를 검사한다. 이때 무제한 음성인식 엔진이면, 상기 제어부823은 상기 823단계로 진행하며, 상기한 바와 같이 835단계-839단계를 수행하면서 한글 문자의 음성 인식 동작을 수행한다. 이때상기 한글 음성 인식은 문자 단위로 수행된다.If the character to be corrected in step 821 is Korean, the controller 101 proceeds to step 823 and checks whether the voice recognition unit 123 is an unlimited voice recognition engine. In this case, if the unlimited voice recognition engine, the control unit 823 proceeds to step 823, and performs steps 835 to 839 as described above and performs the voice recognition operation of the Hangul characters. In this case, the Hangul speech recognition is performed in units of characters.

그러나 상기 음성인식부129가 무제한 음성인식엔진이 아니면, 상기 제어부101은 825단계로 진행하여 한글 음성 인식을 음소 단위로 수행한다. 이런 경우 사용자는 선택된 항목의 문자데이터를 수정하는 경우 녹음버튼을 누른 후 한글 문자데이터를 구성하는 음소들을 음성으로 순차적으로 입력하며, 해당 문자에 대한 음소들의 음성 입력을 완료하면 완료 버튼을 누른다. 상기와 같이 문자를 구성하는 음소들이 음성으로 입력되면, 상기 제어부101은 825단계에서 이를 수신하며, 827단계에서 음성인식부129를 구동하여 수신되는 음소들을 인식한다. 상기와 같이 수정을 원하는 위치의 음소들을 인식한 후, 상기 제어부101은 829단계에서 문자 데이터의 음소 입력이 완료되었음을 감지하고, 831단계에서 음소들을 결합하여 문자를 완성한 후 이를 선택된 항목의 문자데이터로 수정한다. 이후 상기 제어부101은 833단계에서 다음 문자 데이터의 수정 여부를 판단한다. 이때 선택된 항목의 문자데이터에서 수정할 문자가 더 있으면 상기 제어부101은 상기 825단계로 되돌아가 위와 같은 과정을 반복 수행하며, 수정할 문자가 없으면 상기 도 30의 251단계로 진행한다.However, if the speech recognition unit 129 is not an unlimited speech recognition engine, the controller 101 proceeds to step 825 to perform Korean speech recognition in phoneme units. In this case, when the user modifies the text data of the selected item, the user presses the record button and sequentially inputs the phonemes constituting the Hangul text data by voice. When the voice input of the phonemes for the corresponding text is completed, the user presses a completion button. When the phonemes constituting the text as described above are inputted as voice, the controller 101 receives this in step 825, and in step 827 drives the voice recognition unit 129 to recognize the received phonemes. After recognizing the phonemes of the desired location to be corrected as described above, the controller 101 detects that the phoneme input of the text data is completed in step 829, and combines the phonemes in step 831 to complete the text and then converts the phonemes into the text data of the selected item. Correct it. In step 833, the controller 101 determines whether to modify the next character data. In this case, if there are more characters to be modified in the text data of the selected item, the controller 101 returns to step 825 and repeats the above process. If there are no characters to be modified, the controller 101 proceeds to step 251 of FIG.

상기한 바와 같이 음성인식을 이용한 오류 수정과정의 동작은 스타일러스 펜으로 오류를 수정하는 과정에서 오류수정을 완료하지 못한 경우에 수행될 수 있다. 상기 음성인식에 의해 오류 수정은 크게 수정할 문자가 적어도 두 개의 구성문자를 가지는 문자(예를들면 한글: 자음 및 모음, 초성,중성,종성의 복수개의 음소들로 이루어지는 문자)인지 아닌지(예를들면 영문: 알파베트 문자를 순차적으로 연결함)에 의해 구분된다. 현재 휴대 단말장치(예를들면 PDA)는 매우 큰 용량이 필요한 무제한 음성인식기가 들어갈 수 없기 때문에, 한글 인식은 미리 입력된 문자가 아니면 불가능하다. 따라서, 한글인 경우에는 한번에 수정할 수 있는 범위를 음소 단위(초성/중성/종성)로 나누어, 한 부분씩 수정하게 한다. 향후 무제한 음성인식기가 휴대 단말장치에 구현이 된다면 이러한 과정이 필요 없이 한 문자씩 바로 수정하면 된다. 반면 영어나 특수문자의 경우에는 한자씩 수정이 가능하므로, 사용자가 수정을 원하는 알파벳이나 특수문자를 선택하고, 녹음버튼을 눌러 원하는 문자를 발음하면 음성인식기를 통해 그 문자로 수정하면 된다. 수정과정이 끝나면 상기 도 30의 저장항목 선택 과정으로 되돌아간다.As described above, the operation of the error correction process using voice recognition may be performed when the error correction is not completed in the process of correcting the error with the stylus pen. The error correction by the voice recognition is whether or not the character to be largely modified is a character having at least two constituent characters (for example, Korean: consonants and vowels, characters consisting of a plurality of phonemes of consonants, vocals, and voices). English: Alphabet characters are connected in sequence). Currently, portable terminal devices (for example, PDAs) cannot enter an unlimited voice recognizer that requires a very large capacity. Therefore, Korean characters cannot be recognized unless they have been previously inputted. Therefore, in the case of Hangul, the range that can be corrected at one time is divided into phoneme units (secondary / neutral / jongseong) and corrected by one part. If the unlimited voice recognizer is implemented in the portable terminal device in the future, it is not necessary to perform such a process, it is necessary to modify one by one immediately. On the other hand, in the case of English or special characters can be modified one by one, the user selects the alphabet or special characters that you want to modify, and if you press the record button to pronounce the desired character, you can modify the character through the voice recognizer. After the modification process, the process returns to the storage item selection process of FIG.

상기 도 34c를 참조하여 필기체 문자 인식에 의한 오류 수정 동작을 살펴보면, 상기 제어부101은 851단계에서 상기 8b와 같이 오류 인식된 문자가 표시하며, 스타일러스 펜에의 해 상기 제2표시영역75의 인식창에 필기체 문자가 입력되면, 상기 제어부101은 853단계에서 이를 감지하고, 855단계에서 문자인식부123을 구동하여 입력된 필기체 문자의 인식을 수행한다. 그리고 상기 제어부101은 상기 선택된 항목의 오류 인식된 문자데이터를 상기 문자인식부123에 의해 인식된 문자 데이터로 수정한다. 이후 상기 제어부101은 857단계에서 다음 문자 데이터의 수정 여부를 판단한다. 이때 선택된 항목의 문자데이터에서 수정할 문자가 더 있으면 상기 제어부101은 상기 853단계로 되돌아가 위와 같은 과정을 반복 수행하며, 수정할 문자가 없으면 상기 도 30의 251단계로 진행한다.Referring to FIG. 34C, the error correction operation by handwritten character recognition is performed. In step 851, the error recognized character is displayed as shown in FIG. 8B, and the recognition window of the second display area 75 is touched by a stylus pen. When the handwritten character is input to the control unit 101, the control unit 101 detects this in step 853, and in step 855 drives the character recognition unit 123 to recognize the input handwritten character. The controller 101 modifies the error-recognized character data of the selected item to the character data recognized by the character recognition unit 123. In step 857, the controller 101 determines whether to modify the next character data. In this case, if there are more characters to be corrected in the text data of the selected item, the controller 101 returns to step 853 and repeats the above process. If there are no characters to be modified, the controller 101 proceeds to step 251 of FIG.

상기한 바와 같이 필기체 인식에 의한 오류수정은 표시부115의제2표시영역75에 로딩되어 있는 필기체 인식 창을 통해서 수행된다. 상기 도 34a의 오류 수정과정에서 사용자가 원하는 문자로 수정하지 못하였을 경우, 상기 필기체 인식 창에 직접 원하는 문자를 써넣어서 수정하게 된다.As described above, error correction by handwriting recognition is performed through a handwriting recognition window loaded in the second display area 75 of the display unit 115. In the error correction process of FIG. 34A, when the user fails to correct the desired character, the desired character is directly written in the handwriting recognition window.

상기 도 34d를 참조하여 소프트키 인식에 의한 오류 수정 동작을 살펴보면, 상기 제어부101은 871단계에서 상기 8b와 같이 오류 인식된 문자가 표시하고 제4표시영역77에 소프트 키패드를 표시한다. 이때 상기 소프트 키패드를 통해 입력되는 소프트 키데이터들이 수신되면, 상기 제어부101은 873단계에서 이를 감지하고, 875단계에서 문자인식부123의 소프트키 인식기를 구동하여 입력된 소프트 키들에 대응되는 문자를 인식한다. 그리고 상기 제어부101은 상기 선택된 항목의 오류 인식된 문자데이터를 상기 문자인식부123의 소프트키 인식기에 의해 인식된 문자 데이터로 수정한다. 이후 상기 제어부101은 877단계에서 다음 문자 데이터의 수정 여부를 판단한다. 이때 선택된 항목의 문자데이터에서 수정할 문자가 더 있으면 상기 제어부101은 상기 853단계로 되돌아가 위와 같은 과정을 반복 수행하며, 수정할 문자가 없으면 상기 도 30의 251단계로 진행한다.Referring to FIG. 34D, the error correction operation by soft key recognition will be described. In step 871, the controller 101 displays an error recognized character as shown in 8b and displays a soft keypad on the fourth display area 77. In this case, when the soft key data received through the soft keypad is received, the control unit 101 detects this in step 873, and in step 875 drives the soft key recognizer of the character recognition unit 123 to recognize the characters corresponding to the input soft keys. do. The controller 101 modifies the error-recognized character data of the selected item into character data recognized by the soft key recognizer of the character recognizer 123. In step 877, the controller 101 determines whether to modify the next character data. In this case, if there are more characters to be corrected in the text data of the selected item, the controller 101 returns to step 853 and repeats the above process. If there are no characters to be modified, the controller 101 proceeds to step 251 of FIG.

상기한 바와 같이 소프트키 인식에 의한 오류수정은 표시부115의 제4표시영역77에 로딩되어 있는 소프트 키패드를 통해서 수행된다. 상기 도 34a의 오류 수정과정에서 사용자가 원하는 문자로 수정하지 못하였을 경우, 상기 소프트 키패드의 소프트키들을 입력하여 직접 원하는 문자를 써넣어서 수정하게 된다.As described above, error correction by soft key recognition is performed through a soft keypad loaded in the fourth display area 77 of the display unit 115. In the error correction process of FIG. 34A, if the user fails to correct the desired character, the user inputs the soft keys of the soft keypad and corrects the desired character.

상기와 같이 인식된 문자의 항목들을 선택하는 과정 및 선택된 항목의 문자데이터들에 대한 오류 수정과정을 종료하면, 휴대 단말장치의 사용자는 입력부113을 통해 완료키를 입력한다. 그러면 상기 제어부101은 251단계에서 이를 감지하고, 상기 문서 인식된 결과를 데이터 베이스131에 저장한다. 상기 데이터베이스는 상기 문서에서 선택된 항목들 및 해당 항목의 문자데이터들을 사용자에 의해 지정된 번지에 등록한다.When the process of selecting the items of the recognized text and the error correction process for the text data of the selected item is completed as described above, the user of the portable terminal device inputs a completion key through the input unit 113. In step 251, the controller 101 detects this and stores the document recognition result in the database 131. The database registers the items selected in the document and the text data of the items at the address designated by the user.

위와 같은 입력과정과 인식과정, 그리고 수정과정이 모두 끝나면 이 데이터 들을 원하는 영역의 데이터베이스에 저장하면 된다. 이때, 데이터베이스 영역은 폰북, 메모장, 기타 응용프로그램 등 다양한 공간이 될 수 있다. 원하는 데이터를 모두 저장하면 프로그램을 종료시킨다.After the above input process, recognition process, and modification process, all of these data can be stored in the database of the desired area. In this case, the database area may be various spaces such as a phone book, a notepad, and other applications. Save all the data you want and exit the program.

상기와 같은 본 발명의 제2실시예에서는 문서 인식 후 인식된 문서의 저장 항목들을 선택하며, 선택된 저장 항목의 문자 데이터에 오류가 발생되면 오류 문자를 수정한 후 다음 저장항목을 선택한다. 따라서 인식된 문서를 항목별로 저장하는 과정에서 오류가 발생된 문자도 같이 수정하여 저장한다. 또한 본 발명의 제2실시예에서는 저장할 항목을 선택하거나 오류가 발생된 문자를 수정할 때 음성 인식기를 사용할 수 있다.In the second embodiment of the present invention as described above, the storage items of the recognized document are selected after document recognition. When an error occurs in the text data of the selected storage item, the error text is corrected and then the next storage item is selected. Therefore, in the process of storing the recognized document for each item, the character in which an error occurs is also modified and stored. In addition, in the second embodiment of the present invention, a voice recognizer may be used to select an item to store or to correct an error-prone text.

또한 상기 본 발명의 제2실시예에서는 오류 수정시 먼저 후보 문자를 선택하여 오류가 발생된 문자를 보정하며, 상기 후보 문자를 이용하여 오류 수정이 불가능한 경우에 음성 인식 또는 필기체 문자 및 소프트키 인식을 통해 수정하는 예를 설명하고 있다. 그러나 상기 오류 문자를 수정할 때 후보문자를 선택하는 방법, 음성인식에 의한 음성 입력 방법, 필기체 문자 입력 방법, 소프트 키패드에 의한 문자 입력방법들 중에서 일부 방법들을 선택하여 구현하는 것도 가능하다. 즉, 상기오류 문자 수정 방법은 후보문자를 선택하지 않고 음성, 필기체 문자, 소프트키를 직접 입력하여 구현할 수도 있다. 또한 상기 본 발명의 제2실시예에서는 후보문자 선택, 음성 인식 및 필기체 인식으로 오류문자를 수정하는 방법을 설명하고 있지만, 후보문자선택 및 음성인식 방법, 음성인식 및 필기체 인식 방법 및 음성인식 및 소프트키 인식방법만으로 구현이 가능하며, 음성인식, 소프트키 또는 필기체문자 인식 방법만을 사용하여 구현하는 것도 가능하다.In addition, in the second embodiment of the present invention, when correcting an error, a candidate character is first selected to correct an error-prone character, and when the error correction is impossible using the candidate character, speech recognition or handwritten character and softkey recognition are performed. An example of modifying is described. However, it is also possible to select and implement some methods among a method of selecting a candidate character when correcting the error character, a voice input method by voice recognition, a handwritten character input method, and a text input method by a soft keypad. That is, the error character correction method may be implemented by directly inputting a voice, a handwritten character, and a soft key without selecting a candidate character. In addition, although the second embodiment of the present invention describes a method of correcting an error character by selecting a candidate character, speech recognition, and handwriting recognition, the method of selecting a candidate character and a speech recognition method, a speech recognition and a handwriting recognition method, and a speech recognition and software It is possible to implement only by the key recognition method, it is also possible to implement using only the voice recognition, soft key or handwritten character recognition method.

또한 본 발명의 실시예에서는 상기 문서를 명함으로 가정하여 설명하고 있지만, 상기 명함 이외의 다른 문서인식에도 적용이 가능하다.In addition, in the exemplary embodiment of the present invention, the document is assumed to be a business card and described.

상술한 바와 같이, 휴대 단말기등과 같은 장치에 문서의 정보를 등록하는 경우, 문서의 이미지를 스캔한 후 문자인식 및(또는) 음성인식을 통해 문자 데이터를 등록할 수 있어 휴대 단말기를 통한 입력장치의 조작을 최소화할 수 있으며, 문자 또는 음성 인식시 오인식된 문자를 간편하게 수정할 수 있는 이점이 있다. 그리고 상기 문자 및 음성인식 방법을 통해 문서의 정보를 입력할 수 있어 대용량의 문서 정보를 효율적으로 입력할 수 있는 이점이 있다.As described above, in the case of registering information of a document in a device such as a portable terminal, text data can be registered through text recognition and / or voice recognition after scanning the image of the document. Operation can be minimized, and there is an advantage in that a character or a voice recognized during speech recognition can be easily modified. In addition, the information of the document may be input through the text and voice recognition method, and thus there is an advantage of efficiently inputting a large amount of document information.

Claims

An apparatus for recognizing a character image of a document,

An input unit for generating a command for performing the recognition mode, the correction mode, and the storage modes;

A pre-processing unit configured to analyze pixels in the document image in the recognition mode, classify them into letter blocks and background blocks, and then binarize pixels of the letter blocks to generate preprocessed document images;

A character recognition unit for recognizing preprocessed document images and converting them into character data;

A recognition error processing unit for correcting misrecognized character data selected by the input unit in the correction mode to character data output from the input unit;

A database for storing the recognized text data in the storage mode;

And a display unit for displaying document images and text data generated during the mode execution.

The method of claim 1, wherein the pretreatment unit

Classify stripes having a length greater than or equal to a predetermined size in the document image, calculate a tilt angle of the classified stripes, determine a tilt of the subject, determine a rotation angle corresponding to the measured tilt, and tilt the subject. With subject tilt correction to correct,

The document image whose slope of the subject is corrected is classified into a letter block and a background block, a text area is extracted by searching the positions of the classified letter blocks, and an image screen of the extracted text area is input to the size of the input document image. An image area expansion unit extending to

And an image binarization unit configured to compare pixels of the letter blocks of the document image with a pixel reference value to binarize the pixel and background pixels with brightness values and to binarize the pixels of the background block with the brightness values of the background pixels.

The method of claim 2, wherein the preprocessor classifies the input document image into a letter block and a background block, calculates an average energy ratio of the classified letter blocks, and compares the received document image with a predetermined reference value to determine whether a blood image is displayed. The apparatus further comprises an image blurring determining unit for determining.

The apparatus as claimed in claim 2, wherein the preprocessor further comprises a noise removing unit which removes noise of an image screen having an extended image region and outputs the noise to the image binarization unit.

The apparatus of claim 1, wherein the apparatus for recognizing the document image further comprises a camera that photographs a document and generates a document image.

The voice recognition unit of claim 5, further comprising an input signal for selecting the item in the storage mode, and a voice recognition unit for generating an input signal for selecting and correcting character data misrecognized in the correction mode. Document information storage device of the portable terminal, characterized in that for converting the input voice signal into text data.

The document of claim 5, wherein the character recognition unit further comprises a handwriting recognition unit, and recognizes the handwritten character image received in the correction mode and converts the character text to the corrected character data of the misrecognized character data. Information storage.

The apparatus of claim 5, wherein the camera is capable of adjusting distance and exposure.

An apparatus for storing document information using a camera,

An input unit for generating a command to perform a shooting mode, a recognition mode, a correction mode, and storage modes;

A display unit including a first display area for displaying an input document image and the recognized text data, a second display area for displaying items, a third display area for displaying text data of a selected item, and an area for displaying a mode menu Wow,

A camera driven in the photographing mode to photograph the document image;

A character recognition unit for recognizing a preprocessed document image and converting it into the character data;

A database for storing the recognized text data in the storage mode;

And a display unit for displaying document images and text data generated when the respective modes are performed.

The method of claim 9, wherein the pretreatment unit

Classify stripes having a length greater than or equal to a predetermined size in the document image, calculate a tilt angle of the classified stripes, determine a tilt of the subject, and determine a rotation angle corresponding to the measured tilt to tilt the subject. With subject tilt correction to correct,

The method of claim 10, wherein the preprocessor classifies the input document image into a letter block and a background block, calculates an average energy ratio of the classified letter blocks, and compares the received document image with a predetermined reference value to determine whether a blood image is displayed. The apparatus further comprises an image blurring determining unit for determining.

The apparatus of claim 11, wherein the preprocessing unit further comprises a noise removing unit which removes noise of an image screen having an extended image region and outputs the noise to the image binarization unit.

13. The apparatus of claim 12, further comprising an input signal for selecting the item in the storage mode and a voice recognition part for generating an input signal for selecting and correcting character data misrecognized in the correction mode. Document information storage device of the portable terminal, characterized in that for converting the input voice signal into text data.

The document of claim 12, wherein the character recognition unit further comprises a handwriting recognition unit, and recognizes the handwritten character image received in the correction mode, and converts the received character image into corrected character data of the misrecognized character data. Information storage.

A method for recognizing a text image included in a document image in a terminal device,

Specifying a mode for document recognition,

In the document recognition mode, the pixels in the document image are analyzed and classified into a letter block and a background block, and the pixels of the letter block are binarized to generate a preprocessed document image.

Recognizing the preprocessed document image and converting it into text data;

Selecting a character data that is misrecognized in the correction mode, and correcting the misrecognized character data with the input character data;

And storing the recognized text data in a storage mode.

The method of claim 15, wherein the pretreatment process,

Classify stripes having a length greater than or equal to a predetermined size in the document image, calculate a tilt angle of the classified stripes, determine a tilt of the subject, and determine a rotation angle corresponding to the measured tilt to tilt the subject. Calibration process,

The document image whose slope of the subject is corrected is classified into a letter block and a background block, a text area is extracted by searching the positions of the classified letter blocks, and an image screen of the extracted text area is input to the size of the input document image. The process of expanding to

And comparing the pixels of the text blocks of the document image with a pixel reference value to binarize the pixels and the background pixels with brightness values and the pixels of the background block with the brightness values of the background pixels.

The method of claim 16, wherein the preprocessing process classifies the input document image into a letter block and a background block, calculates an average energy ratio of the classified letter blocks, and compares the document image with a predetermined reference value to determine whether the image is a blood image. The method of claim 1, further comprising the step of performing the pre-processing step if it is not a blood image.

18. The method as claimed in claim 17, wherein the preprocessing further comprises removing noise from an image screen having an extended image region and outputting the noise to the image binarization unit.

The method of claim 18, wherein the error correction process,

Displaying candidate characters corresponding to character data misrecognized in the correction mode;

And modifying a character selected from the displayed candidate characters into character data of the misrecognized character.

The method of claim 18, wherein the error correction process,

Displaying a recognition window for inputting handwritten characters when the correction mode is requested;

Recognizing a cursive character when the cursive character is input to the cursive recognition window;

And modifying the recognized character into character data of the misrecognized character.

The method of claim 18, wherein the error correction process,

Displaying candidate characters of a character recognized incorrectly in the correction mode;

Modifying the selected character among the displayed candidate characters into character data of the misrecognized character;

Displaying a cursive recognition window when there is no character data to be corrected in the candidate character in the display process;

The method of claim 18, wherein the error correction process,

Driving a voice recognition unit in the correction mode;

Converting the input voice signal into text data by recognizing the voice recognition unit;

And converting the converted character data into character data of the misrecognized character.

A document having a display portion having a first display area for displaying text images and text data, a second display area for displaying items, a third display area for displaying text data of a selected item, and a menu area for displaying a mode menu. In the method for recognizing a text image included in the image,

Displaying a document image taken by the camera,

Analyzing the pixels in the document image in the document recognition mode, dividing them into a letter block and a background block, and binarizing the pixels of the letter block to generate a preprocessed document image and display the same in the first display area;

Recognizing and converting the preprocessed document image into text data and displaying the text data on the first display area, and displaying the items of the document data on the second display area;

Selecting an item to be stored from among the displayed items, selecting and storing text data of the selected item, and

And repeating the process and storing the selected item and the corresponding text data.

24. The method of claim 23, further comprising an error correction process of correcting misrecognized text data after selecting the item and text data.

The error correction process,

Displaying candidate characters of characters that are incorrectly recognized in the third display area when an error correction request is made;

The error correction process,

Displaying a handwriting recognition window on a second display area when the error correction request is made;

The error correction process,

Displaying a cursive recognition window on the second display area when there is no text data to be corrected in the candidate text;

The error correction process,

Driving the voice recognition unit upon the error correction request;

A portable unit having a display unit having a first display area for displaying character data of a recognized business card, a second display area for displaying items, a third display area for displaying text data of a selected item, and an area for displaying a mode menu. In the business card image recognition method of the terminal device,

Displaying an image of the business card photographed through a camera;

Analyzing pixels in the displayed business card image in a recognition mode, dividing them into a letter block and a background block, and binarizing the pixels of the letter block to generate a preprocessed document image;

Converting the text images of the preprocessed business card into text data, displaying the converted text data in a first display area and displaying items of the business card in a second display area;

Selecting an item to be stored among the displayed items, selecting text data of the selected item and displaying the selected text data in a third display area;

And storing the recognized text data in a storage mode.

The method of claim 28, wherein the storage item,

Wherein said method comprises a name, a mobile phone number, a company phone number, an e-mail address, a position, and the like.

The method of claim 29, wherein the performing of the modification mode comprises:

Displaying candidate characters of character data misrecognized in the third display area;

The method comprising the step of modifying the misrecognized character to the selected candidate character.

The method of claim 29, wherein the correcting the misrecognized character,

Displaying a handwriting recognition window on the second display area when the correction key is input;

And a process of modifying the recognized character into character data of the misrecognized character.

The method of claim 29, wherein the correcting the misrecognized character,

Driving a voice recognition unit when the correction key is input;

And modifying the converted character data into character data of the misrecognized character.

The method of claim 29, wherein the error correction process,