KR920006874A

KR920006874A - Recognition method of document with predetermined format

Info

Publication number: KR920006874A
Application number: KR1019910015755A
Authority: KR
Inventors: 다꾸야 오까모또; 마사또시 히노; 데쯔오 마찌다; 마사또 데라모또
Original assignee: 가나이 쯔또무; 가부시끼가이샤 히다찌세이사꾸쇼
Priority date: 1990-09-14
Filing date: 1991-09-10
Publication date: 1992-04-28
Also published as: JPH04123185A

Abstract

내용 없음No content

Description

Recognition method of document with predetermined format

본 내용은 요부공개 건이므로 전문내용을 수록하지 않았음As this is a public information case, the full text was not included.

제1도는 본 발명의 문서인식방법을 실현하기 위한 시스템의 전체구성도,1 is an overall configuration diagram of a system for realizing the document recognition method of the present invention,

제12도는 프로세서가 실행하는 문서인식처리를 위한 제어흐름도,12 is a control flowchart for document recognition processing executed by a processor,

제16도는 A 및 제16도 B는 제2도에 있어서의 주소필드의 문자열을 인식하기 위한 인식처리루틴을 상세하게 도시한 도면.FIG. 16 is a diagram showing details of a recognition processing routine for recognizing character strings of the address field in FIG.

Claims

In a document recognition method for recognizing, by a recognizing apparatus, a document comprising a plurality of field areas each having a character string written in it, the apparatus includes a processor for executing image processing and an image read for reading an image of a document to be recognized. Means, first memory means for storing image data read by the image reading means, second memory means for recording data indicating the position of each field area included in the document, and a recognition process to be performed. Third memory means for storing the order of the field areas, fourth memory means for storing a correspondence with an identifier of a dictionary to be used for recognizing character strings in each field area for a plurality of field areas, and a plurality of dictionaries. And a fifth memory means for storing, said word dictionary is made up of a plurality of word records, Each word record consists of link data for specifying a part of a word data and a plurality of other word repositories contained in another one or a plurality of word dictionaries related to the word data, and the recognition method includes the image reading means. The first step of reading the video data read by on the first memory and selecting one field according to the order information of the fields stored on the third memory, and reading from the second memory. A second step of extracting a feature of a character string in the partial image area on the first memory specified according to the positional information corresponding to the selected field, a pre-identifier corresponding to the selected field read from the fourth memory; A third step of extracting the second step using a dictionary in the fifth memory having a second to recognize a character string, whereby the character A fourth step of adding at least one word record corresponding to and specifying restriction information to a word record included in at least one other dictionary associated with the link information defined in the specified word record, the second step. The fifth step of repeating the fourth to fourth steps, wherein the word record in the dictionary used for character string recognition of any of the fields contains the limited information added to the fourth step for character string recognition of another field. And a sixth step of recognizing the character string in the field using the word record with the limited information.

A document recognition method for automatically recognizing, by a processing apparatus having a plurality of dictionaries, a character string in a document image having a predetermined format consisting of a plurality of fields in which a character string is written, each dictionary comprising a plurality of characters. At least two word dictionaries composed of a plurality of word data records for recognizing words and at least one character dictionary composed of a plurality of data records for recognizing characters on a character basis, wherein each word data record includes a word code and And link information for specifying a part of a word data record group included in at least one other word dictionary having a predetermined relationship with the word indicated by the word code. The document recognition method performs recognition processing stored in advance. The document image selected according to field order information to be performed A first step of extracting a character string from one field area in the plurality of dictionaries corresponding to a field processed in the first step selected according to information indicating a correspondence relationship with a previously stored field; A second step of recognizing a character string extracted from the document image using one, and one having a word code corresponding to the recognized character string if the dictionary selected in the second step is one in the word dictionary. Adding identification information for forming a limited dictionary to a word data board of a part of a word data record crowd included in at least one other word dictionary specified by the link information according to link information included in the number of word data records. A third step, wherein the first to third steps are repeated for each field in the document image, and if selected in the third step, In the case where the word data record including the limited information is added by the fourth step on the recognition processing of a certain field which has already been executed, the document is extracted from the document image by the limited dictionary consisting of the word data record to which the limited information is added. Document recognition method that recognizes a string.

The method of claim 2, wherein the second step comprises: a fourth step of sequentially reading a word code registered in the word dictionary if the selected dictionary is one of the word dictionaries; A plurality of character codes constituting each word code read in the fourth step are sequentially converted into a feature amount for each character by using a feature variable dictionary for converting an input character code into a feature quantity of a character and the word code A fifth step of obtaining a feature amount of a character string corresponding to and a correspondence degree of the characteristic amount of the character string of the character string extracted from the character image in the first step and the characteristic amount of each character string obtained from the word code in the fifth step; And repeating the sixth step and the fourth to sixth steps for storing the obtained degree of correspondence with the word code for a plurality of word codes registered in the word dictionary. And a seventh step of determining at least one word code to be a recognition result of the character string extracted from the document image according to the magnitude of the correspondence degree of each word stored in the sixth step.

※ Note: This is to be disclosed by the original application.