JP2008310531A - Business form identification method, business form identification program and optical character reading system using the business form identification method - Google Patents

Business form identification method, business form identification program and optical character reading system using the business form identification method Download PDF

Info

Publication number
JP2008310531A
JP2008310531A JP2007156844A JP2007156844A JP2008310531A JP 2008310531 A JP2008310531 A JP 2008310531A JP 2007156844 A JP2007156844 A JP 2007156844A JP 2007156844 A JP2007156844 A JP 2007156844A JP 2008310531 A JP2008310531 A JP 2008310531A
Authority
JP
Japan
Prior art keywords
type
information
size
identification
character string
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2007156844A
Other languages
Japanese (ja)
Other versions
JP5051756B2 (en
Inventor
Kenji Shibata
憲志 柴田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Information and Telecommunication Engineering Ltd
Original Assignee
Hitachi Computer Peripherals Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Computer Peripherals Co Ltd filed Critical Hitachi Computer Peripherals Co Ltd
Priority to JP2007156844A priority Critical patent/JP5051756B2/en
Publication of JP2008310531A publication Critical patent/JP2008310531A/en
Application granted granted Critical
Publication of JP5051756B2 publication Critical patent/JP5051756B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

<P>PROBLEM TO BE SOLVED: To provide an optical character reading system identifying business forms at high speed. <P>SOLUTION: This optical character reading system stores size information of the plurality of business forms, arrangement pattern information of a ruled line, and character information of a title in a memory 25, reads image information of the plurality of business forms, and identifies the kinds of the business forms. The optical character reading system arbitrarily selects and performs: size identification processing for comparing a size 27 of a read image and the size information of the memory 25 to identify the kind of the business form; pattern identification processing for comparing a ruled line arrangement pattern of the read image and the arrangement pattern information of the memory 25 to identify the kind of the business form; and character string identification processing for comparing a character string included in the read image information and character string information of the memory 25 to identify the kind of the business form. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、複数種類の帳票の種類を容易に識別することができる帳票識別方法及び帳票識別プログラム、並びに当該帳票識別方法及びプログラムを用いて読み取り対象となる帳票を容易に識別して効率的に文字を読み取る光学文字読取システムに関する。   The present invention provides a form identification method and a form identification program capable of easily identifying a plurality of types of forms, and easily identifies a form to be read using the form identification method and program. The present invention relates to an optical character reading system for reading characters.

一般に光学文字読取システムは、用紙(以下、帳票と呼ぶ)に記載された文字を光学的に読み取るものであり、例えば銀行等の金融機関においては、例えば出金伝票、入金伝票、振込伝票、公共料金振込伝票他の用紙サイズや文字記入位置が異なる多種の帳票の文字を読み取ってコンピュータに入力する業務に使用されている。   In general, an optical character reading system optically reads characters on paper (hereinafter referred to as a form). For example, in a financial institution such as a bank, for example, a withdrawal slip, a deposit slip, a transfer slip, a public slip It is used for the business of reading the characters of various forms with different paper sizes and character entry positions and entering them into computers.

この金融機関等において使用される光学文字読取システムは、前述したような多種多様の帳票が混在して供給され、帳票上の予め定められた位置に設けられた文字列記載領域から文字を読み取るため、文字読取に先立って帳票の種類を識別し、この識別した帳票種類に応じた文字列記載領域から文字を読み取る必要がある。   The optical character reading system used in this financial institution is supplied with a wide variety of forms as described above, and reads characters from a character string description area provided at a predetermined position on the form. Before reading the character, it is necessary to identify the type of the form and read the character from the character string description area corresponding to the identified form type.

この帳票種類を自動的に識別する技術が記載された文献としては下記特許文献が挙げられ、この特許文献には、帳票の固定位置に予め帳票種を示すID(識別情報)を記載しておき、このIDを読み込むことによって帳票種を識別する技術が記載されている。
特開平7−152856号公報
The following patent document is cited as a document that describes a technique for automatically identifying the form type. In this patent document, an ID (identification information) indicating the form type is described in advance at a fixed position of the form. A technique for identifying a form type by reading this ID is described.
Japanese Patent Laid-Open No. 7-152856

しかしながら、前述の特許文献記載の技術は、多種多様な多数の帳票に予めIDを付番し、且つ全ての帳票の固定位置に前記IDを印刷しておかなければならず、帳票の準備作業が煩雑であると言う不具合があった。   However, the technique described in the above-mentioned patent document requires that IDs are assigned in advance to a large number of various forms, and the IDs must be printed at fixed positions of all the forms. There was a problem of being complicated.

本発明の目的は、帳票に予め帳票種類を識別する識別情報を設けることなく高速に帳票種類を識別することができる帳票識別方法及び帳票識別プログラム、並びに当該帳票識別方法及びプログラムを用いて効率的に文字読み取りを行うことができる光学文字読取システムを提供することである。   An object of the present invention is to provide a form identification method and a form identification program capable of identifying a form type at high speed without providing identification information for identifying the form type in advance in the form, and to efficiently use the form identification method and program. It is to provide an optical character reading system capable of reading characters.

本発明は、前記目的を達成するため、複数種類の帳票のサイズ情報と罫線配置パターン情報とタイトル文字情報とを記憶しておき、帳票の画像を読み込み、前記帳票の種類をコンピュータにより識別する光学文字読取システムであって、前記コンピュータが、前記読み込んだ画像のサイズを前記記憶したサイズ情報と比較して帳票の種類を識別するサイズ識別処理と、前記読み込んだ画像の罫線配置パターンを前記記憶した罫線配置パターン情報と比較して帳票の種類を識別するパターン識別処理と、前記読み込んだ画像に含まれる文字列を前記記憶したタイトル文字列情報と比較して帳票の種類を識別する文字列識別処理とを、任意に選択して行うことを第1の特徴とする。   In order to achieve the above object, the present invention stores optical size information, ruled line arrangement pattern information, and title character information of a plurality of types of forms, reads a form image, and identifies the form type by a computer. In the character reading system, the computer stores the size identification process for identifying the type of the form by comparing the size of the read image with the stored size information, and the ruled line arrangement pattern of the read image. Pattern identification processing for identifying the type of form in comparison with the ruled line arrangement pattern information, and character string identification processing for identifying the type of form by comparing the character string included in the read image with the stored title character string information The first feature is that these are arbitrarily selected.

また本発明は、前記光学文字読取システムにおいて、前記コンピュータが、前記サイズ識別によりサイズが識別された帳票の種類を基に帳票の種類を限定した後、又は前記罫線パターン識別により罫線配置パターンが識別された帳票の種類を限定した後、前記限定した帳票の種類から前記文字列識別により帳票の種類を識別することを第2の特徴とする。   According to the present invention, in the optical character reading system, the computer identifies a ruled line arrangement pattern after limiting the type of form based on the type of form identified by the size identification or by the ruled line pattern identification. A second feature is that after limiting the type of the generated form, the type of form is identified from the limited form type by the character string identification.

更に本発明は、複数種類の帳票のサイズ情報と罫線配置パターン情報とタイトル文字情報とを記憶しておき、帳票の画像を読み込み、前記帳票の種類をコンピュータにより識別する光学文字読取システムの帳票識別方法であって、前記コンピュータに、前記読み込んだ画像のサイズを前記記憶したサイズ情報と比較して帳票の種類を識別するサイズ識別機能と、前記読み込んだ画像の罫線配置パターンを前記記憶した罫線配置パターン情報と比較して帳票の種類を識別するパターン識別機能と、前記読み込んだ画像に含まれる文字列を前記記憶したタイトル文字列情報と比較して帳票の種類を識別する文字列識別機能とを、任意に選択して実行することを第3の特徴とする。   Furthermore, the present invention stores a plurality of types of form size information, ruled line arrangement pattern information, and title character information, reads a form image, and identifies the form type by a computer. A method for identifying the type of a form by comparing the size of the read image with the stored size information and a ruled line arrangement in which the ruled line arrangement pattern of the read image is stored in the computer. A pattern identification function for identifying the form type by comparing with the pattern information, and a character string identification function for identifying the form type by comparing the character string included in the read image with the stored title character string information. The third feature is that it is arbitrarily selected and executed.

また本発明は、前記帳票識別方法において、コンピュータが、前記サイズ識別機能によりサイズが識別された帳票の種類を基に帳票の種類を限定した後、又は前記パターン識別機能により罫線の配置パターンが識別された帳票の種類を限定した後、前記限定した帳票の種類から前記文字列識別機能により帳票の種類を識別する機能を実行することを第4の特徴とする。   According to the present invention, in the form identifying method, after the computer limits the form type based on the form type whose size is identified by the size identifying function, or the ruled line layout pattern is identified by the pattern identifying function. A fourth feature is that after limiting the type of the generated form, a function for identifying the type of the form from the limited form type by the character string identification function is executed.

更に本発明は、複数種類の帳票のサイズ情報と罫線配置パターン情報とタイトル文字情報とを記憶しておき、帳票の画像を読み込み、前記帳票の種類をコンピュータにより識別する光学文字読取システムの帳票識別プログラムであって、前記コンピュータに、前記読み込んだ画像のサイズを前記記憶したサイズ情報と比較して帳票の種類を識別するサイズ識別機能と、前記読み込んだ画像の罫線配置パターンを前記記憶した罫線配置パターン情報と比較して帳票の種類を識別するパターン識別機能と、前記読み込んだ画像に含まれる文字列を前記記憶したタイトル文字列情報と比較して帳票の種類を識別する文字列識別機能とを、任意に選択して実行することを第5の特徴とする。   Furthermore, the present invention stores a plurality of types of form size information, ruled line arrangement pattern information and title character information, reads a form image, and identifies the form type by a computer. A size identification function for identifying the type of form by comparing the size of the read image with the stored size information, and the ruled line arrangement of the read image stored in the computer. A pattern identification function for identifying the form type by comparing with the pattern information, and a character string identification function for identifying the form type by comparing the character string included in the read image with the stored title character string information. The fifth feature is that it is arbitrarily selected and executed.

また本発明は、前記帳票識別プログラムにおいて、前記コンピュータに、前記サイズ識別機能によりサイズが識別された帳票の種類を基に帳票の種類を限定した後、又は前記パターン識別機能により罫線配置パターンが識別された帳票の種類を限定した後、前記限定した帳票の種類から前記文字列識別機能により帳票の種類を識別する機能を実行することを第6の特徴とする。   According to the present invention, in the form identification program, a ruled line arrangement pattern is identified by the computer after the form type is limited based on the type of form whose size is identified by the size identification function or by the pattern identification function. A sixth feature is that, after limiting the type of the generated form, a function for identifying the type of the form from the limited form type by the character string identification function is executed.

本発明の帳票識別方法及び帳票識別プログラムによれば、複数の帳票の種類を、前記サイズ識別処理と前記パターン識別処理と文字列識別処理とを任意に選択して行うことによって、帳票に予め帳票種類を識別する識別情報を設けることなく高速に帳票種類を識別することができる。また、この方法を用いて、効率的に文字読み取りを行う光学文字読み取りシステムが提案される。   According to the form identification method and the form identification program of the present invention, a plurality of form types can be selected in advance by arbitrarily selecting the size identification process, the pattern identification process, and the character string identification process. A form type can be identified at high speed without providing identification information for identifying the type. An optical character reading system that efficiently reads characters using this method is also proposed.

また本発明の帳票識別方法及び帳票識別プログラムによれば、前記サイズ識別又は前記パターン識別により帳票の種類を限定した後、この限定した帳票種類中から文字列識別を用いて帳票の種類を識別することによって、帳票種類の認識を高速に行うことができる。また、この方法を用いて、効率的且つ高速に文字読み取りを行う光学文字読み取りシステムが提案される。   According to the form identifying method and form identifying program of the present invention, after limiting the type of form by the size identification or the pattern identification, the type of form is identified from the limited form type using character string identification. Thus, the form type can be recognized at high speed. An optical character reading system that reads characters efficiently and at high speed using this method is also proposed.

以下、本発明による帳票識別方法を適用した光学文字読取システムの一実施形態を図面を参照して詳細に説明する。図1は本発明による識別対象となる帳票例を示す図、図2は本実施形態による光学文字読取システムを示す図、図3は本実施形態による光学文字読取システムの動作フロー図、図4は本実施形態による修正画面を説明するための図、図5は本実施形態による修正画面を説明するための図、図6は本実施形態による修正画面を説明するための図である。   Hereinafter, an optical character reading system to which a form identifying method according to the present invention is applied will be described in detail with reference to the drawings. FIG. 1 is a view showing an example of a form to be identified according to the present invention, FIG. 2 is a view showing an optical character reading system according to the present embodiment, FIG. 3 is an operation flow diagram of the optical character reading system according to the present embodiment, and FIG. FIG. 5 is a diagram for explaining the correction screen according to the present embodiment, FIG. 5 is a diagram for explaining the correction screen according to the present embodiment, and FIG. 6 is a diagram for explaining the correction screen according to the present embodiment.

まず、本実施形態の対象となる帳票種による特徴を図1を参照して説明する。図1は、本実施形態による識別対象となる識別対象11を示す図であり、本実施形態の対象となる帳票は、サイズの異なる帳票A及びBから成るサイズ識別対象帳票12と、同一サイズの帳票C及びDから成るものの帳票種が異なり、罫線の配置位置が異なる配置パターン識別対象帳票13と、同一サイズの帳票E及びFから成るものの帳票の種類(例えば、前記出金伝票、入金伝票、振込伝票、公共料金振込伝票他の如く、伝票が作成された目的により用紙サイズや文字記入位置が異なるの帳票の種類)が異なり、タイトル文字列が同一の文字列識別対象帳票14とに大別される。   First, the characteristics according to the form type that is the object of the present embodiment will be described with reference to FIG. FIG. 1 is a diagram showing an identification object 11 that is an identification object according to the present embodiment, and a form that is the object of the present embodiment is the same size as a size identification object form 12 composed of forms A and B having different sizes. The forms of the forms C and D have different form types, the arrangement positions of the ruled lines are different, and the types of forms of the forms E and F of the same size (for example, the withdrawal slip, the receipt slip, It is roughly divided into the character string identification target forms 14 having the same title character string, such as transfer slips, utility bill transfer slips, etc., with different paper sizes and character entry positions depending on the purpose for which the slip was created. Is done.

前記したサイズ識別対象帳票12は、サイズをファクターとして帳票種を識別した際には、単純処理のために処理が高速になるものの、同一サイズで内容が異なる帳票は識別できないと言う特性があり、罫線の配置パターン識別対象帳票13は、帳票内の罫線枠の位置や大きさ等をファクターとして識別できるものの、帳票の一部のみが異なる類似帳票の識別が困難であると言う特徴があり、文字列識別対象帳票14は、予め印刷された所定のタイトル文字列をファクターとして識別できるものの、文字の読み取り処理を必要とするために処理速度が遅くなると言う特性がある。   The above-described size identification target form 12 has a characteristic that, when the form type is identified by using the size as a factor, the processing becomes high speed due to simple processing, but forms having different contents with the same size cannot be identified. The ruled line layout pattern identification target form 13 is characterized by the fact that although it is possible to identify the position and size of the ruled line frame in the form as a factor, it is difficult to identify similar forms that differ only in part of the form. Although the column identification target form 14 can identify a predetermined title character string printed in advance as a factor, it has a characteristic that the processing speed is slow because it requires a character reading process.

本実施形態による帳票識別方法及びプログラムを適用した光学文字読取システムは、前述の各種類の帳票であっても高速に帳票種を識別することができ、図2に示す如く、多種類の帳票が混在した入力用帳票21をスキャナ装置22によって読み取り、該読み取った画像情報(イメージデータ)や後述する帳票識別のための識別データ他を記憶しておき、前記帳票の画像情報を基に帳票識別から文字認識までを行うコンピュータ23を備え、該コンピュータ23は、前記読み込んだ画像情報26と、該画像情報26のサイズ情報に対応したサイズ別帳票種情報(例えば、B5サイズは帳票A、A4サイズは帳票B,C,D,E,F)を格納したサイズ識別データ27と、帳票に記載された罫線の配置パターン情報に対応した配置パターン別帳票種情報(例えば、配置パターン1は帳票A、配置パターン2は帳票B他)を格納した配置パターン識別データ28と、記載された文字列情報に対応した文字列別帳票種情報(例えば、「総合振替依頼票」の文字列は帳票C、「会員申込書」は帳票D他)を格納した文字列識別データ29とを格納したメモリ25と、本実施形態による帳票識別プログラムを含む一連のプログラムに従って前記メモリ25に格納した各種データを用いて帳票種類を識別する処理その他を制御するCPU24と、その他キーボード及びディスプレイ他(図示せず)とから構成される。   The optical character reading system to which the form identifying method and program according to the present embodiment are applied can identify the form type at high speed even for the above-mentioned various types of forms. As shown in FIG. The mixed input form 21 is read by the scanner device 22, the read image information (image data), identification data for identifying the form to be described later, and the like are stored, and the form identification is performed based on the image information of the form. The computer 23 includes a computer 23 that performs character recognition. The computer 23 includes the read image information 26 and size-specific form type information corresponding to the size information of the image information 26 (for example, B5 size is form A, and A4 size is Size identification data 27 storing forms B, C, D, E, and F), and layout pattern separate books corresponding to ruled line layout pattern information described in the form Arrangement pattern identification data 28 storing type information (for example, arrangement pattern 1 is form A, arrangement pattern 2 is form B, etc.), and character type-specific form type information (for example, “general” In accordance with a series of programs including a memory 25 storing character string identification data 29 in which a character string of “transfer request form” is a form C, a “member application form” is a form D, and the form identification program according to the present embodiment. The CPU 24 controls the process of identifying the form type using the various data stored in the memory 25, and other keyboards and displays (not shown).

次に本光学文字読取システムの動作を図3以降を参照して説明する。尚、後述するステップ32/36/38によるサイズ/配置パターン/文字列の識別を行うか否かの設定は、操作者が文字読み取りを行う対象の複数の帳票の特徴(図1に示した各帳票特徴)を目視にて確認し、どの特徴の帳票枚数が多い/少ない等に応じて任意に設定するものである。   Next, the operation of the optical character reading system will be described with reference to FIG. Whether or not to identify the size / arrangement pattern / character string in step 32/36/38, which will be described later, is set according to the characteristics of a plurality of forms to be read by the operator (each of the forms shown in FIG. 1). (Characteristics of the form) is visually confirmed, and is arbitrarily set according to which characteristic number of forms is large / small.

さて、本実施形態による光学文字読取システムは、サイズと罫線の配置パターンと文字列の各ファクターを帳票識別に用いるものであって、まず、コンピュータ23のCPU24が、図2に示した多種の帳票が混在した入力用帳票21をスキャナ装置22によって読み取り、該読み取った帳票毎の画像情報を取得するステップ31と、前記画像情報のサイズ(縦横寸法/ドット数他)を識別すると設定されているか否かを判定するステップ32と、該ステップ32によりサイズを識別すると判定したとき、画像(情報)の縦横寸法(インチ又はmm)値又はドット数(ピクセル数)値他によるサイズを前記サイズ識別データ27のサイズ情報と比較して認識するステップ33と、該認識したサイズを基に帳票が特定できるか否かを(例えば識別したサイズの帳票が1種のときは特定する)判定し、特定できると判定したときに帳票識別が成功として文字データを読み込むステップ34と、該ステップ34において同一サイズの異種帳票が複数存在し、帳票種を特定することができないと判定したとき、識別した同一サイズの帳票種を絞り込むステップ35とを行う。   The optical character reading system according to the present embodiment uses factors of size, ruled line arrangement pattern, and character string for form identification. First, the CPU 24 of the computer 23 performs various forms shown in FIG. Is read by the scanner device 22 to acquire image information for each read form, and whether the size of the image information (vertical / horizontal dimensions / number of dots, etc.) is identified. If it is determined that the size is to be identified in step 32, the size identification data 27 indicates the size of the image (information) according to the vertical and horizontal dimension (inch or mm) value, the number of dots (pixel number) value, or the like. Step 33 for recognizing by comparing with the size information and whether the form can be specified based on the recognized size (for example, identification Step 34 for reading the character data when the form identification is successful when it is determined that it can be specified, and there are a plurality of different types of forms of the same size in step 34, When it is determined that the form type cannot be specified, step 35 is performed to narrow down the identified same-type form type.

次いで本光学文字読取システムは、前記ステップ35に続くか又は前記ステップ32においてサイズ識別をしないと判定したとき、帳票の罫線配置パターンを識別すると設定されているか否かを判定するステップ36と、該ステップ36において配置パターンを識別すると設定されていることを判定したとき、画像情報に含まれる罫線枠の位置や大きさ等の配置パターンの識別を前記配置パターン識別データ28の配置パターン情報と比較して行うステップ37と、該ステップ37によって帳票を特定できたか否かを(例えば識別した配置パターンが種のときは特定する)判定し、特定できると判定したときに帳票識別が成功として文字データを読み込むステップ34と、該ステップ34において特定できないと判定したとき、該識別した同一配置パターンの帳票種に絞り込むステップ35とを行う。   Next, the optical character reading system determines whether or not it is set to identify the ruled line arrangement pattern of the form when it is determined that the size identification is not performed following the step 35 or in the step 32; When it is determined in step 36 that an arrangement pattern is identified, the arrangement pattern identification such as the position and size of the ruled line frame included in the image information is compared with the arrangement pattern information of the arrangement pattern identification data 28. Step 37, and whether or not the form can be specified by the step 37 (for example, specify when the identified arrangement pattern is a seed), and when it is determined that the form can be specified, the character identification is determined as successful. Step 34 to be read and when it is determined in Step 34 that it cannot be specified, the identified same Performing the step 35 to narrow down to the form species of the arrangement pattern.

更に本システムは、前記ステップ35に続くか又は前記ステップ36において配置パターン識別をしないと判定したとき、帳票の文字列を識別すると設定されているか否かを判定するステップ38と、該ステップ39において文字列を識別すると判定されたとき、画像情報中の文字列情報を認識して前記文字列識別データ29の文字列情報と比較して識別するステップ39と、該ステップ39に続くか前記ステップ38において文字列識別を行わないと判定したとき、帳票を特定したか否かを判定し、特定したと判定したときに特定した帳票種に応じてデータの読み込みを行い、特定できないと判定したときに後述する識別失敗修正画面に移行するステップ34とを順次行う。これらの各ステップは、帳票識別プログラムの複数のコンピュータ処理機能により実行される。   Further, when it is determined in the step 36 that the arrangement pattern is not identified in the step 35 or the step 36, the system determines whether or not the character string of the form is set, and in the step 39 When it is determined that the character string is to be identified, the step 39 recognizes the character string information in the image information and identifies the character string information by comparing with the character string information of the character string identification data 29; When it is determined that character string identification is not performed, it is determined whether or not the form is specified. When it is determined that the form is specified, data is read according to the specified form type and when it is determined that the form cannot be specified. Step 34 to shift to the identification failure correction screen described later is sequentially performed. Each of these steps is executed by a plurality of computer processing functions of the form identification program.

このように本実施形態は、前記ステップ35において絞り込まれ限定された帳票の種類を、次段の帳票を特定するステップ34における絞り込みに用い、帳票を特定する際に前記絞り込んだ範囲において帳票の特定が判定される。即ち本実施形態による光学文字読取システムは、前記サイズ識別によりサイズが識別された帳票の種類を基に帳票の種類を限定した後、又は前記パターン識別により罫線の配置パターンが識別された帳票の種類を限定した後、前記限定した帳票の種類から前記文字列識別を用いて帳票の種類の識別を行う。   As described above, the present embodiment uses the limited form type narrowed down in the step 35 for narrowing down in the step 34 for specifying the next form, and specifies the form within the narrowed down range when specifying the form. Is determined. In other words, the optical character reading system according to the present embodiment is limited to the type of the form whose size is identified by the size identification, or the type of the form whose ruled line arrangement pattern is identified by the pattern identification. Then, the form type is identified from the limited form type using the character string identification.

このように本実施形態による帳票識別方法及び帳票識別プログラムを採用した光学文字読取システムは、帳票を読み込んだ画像のサイズと罫線配置パターンと記載された文字列の3種の識別条件により帳票種を絞り込みながら識別するため、帳票に予め帳票種類を識別する識別情報を設けることなく高速に帳票種類を識別することができる。尚、前記配置パターンは、罫線配置のみに限られるものではなく、帳票に予め定められた記号・マーク・ロゴ等の配置が異なることにより帳票を識別可能なものであれば良い。   As described above, the optical character reading system adopting the form identifying method and the form identifying program according to the present embodiment selects the form type according to the three kinds of identification conditions of the character string described as the size of the image read from the form and the ruled line arrangement pattern. Since identification is performed while narrowing down, it is possible to identify the form type at high speed without providing identification information for identifying the form type in advance in the form. The arrangement pattern is not limited to the ruled line arrangement, and any arrangement pattern may be used as long as the arrangement of symbols, marks, logos, and the like determined in advance on the form is different.

次いで、前記帳票種が特定できないと判定したときの操作者による識別失敗修正画面による処理を図4以降を参照して説明する。これら画面はコンピュータに接続されたディスプレイ上に表示され、キーボードやマウス等によって操作されるものであり、図4は帳票サイズを基にした識別失敗修正画面を示す図、図5は帳票罫線の配置パターンを基にした識別失敗修正画面を示す図、図6は文字列を基にした識別失敗修正画面を示す図である。   Next, processing by the operator using the identification failure correction screen when it is determined that the form type cannot be specified will be described with reference to FIG. These screens are displayed on a display connected to a computer and operated by a keyboard, a mouse, etc. FIG. 4 is a diagram showing an identification failure correction screen based on the form size, and FIG. 5 is an arrangement of form ruled lines. FIG. 6 is a diagram showing an identification failure correction screen based on a pattern, and FIG. 6 is a diagram showing an identification failure correction screen based on a character string.

まず、図4に示したサイズ識別失敗修正画面41は、操作者が読み込んだ帳票の画像情報、本例の場合は小さい正方形状の「振替依頼申込書」の画像情報を画面右上に表示させ、画面左下のIDフォーマットから「帳票1」をマウス等により選択し、前記「振替依頼申込書」の画像上に左上端部を基点として「帳票1」のフォーマットによる画像を重ねて表示した例を示し、本図を参照すれば明らかな如く、帳票サイズが異なることが操作者により容易に認識でき、操作者が前記IDフォ−マットの帳票を順次重ねて表示することによってサイズによる帳票識別(選択)を容易に行うことができる。   First, the size identification failure correction screen 41 shown in FIG. 4 displays image information of the form read by the operator, in the case of this example, image information of a small square “transfer request application” on the upper right side of the screen, An example in which “Form 1” is selected from the ID format in the lower left of the screen with a mouse or the like, and an image in the “Form 1” format is displayed on the “Transfer Request Application” image with the upper left corner as a base point. As is apparent from this figure, the operator can easily recognize that the document sizes are different, and the operator can identify (select) the document by size by displaying the ID format documents one after another. Can be easily performed.

次いで図5に示した配置パターン識別失敗修正画面42は、操作者が前記「振替依頼申込書」の画像情報を表示させ、画面左下のIDフォーマットから「帳票3」をマウス等により選択し、前記「振替依頼申込書」の画像上に左上端部を基点として「帳票3」のフォーマットによる画像を重ねて表示した例を示し、本図を参照すれば明らかな如く、帳票サイズは同じであるものの、罫線のサイズ及び位置が異なることを操作者が容易に認識できる。   Next, in the arrangement pattern identification failure correction screen 42 shown in FIG. 5, the operator displays the image information of the “transfer request application”, selects “form 3” from the ID format at the lower left of the screen with the mouse, etc. An example in which an image in the format of “form 3” is displayed with the upper left corner as the base point on the image of the “transfer request application form” is shown. Although the form size is the same as is apparent from this figure, The operator can easily recognize that the size and position of the ruled line are different.

更に図6に示した文字列識別失敗修正画面43は、操作者が表示した「振替依頼申込書」の(入力)画像情報を表示させ、該画像情報に含まれる文字を認識し、予め複数の帳票毎に設定された文字列(キーワード)に対応して認識した文字列を表示するものであって、本例の場合、キーワード「振替依頼申込書」に対して認識した文字が「振?依頼??書」(「?」は認識不能文字を示す)であり、文字認識の不能により帳票識別ができなかったことが判り、操作者がキーボード等により「振替依頼申込書」を入力することにより、帳票を識別することができる。   Further, the character string identification failure correction screen 43 shown in FIG. 6 displays (input) image information of the “transfer request application form” displayed by the operator, recognizes characters included in the image information, and stores a plurality of characters in advance. The recognized character string corresponding to the character string (keyword) set for each form is displayed. In this example, the recognized character for the keyword “transfer request application” is “transfer request”. "?" ("?" Indicates an unrecognizable character), and it is found that the form could not be identified due to the inability to recognize the character, and the operator inputs the "transfer request application form" using a keyboard, etc. The form can be identified.

この様に本実施形態による帳票識別方法は、コンピュータが画面上に識別できなかった帳票の画像情報を表示させ、この画像情報に重ねること又は文字認識結果を表示させることによって、帳票の識別結果を修正することができる。   In this way, the form identification method according to the present embodiment displays the image information of the form that the computer could not identify on the screen, and displays the result of identifying the form by overlaying this image information or displaying the character recognition result. It can be corrected.

このように本発明の帳票識別方法及び帳票識別プログラム並びに光学文字読取システムによれば、帳票の種類を識別するためのサイズ識別処理とパターン識別処理と文字列識別処理とを任意に選択して行うことによって、複数の帳票の種類を高速に識別することができる。   As described above, according to the form identifying method, form identifying program, and optical character reading system of the present invention, the size identifying process, the pattern identifying process, and the character string identifying process for identifying the type of the form are arbitrarily selected and performed. As a result, a plurality of forms can be identified at high speed.

本発明による識別対象となる帳票例を示す図。The figure which shows the example of a form used as the identification object by this invention. 本発明の一実施形態による光学文字読取システムを示す図。The figure which shows the optical character reading system by one Embodiment of this invention. 本発明による光学文字読取システムの動作フロー図。The operation | movement flowchart of the optical character reading system by this invention. 本実施形態によるサイズ識別失敗修正画面を説明するための図。The figure for demonstrating the size identification failure correction screen by this embodiment. 本実施形態による配置パターン識別失敗修正画面を説明するための図。The figure for demonstrating the arrangement pattern identification failure correction screen by this embodiment. 本実施形態による文字列識別失敗修正画面を説明するための図。The figure for demonstrating the character string identification failure correction screen by this embodiment.

符号の説明Explanation of symbols

11:識別対象、12:サイズ識別対象帳票、13:配置パターン識別対象帳票、14:文字列識別対象帳票、21:入力用帳票、22:スキャナ装置、23:コンピュータ、24:CPU、25:メモリ、26:画像情報、27:サイズ識別データ、28:配置パターン識別データ、29:文字列識別データ、41:サイズ識別失敗修正画面、42:配置パターン識別失敗修正画面、43:文字列識別失敗修正画面。   11: Identification object, 12: Size identification object form, 13: Arrangement pattern identification object form, 14: Character string identification object form, 21: Input form, 22: Scanner device, 23: Computer, 24: CPU, 25: Memory , 26: image information, 27: size identification data, 28: arrangement pattern identification data, 29: character string identification data, 41: size identification failure correction screen, 42: arrangement pattern identification failure correction screen, 43: character string identification failure correction screen.

Claims (6)

複数種類の帳票のサイズ情報と罫線配置パターン情報とタイトル文字情報とを記憶しておき、帳票の画像を読み込み、前記帳票の種類をコンピュータにより識別する光学文字読取システムであって、
前記コンピュータが、
前記読み込んだ画像のサイズを前記記憶したサイズ情報と比較して帳票の種類を識別するサイズ識別処理と、
前記読み込んだ画像の罫線配置パターンを前記記憶した罫線配置パターン情報と比較して帳票の種類を識別するパターン識別処理と、
前記読み込んだ画像に含まれる文字列を前記記憶したタイトル文字列情報と比較して帳票の種類を識別する文字列識別処理とを、任意に選択して行う光学文字読取システム。
An optical character reading system that stores size information of multiple types of forms, ruled line arrangement pattern information, and title character information, reads an image of the form, and identifies the type of the form by a computer,
The computer is
A size identification process for comparing the size of the read image with the stored size information to identify the type of form,
A pattern identification process for identifying the type of form by comparing the ruled line arrangement pattern of the read image with the stored ruled line arrangement pattern information;
An optical character reading system that arbitrarily selects and performs a character string identification process for comparing a character string included in the read image with the stored title character string information to identify the type of form.
前記コンピュータが、前記サイズ識別によりサイズが識別された帳票の種類を基に帳票の種類を限定した後、又は前記罫線パターン識別により罫線配置パターンが識別された帳票の種類を限定した後、前記限定した帳票の種類から前記文字列識別により帳票の種類を識別する請求項1記載の光学文字読取システム。   After the computer limits the type of the form based on the type of the form whose size is identified by the size identification, or after limiting the type of the form whose ruled line arrangement pattern is identified by the ruled line pattern identification, the limitation The optical character reading system according to claim 1, wherein the type of form is identified by the character string identification from the type of form that has been made. 複数種類の帳票のサイズ情報と罫線配置パターン情報とタイトル文字情報とを記憶しておき、帳票の画像を読み込み、前記帳票の種類をコンピュータにより識別する光学文字読取システムの帳票識別方法であって、
前記コンピュータに、
前記読み込んだ画像のサイズを前記記憶したサイズ情報と比較して帳票の種類を識別するサイズ識別機能と、
前記読み込んだ画像の罫線配置パターンを前記記憶した罫線配置パターン情報と比較して帳票の種類を識別するパターン識別機能と、
前記読み込んだ画像に含まれる文字列を前記記憶したタイトル文字列情報と比較して帳票の種類を識別する文字列識別機能とを、任意に選択して実行する帳票識別方法。
A form identification method for an optical character reading system that stores size information of multiple types of forms, ruled line arrangement pattern information, and title character information, reads an image of the form, and identifies the type of the form by a computer,
In the computer,
A size identification function for comparing the size of the read image with the stored size information to identify the type of form,
A pattern identification function for comparing the ruled line arrangement pattern of the read image with the stored ruled line arrangement pattern information to identify the type of form;
A form identification method for arbitrarily selecting and executing a character string identification function for comparing a character string included in the read image with the stored title character string information to identify the type of form.
前記コンピュータが、前記サイズ識別機能によりサイズが識別された帳票の種類を基に帳票の種類を限定した後、又は前記パターン識別機能により罫線の配置パターンが識別された帳票の種類を限定した後、前記限定した帳票の種類から前記文字列識別機能により帳票の種類を識別する機能を実行する請求項3記載の帳票識別方法。   After the computer has limited the type of form based on the type of form whose size has been identified by the size identification function, or after limiting the type of form in which the arrangement pattern of ruled lines has been identified by the pattern identification function, The form identifying method according to claim 3, wherein a function for identifying a form type by the character string identifying function is executed from the limited form type. 複数種類の帳票のサイズ情報と罫線配置パターン情報とタイトル文字情報とを記憶しておき、帳票の画像を読み込み、前記帳票の種類をコンピュータにより識別する光学文字読取システムの帳票識別プログラムであって、
前記コンピュータに、
前記読み込んだ画像のサイズを前記記憶したサイズ情報と比較して帳票の種類を識別するサイズ識別機能と、
前記読み込んだ画像の罫線配置パターンを前記記憶した罫線配置パターン情報と比較して帳票の種類を識別するパターン識別機能と、
前記読み込んだ画像に含まれる文字列を前記記憶したタイトル文字列情報と比較して帳票の種類を識別する文字列識別機能とを、任意に選択して実行する帳票識別プログラム。
A form identification program of an optical character reading system that stores size information, ruled line arrangement pattern information, and title character information of a plurality of forms, reads a form image, and identifies the form type by a computer,
In the computer,
A size identification function for comparing the size of the read image with the stored size information to identify the type of form,
A pattern identification function for comparing the ruled line arrangement pattern of the read image with the stored ruled line arrangement pattern information to identify the type of form;
A form identification program for arbitrarily selecting and executing a character string identification function for comparing a character string included in the read image with the stored title character string information to identify a form type.
前記コンピュータに、前記サイズ識別機能によりサイズが識別された帳票の種類を基に帳票の種類を限定した後、又は前記パターン識別機能により罫線配置パターンが識別された帳票の種類を限定した後、前記限定した帳票の種類から前記文字列識別機能により帳票の種類を識別する機能を実行する請求項5記載の帳票識別プログラム。   After limiting the type of form based on the type of form whose size has been identified by the size identifying function to the computer, or after limiting the type of form in which the ruled line arrangement pattern has been identified by the pattern identifying function, 6. The form identification program according to claim 5, wherein a function for identifying a form type is executed from the limited form type by the character string identification function.
JP2007156844A 2007-06-13 2007-06-13 Form identification method, form identification program, and optical character reading system using the form identification method Active JP5051756B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2007156844A JP5051756B2 (en) 2007-06-13 2007-06-13 Form identification method, form identification program, and optical character reading system using the form identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2007156844A JP5051756B2 (en) 2007-06-13 2007-06-13 Form identification method, form identification program, and optical character reading system using the form identification method

Publications (2)

Publication Number Publication Date
JP2008310531A true JP2008310531A (en) 2008-12-25
JP5051756B2 JP5051756B2 (en) 2012-10-17

Family

ID=40238083

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2007156844A Active JP5051756B2 (en) 2007-06-13 2007-06-13 Form identification method, form identification program, and optical character reading system using the form identification method

Country Status (1)

Country Link
JP (1) JP5051756B2 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011013587A1 (en) * 2009-07-27 2011-02-03 株式会社日立ソリューションズ Document data processing device
JP2017090974A (en) * 2015-11-02 2017-05-25 富士ゼロックス株式会社 Image processing device and program
JP2019207735A (en) * 2019-08-29 2019-12-05 株式会社Pfu Mobile terminal, image processing method, and program
US10885375B2 (en) 2016-03-17 2021-01-05 Pfu Limited Mobile terminal, image processing method, and computer-readable recording medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09204492A (en) * 1996-01-26 1997-08-05 Toshiba Corp Slip processor
JPH11296676A (en) * 1998-04-08 1999-10-29 Oki Electric Ind Co Ltd Image data classification method and image data registration method
JP2001312694A (en) * 2000-05-01 2001-11-09 Hitachi Ltd Method and device for recognizing many kinds of slips
JP2003331216A (en) * 2002-05-16 2003-11-21 Oki Electric Ind Co Ltd Business form reading method
JP2004005268A (en) * 2002-05-31 2004-01-08 Toshiba Corp Business form identifying device, business form defining method and business form identifying method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09204492A (en) * 1996-01-26 1997-08-05 Toshiba Corp Slip processor
JPH11296676A (en) * 1998-04-08 1999-10-29 Oki Electric Ind Co Ltd Image data classification method and image data registration method
JP2001312694A (en) * 2000-05-01 2001-11-09 Hitachi Ltd Method and device for recognizing many kinds of slips
JP2003331216A (en) * 2002-05-16 2003-11-21 Oki Electric Ind Co Ltd Business form reading method
JP2004005268A (en) * 2002-05-31 2004-01-08 Toshiba Corp Business form identifying device, business form defining method and business form identifying method

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011013587A1 (en) * 2009-07-27 2011-02-03 株式会社日立ソリューションズ Document data processing device
JP2011028568A (en) * 2009-07-27 2011-02-10 Hitachi Solutions Ltd Document data processing device
CN102473176A (en) * 2009-07-27 2012-05-23 株式会社日立解决方案 Document data processing device
US8768941B2 (en) 2009-07-27 2014-07-01 Hitachi Solutions, Ltd. Document data processing device
JP2017090974A (en) * 2015-11-02 2017-05-25 富士ゼロックス株式会社 Image processing device and program
US10885375B2 (en) 2016-03-17 2021-01-05 Pfu Limited Mobile terminal, image processing method, and computer-readable recording medium
JP2019207735A (en) * 2019-08-29 2019-12-05 株式会社Pfu Mobile terminal, image processing method, and program

Also Published As

Publication number Publication date
JP5051756B2 (en) 2012-10-17

Similar Documents

Publication Publication Date Title
US8107727B2 (en) Document processing apparatus, document processing method, and computer program product
US8331677B2 (en) Combined image and text document
CN100568903C (en) Display control unit, image processing apparatus, display control method
US9740692B2 (en) Creating flexible structure descriptions of documents with repetitive non-regular structures
JP2011165187A (en) Method and system for displaying document
CN101443790A (en) Efficient processing of non-reflow content in a digital image
CN105095166B (en) Method for generating stream-type electronic book and website system
CN107133615B (en) Information processing apparatus, information processing method, and computer program
JP2021043775A (en) Information processing device and program
JP5051756B2 (en) Form identification method, form identification program, and optical character reading system using the form identification method
JP5491774B2 (en) Data entry system and data entry method
JP2008108114A (en) Document processor and document processing method
JP2009031937A (en) Form image processing apparatus and form image processing program
JP5566971B2 (en) Information processing program, information processing apparatus, and character recognition method
JP7241506B2 (en) Correction support device and correction support program for optical character recognition results
JP5142836B2 (en) Scanning image element alignment method
US8749854B2 (en) Image processing apparatus, method for performing image processing and computer readable medium
JP4518212B2 (en) Image processing apparatus and program
JP4517822B2 (en) Image processing apparatus and program
JP2018055256A (en) Information processing apparatus, information processing method, and program
JP4309881B2 (en) ID card recognition apparatus and ID card recognition method
JP4501731B2 (en) Image processing device
JP6682827B2 (en) Information processing apparatus and information processing program
JP5720156B2 (en) Electronic document processing apparatus and program
JP2012243121A (en) Data creation device, data creation program, recording medium and data creation method

Legal Events

Date Code Title Description
RD02 Notification of acceptance of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7422

Effective date: 20091211

A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20100525

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20100617

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20110722

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20110801

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20110909

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20120306

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20120412

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20120718

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20120719

R150 Certificate of patent or registration of utility model

Ref document number: 5051756

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

Free format text: JAPANESE INTERMEDIATE CODE: R150

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20150803

Year of fee payment: 3

S111 Request for change of ownership or part of ownership

Free format text: JAPANESE INTERMEDIATE CODE: R313111

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250