JPH0498363A

JPH0498363A - Knowledge processing system for continuous field

Info

Publication number: JPH0498363A
Application number: JP2211831A
Authority: JP
Inventors: Koji Inami; 康治井波
Original assignee: PFU Ltd
Current assignee: PFU Ltd
Priority date: 1990-08-10
Filing date: 1990-08-10
Publication date: 1992-03-31

Abstract

PURPOSE:To divide a field on a slip and to process knowledge as to the divided field by connecting the continuous fields through the use of information on the connection of the continuous fields having an OCR definition body. CONSTITUTION:When a connection processing part 11 receives a reading result from a read part 6 which separately reads character strings in the continuous fields 52 and 53, it uses information on the optical character reader(OCR) definition body 9 and identifies that they are the read character strings from the continuous fields 52 and 53. Then, it connects them in a prescribed order, namely, the read character string of the continuous field 53 immediately after the read character string of the continuous field 52. Furthermore, a knowledge processing program 3 executes an address knowledge processing based on a KANA (Japanese syllabary) character string 51' as the knowledge processing and outputs address data as prescribed data. Thus, the field on the slip can be divided and knowledge as to the divided field concerned can be processed.

Description

【発明の詳細な説明】〔概　要〕帳票上で複数に分割された継続フィールドを合体させた
上で知識処理を行う継続フィールドの知識処理方式に関
し帳票上でのフィールドの分割を可能とすると共に、当該
分割したフィールドについての知識処理を可能とするこ
とを目的とし帳票上の複数に分割された継続フィールドに記入された
文字列を読み取る読取部と、前記読取部が読み取った文
字列について知識処理を行って所定のデータを出力する
知識処理プログラムとを備えたＯＣＲ処理システムにお
いて、前記継続フィールドから読み取られた文字列を結
合する結合処理部を設け、前記知識処理プログラムが、
前記結合処理部の結合した文字列に基づいて知識処理を
行い、前記所定のデータを出力するように構成する。[Detailed Description of the Invention] [Summary] Concerning a knowledge processing method for a continuous field that performs knowledge processing after combining a plurality of continuous fields divided into a plurality of fields on a form, it is possible to divide fields on a form, and , a reading unit that reads a character string written in a continuation field divided into multiple parts on a form with the purpose of enabling knowledge processing about the divided field, and a knowledge processing about the character string read by the reading unit. and a knowledge processing program that outputs predetermined data by providing a combination processing unit that combines character strings read from the continuation field, the knowledge processing program comprising:
The combination processing unit performs knowledge processing based on the combined character strings and outputs the predetermined data.

[Industrial application field]

本発明は、継続フィールドの知識処理方式に関し、更に
詳しくは２ｇ票上で複数に分割された継続フィールドを
合体させた上で知識処理を行う継続フィールドの知識処
理方式に関する。The present invention relates to a knowledge processing method for a continuation field, and more particularly to a knowledge processing method for a continuation field that performs knowledge processing after combining a plurality of continuation fields divided on a 2g vote.

帳票２例えば生命保険等の申し込み用紙に記入された住
所をコード化する作業は、近年、０ＣＲ（光学文字読取
装置）処理システムによって行うことが多くなってきて
いる。例えば、帳票上の所定位置にあるカナ住所フィー
ルドに、カナ（カタカナ）により記入された文字列をＯ
ＣＲによって読み取り、これを知識処理して、認識率の
向上を図っている。In recent years, the work of encoding addresses written on form 2, for example application forms for life insurance, etc., has been increasingly performed by OCR (optical character reader) processing systems. For example, a character string written in kana (katakana) is entered in a kana address field at a predetermined position on a form.
The recognition rate is improved by reading the information using CR and processing it with knowledge.

[Conventional technology]

帳票上のフィールド（記入欄）１例えばカナ住所フィー
ルドの位置は、予め定まっている。またその記入事項も
予め定まっており１通常、［都道府県塩Ｊ、「郡名又は
市区名」、「町村名等」及び［番地ｊからなる。このう
ち、「都道府県塩」を第ルヘル、　「郡名又は市区名」
を第２レベル「町村名等Ｊを第３レベルという。The position of field (input column) 1 on the form, for example, the address field in kana, is determined in advance. The entries are also predetermined and usually consist of the prefecture, ``name of county or city,'' ``name of town, village, etc.,'' and address. Among these, "prefectural salt" is the first rule, "county name or city ward name"
is the second level, and the name of the town, village, etc. is the third level.

例えば第１レベル；カナガワケン第２レベル；カワサキンナカハラク第３レベル、カミオダナ力番地　　　；　１−１−１である。for example 1st level; Kanagawa Ken 2nd level: Kawasakinnakaharaku 3rd level, Kamiodana power Address: 1-1-1 It is.

カナ住所フィールド内には、これらの事項が例えば１手
書きの文字として記入される。In the kana address field, these items are entered, for example, as one handwritten character.

従来は、第６図図示の如く、このような手書きの文字列
をＯＣＲで読み取った後、対応する候補文字を生成し、
これを用いて住所辞書をアクセスし、知識処理を行って
いた。即ち、住所辞書を郵便番号又は第］レベル（の候
補文字）を用いてアクセスする。そして１例えば、郵便
番号による場合は、対応する第１乃至第３レベルを辞書
から読み出し、これと候補文字とを比較し、−敗した場
合にこれを出力する。この時、カナ漢字変換も合わせて
行う。Conventionally, as shown in FIG. 6, after reading such a handwritten character string with OCR, corresponding candidate characters are generated,
This was used to access the address dictionary and perform knowledge processing. That is, the address dictionary is accessed using the postal code or the [candidate characters of] level. For example, in the case of a postal code, the corresponding first to third levels are read out from the dictionary, compared with the candidate characters, and outputted if the candidate character is defeated. At this time, kana-kanji conversion is also performed.

なお１郵便番号は、帳票上の郵便番号フィールドに記入
されたものを、ＯＣＲで読み取り、前記アクセスのため
に使用する。Note that the first postal code is entered in the postal code field on the form, read by OCR, and used for the above-mentioned access.

ところで、このようなフィールドは、知識処理を行う都
合上１分割することはできない。従って例えば、カナ住
所フィールドについても、まとまった１つ（１行）のフ
ィール１′として設ける必要がある。ここで、フィール
ドの分割とは、帳票上において、１つのフィールドが複
数行に渡っていること、又は、１つのフィールドが同一
行内で複数に分離されていることをいう。Incidentally, such a field cannot be divided into one part for convenience of knowledge processing. Therefore, for example, the kana address field also needs to be provided as a single (one line) field 1'. Here, field division refers to one field extending over multiple lines on a form, or one field being separated into multiple lines within the same line.

[Problem to be solved by the invention]

前述の従来技術によれば、フィールドを分割することか
できないので、帳票の形状の制約やＩｉ票上におけるフ
ィールドの配置等の制約が大きいという問題があった。According to the above-mentioned conventional technology, since it is only possible to divide fields, there is a problem in that there are large restrictions on the shape of the form and the arrangement of fields on the form Ii.

例えば、カナ住所フィールドについてみると前述の如く
第１乃至第３レベル及び番地の全てを記入する必要上、
極めて横長のフィールドとなる。For example, looking at the kana address field, as mentioned above, it is necessary to fill in all of the first to third levels and the street address.
The field becomes extremely wide.

この場合５帳票の形状は、カナ住所フィールドを１行に
配置できるように、横長の用紙とするか。In this case, the form of the form 5 should be horizontally long so that the kana address field can be placed in one line.

又は、カナ住所フィールドの長さ分の幅を有する用紙と
する必要がある。また、ｒＡ票上においては。Alternatively, the paper needs to have a width equal to the length of the kana address field. Also, on the rA vote.

この横長のカナ住所フィールドを１行に配置することに
より、他のフィールドの配置が制約を受けてしまう。By arranging this horizontally long kana address field in one line, the arrangement of other fields is restricted.

なお、帳票の形状や帳票上におけるフィールドの配置等
の理由から、フィールド、例えばカナ住所フィールドを
分割すると、以下の問題を生しる。Note that if a field, for example, a kana address field, is divided for reasons such as the shape of the form or the arrangement of fields on the form, the following problems will occur.

即ち、カナ住所フィールドの分割により３第２又は第３
レベルの文字列の不特定の部分で、当該文字列が分断さ
れてしまう。従って、このままでは。That is, by splitting the kana address field, 3 second or third
The character string of the level is divided at an unspecified part. Therefore, as it is.

知識処理を十分に行うことができない（従ってこの場合
は、オペレータが全データをキーボードから入力するこ
とになる）。Knowledge processing cannot be performed sufficiently (therefore, in this case, the operator must input all data from the keyboard).

本発明は、帳票上でのフィールドの分割を可能とすると
共に、当該分割したフィールドについての知識処理を可
能とした継続フィールドの知識処理方式を提供すること
を目的とする。SUMMARY OF THE INVENTION An object of the present invention is to provide a continuous field knowledge processing method that allows fields to be divided on a form and knowledge processing for the divided fields.

〔課題を解決するための手段］第１図は本発明の原理構成図であり１本発明によるＯＣ
Ｒ処理システムを示している。[Means for Solving the Problems] FIG. 1 is a diagram showing the principle configuration of the present invention.
An R processing system is shown.

第１圀において、１は処理装置、３は知識処理プログラ
ム、５は帳票、５１　はカナ住所フィールド、５２及び
５３は継続フィールド、６は読取部９はＯＣＲ定義体、
１０は住所辞書、　　１１は結合処理部である。In the first domain, 1 is a processing device, 3 is a knowledge processing program, 5 is a form, 51 is a kana address field, 52 and 53 are continuation fields, 6 is a reading unit 9 is an OCR definition body,
10 is an address dictionary, and 11 is a combination processing section.

帳票５は、ＯＣＲ処理システムの処理対象であり、その
所定の位置に継続フィールＦ　５２．５３を有する。継
続フィール）”　５２．５３は１本来１つであるフィー
ルド、例えば、カナ住所フィールド５１を、複数に分割
したものである。従って、継続フィールド５２．５３に
は１本来１つの文字列よして記入されるべき文字列が、
その不特定の部分で、複数に分断されて記入される。Form 5 is a processing target of the OCR processing system, and has a continuation field F 52.53 at a predetermined position. "Continuation field)" 52.53 is a field that is originally one, for example, the kana address field 51, is divided into multiple fields. Therefore, the continuation fields 52 and 53 are filled in as originally one character string. The string that should be
The unspecified part is divided into multiple parts and filled in.

読取部６は、帳票５に記入されている情報、特に、継続
フィールド５２．５３に記入された文字列を読み取る。The reading unit 6 reads the information written in the form 5, particularly the character strings written in the continuation fields 52 and 53.

結合処理部１１　は、Ｖｆ、取部６によって継続フィー
ルド５２．５３から読み取られた文字列を所定の順に結
合し２本来の１つの文字列とする。The combination processing unit 11 combines the character strings read from the continuation fields 52 and 53 by the Vf and extraction unit 6 in a predetermined order to form two original character strings.

知識処理プログラム３は、結合処理部１１の結合した文
字列に基づいて、所定の知識処理を行い所定のデータを
出力する。The knowledge processing program 3 performs predetermined knowledge processing based on the strings combined by the combination processing section 11 and outputs predetermined data.

[For production]

第２図は本発明の作用説明図である。 FIG. 2 is an explanatory diagram of the operation of the present invention.

例えば１図示の如く、継続フィールド５２５３が、カナ
住所フィールド５１を２つに分割したものであるとする
。この場合、継続フィールド５２５３内し二は、帳票記
入者により、その住所を示すカナ（カタカナ）文字列が
記入される。この時第３レベル「カミオダナカ」は、継
続フィールド５２、５３において、ｒカミ」と「オダナ
カ」とに分断される。For example, as shown in Figure 1, it is assumed that the continuation field 5253 is obtained by dividing the kana address field 51 into two. In this case, in the second part of the continuation field 5253, a character string in katakana indicating the address is entered by the person filling out the form. At this time, the third level "Kami-odanaka" is divided into "r-kami" and "Odanaka" in the continuation fields 52 and 53.

結合処理部１１　は、この継続フィールド５２５３内の
文字列を別々に読み取った読取部６がらその読み取りの
結果を渡されると１例えば、ＯＣＲ定義体９の情報を用
いて、これらを継続フィールド’　５２．５３からの読
み取り文字列であると識別する。そして、これらを所定
の順１ｍち、継続フィールド５２の読取文字列の直後に
、継続フィールド５３の読取文字列を結合する。これに
より本来１つのカナ住所フィールＦ　５１に記入される
べきであった１つのカナ住所文字列５１′を得る。When the combination processing unit 11 receives the result of reading from the reading unit 6 which has read the character strings in the continuation field 5253 separately, the combination processing unit 11 converts them into the continuation field '52 using the information of the OCR definition body 9, for example. .53. Then, the read character string of the continuation field 53 is combined with the read character string of the continuation field 53 immediately after the read character string of the continuation field 52 after 1 m in a predetermined order. As a result, one kana address character string 51', which should originally have been entered in one kana address field F51, is obtained.

この時、先に分断されていた第３レベル「カミオダナカ
、は、１つの第３レベルの文字列とされる。At this time, the previously divided third level "Kamiodanaka" is treated as one third level character string.

知識処理プログラム３は、知識処理としてカナ住所文字
列５１’に基づいた住所知識処理を行い所定のデータと
して住所データを出力する。即ちカナ住所文字列５１”
に含まれる第３レベル（町村名等）に基づいて住所辞書
１０をアクセスし　当該カナ住所文字列５１”に対応す
る住所文字列を出力する。The knowledge processing program 3 performs address knowledge processing based on the kana address character string 51' as knowledge processing and outputs address data as predetermined data. That is, kana address character string 51”
The address dictionary 10 is accessed based on the third level (town/village name, etc.) included in the address string 51'', and the address string corresponding to the kana address string 51'' is output.

このように１本発明によれば、１つのフィールドを複数
に分割して継続フィールド５２．５３としても、これら
を結合した上で知識処理を行うことができる。従って、
Ｉｉ票の形状の制約や、＠票上におけるフィールドの配
置等の制約をなくすことができ、＠票設計を容易に行う
ことができる。また、フィールドを分割しても、得られ
るデータは分割しなかった場合と同様であるので、これ
を利用する立場のオペレータや利用者プログラム等の負
担をなくすことができる。As described above, according to the present invention, even if one field is divided into a plurality of continuation fields 52 and 53, knowledge processing can be performed after combining these fields. Therefore,
It is possible to eliminate restrictions on the shape of the Ii vote, the arrangement of fields on the @ vote, etc., and it is possible to easily design the @ vote. Furthermore, even if the field is divided, the data obtained is the same as if it were not divided, so the burden on the operator, user program, etc. who use this data can be eliminated.

〔実施例］第３図は実施例構成図であり、ＯＣＲ処理システムを示
している。[Embodiment] FIG. 3 is a block diagram of an embodiment, showing an OCR processing system.

第３図において、２は帳票処理部、４はアクセステーブ
ル、７は表示装置、８はデータファイルである。In FIG. 3, 2 is a form processing unit, 4 is an access table, 7 is a display device, and 8 is a data file.

帳票処理部２は、中央処理１ｉ１！！（ＣＰＵ）及びメ
モリからなる処理装置１内に設けられ、帳票５について
の種々の処理を行う、このために、読取部６は、帳票５
から読み取った情報を帳票処理部２に送る。一方、この
読取部６からの情報を処理するために、帳票処理部２は
、ＯＣＲ定義体９を読み込む。The form processing unit 2 is a central processing unit 1i1! ! The reading section 6 is provided in the processing device 1 consisting of a CPU (CPU) and a memory, and performs various processes on the form 5.
The information read from is sent to the form processing section 2. On the other hand, in order to process the information from the reading section 6, the form processing section 2 reads the OCR definition body 9.

ＯＣＲ定義体９は、対応する所定の形式の帳票５を処理
するための種々の情報を格納している。The OCR definition body 9 stores various information for processing the corresponding form 5 in a predetermined format.

この情報は８例えば、帳票５の用紙の大きさ、Ｉｔ！票
５上の各種のフィールド（記入欄）の位置、大きさ及び
記入内容（住所である等）についての情報等からなる。This information may be 8, for example, the paper size of form 5, It! It consists of information about the position, size, and entry contents (address, etc.) of various fields (input columns) on the form 5.

帳票処理部２は、読取部６が光学的に読み取った情報（
読取文字列）について、ＯＣＲ定義体９により解釈する
。例えば、ＯＣＲ定義体９により継続フィールド５２．
５３の位置を知り、また、これら継続フィールド５２．
５３はカナ住所フィールド５１を２つに分割したもので
あることを知り以上から当該位置において読み取った読
取文字列は、住所を表す文字列（カナ住所文字列）であ
ることを認識する。The form processing unit 2 reads the information optically read by the reading unit 6 (
The read character string) is interpreted using the OCR definition body 9. For example, according to the OCR definition program 9, the continuation field 52.
53 and also know the location of these continuation fields 52.
53 is obtained by dividing the kana address field 51 into two, and from the above, recognizes that the read character string read at that position is a character string representing an address (kana address character string).

このために、ＯＣＲ定義体９は、帳票５上のフィールド
の各々について、他のフィールドとの継続関係を格納す
る。第３図図示のＯＣＲ定義体９の形式は、その−例で
ある。即ち、１つのフィルト毎二二、そのフィールド名
に対応して、当該フィールドの帳票５上でのアドレス等
の情報、当該フィールドの属性情報、及び、当該フィー
ルドについての結合に関する情報として継続フィールド
名が格納される０例えば、継続フィールド５２及び５３
のフィールド名が各々５Ａ及びＢとするとフィールド名
Ａについての継続フィールド名Ｂが格納され、フィール
ド名Ｂについての継続フィルドは存在しない旨（ＥＮＤ
）の情報が格納される。なお、ｆｉ性情報として、住所
文字列であることが格納される。For this purpose, the OCR definition body 9 stores the continuation relationship with other fields for each field on the form 5. The format of the OCR definition body 9 shown in FIG. 3 is an example thereof. That is, for each filtration, corresponding to the field name, information such as the address of the field on the form 5, attribute information of the field, and continuation field name as information regarding the combination of the field. Stored 0 e.g. continuation fields 52 and 53
If the field names are 5A and B, respectively, the continuation field name B for field name A is stored, and a message indicating that there is no continuation field for field name B (END
) information is stored. Note that the fact that the address string is an address character string is stored as fi property information.

結合処理部１１は、帳票処理部２から、継続フィールド
５２．５３からの読取文字列を渡されると。When the combination processing section 11 receives the read character strings from the continuation fields 52 and 53 from the form processing section 2.

これらは継続フィールド５２．５３についてのものであ
り、結合すべきものと識別する。更に、＠票処理部２か
ら、ＯＣＲ定義体９の持つ、継続フィールド５２．５３
についての結合に関する情報を得て、該当文字列を結合
する。即ち、継続フィールド５２の読取文字列の直後に
、ブランク等を置くことなく、継続フィールド５３の読
取文字列を結合する。These are for continuation fields 52,53 and identify those to be combined. Furthermore, the continuation fields 52 and 53 of the OCR definition body 9 are sent from the @vote processing unit 2.
Obtain information about the combination of , and combine the corresponding strings. That is, the read character string of the continuation field 53 is joined immediately after the read character string of the continuation field 52 without placing a blank or the like.

帳票処理部２は、帳票５上の各種フィールドについての
読取文字列についての処理を行うが、継続フィールド５
２．５３については、結合処理部１１が結合処理した当
該読取文字列から候補文字列を生成し、これら読取文字
列及び候補文字列を知識処理プログラム３に渡して、知
識処理を依頼する。The form processing unit 2 processes read character strings for various fields on the form 5.
Regarding 2.53, the combination processing unit 11 generates candidate character strings from the read character strings subjected to the combination processing, passes these read character strings and candidate character strings to the knowledge processing program 3, and requests knowledge processing.

知識処理プログラム３は、前記読取文字列及び候補文字
列を受は取ると、各種の辞書を用いた知識処理を行う。Upon receiving the read character string and candidate character string, the knowledge processing program 3 performs knowledge processing using various dictionaries.

今、継続フィールド５２．５３がカナ住所フィールド５
１を分割したものである場合、　（住所）知識処理プロ
グラム３は、当４に？Ａ取文字列（カナ住所文字列）に
ついて、住所辞書１０を用いた住所知識処理を行う。Now, continuation fields 52 and 53 are kana address field 5.
If 1 is divided, (address) knowledge processing program 3 is divided into 4? Address knowledge processing using the address dictionary 10 is performed on the A character string (kana address character string).

このために、知識処理プログラム３は　アクセステーブ
ル４を作成する。即ち、読取文字列及び候補文字列から
各々の第３レベルのみを選択的に取り出し１読取文字列
の第３レベルに対応する形で候補文字列の第３レベルを
格納する。この時１つの読取文字列に対して２以上の候
補文字列があれば、その優先順位の高い順に、候補文字
列の第３レベルを並べる。知識処理プログラム３は作成
したアクセステーブル４に従って、優先順に。For this purpose, the knowledge processing program 3 creates an access table 4. That is, only the third level of each of the read character string and candidate character string is selectively extracted and the third level of the candidate character string is stored in a form corresponding to the third level of one read character string. At this time, if there are two or more candidate character strings for one read character string, the third level candidate character strings are arranged in descending order of priority. The knowledge processing program 3 is executed in priority order according to the created access table 4.

候補文字列の第３レベルを用いて、住所辞書１０をアク
セスする（第３レベルアクセスを行う）。The address dictionary 10 is accessed using the third level of the candidate character string (third level access is performed).

そして、当該第３レベルに対応する住所文字列があれば
、これを読み出す。Then, if there is an address character string corresponding to the third level, it is read out.

このために、住所辞書１０は、第３レベルによる検索が
可能な形式で住所文字列を格納する。For this purpose, the address dictionary 10 stores address strings in a format that allows searches at the third level.

知識処理プログラム３は、第３レベルアクセスにより住
所辞書１０から得た住所文字列を用いて住所データを作
成する。即ち、まず、当該住所文字列の第２レベルと、
候補文字列の第２レベルとを比較する。そして、一致し
た場合には、当該住所文字列を用いて住所データを作成
する０例えば漢字データを作成し、これをカナデータと
対応させ、第ルベルの漢字及びカナデータを作成し。The knowledge processing program 3 creates address data using address character strings obtained from the address dictionary 10 through third level access. That is, first, the second level of the address string,
and the second level of the candidate character string. If there is a match, address data is created using the address string. For example, kanji data is created, and this is associated with kana data to create the kanji and kana data for the first rubel.

これらをまとめて１つの住所データとする。These are combined into one address data.

表示袋Ｗ７は、利用者に所定のデータを表示するための
ものである。即ち、知識処理プログラム３は、所定のデ
ータ、例えば住所データを表示装置７に表示する。これ
を見た利用者は、キーボード等の入力装置（図示せず）
からの入力により所定のデータを修正する。The display bag W7 is for displaying predetermined data to the user. That is, the knowledge processing program 3 displays predetermined data, such as address data, on the display device 7. The user who sees this uses an input device such as a keyboard (not shown)
Predetermined data is corrected by input from.

データファイル８は、所定のデータを格納するためのフ
ァイルであり、当該帳票５の処理を行う利用者プログラ
ムの持つファイルである。即ち知識処理プログラム３は
、（修正済みの）所定のデータを、データファイル８に
出力する。The data file 8 is a file for storing predetermined data, and is a file owned by a user program that processes the form 5. That is, the knowledge processing program 3 outputs (corrected) predetermined data to the data file 8.

第４図は継続フィールドの知識処理フロー、第５図は継
続フィールドの知識処理の一例を示す図である。FIG. 4 is a diagram showing a knowledge processing flow for a continuation field, and FIG. 5 is a diagram showing an example of knowledge processing for a continuation field.

以下、第５図を参照しつつ、第４図に従って住所知識処
理について説明する。Hereinafter, address knowledge processing will be explained according to FIG. 4 with reference to FIG. 5.

■　読取部６が、帳票５上の継続フィールド５２５３等
の読み取りを行い８その結果を帳票処理部２へ送る。(2) The reading unit 6 reads the continuation field 5253 and the like on the form 5 and sends the result to the form processing unit 2.

この継続フィールド５２．５３は、カナ住所フィールド
５１を分割したものである。従って、これらの中には２
第５図（Ａ）図示の如く、住所の第２レベル、第３レベ
ル及び番地が記入されている。The continuation fields 52 and 53 are obtained by dividing the kana address field 51. Therefore, among these there are 2
As shown in FIG. 5(A), the second level, third level, and street address of the address are entered.

また、第３レベル「カミオダナカ」は、２つの継続フィ
ールド５２．５３に分断されている。なお第２レベルと
第３レベルとの間、第３レベルと番地との間の各々に、
空白（ブランク）が挿入され。Further, the third level "Kamiodanaka" is divided into two continuous fields 52 and 53. Furthermore, between the second level and the third level, and between the third level and the address,
A blank space is inserted.

これらを区別するために用いられる。It is used to distinguish between these.

■　帳票処理部２が、ＯＣＲ定義体９を読み込み読取部
から送られた読取文字列を解釈する。これにより、継続
フィールド５２．５３に対応する文字列は結合すべきも
のであること（フィールド５２５３は継続フィールドで
あること）及び住所についての情報であること等を認識
する。(2) The form processing section 2 reads the OCR definition body 9 and interprets the read character string sent from the reading section. This recognizes that the character strings corresponding to the continuation fields 52 and 53 should be combined (field 5253 is a continuation field) and that it is information about an address.

■　結合処理部１】が、帳票処理部２からの指示に従っ
て、継続フィールド５２に対応する読取文字列の直後に
１継続フイールド５３に対応する読取文字列を結合する
。これにより、第５図（Ｄ）図示の如き、第３レベル「
カミオダナ力」を１つの文字列として含むカナ住所文字
列５１゛を得る。(2) The combination processing unit 1 combines the read character string corresponding to the 1 continuation field 53 immediately after the read character string corresponding to the continuation field 52 according to instructions from the form processing unit 2. As a result, the third level "
A kana address character string 51' containing "Kamiodana Chikara" as one character string is obtained.

帳票処理部２は、このカナ住所文字列（読取文字列）５
１゛に対応する候補文字列を生成する。The form processing unit 2 reads this kana address character string (read character string) 5.
Generate a candidate character string corresponding to 1゛.

■　知識処理プログラム３が、読取文字列及び候補文字
列を帳票処理部２から受は取り、これを用いて所定の処
理を行う。まず、第３レベルアクセスを行うため、知識
処理プログラム３が、アクセステーブル４を作成する。(2) The knowledge processing program 3 receives read character strings and candidate character strings from the form processing unit 2, and uses them to perform predetermined processing. First, in order to perform third level access, the knowledge processing program 3 creates an access table 4.

即ち、読取文字列及び候補文字列をサーチし５最初のブ
ランクと第２のブランクとの間の文字列「カミオダナカ
」を第３レベルとして取り出す。そして、候補文字列の
第３レベルを優先順に並べる。第３レベルが分断されて
いないので、このような処理が可能となる。That is, the read character string and candidate character string are searched, and the character string "Kamiodanaka" between the first blank and the second blank is extracted as the third level. Then, the third level candidate character strings are arranged in order of priority. Such processing is possible because the third level is not divided.

知識処理プログラム３は、候補文字列の第３レベル（「
カミオダナカ」）を用いて、住所辞書１０をアクセスし
、当該第３レベルに該当する住斯文字列が格納されてい
るか否かを調べる。The knowledge processing program 3 processes the third level of candidate character strings (“
The address dictionary 10 is accessed using the address dictionary 10, and it is checked whether or not a residential address string corresponding to the third level is stored.

■　住所辞書１０に該当住所文字列がある（知識処理成
功の）場合、知識処理プログラム３は、まず、カナ漢字
変換処理を行う。例えば、該当住所文字列はカナ（カタ
カナ）文字列であるが、住所辞書１０において、対応す
る漢字データ又は文字列を格納するようにしておき、こ
れを該当住所文字列と共に読み出すことによって、当該
変換処理を行う。■ If the address dictionary 10 contains the corresponding address character string (knowledge processing is successful), the knowledge processing program 3 first performs kana-kanji conversion processing. For example, the corresponding address string is a kana (katakana) string, but the address dictionary 10 stores the corresponding kanji data or character string, and by reading this along with the corresponding address string, the conversion is performed. Perform processing.

知識処理プログラム３は、住所辞書１０がら読み出した
住所文字列のうちの第２レベルと、候補文字列の第２レ
ベルとを比較する。ここで、読み出した住所文字列は１
例えば、第５図（Ｅ）に図示の如く、「カナガワケン　
カワサキンナ力ハラク　力ミオダナ力」となる。即ち、
第１乃至第３レベルからなる。このうち第２レベルは、
最初のブランクと第２のブランクとの間の文字列［カヮ
サキンナカハラクＪであると認識される。一方候補文字
列の第２レベルは、先頭から最初のブランクまでの文字
列「カワサキノナカハラク」であると認識される。The knowledge processing program 3 compares the second level of the address string read from the address dictionary 10 with the second level of the candidate string. Here, the read address string is 1
For example, as shown in Figure 5 (E), "Kanagawa Ken
Kawasakinna Chikara Haraku Chikara Miodana Chikara'. That is,
It consists of the first to third levels. Of these, the second level is
The character string between the first blank and the second blank [is recognized as Kawasakinnakaharaku J]. On the other hand, the second level of the candidate character string is recognized as the character string "Kawasaki Nonaka Haraku" from the beginning to the first blank.

第２レベルのカナ読みが重複するのは約３０弱（正確に
は２７）であり、この場合にあっても第２レベル及び第
３レベルの双方が重複することはないので、第２レベル
が前記比較により一致した場合、これを出力（住所デー
タ）作成のために用いる住所文字列として採用する。こ
の時点で、この住所文字列は、第５図（Ｅ）図示の如く、各々が第１乃至第３レベル及び番地からな
るカナ文字列と漢字文字列からなるように補正されてい
る。The number of duplicate kana readings at level 2 is about 30 (27 to be exact), and even in this case, there is no overlap between both level 2 and level 3, so 2nd level If there is a match as a result of the comparison, this is adopted as the address character string used to create the output (address data). At this point, the address character string has been corrected to consist of a kana character string and a kanji character string, each consisting of a first to third level and a street address, as shown in FIG. 5(E).

以上の処理も９分割された継続フィールド５２５３（に
対応する文字列）を結合したことによって、可能となっ
たものである。The above processing is also made possible by combining the nine-divided continuation field 5253 (character strings corresponding thereto).

■　前記カナ文字列及び漢字文字列から、知識処理プロ
グラム３が、住所データ、即ち、カナブタ、漢字データ
を生成する。(2) The knowledge processing program 3 generates address data, ie, kanabuta and kanji data, from the kana character string and kanji character string.

■　知識処理プログラム３が１表示装置７の画面の所定
位置に、第５図（Ｆ）図示の如く、漢字データ等を表示
するための枠を書き、この枠内に漢字データを表示する
。この画面は、利用者が９表示された漢字データ等をキ
ーボード等からの入力により修正するための修正画面で
ある。また、この枠は、修正画面上にあっては、オペレ
ータにデクが見易いようにデータを１行に示す（カナ文
字列及び漢字文字列を各々１行に示す）ものであり、一
方、帳票５からみると、２つ（２行分）のフィールド５
２．５３内の記入文字列を表示するものである。(2) The knowledge processing program 3 draws a frame for displaying kanji data, etc. at a predetermined position on the screen of the display device 7, as shown in FIG. 5(F), and displays the kanji data within this frame. This screen is a correction screen for the user to correct the displayed kanji data etc. by inputting from a keyboard or the like. Also, on the correction screen, this frame shows data in one line (kana character strings and kanji character strings are shown in one line each) so that the operator can easily see the data. If you look at it from above, there are two fields (two lines) 5.
2.53 is displayed.

■　知識処理プログラム３が、前記修正画面の枠内に、
ＡＮＫ　（アルファニューメソツタカナ）データを表示
する。これにより、修正画面は、第５図（Ｆ）図示の如
くになる。■ The knowledge processing program 3 displays the following information within the frame of the correction screen.
Display ANK (Alpha Numesotsutakana) data. As a result, the correction screen becomes as shown in FIG. 5(F).

なお、住所辞書１０に該当住所文字列がない場合であっ
ても、修正画面上には、前記枠が開設され９例えば、読
取文字列又は候補文字列等が表示される。Note that even if there is no corresponding address string in the address dictionary 10, the frame is opened on the correction screen 9 and, for example, the read string or candidate string is displayed.

■　利用者による修正後、知識処理プログラム３は、所
定のデータとしての住所データを２例えばデータファイ
ル８に出力する。この時、住所データは、第５図（Ｅ）
図示の如く、レコード情報カナデータ及び漢字データか
らなるレコードの形で出力される。このデータは、デー
タファイル８内にあっては、利用者プログラムが取り扱
い易いように１つのデータ（レコード）として取り扱わ
れるものであり、一方、帳票５からみると、２つ（２行
分）のフィールド５２．５３の文字列を１つにまとめた
ものである。(2) After the modification by the user, the knowledge processing program 3 outputs the address data as predetermined data to the data file 8, for example. At this time, the address data is shown in Figure 5 (E).
As shown in the figure, the record information is output in the form of a record consisting of kana data and kanji data. This data is handled as one data (record) in the data file 8 so that the user program can easily handle it, but when viewed from the form 5, it is divided into two (two lines). This is a combination of the character strings in fields 52 and 53.

〔Effect of the invention〕

以上説明したように１本発明によれば、ＯＣＲ定義体の
持つ継続フィールドの結合に関する情報を用いて結合処
理部が継続フィールドを結合することにより、継続フィ
ールドについても知識処理を可能とすると共に、これに
より帳票上に継続フィールドを設けることを可能とし、
Ｉ！票の形状の制約や帳票上でのフィールドの配置の制
約をなくすことができる。As explained above, according to the present invention, the combination processing unit combines the continuation fields using the information regarding the combination of continuation fields possessed by the OCR definition, thereby making it possible to perform knowledge processing on the continuation fields as well. This makes it possible to set up a continuation field on the form,
I! It is possible to eliminate restrictions on the shape of the form and the arrangement of fields on the form.

[Brief explanation of drawings]

第１図は本発明の原理構成図第２図は本発明の作用説明図。第３図は実施例構成図第４図は継続フィールドの知識処理フロー第５図は継続
フィールドの知識処理の一例を示す図第６図は従来技術説明図。１は処理装置、２は帳票処理部、３は知識処理プログラ
ム、４はアクセステーブル、５は帳票。５１　はカナ住所フィールド、５２及び５３は継続フィ
ールド、６は読取部、７は表示装置、８はブタファイル
、９はＯＣＲ定義体、１０は住所辞書である。特許出願人　株式会社ビーエフニー代理人　弁理士　森田寛（外２名）本発明の原理構成図第１図本発明の作用説明図第２図継続フィールドの知ｉｌｌ処理フ０第４Ｖ（＾）帳票読取り文字列＋候補文字列継続フィールドの知識処理の第　５　図（その１）例を示す図（Ｇ）継続フィールドの知識処理の第　５　図（その２）例を示す図FIG. 1 is a diagram illustrating the basic structure of the present invention. FIG. 2 is a diagram illustrating the operation of the present invention. FIG. 3 is a diagram illustrating the configuration of an embodiment. FIG. 4 is a flowchart of knowledge processing for a continuation field. FIG. 5 is a diagram showing an example of knowledge processing for a continuation field. 1 is a processing device, 2 is a form processing unit, 3 is a knowledge processing program, 4 is an access table, and 5 is a form. 51 is a kana address field, 52 and 53 are continuation fields, 6 is a reading section, 7 is a display device, 8 is a pig file, 9 is an OCR definition body, and 10 is an address dictionary. Patent applicant: BFN Co., Ltd. Agent: Patent attorney Hiroshi Morita (2 others) Principle configuration diagram of the present invention Figure 1 Explanation diagram of the action of the present invention Figure 2 Ill processing of continuation field 4th V (＾) Document reading Figure 5 (Part 1) of knowledge processing of character string + candidate character string continuation field Diagram showing an example (G) Figure 5 (Part 2) of knowledge processing of continuation field Diagram showing an example

Claims

[Claims]

(1) A reading unit (
6), and a knowledge processing program (6) that performs knowledge processing on the character string read by the reading unit (6) and outputs predetermined data.
3), a combination processing unit (11) for combining character strings read from the continuation field (52, 53) is provided, and the knowledge processing program (3) (11) A knowledge processing method for continuous fields, characterized in that knowledge processing is performed based on the combined character strings and the predetermined data is output.

(2) An OCR definition body (9) storing information for processing the form (5) is provided, information regarding the combination of the continuation fields (52, 53) is stored in this, and the combination processing unit (11 ) obtains information regarding the combination of the continuation fields (52, 53) of the OCR definition body (9), and uses this to combine the character strings read from the continuation fields (52, 53). The continuous field knowledge processing method according to claim 1, characterized in that:

(3) Provide an address dictionary (10) for storing address strings,
The reading unit (6) reads the continuation field (52, 53)
The combination processing unit (11) reads the character string entered in the kana address field (51) that is
53) are combined to form a kana address string, and the knowledge processing program (3) accesses the address dictionary (10) based on the third level in the kana address string. , an address character string corresponding to the third level is read from the address dictionary (10), and based on this, address data is output as the predetermined data. Knowledge processing method.