JP2005004395A

JP2005004395A - Business form, method and program for form processing, recording medium where form processing program is recorded, and form processing apparatus

Info

Publication number: JP2005004395A
Application number: JP2003165876A
Authority: JP
Inventors: Masaki Nakagawa; 正樹中川; Hekiran Shu; 碧蘭朱
Original assignee: Tokyo University of Agriculture and Technology NUC
Current assignee: Tokyo University of Agriculture and Technology NUC
Priority date: 2003-06-11
Filing date: 2003-06-11
Publication date: 2005-01-06
Anticipated expiration: 2023-06-11
Also published as: JP4117648B2

Abstract

<P>PROBLEM TO BE SOLVED: To stably read information which is superposed in a dot shape for increasing the amount of information to be superposed. <P>SOLUTION: An active form processing system inputs a form image. According to the inputted form image, entry frame data and handwritten data are separated, and respective entry frame data and an information bar in entry frame data on which specified information is superposed with a code are detected based upon lateral and longitudinal histograms for the form image. Superposed information which is superposed on the detected information bar and includes a property showing the kind of information to be entered by handwriting, the kind of characters to be entered, and a processing instruction after a read is recognized. Character recognition of the handwriting data is carried out according to the recognized property and/or character kind. According to the recognized character information and superposed information, instruction form data for making another application system perform specified processing are generated and/or stored. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、帳票、帳票処理方法、帳票処理プログラム、帳票処理プログラムを記録した記録媒体及び帳票処理装置に係り、特に、窓口やオフィス等で利用される帳票読み取りに利用可能な、帳票側に処理方法が記述された帳票、その帳票を処理するための帳票処理方法、帳票処理プログラム、帳票処理プログラムを記録した記録媒体及び帳票処理装置に関する。
【０００２】
【従来の技術】
従来、紙とコンピュータ処理の橋渡しとして、光学帳票読取り装置や、紙上の筆記から動的に運筆情報を読み取るタブレット、ｅ−ｐｅｎ、Ａｎｏｔｏｐｅｎなどの装置が知られている。しかし、後者は、筆記時に特別なデバイスが必要となる。
また、帳票の記入枠から手書きを抽出するために、記入枠にドロップアウトカラーが用いられることがある。しかし、カラーの使用はコストを高め、現在広く使われているモノクロコピー、モノクロファックスが利用できなくなるという課題がある。印刷コストや読取り装置のコスト、モノクロファックスの利用などの理由から、ドロップアウトを用いた帳票は減少し、モノクロ帳票が主流になりつつある。また、レーザビームプリンタの高精度化や、ＰｏｓｔＳｃｒｉｐｔなどのページ記述言語の一般化により、微細なドットテクスチャからなる帳票を安価なプリンタで印刷することが可能になっている。さらに、帳票作成エディタを作成することにより、誰でも簡単に帳票をデザインすることができる。手書きの帳票は入力が容易で、かつ、証拠として残ることから、電子化が進んでもなくならず、その手軽な電子化のニーズは高まっている。
【０００３】
なお、ドットで構成された記入枠を有する帳票に対する処理方法が提案されている（例えば、特許文献１、非特許文献３参照）。また、帳票に記入された文字を認識し、修正する方法が提案されている（例えば、特許文献２、３参照）。帳票側に処理を記載したアクティブ帳票システムが提案されている（例えば、特許文献１、非特許文献１、２参照）。
また、線や交点の検出から帳票構造を認識する方法が提案されている（例えば、非特許文献４、５参照）。
【０００４】
【特許文献１】
国際公開第０１／９３１８８号パンフレット
【特許文献２】
特開２００１−０５２１１０号公報
【特許文献３】
特開２００１−０５２１１１号公報
【非特許文献１】
島村太郎、朱碧蘭、増田厚司、小沼元輝、櫻田武嗣、黒沼靖、中川正樹、「アクティブ帳票システムの設計と試作」、信学技報、２００２年１２月、ＰＲＭＵ２００２−１３２、Ｖｏｌ．１０２、Ｎｏ．５３１、ｐｐ．１９−２４．
【非特許文献２】
朱碧蘭、島村太郎、中川正樹、「アクティブ帳票処理技術の試作」、電子情報通信学会、信学技報、２００２年１２月、ＰＲＭＵ２００２、Ｖｏｌ．１０２、Ｎｏ．５３１、ｐｐ．２５−３０．
【非特許文献３】
前田陽二、中川正樹、「ペーパインタフェースによる文書編集方式」、情報処理学会論文誌、２０００年、Ｖｏｌ．４１、Ｎｏ．５、ｐｐ．１３０８−１３１６．
【非特許文献４】
Ｔ．Ｗａｔａｎａｂｅ，Ｑ．Ｌｕｏ，ａｎｄＮ．Ｓｕｇｉｅ，「Ｌａｙｏｕｔｒｅｃｏｇｎｉｔｉｏｎｏｆｍｕｌｔｉ−ｋｉｎｄｓｏｆｔａｂｌｅｆｏｒｍｄｏｃｕｍｅｎｔｓ」，ＩＥＥＥＴｒａｎｓ．ｏｎＰａｔｔｅｒｎＡｎａｌｙｓｉｓａｎｄＭａｃｈｉｎｅＩｎｔｅｌｌｉｇｅｎｃｅ，１９９５，Ｖｏｌ．１７，Ｎｏ．４，ｐｐ．４３２−３３４．
【非特許文献５】
Ｌ．Ｙ．ＴｓｅｎｇａｎｄＲ．Ｃ．Ｃｈｅｎ，「Ｒｅｃｏｇｎｉｔｉｏｎａｎｄｄａｔａｅｘｔｒａｃｔｉｏｎｏｆｆｏｒｍｄｏｃｕｍｅｎｔｓｂａｓｅｄｏｎｔｈｒｅｅｔｙｐｅｓｏｆｌｉｎｅｓｅｇｍｅｎｔｓ」，ＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ，１９９８，Ｖｏｌ．３１，Ｎｏ．１０．ｐｐ．１５２５−１５４０．
【０００５】
【発明が解決しようとする課題】
しかしながら、モノクロ帳票では、帳票記入枠と手書きの分離が困難な場合があった。例えば、手書き文字が記入枠と重なった場合や、帳票の汚れ及び読み取りノイズがある場合には、正確に分離できない場合があった。また、記入された文字は、文字認識することが可能となっているが、なお誤読が生じる場合がある。特に、メールアドレスの認識には文脈処理が使えず、また、その後の処理に大きな影響を与えることが想定される。
本発明は、以上の点に鑑み、窓口やオフィス等で利用される帳票読取りに利用可能で、帳票記入枠と記入文字の分離が容易な帳票及び帳票処理方法、帳票処理プログラム、そのプログラムを記録した記録媒体及び帳票処理装置を提供することを目的とする。また、本発明は、記入文字の認識率を高めることを目的とする。さらに、本発明は、装置側でなく帳票側に帳票それ自身の処理方法を記述し、読取装置として汎用機を用いることができ、及び、多種の帳票を一台で処理することを目的とする。
【０００６】
また、本発明は、重畳する情報量を高めるためのドット形状を有する帳票及び帳票処理方法を提供することを目的とする。本発明は、手書きとの重なりや汚損から重畳情報を安定に読み取るためのコーディング方式及びデコーディング方式を提供することを目的とする。また、本発明は、特に誤読しやすい英数字記号などの補助記入方式を提供することを目的とする。さらに、本発明は、エディタで簡易に作成し、プリンタで印刷できる帳票を提供することも目的のひとつである。また、本発明は、記入枠の行間の画像をスキャンする手間を省略し、処理時間を短くすることも目的とする。
【０００７】
【課題を解決するための手段】
本発明は、記入枠に微細なドットで構成されたドットテクスチャを用いることでモノクロ環境下でも手書きとの分離を容易にし、また、そこに手書き記入の属性や読取り後の指示情報を重畳することで完全に帳票側が処理を主導するアクティブ帳票において、次のことを提供する。
記入枠内に情報を重畳するためのコード形状として、ドットテクスチャにより安定かつ大量に情報を重畳できるように、ドット形状を提案する。
【０００８】
また、コード図形の認識方法として、四方向のヒストグラムによる認識方法及び部分的な走査による認識方法を提案する。四方向のヒストグラムによる認識方法では、コードの画像に対し四方向のヒストグラムをとることによって、コードの形状特徴を判断しコードを認識する。部分的な走査による認識方法では、部分的にコード画像を走査することにより、コード図形中の線分があるべき位置ごとに順番に走査し、線分があれば対応の情報ビットを１とし、線分がなければ０とする。
【０００９】
また、記入枠の安定な検出方法として、ドットテクスチャによる記入枠の検出を容易にするために、各ドットに膨張処理をしてから縦および横方向にヒストグラムを取ることで、安定に記入枠を検出する。
帳票内の記入枠の間には空白があるので、空白の処理時間を省いて高速化する。記入枠と手書きの分離をしなくても手書きを含んだままで記入枠の行検出は行えるので（行内の位置検出には手書きが邪魔になるが、行の検出には邪魔にならない）、手書きを分離する前に行検出を行い、一行ずつの画像だけに対してラベリングをすれば、行間でのラベリング処理が省ける。
【００１０】
また、記入枠の境目を検出するときにノイズの影響を未然に防ぐ方法を提案する。縦あるいは横方向へのヒストグラムにより記入枠の位置を検出するとき、ノイズの妨げを避けるために、ヒストグラムの結果に対してある閾値以下の値を０と見なして、ノイズのある場所でも空白として無視する。こうすることにより、ノイズの影響を受けず、記入枠の位置を検出できる。しかし、閾値以下の値を０とみなしているため、記入枠の境のところで閾値以下の情報も無視されてしまい、記入枠の中に重畳した情報を読み取るには悪影響を与える。記入枠の位置を検出する時に、このような情報損失を防止するために、ヒストグラムで閾値以下のところで記入枠の位置を検出してから、さらに、この位置から記入枠の外側の方向にヒストグラムの値を辿って行き、ヒストグラムの値が０より大きい値から０になった位置、あるいは、始めの位置ｆから辿った長さがある閾値を超えてもヒストグラムの値が０にならない場合には始めの位置ｆから外側に閾値分だけずらした位置を記入枠の開始あるいは終了の位置とする。
【００１１】
また、ドットテクスチャの文書が多少傾いてスキャンされてもコード図形を切り出す方法として、ドットテクスチャにおけるドットの行数はドットの列数にくらべて少なく、縦方向へのヒストグラムの重なりは小さいことを利用して、ドットテクスチャの画像に対して縦方向のヒストグラムにとり、情報バーの左から右へ列ごとに切り出す。次に、列ごとの画像に対して横方向のヒストグラムをとり、コードごとの図形を切り出す。
【００１２】
手書きと重なって消された情報レコードを検出して読み飛ばす方法として、情報バーの画像において、左から右へ列ごとのコード図形を切り出していき、前の列と次の列の間隔が想定される幅より大きいときにコード図形と手書きが重なって手書きの分離処理によりコード図形が欠落したと判断して、この後のコード図形に対して、区切り記号が見つかるまで無視する。そうではない場合、列のコードを切り出して認識し処理を行う。
各コード図形に微小なノイズが付着して連結してしまったり微小なノイズをコード図形として処理してしまったりする誤りを防止する。例えば、切り出した図形の大きさが想定されるコード図形より小さい場合はノイズとして無視し、大きすぎる場合には、想定される大きさで分離して認識することにより、各コード図形に微小なノイズが付着して連結してしまったり微小なノイズをコード図形として処理してしまったりする誤りを防止する。
【００１３】
また、情報バーの位置検出誤差への対応方法を提案する。記入枠の情報バーとそれ以外の部分との境界線は正しい情報バーの境界線から誤差を含んで検出される場合がある。この検出された情報バー境界線に従い、コード図形を切り出してデコーディングを行うと、情報バーからはみ出したコード図形の一部を除外して認識することになり、誤認識を招く危険がある。このことを避けるために、以下の（１）から（４）の処理を実行する。（１）検出した情報バー境界線に従い、情報バーの画像に対して縦方向のヒストグラムにより、左から右へコードの列ごとの位置を検出する。（２）列ごとのコード画像に対して、横方向のヒストグラムを走査し、コードごとを切り出す。記入枠の上側情報バーの切出し方向は上から下へ、下側情報バーの切出し方向は下から上へである。この段階から以後、境界線ｂの情報を利用しない。（３）一つのコードを切り出した後、この列において切り出したコード数と走査した長さＬを調べる。情報バーのコード行数をＮとする。もし、この列のコード数がすでにＮに達しているか、あるいは、走査した長さＬがＮに応じた長さまで達していれば、コードの切出しを終了する。そうでなければ、続いてヒストグラムを走査し、コードの切出しを行う。ここで、走査した長さで終了を検出する目的は、コード図形が手書きと重なったりして欠落した場合に、情報バーの領域を越えてコードを探しに行かないようにするためである。（４）列ごとに対して、コードの切り出しが終了した時点で、そのコード数をチェックする。もしＮに達してなければ、コード数のエラーと見なし、この後のコード図形に対して、区切り記号が見つかるまで無視する。
【００１４】
また、メールアドレスの認識率を高める方法を提案する。メールアドレスに使われる文字は情報量の少ない英数字記号であり文字認識率が低く、また、意味のない文字列のために文脈処理が使えない。メールアドレス誤認識において、主な原因は類似文字への誤認識である。もし、この誤認識を避けることができれば、認識率を高めることができる。そこで、メールアドレスの文字種を限定するために、アクティブ帳票の中でメールアドレス記入枠の近くにもう一つの記入枠を設け、メールアドレスに対して対応するマスの文字種を簡単な記号で記入する。認識のしやすさと人の記入しやすさを考慮し、枠内の上や下、または両方に横線を枠に重なって記入できるようにする。
帳票は、複数の微小なドット（黒領域）又は線分を二次元に配置してなるドットテクスチャで文字入力枠が形成された文字入力用の帳票であって、２次元に配列されたドットの集まりで構成され、複数個のドットを含む幅のある線によるドットテクスチャで描かれた見出し文字と、２次元に配列されたドット又は線分の集まりで構成され、複数個のドット又は線分を含む幅のある線によるドットテクスチャで画成された記入枠とを有する。
【００１５】
処理部が、文字が入力された前記帳票から、各ドットが手書きの黒領域より小さく設定されることから、ドットテクスチャを構成する各々のドットが収縮処理により文字画像より先に消えること、又は、定められた閾値より小さい連結成分を除去すること、のいずれかによって、ドットテクスチャによる記入枠画像を除去して、記入された文字の文字画像を検出するための前記帳票を対象とする。
本方式では、微小なドットで構成されたドットテクスチャからなる記入枠を使用することによって、モノクロ環境において、手書きが記入枠に重なっても容易に分離できる。また、記入枠の中に手書きの属性、文字種を埋め込むことで、切り出した手書き文字に対して認識率を高めることができる。手書き文字の内容を属性、見出しの情報と対応付ければ、帳票ごとの設定をなくすことができる。さらに、読取り後の手書きに対する処理方法を埋め込むことで、帳票を読み込ませるだけ帳票に応じた処理を帳票側で指示するシステムが実現できる。いままで処理システム側に必要とされていたプログラムを帳票側に埋め込むことで帳票がアクティブになる。これが「アクティブ帳票」の由縁である。
【００１６】
本発明の第１の解決手段によると、
線分の傾斜又は複数の線分の組み合わせで情報が表現されるコードで構成される記入枠データと、記入された手書きデータとを含む帳票画像データを入力する帳票入力ステップと、
前記帳票入力ステップで入力された帳票画像データに基づき、記入枠データと手書きデータを分離し、及び、各記入枠データと、コードにより所定の情報が重畳された当該記入枠データ内の情報バーとを検出する記入枠検出ステップと、
検出された情報バーの各コードを切り出すステップと、
切り出されたコードの、縦方向と、横方向と、右斜め方向と、左斜め方向との四方向のヒストグラムを求めるステップと、
求められた四方向のヒストグラムに基づき、コードの形状特徴を判断し、コードを認識するコード認識ステップと
を含む帳票処理方法、これら各処理をコンピュータに実行させるための帳票処理プログラム及びそのプログラムを記録したコンピュータ読み取り可能な記録媒体が提供される。
【００１７】
本発明の第２の解決手段によると、
ドットの幅又は線分の傾斜又は複数の線分の組み合わせで情報が表現されるコードで構成される記入枠データと、記入された手書きデータとを含む帳票画像データを入力する帳票入力ステップと、
前記帳票入力ステップで入力された帳票画像データに基づき、記入枠データと手書きデータを分離し、及び、各記入枠データと、コードにより所定の情報が重畳された当該記入枠データ内の情報バーとを検出する記入枠検出ステップと、
前記記入枠検出ステップで検出された情報バーの各コードを認識し重畳情報を抽出する重畳情報認識ステップと、
前記記入枠検出ステップで分離された手書きデータに対して、抽出された重畳情報が示す手書きデータの属性及び／又は文字種類に従い文字認識を行う文字認識ステップと、
前記文字認識ステップで認識された文字情報と、前記重畳情報認識ステップで抽出された重畳情報に基づき、他のアプリケーションシステムに所定の処理をさせるための命令書データを作成する命令書作成ステップと、
前記命令書作成ステップで作成された命令書データを出力する、及び／又は、前記帳票入力ステップで入力された帳票画像データと、当該命令書データとを対応させて記憶するステップと
を含む帳票処理方法、これら各処理をコンピュータに実行させるための帳票処理プログラム及びそのプログラムを記録したコンピュータ読み取り可能な記録媒体が提供される。
【００１８】
また、本発明の第３の解決手段によると、
線分の傾斜又は複数の線分の組み合わせで情報が表現されるコードで構成される記入枠データと、記入された手書きデータとを含む帳票画像データを入力する帳票入力部と、
入力された帳票画像データの記入枠を検出し、及び、記入枠内のコードを認識する処理部と
を備え、
前記処理部は、
前記帳票入力部から帳票画像データを入力する手段と、
前記帳票入力部から入力された帳票画像データに基づき、記入枠データと手書きデータを分離し、及び、各記入枠データと、コードにより所定の情報が重畳された当該記入枠データ内の情報バーとを検出する記入枠検出手段と、
検出された情報バーの各コードを切り出す手段と、
切り出されたコードの、縦方向と、横方向と、右斜め方向と、左斜め方向との四方向のヒストグラムを求める手段と、
求められた四方向のヒストグラムに基づき、コードの形状特徴を判断してコードを認識するコード認識手段と
を有する帳票処理装置が提供される。
【００１９】
本発明の第４の解決手段によると、
ドットの幅又は線分の傾斜又は複数の線分の組み合わせで情報が表現されるコードで構成される記入枠データと、記入された手書きデータとを含む帳票画像データを入力する帳票入力部と、
帳票画像データ及び／又は帳票画像データに基づき作成される命令書データが記憶される記憶部と、
入力された帳票画像データに記入された手書きの情報と、記入枠に重畳された情報を認識し、所定の処理を実行する処理部と、
を備え、
前記処理部は、
前記帳票入力部から帳票画像データを入力する手段と、
入力された帳票画像データに基づき、記入枠データと手書きデータを分離し、及び、各記入枠データと、コードにより所定の情報が重畳された当該記入枠データ内の情報バーとを検出する記入枠検出手段と、
前記記入枠検出手段で検出された情報バーの各コードを認識し重畳情報を抽出する重畳情報認識手段と、
前記記入枠検出手段で分離された手書きデータに対して、抽出された重畳情報が示す手書きデータの属性及び／又は文字種類に従い文字認識を行う文字認識手段と、
前記文字認識手段で認識された文字情報と、前記重畳情報認識手段で抽出された重畳情報に基づき、他のアプリケーションシステムに所定の処理をさせるための命令書データを作成する命令書作成手段と、
前記命令書作成手段で作成された命令書データを出力する、及び／又は、前記入力する手段で入力された帳票画像データと、当該命令書データとを対応させて前記記憶部に記憶する手段と
を有する帳票処理装置が提供される。
【００２０】
また、本発明の解決手段のひとつでは、前記記入枠検出ステップ及び記入枠検出手段は、帳票画像データに対する横方向及び縦方向のヒストグラム又は線分検出並びに／若しくは交点検出に基づき、各記入枠データと、コードにより所定の情報が重畳された当該記入枠データ内の情報バーとを検出する。
【００２１】
さらに、本発明の第５の解決手段によると、
ドットの幅又は線分の傾斜又は複数の線分の組み合わせで情報が表現されるコードにより、所定の情報が重畳された情報レコードが複数の行に渡り配置される情報バーを有し、該コードにより構成される記入枠と、
前記コードより構成され、帳票又は前記情報バーに関する情報を含む水平バーと
を備える帳票が提供される。
【００２２】
【発明の実施の形態】
１．アクティブ帳票とその処理システムの概略
１．１アクティブ帳票
図１は、アクティブ帳票の構成図である。
アクティブ帳票とは、記入枠に微細なドットで構成されたドットテクスチャを用いることで、モノクロ環境下でも記入枠と手書き文字の分離を可能にし、さらにドットを一様でないように分布させることで、記入文字の属性や読取り後の指示情報を記入枠に埋め込むことができる技術に基づいた帳票のことである。
帳票は、水平バー１０と、情報バー３０を有する記入枠２０とを備える。帳票標題の下などに印刷する横線は水平バー１０と呼ぶ。記入枠２０と水平バー１０はドットテクスチャで印刷される。記入枠のドットテクスチャには、記入される手書きの属性（例えば、人名、住所、メールアドレス、自由筆記など）、文字種（例えば、数字、漢字、英文など）、見出しやその処理方法（例えば、手書き文字を認識する、記載内容をメールで送信する、あるいはデータベースに保存するなど）を指示する情報（重畳情報）を記入枠上下側に重畳する。この情報が重畳される部分を情報バー３０という。また、帳票の全体に対する情報を水平バー１０に重畳する。また、記入枠の中で文字を一文字ごとに区切って記入してもらうための分割をマス４０と呼ぶ。
【００２３】
ドットテクスチャによる記入枠では、ドットを水平（行）方向および垂直（列）方向に並べてテクスチャを形成する。記入枠におけるドット行数は、水平バー１０も含めて、すべて同一とすることにより、水平バー１０のドット行数を求めれば、すべての記入枠のドット行数を容易に知ることができる。また、本実施の形態では、ドット以外に線分の傾斜又は線分の組み合わせで情報が表現されるコード図形に重畳情報を重畳することができる。この線分は、ドットを用いて構成してもよい。なお、以下の説明において、コード図形を単にドットと呼ぶ事もある。
【００２４】
１．２アクティブ帳票処理システム
図２は、アクティブ帳票処理システムのハード構成図である。
アクティブ帳票処理システムは、例えば、処理部１と、帳票入力部２と、出力部３と、記憶部５とを備える。また、アクティブ帳票処理システムは、適宜の入力部、表示部４を備えても良い。アクティブ帳票処理システムは、例えば、読み取ったアクティブ帳票の画像データから記載内容や重畳情報を抽出し、命令書に出力する。なお、処理部の処理の詳細については後述する。記憶部５は、帳票画像データ、帳票画像データに基づき作成される命令書データが記憶される。また、記憶部５には、例えば、コードが示す値と情報が対応したテーブルが予め記憶されている。なお、記憶部５は、例えば標準パタンの特徴値と、その標準パタンが示す情報が対応したテーブルを含むことができる。
【００２５】
２．アクティブ帳票の詳細
２．１情報重畳のためのコード形状
アクティブ帳票では、記入枠ごとにその中に記入される手書きの属性、文字種、見出し、処理方法を指示する情報をそのドットテクスチャの中に重畳する。ドットテクスチャに情報を埋め込む方法の一つとしてドットの形状を変化させる方法が考えられる。そこで、情報を表すいくつかのコード形状について述べる。
【００２６】
図３は、大小２種類の図形を用いるコード（大小２種類コード）の構成図である。大きさの違う２種類のドットでバイナリビットの１と０を表し、これに情報を重畳する。２つコードの高さが同じでｈとする。大きいドットの幅をＬとする。コード間の隙間を考えコードを帳票に均等して配置するための領域をコードの配置エリアとする。各コードの配置エリアの大きさは同じで、コードをその中心に置く。各コードの配置エリアを碁盤の目のように縦横に隙間を空けずに配置すれば、コードとしては隙間を確保したドットテクスチャになる。後述する他の種別のコードも同様に、配置エリアの中心に置いて並べることにより、記入枠のドットテクスチャを生成する。
【００２７】
図４は、８方向線分を用いるコード（８方向線分コード）の構成図である。８方向の線分により、０から７までの図４に示す８種類のコードを表すことができる。図４では、各コードの下に示した「０００」、「００１」などの番号は、これらのコードに対応する２進数の番号である。なお、各コードには、図に示す以外にも適宜の番号を対応させることができる。また、８方向以外にも、所定の角度傾斜した線分を用いた多方向とすることもできる。コードを描画するための同じ大きさの正方形領域を描画エリアと呼ぶ。
【００２８】
図５に、描画エリアと配置エリアの関係を示す。描画エリアは配置エリアよりも小さく、配置エリアの中心に位置する。上述と同様に、配置エリアは、碁盤の目のように縦横に隙間を空けずに配置することで、各コードは隙間を確保して配置される。
図６は、２^１２種類の図形を用いるコード（２^１２種類コード）の構成図である。２^１２種類コードは、図６に示すように、１２種類の線分の組合せで構成される。大きさａ×ａの描画エリアを基準とし、これら１２種類（１２本）の線分の有無により２^１２種類、すなわち、４０９６種類のコードを作る。
【００２９】
図７は、２^８種類の図形を用いるコード（２^８種類コード）の構成図である。２^８種類コードは、図７に示すように、８種類の線分の組合せで構成される。大きさａ×ａの描画エリアを基準とし、これら８種類（８本）の線分の有無により２^８種類、すなわち、２５６種類のコードを作る。なお、２^８種類コード、２^１２種類コード以外にも、適宜の線分の組み合わせにより２^ｎ種類コードを作成することができる。
【００３０】
２．２重畳情報のコーディング
ドットテクスチャの微小なドット形状を保証するため、ＰｏｓｔＳｃｒｉｐｔ言語で帳票を作り、ＰｏｓｔＳｃｒｉｐｔプリンタに帳票画像データを送り、帳票を印刷する。大小２種類コードと８方向線分コードのそれぞれのコーディングについて述べる。その他のコードは、基本的に後者のコーディングを同様に利用できる。
【００３１】
２．２．１大小２種類コードにおけるコーディング
図８は、重畳する情報レコードを示す図である。ドットテクスチャの記入枠ごとに図８に示す情報を重畳する。これを情報レコードと呼ぶ。例えば、情報レコードを８＋８＋８＋１６×４＋８＝９６ビットの固定長で表す。情報レコードは、例えば、記入枠の上側に重畳することができる。
図９は、属性とコードが示す値の対応を示すテーブルである。属性とは、その記入枠の中に記入される手書きの属性で、記入枠の見出し文字をカテゴリごとに分類したものである。属性には、帳票の種類、用途に応じてさまざまな種類がある。図９は、実際に使用されている帳票３６種類から見出し文字を抽出した属性の例である。図９に示すように、例えば、１４種類の属性に分類しコーディングすることができる。ただし、属性は適宜のものを用いることができ、これらには限定されない。
【００３２】
図１０は、文字種類及び命令と、コードが示す値との対応を示すテーブルである。文字種類とは、その記入枠の中に記入された手書き文字の字種のことである。命令とは、その記入枠の中に記入された手書きに対する読取り後の処理方法のことである。文字種と命令は８ビットのフラグで表現することができる。例えば、図１０に示すように、文字種類及び命令のそれぞれに対応したビット番号を立てる（例えば、１にする）。なお、文字種類及び命令は、これ以外にも適宜のものを用いることができる。例えば、「認識せよ」との命令は、「枠あり認識せよ」、「枠なし認識せよ」などとしてもよい。見出し文字には、例えば、その記入枠見出しの始り４文字のＳｈｉｆｔ−ＪＩＳコード（合計６４ビット）を埋め込む。終端記号は、８ビットの０を埋め込む。なお、アクティブ帳票処理システムの記憶部５には、例えば、図９及び図１０に示すようなテーブルが予め記憶されている。
【００３３】
図１１は、情報レコードのドット列へのコーディング方法を示す図である。
大小２種類コードを水平（行）に並べ、例えば上記の情報を記入枠の上側に重畳する。記入枠と手書き文字の重なりにより重畳情報が損失する危険があるため、同一の重畳情報を一行のドット並び内で可能な範囲で繰り返し、残りの部分には例えば０を示すドットを入れる。また、２行目以降、先頭に例えば０を示すドットを入れて２ドットずつずらし、上のドット行を繰り返す。つまり、９６ビットの情報レコードを可能な長さだけつなげ、残りの部分には０を示すドットを入れる。これらの重畳情報の多数決をとることで、損失した情報を復元することができる。なお、各行でずらすドット数は、２ドット以外にも適宜のドット数ずらすことができる。また、図１１に示すコーディング方法は、大小２種類コード以外にも、例えば８方向線分コードなどの他のコードに適用することもできる。
【００３４】
２．２．２８方向線分におけるコーディング
図１２は、重畳する情報レコードを示す図である。８方向線分コードは、大小２種類コードの情報の重畳方法を発展して、もっと多くの情報を重畳でき、かつエラーの検出などの効率的な重畳方法を使用することができる。記入枠ごとに図１２に示す情報レコードを重畳する。
属性は、大小２種類コードの重畳と同様とすることができる。なお、８方向線分コードの１符号の情報量が３ビットであるため、表現の都合から、例えば、図９に示した大小２種類コードの属性の末尾（又は先頭）に０をつけ、９ビットとする。文字種と命令は、９ビットのフラグで表現することができる。表現方法は、上述の図１０と同様とすることができる。また、ビット番号９の文字種類として、メールアドレスの認識率を高めるために、そこに用いられる英数字、「＠」、「．」などの字種を示す情報とすることもできる。見出し文字は、全角１文字から１０文字までのＳｈｉｆｔ−ＪＩＳコードで表す。メールアドレスの認識率向上については後述する。図１２の情報レコード（４５〜２０７ビットの不定長）を３ビットごとに区切り、それを表現するコードに変換する。各コードを含む情報レコードを記入枠の上側と下側の情報バーに重畳する。
【００３５】
図１３は、８方向線分コードの重畳の例を示す。図１３に示すように、例えば４行のドットテクスチャを考えると、列ごとに４つのコードを上から下に配置し、それを左から右へコーディングしていく。図１３に示す例では、１、２、３、４、５、６の方向で線分を配置している。情報レコードの始めと終りには区切り記号をつけ、また、終りの前にパリティ符号を置いて、さらに２番目、３番目の情報レコードを繰り返す。これを可能な範囲で繰り返す。２番目以降の情報レコードは、１番目の情報レコードと同一であり、パリティでまずエラー混入があるかどうかを検査し、さらに、複数の情報レコードの多数決をとって、損失した情報を復元することができる。
【００３６】
図１４に、パリティ符号の設定方法の説明図を示す。区切り図形間の対応する位のビットの値を排他的論理和で０になるように、パリティ符号を付加する。例えば、図１４に示すように、「０００」、「００１」、「０１０」を示すデータに対しては、各データの対応するビットの値の排他的論理和が０になるように、パリティ符号として「０１１」を示すコードを付加する。
図１５に、区切り記号を表す図形を示す。区切り記号は、情報レコードごとの境を示す機能を持ち、例えば、図４に示した８方向線分コードの番号を引き継いで８の番号とする。なお、区切り記号は図１５に示す形状以外にも、それと判別できる適宜の形状とすることができる。
【００３７】
２．３コード図形の認識方法
コードを認識するには、以下のいずれかの方法を用いることができる。
２．３．１四方向のヒストグラムによる認識方法
図１６は、四方向のヒストグラムによる認識方法の説明図である。コードの画像（データ）に対し四方向（例えば、縦、横、右斜め、左斜め）のヒストグラムをとることによって、コードの形状特徴を判断しコードを認識することができる。図１６に示すようにコードによって四方向のヒストグラムがはっきり違う。
【００３８】
２．３．２マッチングによる認識方法
全ての種類のコードに対して、大量の画像サンプルを作り、各サンプルの特徴値を抽出する。コード種類ごとに平均特徴値を求め、各コードの標準パタンを作ることができる。標準パタンサンプルの特徴値の抽出と同じ方法で入力コードの特徴値を抽出し、各コードの標準パタンとマッチングし、一番特徴の近い標準パタンを求め、この標準パタンを認識の結果とする。
図１７に、標準パタンサンプルと入力コードの特徴値の抽出の説明図を示す。特徴値の抽出方法としては、文字記号認識において一般的に用いられている方法を用いることができる。まず、線密度を計算し、線密度が均等になるように非線形正規化をし、文字画像のエッジのギザギザを消すための線の連結性を保つ平滑化を行う（図１７の（１））。次に、正規化されたパタンに対して、横、縦と斜めの４つ方向の成分を抽出する（図１７の（２））。それぞれを８×８の区画に分割し、各区画に含まれる方向成分の量を特徴として、それを６４個並べ、さらに４つの方向成分を並べて、６４×４＝２５６次元の特徴ベクトルを得る（図１７の（３））。
【００３９】
図１８に、標準パタンとマッチングすることによるコードの認識の説明図を示す。例えば、図１８の右側に示すような標準パタンの特徴値と、その標準パタンが示す情報が予め記憶部に記憶され、図１８の左側に示す読み取られたコードの特徴値を、標準パタンの特徴値と比較し、コードを認識することができる。
２．３．３部分的な走査による認識方法
図１９は、部分的な走査による認識方法の説明図である。部分的にコード画像を走査することによるコード図形を認識する。コード図形中の線分があるべき位置ごとに順番に走査し、線分があれば対応の情報ビットを１とし、線分がなければ０とする。図１９に示す図は、２^８種類コードの例である。白い部分は線分のない部分、黒い部分は線分のある部分である。図１９では、線分のない部分を見やすくするために背景を灰色にしているが、実際の帳票では、背景と線分のない部分は同色とすることができる。コードを構成する各線は、２ビット数の桁に対応する。これらの線分を順番に走査し、線分があれば対応の桁を１とし、線分がなければ０とする。走査する部分は、コードの描画エリア又は配置エリアが切り出せれば、これらのエリア内の所定位置を走査することができる。順次走査することによって、図１９の例では、２ビットの数「１０１１０１１０」が求まる。１０進数に直すと「１８２」になる。この値が、このコードの値である。
【００４０】
２．４各コード形状についての検討
上述の各コードの大きさ及び各コードの認識実験結果について説明する。
【００４１】
２．４．１実験環境条件
各コードの作成及び認識に用いた処理装置は、Ｐｅｎｔｉｕｍ（登録商標）４の２．２６ＧＨｚのＣＰＵと、１．００ＧＢのメモリを備え、Ｗｉｎｄｏｗｓ（登録商標）ＸＰをＯＳとして搭載した機種である。
プリント精度として、安価なレーザビームプリンタでも印刷できる１２００ｄｐｉを採用した。上述のように、アクティブ帳票は、ＰｏｓｔＳｃｒｉｐｔ言語で帳票コードを作成し、ＰｏｓｔＳｃｒｉｐｔプリンタに帳票コードを送ることにより印刷することができる。プリント精度は高ければ高いほどよい。しかし、あまり高い精度を前提にすると装置コストが高くなる。実験の結果、１２００ｄｐｉの精度でプリントできる丸いドットの最小直径は０．１ｍｍである。
【００４２】
スキャンの精度は、６００ｄｐｉを採用した。帳票を読み取るスキャン精度は、高ければ高いほどいい。しかし、精度が高いと、読み込んだ帳票のサイズが大きくなり、帳票処理の時間が大きくなる。例えば、６００ｄｐｉの精度でスキャンした場合、Ａ４帳票のサイズは４９２４×６８８３の画素数となり、このサイズでの大小２種類コードの普通帳票に対しては、コンピュータの処理時間が２秒弱となる。
また、ドットテクスチャの小さいドットがスキャンにより連結しないようにするため、ドット同士の間に適切な間隔をあける必要がある。プリント時の印刷誤差及び帳票が傾いた状態でのスキャンや二値化等による誤差を含め、実験により６００ｄｐｉのスキャンでは０．２ｍｍのドット間の最小間隔があることが望ましい。このサイズは、プリント精度１２００ｄｐｉで印刷可能である。また、この間隔は、帳票の見た目を損なわない。
【００４３】
２．４．２各コードの検討
図２０は、大小２種類コード及び８方向線分コードにより作られたドットテクスチャの記入枠である。図２０（ａ）に、大小２種類コードにより、作られたドットテクスチャの記入枠を示す。
１２００ｄｐｉの精度でプリントできる最小サイズのドットが０．１ｍｍであることにより、図３に示すコードの高さｈは０．１ｍｍとすることができる。大きいドットと小さいドットを区別するため、帳票が傾いた状態でのスキャンや二値化などの誤差を考慮し、図３に示す大きいドットの長さＬは０．２５ｍｍとすることができる。スキャン精度で述べたように、ドット間の間隔を０．２ｍｍにするため、図３に示した長方形の配置エリアの大きさは幅０．４５×高さ０．３ｍｍすることができる。図２０（ａ）は、この大きさのコードにより作成されたドットテクスチャの記入枠の例である。この例では、記入枠の上側に情報を重畳している。
このコードは２種類だけなので、縦方向のヒストグラムをとってコードの幅を判別することによって簡易に識別することができる。上述の大きさの２種類コードの帳票１枚を認識処理してみた結果、修正インタフェースを呼び出すまでの処理時間は１７８２ｍｓである。また、識別誤りはなかった。
【００４４】
図２０（ｂ）に、８方向線分コードにより作られたドットテクスチャの記入枠を示す。８方向線分コードの線を、１２００ｄｐｉでプリントできる最小サイズである０．１ｍｍの太さとする。６００ｄｐｉのスキャンで０．１ｍｍサイズのピクセル数は３画素である。図４に示す「１００」を示すコードと「１０１」を示すコードを区別するためには、「１０１」のコードの幅（横方向）には４ピクセルが必要である。スキャンする前のサイズに換算すると０．１６８ｍｍである。誤差などを考慮し、「１０１」のコードの幅を０．２ｍｍとすることができる。この場合、描画エリアの正方形のサイズは０．４ｍｍになる。ドット間の間隔を０．２ｍｍに保証すると、配置エリアのサイズは０．６×０．６ｍｍになる。図２０（ｂ）は、この大きさのコードにより作られた記入枠の例である。この例では、図２０（ａ）と同様に、コードを横方向に各行２コードずつずらして記入枠の上側に情報を重畳している。
【００４５】
四方向のヒストグラムをとることにより、簡易に８方向線分コードを識別することができる。比較するため、マッチングによる認識方法と四方向のヒストグラムによる認識方法についてコード認識の処理を試した。マッチングによる方法の標準パタンの辞書は、８種類のコードごとに２０個ずつ合計１６０個の画像をＰｏｓｔＳｃｒｉｐｔ言語で作り、１２００ｄｐｉプリントと６００ｄｐｉスキャンにより得た画像をサンプルとし、特徴をとることにより作成した。
図２１に、四方向ヒストグラムとマッチングによる認識方法を試した結果を示す。認識率は、辞書の作成用に使った全てのサンプル画像に対しての認識率で、両方とも１００％に達している。しかし、処理時間は、ヒストグラムによる認識方法の方が早い。なお、１枚の帳票に対する処理時間は、認識した文字を修正するための修正インタフェースを呼び出すまでの処理時間を示している。
【００４６】
図２２に、２^１２種類コードにより作られたドットテクスチャの記入枠を示す。
このコード形状はかなり複雑になり、コード画像が小さいと組合せの線が接触してしまい、人でも識別できなくなる場合がある。組合せの線が接触しないで人が識別できるまでのサイズのものを作り、マッチングによるデコーディング方法で識別し、９８％以上の認識率を保証できるサイズを見出した。このサイズの描画エリアは０．９×０．９ｍｍである。最小符号間隔の０．２ｍｍを考え、配置エリアは１．１×１．１ｍｍである。なお、このサイズは、プリント精度、スキャナ精度を高精度にすれば、さらに小さくすることも可能である。図２２は、この大きさのコードにより作られた記入枠の例である。この例では、記入枠の上側に情報を重畳している。
【００４７】
このコード形状は、小さくて線がかなり集まっている場合、ヒストグラムによる認識方法では特徴を識別できなくなる。ヒストグラムによる方法により識別するには、上述のサイズのよりも大きなサイズのコードとする必要がある。仮に、コードのサイズを大きくすると帳票の見た目に影響するので、ヒストグラムによる方法を止め、本例では、マッチングによる認識方法と部分的な走査による認識方法を試した。マッチングのための標準パタンの辞書は、４、０９６種類のコードごとに１７個ずつ合計６９６３２個の画像をＰｏｓｔＳｃｒｉｐｔ言語で作り、１２００ｄｐｉプリントと６００ｄｐｉスキャンにより得た画像をサンプルとし、特徴をとることにより作成した。
【００４８】
図２３に、コードの認識結果を示す。認識率は、辞書の作成用に使った全てのサンプルに対しての認識率である。また、１枚の帳票に対する処理時間は、修正インタフェースを呼び出すまでの処理時間である。処理時間と認識率ともに、部分的な走査による認識方法の方がよい。
【００４９】
図２４に、２^８種類コードにより作られたドットテクスチャの記入枠を示す。
２^１２種類コードと同じで、このコードも小さいと組合せの線が接触してしまい、人でも識別できなくなる。組合せの線が接触しないで人が識別できるまでのサイズを作り、マッチングによる認識方法で識別し、９８％以上の認識率を保証できるサイズを見出した。このサイズの描画エリアは０．７８×０．７８ｍｍである。最小符号間隔の０．２ｍｍを考え、配置エリアは０．９８×０．９８ｍｍである。なお、このサイズは、プリント精度、スキャナ精度を高精度にすれば、さらに小さくすることも可能である。図２４は、この大きさのコード図形により作られた記入枠の例である。この例では、記入枠の上側に情報を重畳している。
【００５０】
このコードもヒストグラムによる認識方法で特徴を識別するには、上述のサイズよりも大きくする必要がある。そこで、マッチングによる認識方法と部分的な走査による認識方法を試した。マッチングの標準パタンの辞書は、２５６種類の符号ごとに７個ずつ合計１７９２個の画像をＰｏｓｔＳｃｒｉｐｔ言語で作り、１２００ｄｐｉプリントと６００ｄｐｉスキャンにより得た画像をサンプルとし、特徴をとることにより作成した。
図２５に、認識実験結果を示す。１枚の帳票に対する処理時間は修正インタフェースを呼び出すまでの処理時間である。認識率は辞書の作成用に使った全てのサンプルに対しての認識率で、両方とも１００％に達した。しかし、処理時間は部分的な走査による認識方法の方が早い。
【００５１】
２．４．３コード形状についての比較
図２６は、各種のコード形状の認識結果の比較図である。１ビットあたりのデコード時間は、前に述べた各種コード形状に対する最速な認識方法によるデコーダエンジンに、記入枠の情報バーを渡してからデコードの結果が求まるまでの時間を、情報バーの中に埋め込んだ情報量で割った時間である。２^１２種類と２^８種類のコードは、現実的な大きさではコードの切出しと認識において、他のコードに比べて困難である。また、この二方式のコード形状は大きくなると、人が形を識別できてコードらしくない点も他のコードとの違いである。さらに、後の節で述べる記入枠と手書きの分離方法のうち最高速であるラベリングの方法により、この二方式のコードで作った帳票を処理すると、小さい文字（句読点など）と分離できなくなり、文字と重なる場合に文字の認識に悪影響を与える。他の方法で記入枠と手書きを分離することも可能であるが、分離の処理時間が長くなる。一方、８方向線分コードは大小２種類コードより情報量とデコード速度ともに良い。また、大小２種類コードは簡単でシステムを作りやすい。
【００５２】
３．アクティブ帳票処理のメインフローチャート
図２７に、アクティブ帳票処理システムの流れ図を示す。以下に、図２７の各処理について説明する。アクティブ帳票処理システムは、処理を開始すると、まずアクティブ帳票画像（データ）を読み込む（Ｓ１０１）。次に、アクティブ帳票処理システムは、記入枠と手書きを分離し、記入枠と例えば情報バーの位置などの記入枠構造とを検知する（Ｓ１０３）。アクティブ帳票処理システムは、情報バーの位置に従い、記入枠ごとに重畳された情報を読み取る（Ｓ１０５）。なお、ステップＳ１０３、Ｓ１０５の処理の詳細については後述する。また、アクティブ帳票処理システムは、検出した情報バーの位置、記入枠構造などのデータを適宜のタイミングで記憶部に記憶する。
【００５３】
アクティブ帳票処理システムは、重畳情報の属性により記入された手書きが文字の場合、記入枠の手書き文字の画像に対して、手書き文字の属性と文字種に従いオフライン文字認識を行う（Ｓ１０７）。アクティブ帳票処理システムは、オフライン認識の結果を修正インタフェースを通して修正する（Ｓ１０９）。修正インタフェースは、認識した文字を修正するための処理を実行する。アクティブ帳票処理システムは、アクティブ帳票から抽出した記載内容や重畳情報を、例えば、ＣＳＶファイルの形で出力する（Ｓ１１１）。これを命令書と呼ぶ。また、帳票画像を命令書とを対応させて記憶部に保存する。アクティブ帳票処理システムは、アクティブ帳票処理を終了する。その後、他のアプリケーションシステムは、命令書に従いアクティブ帳票から抽出した情報に対して処理を行う。
【００５４】
４．記入枠と手書きの分離及び記入枠と記入枠構造の検知の処理
４．１処理の流れ
図２８に、記入枠と手書きの分離、記入枠と記入枠構造の検知についてのフローチャートを示す。図２８に示すフローチャートは、図２７に示すフローチャートにおけるステップＳ１０３の詳細処理である。記入枠と手書きの分離をしなくても手書きを含んだままで記入枠の行検出は行えるので（行内の位置検出には手書きが邪魔になるが、行の検出には邪魔にならない）、手書きを分離する前に行検出を行い、一行ずつの画像だけに対してラベリングをすれば（アルゴリズム２とする）、行間でのラベリング処理を省くことができる。処理手順は次の通りである。
【００５５】
まず、アクティブ帳票処理システムは、全帳票画像（データ）に対し横方向へのヒストグラムをとり、行ごとの画像を検出する（Ｓ２０１）。次に、アクティブ帳票処理システムは、行ごと画像に対してラベリング処理を適用し、記入枠と手書きを分離する（Ｓ２０３）。なお、ラベリング処理以外にもモルフォロジー、メディアンフィルタの手法いて、記入枠と手書きを分離することもできる。ラベリング等については、後述する。
さらに、アクティブ帳票処理システムは、行ごとの記入枠だけの画像に対し縦方向へのヒストグラムで記入枠の位置とマス位置を検出する（Ｓ２０５）。アクティブ帳票処理システムは、記入枠ごとの画像に対し横方向へのヒストグラムで記入枠の情報バーの位置を検出する（Ｓ２０７）。なお、アクティブ帳票処理システムは、行及び記入枠の検出の際に、予め定められた閾値以下の値を０と見なして、ノイズのある場所でも空白として無視してもよい。この場合、後述する記入枠検出時の情報損失防止処理を実行するようにしてもよい。
【００５６】
４．２記入枠と手書きの分離方法
４．２．１ラベリングによる手法
ラベリングは、二値画像に対して連結している黒画素の連結成分に同じラベルをつけることをいう。ラベリングの手法には再帰関数を呼び出す方法と往復走査する方法がある。
【００５７】
４．２．２モルフォロジーの手法
帳票画像の収縮と膨張には、それぞれ以下の二つの方法がある。
収縮方法としては、（ａ）画像を走査し白画素にあったら、この画素の８近傍か４近傍の画素も白くする方法、（ｂ）画像を走査し黒画素にあったら、この画素の８近傍か４近傍を調べ、８近傍か４近傍の画素の中で一つでも白であれば、この黒画素を白にする方法のいずれかを用いる事ができる。
膨張方法としては、（ａ）画像を走査し黒画素にあったら、この画素の８近傍か４近傍の画素も黒くする方法、（ｂ）画像を走査し白画素にあったら、この画素の８近傍か４近傍を調べ、８近傍か４近傍の画素の中で一つでも黒であれば、この白画素を黒くする方法のいずれかを用いることができる。
【００５８】
４．２．３メディンアンフィルタの手法
メディンアンフィルタはよく画像の平滑化に用いられる手法である。メディンアンフィルタとは、領域内の濃度の中央値、すなわち３×３の領域であれば、９個の濃度値を低い（または高い）順番に並べ、５番目（中央）の濃度値を、その中心の新しい濃度値とするフィルタのことである。
図２９に、メディアンフィルタの説明図を示す。中心にある目的画素の元の濃度値は２２である。３×３の領域のメディアンフィルタを使う。そのフィルタ中にある９個の濃度値を低い順番に並べると、２２、２２、２３、２４、２４、２４、２５、２６、２７という順番になる。中央値が５番目の値２４なので、目的画素の新しい濃度値を２４とする。２値画像の場合、メディアンフィルタの大きさの二分の一より小さい面積（黒画素数）の黒画素の連結成分では、この連結成分のどの黒画素にメディアンフィルタを適応しても、中央値が必ず白画素値である。したがって、この連結成分が消される。
【００５９】
４．２．４記入枠と手書きの分離
記入枠と手書きの分離方法では、ラベリングによる分離方法、収縮と膨張による分離方法、これらを組み合わせた分離方法、及び、メディアンフィルタによる分離方法が考えられる。
大小２種類コードによる帳票と８方向線分コードによる帳票に対し、記入枠と手書きを分離してみた。いくつかの記入枠と手書きの分離方法を比較するため、これらのサンプル帳票は普通の太さのペンで記入してもらっている。分離した結果では、収縮と膨張による分離方法で記入枠の画像に多量の手書きの跡が残った。きれいに記入枠と手書きを分離できるためには、もっと太めのペンで記入する必要がある。しかし、こうすると収縮と膨張による分離時間が長くなる。また、筆記具を制限することになる。
分離した結果によると、再帰関数を呼び出すラベリングによる分離方法が、一番高速で、かつ、手書きの太さと記入枠ドットの大きさに制限がかからない。したがって、記入枠と手書きの分離方法には再帰関数を呼び出すラベリングの方法が有効である。
図３０に、記入枠と手書きが重なった場合における分離されたき記入枠と手書き文字を示す。記入枠と手書きが重なったことが手書き分離に影響していないことが確認できる。
【００６０】
４．３記入枠と記入枠構造の検出
記入枠を検出するためには、線分検出、交点検出による方法（後述する）、以下で述べるような縦および横方向へのヒストグラムを取る方法などがある。アクティブ帳票方式は、記入枠の検出方法に依存せずに有効である。ただし、以下では、ヒストグラムによる方法により、更なる技術開発を行ったので、ヒストグラムによる記入枠検出について述べる。
【００６１】
４．３．１ヒストグラムでの検出
図３１に、ヒストグラムによる記入枠及び記入枠構造の検出の説明図を示す。図２８に示す各処理を実行することにより、図３１に示すようなヒストグラムが得られる。
まず、図３１（ａ）に示すように、帳票画像全体に横方向へのヒストグラムを取り、得られたヒストグラムに従い一行ずつの記入枠を検出する（Ｓ２０１に対応）。次に、図３１（ｂ）に示すように、検出した行に対して縦方向へのヒストグラムを取り、得られたヒストグラムに従い記入枠の位置と記入枠のマスを検知する（Ｓ２０５に対応）。次に、図３１（ｃ）に示すように、検知された記入枠に対して横方向のヒストグラムを取り、得られたヒストグラムに従い記入枠の情報バー位置を検知する（Ｓ２０７に対応）。
【００６２】
４．３．２記入枠構造の検知方法についての検討
図３２〜図３４は、記入枠構造の検知方法についての説明図である。
図３２に示すように、記入枠の高さと幅が十分に大きいとき、縦方向と横方向へのヒストグラムにより、記入枠の構造がはっきり区別できる。ヒストグラムの最大値と最小値の中間値を求め、この中間値を閾値とし、記入枠のマス位置や情報バーの位置を検出することができる。しかし、図３３に示すように、記入枠の幅や高さが小さい場合、中間値を安定に求めることができない。なお、図３２及び図３３は、見やすさのため記入枠の大きさを拡大縮小して表示しているため、記入枠の大きさの違いが一目ではわかりづらいが、配置されたコードの数から記入枠の大きさの違いがわかる。さらに、図３４に示すような手書きとの重なりにより消された記入枠は、そのままの状態で処理すると誤った結果になることがある。よって、記入枠の構造を識別する時、まず、図３４に示すように記入枠のドットを膨張する。記入枠のドットが完全に接触するまで膨張するのは時間がかかるため、ドットのサイズに従いある程度膨張すればよい。膨張した記入枠に対して、ヒストグラムの中間値を求め、簡易に記入枠の構造を識別することができる。大小２種類コードによる記入枠は、ドットのサイズが小さいため、膨張処理を行わなくても構造を認識できる。８方向線分コードによる記入枠は、ドットのサイズと形状によりドットテクスチャがまばらになる。ヒストグラムにより記入枠構造を認識するには、膨張処理を行うことが好ましい。
【００６３】
４．３．３空白を考慮した処理時間短縮
図３５は、６種類の大小２種類コードと８方向線分コードの帳票に対し、帳票の空白を考慮せずに処理する方法（アルゴリズム１）と、空白を考慮して処理する方法（アルゴリズム２）を適用した処理結果である。帳票内の記入枠の間には空白があるので、上述のフローチャートに示す処理では、ラベリングでの空白の処理時間を省き帳票の処理時間が早くしている。
アルゴリズム２の処理は、図２８に示す本実施の形態に係る処理と同様である。アルゴリズム１の処理手順は以下の通りである。まず、全帳票画像に対しラベリングし記入枠と手書きを分離する。次に、記入枠だけの画像に対し横方向へのヒストグラムで行を検出する。さらに、行ごとの記入枠画像に対して縦方向へのヒストグラムで記入枠の位置とマス位置を検出する。行ごとの記入枠画像に対して横方向へのヒストグラムで記入枠の情報バー位置を検出する。アルゴリズム１では、全帳票に対してラベリングしているのに対し、アルゴリズム２では、検出された行に対してのみラベリングする点が異なる。記入枠の密度によらず、アルゴリズム２、すなわち行間の空白を考慮した方法のほうが高速である。なお、２つのアルゴリズムによる処理結果の精度は同じである。
【００６４】
４．４記入枠検出時の情報損失防止処理
図３６に、記入枠検出時の情報損失防止処理の説明図を示す。
横あるいは縦方向へのヒストグラムにより記入枠の位置を検出するとき、ノイズの妨げを避けるために、ヒストグラムの結果に対して予め定められた閾値以下の値を０と見なして、ノイズのある場所でも空白として無視する。図３６では、記入枠の上外側にたくさんのノイズがあるが、閾値処理をすることにより、ノイズの影響を受けず、記入枠の位置ｆを検出できる。しかし、閾値以下の値を０とみなしているため、記入枠の境のところで閾値以下の情報も無視されてしまい、記入枠の中に重畳した情報を読み取るには悪影響を与える。したがって、記入枠の位置を検出する時に、このような情報損失を防止するために、ヒストグラムで閾値以下のところで記入枠の位置ｆを検出してから、さらに、この位置ｆから記入枠の外側の方向（例えば、図３６の例では上方向）にヒストグラムの値を辿って行き、ヒストグラムの値が０より大きい値から０になった位置、あるいは、始めの位置ｆから辿った長さがある閾値を超えてもヒストグラムの値が０にならない場合には始めの位置ｆから外側に閾値分だけずらした位置を記入枠の開始あるいは終了の位置とする。こうすることにより、正しく記入枠の開始あるいは終了の位置を検出することができるとともに、記入枠外のノイズを避けることもできる。なお、図３６は、行の検出（上下に走査）の例であるが、列の検出（左右に走査）についても同様とすることができる。
【００６５】
５．デコーディング
５．１処理の流れ
次に、図２７のステップＳ１０５の処理（デコーディング）について説明する。
図３７に、大小２種類コードのデコーディングの流れ図を示す。図３７に示すフローチャートは、情報レコードが情報バーの横方向に配置された帳票に対するデコーディングである。
まず、アクティブ帳票処理システムは、ステップＳ１０３で検知された記入枠の情報バーに対して、横と縦方向にヒストグラムをとり、行ごとのドットを切り出す（Ｓ３０１）。次に、アクティブ帳票処理システムは、切り出された各ドットの縦方向のヒストグラムに従い、ドットの長短を測り、ドットが示す情報が１か０かを求める。例えば、アクティブ帳票処理システムは、長いドットを１、短いドットを０とする。そして、アクティブ帳票処理システムは、所定ビット、例えば２ビットずつの行のずらしを戻す（Ｓ３０５）。アクティブ帳票処理システムは、記入枠の情報バーから大量の情報レコードを求める（Ｓ３０７）。例えば、アクティブ帳票処理システムは、記入枠の情報バーから可能な限りの情報レコードを求めることができる。また、アクティブ帳票処理システムは、予め定められた数の情報バーを求めるようにしても良い。アクティブ帳票処理システムは、情報レコードの値の多数決を取ることにより重畳情報レコードを求める（Ｓ３０９）。多数決を取ることにより、エラーを回復できる。なお、上述の説明では、大小２種類コードについて説明したが、横方向に配置された他のコードを有する帳票に対しても同様とすることができる。
【００６６】
次に、８方向線分コードのデコーディングについて説明する。８方向線分コードの切出しにおいてはエラー防止と検出を行い、効率的なデコーディング方法を行う。なお、８方向線分コードの大きさは、上述のサイズとして説明する。また、実験により、描画領域が０．３５×０．３５ｍｍ、０．４５×０．４５ｍｍ、０．５×０．５ｍｍ、０．６×０．６ｍｍのサイズにおいても、デコーディングが可能であることが確認済みである。コードの大きさは大きければ大きいほど認識率が高いが、小さければ小さいほど帳票の見た目がよい。８方向線分コードのデコーディングでは、ドット同士間の必要な間隔０．２ｍｍをあけ、好きなコードサイズで帳票を作れるようにしたため、自動的にコードの大きさを検出しこの大きさに合わせたデコーディングを行う。また、８方向線分コードは区切り記号による不定長の情報レコードと情報レコードごとにパリティ符号を使うため、情報レコード長のエラーと情報レコードの内容のエラーを検出することができる。以下に、これらの処理の詳細について述べる。
【００６７】
図３８に、８方向線分コードのデコーディングの流れ図を示す。図３８に示すフローチャートは、コードとパリティ符号と区切り記号とが情報バーの縦方向に配置された帳票に対するデコーディングである。
まず、アクティブ帳票処理システムは、情報レコードをセットする（３７−０１）。例えば、ステップＳ１０３で検出された情報バー又はその一部を記憶部から読み出す。次に、アクティブ帳票処理システムは、情報レコードの先頭の区切り記号を求める。具体的には、以下のステップ３７−０２〜３７−０６の処理を実行する。まず、アクティブ帳票処理システムは、画像の終端か判断する（３７−０２）。例えば、アクティブ帳票処理システムは、次に切り出す対象となる画像（データ）が存在するか判断する。アクティブ帳票処理システムは、画像の終端の場合（３７−０２）、ステップ３７−１７の処理へ移る。一方、アクティブ帳票処理システムは、画像の終端でない場合（３７−０２）、アクティブ帳票処理システムは、コードの図形を切り出す（３７−０３）。例えば、アクティブ帳票処理システムは、縦方向及び横方向のヒストグラムの区切れを判断し、その幅に基づきコードの図形を切り出す。アクティブ帳票処理システムは、切り出しエラーか判断する（３７−０４）。切り出しエラーとしては、例えば、手書きの重なりによりコードが消されるなどの理由によりコードが切り出せないことや、切り出したコードが所定の位置になく、他のコードが消されていることが考えられる。アクティブ帳票処理システムは、切り出しエラーの場合（３７−０４）、ステップ３７−１４の処理へ移る。一方、切り出しエラーではない場合（３７−０４）、切り出したコードの図形を認識する（３７−０５）。アクティブ帳票処理システムは、認識したコードが先頭区切り記号か判断する（３７−０６）。アクティブ帳票処理システムは、認識したコードが、先頭区切り記号の場合ステップ３７−０７の処理へ移り、先頭区切り記号ではない場合はエラーとみなし、ステップ３７−１４の処理へ移る。
【００６８】
ステップ３７−１４〜３７−１６の処理では、次の終端区切り記号を見つける。まず、アクティブ帳票処理システムは、次のコードの図形を切り出す（３７−１４）。アクティブ帳票処理システムは、切り出したコードを認識する（３７−１５）。アクティブ帳票処理システムは、認識したコードが終端区切り記号か判断する（３７−１６）。なお、先頭区切り記号と終端区切り記号は、同じ記号を用いることもできる。アクティブ帳票処理システムは、認識したコードが終端区切り記号ではない場合、ステップ３７−１４に戻る。アクティブ帳票処理システムは、区切り記号を見つけるまで、ステップ３７−１４〜３７−１６を繰り返すことになる。一方、終端区切り記号の場合、ステップ３７−０１の処理へ移る。
一方、アクティブ帳票処理システムは、ステップ３７−０６で認識したコードが先頭区切り記号の場合、情報レコードのデータを求める。具体的には、以下のステップ３７−０７〜３７−１３の処理を実行する。
【００６９】
まず、アクティブ帳票処理システムは、画像の終端か判断する（３７−０７）。アクティブ帳票処理システムは、画像の終端の場合（３７−０７）、ステップＳ３７−１７の処理へ移る。一方、アクティブ帳票処理システムは、画像の終端でない場合（３７−０７）、次のコードの図形を切り出す（３７−０８）。アクティブ帳票処理システムは、切り出しエラーか判断する（３７−０９）。アクティブ帳票処理システムは、切り出しエラーの場合（３７−０９）、ステップ３７−１４の処理へ移る。一方、切り出しエラーではない場合（３７−０９）、切り出したコードの図形を認識する（３７−１０）。アクティブ帳票処理システムは、認識したコードが終端区切り記号か判断する（３７−１１）。アクティブ帳票処理システムは、認識したコードが終端区切り記号ではない場合（３７−１１）、セットした情報レコードと対応させて記憶部に認識結果を格納し（３７−１２）、ステップ３７−０７の処理へ戻る。アクティブ帳票処理システムは、３７−０７以降の処理を繰り返して、終端区切り記号を見つけるまでコードを認識し、情報レコードのデータを求める。一方、アクティブ帳票処理システムは、認識したコードが終端区切り記号である場合（３７−１１）、一つの情報レコードの処理を終了し（３７−１３）、ステップ３７−０１に戻り、さらに次の情報レコードに対する処理を実行する。
【００７０】
ステップ３７−１７では、アクティブ帳票処理システムは、情報レコードに重畳され多情報を求める処理を終了し、ステップ３７−１８の処理へ移る。アクティブ帳票処理システムは、情報レコードの長さをチェックし、長さがあり得る範囲（例えば、図１２に示す例では４５〜２０７ビット）に入っていない情報レコードを削除する（３７−１８）。アクティブ帳票処理システムは、情報レコードの長さの多数決をとることにより情報レコードの長さを求める（３７−１９）。アクティブ帳票処理システムは、情報レコードの長さをチェックし、正しくない情報レコードを削除する（３７−２０）。アクティブ帳票処理システムは、情報レコードのパリティ符号をチェックし、正しくない情報レコードを削除する（３７−２１）。アクティブ帳票処理システムは、残った情報レコードの値の多数決を取ることにより、記入枠の重畳情報レコードを求める（３７−２２）。また、アクティブ帳票処理システムは、求めた情報レコードを適宜記憶部に記憶し、処理を終了する。
【００７１】
５．２コードの大きさと情報バーのコード行数の検出
図３９は、描画エリアの大きさ算出の説明図である。図３９に示す水平バーには、一例として４行のコードが配置されている。コードの切り出しを行うため、水平バーを使ってコードの大きさと情報バーのコード行数を検出する。
まず、水平バーに対して縦方向のヒストグラムをとり、図３９に示すように、各コード図形列の中心位置を検知する。次に、各コード図形列の中心間の距離を求め、これらの距離に対して平均を求める。この結果はコード図形の配置エリア正方形の辺長である。配置エリアの辺長からドット間隔のサイズ０．２ｍｍ（６００ｄｐｉスキャンで４画素）を引くと、描画エリア正方形の辺長が求まる。
【００７２】
本実施の形態のおける帳票は、水平バーのコード行数が記入枠の情報バーのコード行数と同じであると設定しているため、水平バーに対してコード行数を検出することにより、記入枠の情報バーのコード行数を求めることができる。帳票に傾きがあると、コード行数の検出は水平バー全体に対して行うのは難しい。したがって、コードの大きさの検出時に水平バーに対して取った縦方向のヒストグラムに従い、何列かのコード図形だけに対して、横方向のヒストグラムを取ることにより、記入枠の情報バーのコード行数を検出することができる。なお、情報バーのコード行数は、水平バーのコード行数を検出することにより求める以外にも、水平バーを構成するコードにコード行数を示す情報を重畳し、そのコードを認識することにより求めるようにしてもよい。
【００７３】
５．３コード図形の切出し
次に、コード図形の切り出し（例えば、図３８のステップ３７−０３、３７−０８、３７−１４）について説明する。
【００７４】
５．３．１ヒストグラムによるコード図形の切出し方法の必要性
図４０は、コードの大きさによるコード切り出しの例である。
コードの大きさが検出できれば、コードの大きさを一定長としてコードを切り出す方法が考えられる。しかし、何らかの誤差（例えば、微少な誤差の累積）があるため、コードの大きさを一定長として正しくコード図形を切り出すことができない場合がある。図４０に示すように、コードの大きさＢＣを一定長として列ごとのコードを切り出し、縦線を引いて切出しの結果をシミュレーションしてみると、始めの何列かは正しく切り出すことができるが、後ろの列になると誤差が大きくなるために正しく切り出すことができない場合がある。本実施の形態にかかるアクティブ帳票処理システムにおいては、コード図形の切出しは、コードの大きさを一定長としての切り出すのではなく、ヒストグラムによる切出しを行うことで、正確なコード切り出しを実現している。
【００７５】
５．３．２コード図形の切出し方向
まず、情報バーの画像に対して横方向のヒストグラムにとり、行ごとを切り出す。次に、行ごとの画像に対して縦方向のヒストグラムをとり、コードごとの図形を切り出す。傾きのない情報バーの場合、この切出し方向で容易にコード図形を切り出すことができる。しかし、図４１に示すように、傾いてスキャンされた情報バーの場合、この切出し方向では容易にコード図形を切り出すことができない。しかし、このような場合でも、列数にくらべ行数が少なく縦方向へのヒストグラムの重なりが小さいことに注目し、この性質を利用して情報バーが多少傾いていても正しくコードごとの図形を切り出すことができる。例えば、アクティブ帳票処理システムは、まず、情報バーの画像に対して縦方向のヒストグラムにとり、情報バーの左から右へ列ごとを切り出す。次に、列ごとの画像に対して横方向のヒストグラムをとり、コードごとの図形を切り出す。
【００７６】
５．３．３手書きの重なった情報レコードの検出と処理
次に、切り出しエラー（図３８の３７−０３、３７−０８、３７−１４に対応）について説明する。手書きと記入枠の重なりにより、分離した記入枠では手書きと重なる部分が消される。これを効率よく検出し、エラーのある情報レコードに対して、処理を行わない方法をとっている。図３８の３７−０３と３７−０８のコード図形の切出しにおいて、切り出されたコードに対してこのようなエラーを検出し、図３８の３７−０４と３７−０９から図３８の３７−１４、３７−１５、３７−１６の繰返し処理に移り、次の区切り記号を見つけるまで、コードを読み飛ばす。つまり、エラーのある情報レコードに対して処理を行わない。情報バーの画像において、左から右へ列ごとのコード図形を切り出していき、前の列と次の列の間隔が想定される幅より大きいときにコード図形と手書きが重なって手書きの分離処理においてコード図形が欠落したと判断する。
【００７７】
図４２は、手書きの重なった情報レコードの検出と処理の説明図である。図４２は記入の情報バーの最左部分であり、仮想的に小さい正方形でコードの描画エリアを表す。描画エリアを均一的に０．２ｍｍの最小間隔（４画素）を空けて並べることで、情報バーのドットテクスチャを形成している。描画エリア正方形の辺長ＢＢは上述の方法により水平バーから求められる。列の中心位置Ｓ＋描画エリア辺長ＢＢ／２の位置を列の最大右位置Ｒ２とし、列の縦方向ヒストグラムの実際の右位置をＲ１とする。
【００７８】
各列のコード図形を切り出す時、この列の右位置Ｒ１と基準ａの距離ｄを求め、距離ｄに従い、手書きの重なりエラーの有無を判断する。まず、基準位置ａの求め方について説明する。情報バーの画像において、左から右へ列ごとのコード図形を切り出していく。最初、基準位置ａの初期位置は情報バーの最左の位置とする。列ごとのコード図形を切り出していく度に、基準位置ａを更新していく。基準位置ａの新しい位置は、切り出した列の中心位置Ｓ＋描画エリア辺長ＢＢ／２である。つまり、基準位置ａは、初期値以外は、いつもコード列の最大右位置Ｒ２を指している。基準位置ａは列の中心位置を基準として変化していくので、始めの位置がノイズなどにより正しくなくても構わない。基準位置ａに従う手書きの重なりエラーの検出の具体的処理を以下に示す。
【００７９】
まず、アクティブ帳票処理システムは、新しいコード列を切り出すごとに、この列の右位置Ｒ１と基準ａの距離ｄを求める（例えば、ステップ３７−０３に対応）。次に、アクティブ帳票処理システムは、距離ｄが描画エリア辺長ＢＢより６画素（描画エリアの間隔４画素＋マージンとしての２画素）以上大きい場合（例えば、ステップ３７−０４に対応）、手書きの重なりエラー（切出しエラー）と見なして、この後のコード図形に対して、区切り記号が見つかるまで無視する（ステップ３７−１４〜３７−１６に対応）。一方、アクティブ帳票処理システムは、距離ｄが描画エリア辺長ＢＢより６画素以上大きくない場合、列のコードを切り出して認識する処理（例えば、ステップ３７−０５、３７−１０に対応）を行う。
【００８０】
５．３．４コード図形におけるエラー
図４３は、コード図形におけるエラーの例を示す図である。
図４３に示すように、ノイズによってコード図形が接触し、形や大きさが異常になる場合がある。例えば、横方向の連結によるコード図形横サイズエラー、縦方向の連結によるコード図形縦サイズエラー、ノイズによるコード図形縦横サイズエラー、情報バー位置誤差による横方向のコード図形横サイズエラーなどがある。これらの異常に対して、切り出した図形の大きさが想定されるコード図形より小さい場合はノイズとして無視し、大きすぎる場合には、想定される大きさで分離して認識することにより、各コード図形に微小なノイズが付着して連結してしまったり微小なノイズをコード図形として処理してしまったりする誤りを防止する。
【００８１】
列ごとのコード図形を切り出す時、列の幅をチェックし、もし列の幅が描画エリア幅ＢＢ＋４（画素）より大きければ、コード図形の横サイズエラーと判断し、基準位置ａからＢＢ＋４の幅でこの列を分割して切り出す。８方向線分コードは線分で構成され、上述のように線の太さが０．１ｍｍ、６００ｄｐｉスキャンでのサイズが約３画素である。誤差を考慮し、コードの線太さの閾値を、例えば５画素とする。コード図形を認識できるために、描画エリアの幅ＢＢは必ず線の太さより大きい。したがって、コードの幅と高さが両方とも線太さの５画素より小さいことはあり得ない。コードごとの図形を切り出す時、コードの幅と長さをチェックし、もしコードの幅と長さの両方とも５画素以下であれば、コード図形の縦横サイズエラーと判断し、この図形を無視する。また、コード図形の長さが描画エリア幅ＢＢ＋４より大きければ、コード図形の縦サイズエラーと判断し、この図形の最上からＢＢ＋４の幅で分割して切り出す。
【００８２】
こうすることにより、各コード図形に微小なノイズが付着して連結してしまったり、微小なノイズをコード図形として処理してしまったりする誤りを防止できる。
５．３．５情報バーの位置検出誤差によるコード図形損傷への対応
図４４〜図４６は、情報バーの位置検出誤差によるコード図形損傷への対応処理の説明図（１）〜（３）である。
記入枠の情報バーの位置を検出するとき、図４４に示すように、例えば上述のステップＳ２０７で検出された情報バーとそれ以外の部分の境界線ｂは、正しい情報バーの境界線となんらかの誤差がある場合がある。この情報バー境界線ｂに従い、コード図形を切り出してデコーディングを行うと、情報バーからはみ出したコード図形の一部を除外して認識することになり、誤認識を招くおそれがある。このことを避けるために、アクティブ帳票処理システムは、以下のようにコードの切り出しの処理（図３８のステップ３７−０３、３７−０８、３７−１４に対応）及び切り出しエラーの処理を実行することができる。
【００８３】
アクティブ帳票処理システムは、検出した情報バー境界線ｂに従い、情報バーの画像に対して、例えば、最上のコードから境界線ｂまでのデータに対して、縦方向のヒストグラムにより、左から右へコードの列ごとの位置を検出する。次に、アクティブ帳票処理システムは、図４５示すように、列ごとのコード画像に対して、横方向のヒストグラムを走査し、コードごとを切り出す。図４５に示すように、記入枠の上側情報バーの切出し方向は上から下へ走査する。一方、下側情報バーの切出し方向は下から上へ走査する。この段階から以後、境界線ｂの情報を利用しない。
【００８４】
アクティブ帳票処理システムは、列ごとに一つづつコードを切り出した後、この列において切り出したコード数と走査した長さＬを調べる。水平バーから求められた情報バーのコード行数をＮとすると、この列のコード数がすでにＮに達しているか、及び／又は、走査した長さＬがＮに応じた長さまで達していれば、コードの切出しを終了する。そうでなければ、続いてヒストグラムを走査し、コードの切出しを行う。ここで、走査した長さで終了を検出する目的は、コード図形が手書きと重なったりして欠落した場合に、情報バーの領域を越えてコードを探しに行かないようにするためである。
アクティブ帳票処理システムは、列ごとに対して、コードの切り出しが終了した時点で、そのコード数をチェックする。もしＮに達してなければ、コード数のエラー（図３８に示す切出しエラー）と見なし、この後のコード図形に対して、区切り記号が見つかるまで無視する。
【００８５】
コード図形の切り出し段階において境界線ｂを用いないことによって、検出した情報バー境界線からはみ出したコード図形を正しく認識することができる。この方法によりコード図形を切り出した結果を図４６に示す。図４６では、便宜上切り出されたコード枠をつけて表示している。記入枠上段の情報バーでは、境界線ｂの誤差の影響を受けずに、コードが正しく切り出されていることが示されている。
【００８６】
５．４８方向線分コードの認識
図４７は、８方向線分コードの認識についての説明図である。
次に、コード図形の認識（図３８の３７−０５、３７−１０、３７−１５に対応）について説明する。
コードの大きさが検出できれば、コードの大きさに合わせるデコーディング方法を行うことができる。コードの描画エリア正方形の辺長はＢＢである。コード図形に対して、四方向へのヒストグラムをとる。図４７に示すように、四方向へのヒストグラムの幅を別々にｌ、ｘ、ｓ、ｈとする。エラーなどの誤差を考慮し、コードの線太さの閾値を５画素とする。
【００８７】
図４８は、コード認識のアルゴリズムである。また、図４９に、コードを認識のプログラム例を示す。
まず、アクティブ帳票処理システムは、ｈ≦５か判断する（Ｓ５０１）。アクティブ帳票処理システムは、ｙｅｓのときコード＝０とし、ｎｏのとき、ｈ≧ＢＢ−２か判断する（Ｓ５０２）。アクティブ帳票処理システムは、ｙｅｓの時ステップＳ５０３へ移り、ｎｏの時ステップＳ５０８へ移る。アクティブ帳票処理システムは、ｘ≧ＢＢ−２か判断する（Ｓ５０３）。アクティブ帳票処理システムは、ｙｅｓの時ステップＳ５０４へ移り、ｎｏの時ステップＳ５０６へ移る。アクティブ帳票処理システムは、ｓ≦５か判断する（Ｓ５０４）。アクティブ帳票処理システムは、ｙｅｓの時コード＝２とし、ｎｏの時ｌ≦５か判断する（Ｓ５０５）。アクティブ帳票処理システムは、ｙｅｓの時コード＝６とし、ｎｏの時コード＝８とする。
【００８８】
ステップ５０６では、アクティブ帳票処理システムは、ｘ＞５か判断する（Ｓ５０６）。アクティブ帳票処理システムは、ｎｏの時コード＝４とし、ｙｅｓの時ｓ＜ＢＢ−２か判断する（Ｓ５０７）。アクティブ帳票処理システムは、ｙｅｓの時コード＝３とし、ｎｏの時コード＝５とする。
ステップ５０８では、アクティブ帳票処理システムは、ｓ＜ＢＢ−２か判断する（Ｓ５０８）。アクティブ帳票処理システムは、ｙｅｓの時コード＝１とし、ｎｏの時コード＝７とする。
なお、上述の判断以外にも、コードの大きさ、線の太さに応じた適宜の判断をすることができる。また、四方向ヒストグラムを用いる以外にも、上述の部分的走査、マッチングによりコードを認識することもできる。
【００８９】
５．５情報レコード長エラーの検出とその処理
次に、情報レコード長エラーの検出とその処理（図３８の３７−１８、３７−１９、３７−２０に対応）について説明する。
８方向線分コードは、区切り記号により、情報レコードの長さエラーの検出を行うことができる。上述のコーディング方法で述べたように、８方向線分コードの情報レコードの長さは不定長であるが、パリティの記号を含め、コード図形の数は１６個から７０個（情報レコードは、４５〜２０７ビット）までの間にある。コード図形を認識する時、区切り記号の認識を間違うと、情報レコードの長さは１６から７０までの間になっていない場合がある。その場合、次の区切り記号までを読み飛ばして、これらの情報レコードを削除する（ステップ３７−１８）。
【００９０】
また、情報バーの全部のコードに対して、処理し認識を終えたら、複数のレコードが求まる。これらのレコードの長さがすべて同じでない場合は、それぞれの長さの多数決をとり、最多のレコード長を正しい長さとする（ステップ３７−１９に対応）。また、情報レコードの長さをチェックし、その長さが多数決により求められた長さと異なる情報レコードを削除する。残った情報レコードに対して次のパリティチェックを行う。
【００９１】
５．６パリティ符号のチェック
情報レコードに対して、パリティ符号のチェックを行う（図３８の３７−２１に対応）。例えば、情報レコードごとに、同じ位のビットでの１の数が偶数になっているかいないかをチェックする。なっていれば、この情報レコードが正しく、なっていなければ、正しくないと判断しこの情報レコードを削除する。
【００９２】
５．７記入枠の重畳情報レコードの取得
以上で述べた方法により求めた情報レコードに対して、ビットごとに多数決を取る（図３８の３７−２２に対応）。こうすることにより、記入枠ごとに対して、確信度の強いデータを求めることができる。
【００９３】
６．メールアドレス認識率を高める方法
図５０は、メールアドレスの認識率を高めるための記入枠である。
メールサーバによってメールアドレスの表記法が違い、メールアドレスの文字種も違うが、一般的に使われるメールアドレス形式はドメインアドレス形式であり、その文字種は「＠」、英数字、「．」（ピリオド）、「−」（ハイフン）、「＿」（アンダーバー）である。アクティブ帳票処理では、記入枠に重畳した情報に従い、手書きの内容がメールアドレスの場合にはこれらの文字種に限定することによって認識率を高めることができる。
【００９４】
しかし、これらの文字種は、オフライン文字認識においては一般に低い認識率となり、メールアドレスの認識率がこの認識率に左右されることになる。字種を限定しても、英文大文字、英文小文字と数字などの字種が互いに影響しあい、誤認識になる傾向がある。また、メールアドレスは意味の通じない文字列なので、認識率を高める文脈処理が行えない。これも、メールアドレス認識率を高められない原因である。しかし、帳票処理システムにとっては、メールアドレスの認識は間違うと大きい過ちを犯す危険性があるので、極めて重要なことである。メールアドレス誤認識において、主な原因は類似文字への誤認識である。もし、この誤認識を避けることができれば、認識率を高めることができる。
【００９５】
図５０に示すように、メールアドレスの文字種を限定するために、例えばアクティブ帳票の中でメールアドレス記入枠の下に、もう一つの記入枠を設ける。この記入枠の中で、上のメールアドレスに対して対応するマスの文字種を記入する。認識のしやすさと人の記入しやすさを考慮し、例えば、以下のような符号を記入する。
（１）￣：英文大文字（枠に重なって記入できるようにする）
（２）＿：英文小文字（枠に重なって記入できるようにする）
（３）＝：数字（枠に重なって記入できるようにする）
（４）空白：その他
【００９６】
縦方向および横方向へのヒストグラムをとること、あるいは、枠内で３種の記号をマッチングすることによりこれらの符号を容易に、かつ、確実に認識できる。こうすることにより、メールアドレスの文字種間の相互的な影響を避けることができ、認識率が高められる。図５０に示すように、これらの記号を連続するマスにまたがって記入できるようにする。これにより、文字単位ではなく文字列単位に記入できるので指示が簡単になる。認識のためにはマスの中だけを見れば良い。なお、文字種を限定するための記入枠の情報レコードには、文字種類としてメールアドレスの認識を高めるための上述の符号が記入されることを示す適宜のデータを含むことができる。
【００９７】
７．アクティブ帳票処理システムの試作
以上で述べた技術を使いアクティブ帳票処理システムを試作した。以下、その概略について述べる。
【００９８】
７．１開始と帳票読み込み
アクティブ帳票システムを立ち上げると、［ｉｍａｇｅＳｃａｎ］、［ｆｉｌｅ］、［ｉｍａｇｅＳｉｚｅ］、［設定］等のメニューと、実行ボタン等が表示される。例えば、アクティブ帳票処理システムは、［ｉｍａｇｅＳｃａｎ］メニューが操作されることにより、スキャナを通じて帳票を読め込む。また、アクティブ帳票処理システムは、［ｆｉｌｅ］メニューから既存の帳票画像を開くこともできる。アクティブ帳票処理システムは、読み込んだ帳票画像を表示することができる。また、［ｉｍａｇｅＳｉｚｅ］メニューを通じて、画像の表示サイズを変更可能である。［設定］メニューから帳票処理システムの終了後に呼び出すアプリケーション等の設定を行う。プログレスバーを通じ帳票処理の進行状況を表示する。画像を読み込み表示して、実行ボタンが押されると、アクティブ帳票処理システムは処理を開始する。
【００９９】
７．２記入枠と手書きの分離・記入枠位置と記入枠構造の検出
図５１は、アクティブ帳票処理システムの処理の流れの例である。
アクティブ帳票処理システムは、帳票の読み取り後、記入枠と記入枠マス位置を検知し、記入枠と手書きを分離する（図５１の５２−１）。アクティブ帳票処理システムは、記入枠ごとの枠だけの画像と手書きだけの画像を求め、記入枠を空記入枠、水平バー、手書き済み記入枠、ノイズに分類する（図５１の５２−２）。例えば、記入枠の手書きだけの画像の黒画素数がある値以下であれば空記入枠と判断する。記入枠の枠だけの画像に対して縦方向のヒストグラムを取り、縦方向のヒストグラムでの変動が小さく、ある値以下であれば、水平バーと判断する。また、記入枠の幅と高さが小さく、ある値以下であれば、エラーと判断する。その以外の記入枠は手書き済み記入枠とする。次に、アクティブ帳票処理システムは、記入枠ごとの枠画像に対して横方向のヒストグラムをとり、手書文字の属性、文字種、命令、見出しなどの情報が埋め込まれている枠上部と下部の位置を検出する（図５１の５２−３）。
【０１００】
７．３重畳情報読取り
アクティブ帳票処理システムは、記入枠ごとに、情報が埋め込まれている枠上部と下部の位置情報を、重畳情報を認識するデコードエンジンに渡し、記入枠内の重畳情報を解読する（図５１の５２−４）。一つの記入枠に埋め込まれた手書きの属性には、例えば、住所、メールアドレス、組織名、人名、職名、金額、番号、日時曜日、印、自由筆記、チェック、図などの種類がある。
【０１０１】
７．４文字ごとへの分離とオフライン文字認識
アクティブ帳票処理システムは、解読した重畳情報に含まれる手書きの属性により、記入枠をチェック、図・印、メールアドレス、他の手書き文字の種類に分類する（図５１の５２−５）。アクティブ帳票処理システムは、他の手書き文字に対して、文字ごとへ分離し、重畳情報に含まれる記入枠ごとの手書きの種類に従い、文字認識を行う（図５１の５２−９、５２−１０、５２−１２、５２−１３）。これにより文字認識率を高めることができる。アクティブ帳票処理システムは、他の手書き文字の場合、手書き文字の属性によって文脈処理を適用するかどうかを決める。属性が組織名、職名、自由筆記、住所の場合に文脈処理を行う（図５１の５２−１１）。人名、金額、番号、日時曜日の場合は文脈処理を行わない。アクティブ帳票処理システムは、メールアドレスに対しては、文字ごとへ分離して文字ごとの種類を検知し、この種類に従って文字認識を行うことで文字認識率が高められる（図５１の５２−６、５２−７、５２−８）。
【０１０２】
７．５手書き文字オフライン認識結果の修正
次に、アクティブ帳票処理システムは、オフライン認識の結果を、修正インタフェースを通して修正を行う。
【０１０３】
７．６命令書作成・帳票画像保存と修了
アクティブ帳票処理システムは、オフライン認識の結果を修正し終わったら、アクティブ帳票から抽出した内容を、例えばＣＳＶファイルの形で出力する。出力されるファイルを命令書（命令書データ）と呼ぶ。アクティブ帳票処理システムの終了後に呼び出すアプリケーションは、命令書に従い帳票に対しての処理を行う。また、帳票画像を命令書と同じフォルダに保存する。
図５２は、出力ファイルのフォーマットである。命令書はコンマを区切りとして、記入枠ごとの情報レコードを一行ずつに表す。一行目は列ごとの内容を表す。［ｎａｍｅ］は見出し、［ｐｒｏｐｅｒｔｙ］は属性、［ｃｏｎｔｅｎｔｓ］は手書きの内容、［ａｃｔｉｏｎ］は命令、［ｃｈａｒａｃｔｅｒ］は文字種である。二行目以降は記入枠ごとの情報レコードを表す。記入枠に手書きを記入してない場合、［ｃｏｎｔｅｎｔｓ］の対応の列で「、、」のように何も出力しない。
【０１０４】
８．評価実験
８．１実験対象データと実験環境
Ａ４判大の大小２種類コードの帳票２枚と、８方向線分コードの帳票２枚を１２００ｄｐｉでプリントし、１０人に１枚ずつ書いてもらい、全部で２０枚の大小２種類コードの帳票サンプルと２０枚の８方向線分コードの帳票サンプルを収集した。これら帳票を６００ｄｐｉ、８ビットグレーレベルでスキャンし、アクティブ帳票処理システムの予備評価を行った。図５３は、評価実験に用いた８方向線分コードによる帳票である。大小２種類コードの帳票も同様のものである。
【０１０５】
実験環境としては、Ｐｅｎｔｉｕｍ（登録商標）４の２．２６ＧＨｚのＣＰＵを有するＰＣを用いた。オフライン枠なし手書き文字データ収集の場合、実験用のデータということで、オンライン枠なし手書き文字実験用データの収集方法を使用し、データの筆記に際して、次のような条件を設けた。（１）文字の大きさ・文字の間隔は自由とする。（２）文字と文字の間を続けるのは不可とする。
大小２種類コード帳票の場合は、システムを簡易に仕上げるため、枠あり文字認識だけを可能とし、さらに、メールアドレスの認識率を高めるための文字ごとの字種限定は設けてない。８方向線分コード帳票の場合は、枠なし文字認識も可能とし、メールアドレスに対して、文字ごとの字種限定も実現した。
【０１０６】
８．２オフライン枠なし手書き文字認識の評価項目
枠なし認識は、手書きの文字列パタンを「文字列」として認識する処理である。しかし、「文字列」の認識率を評価する場合、評価の尺度を明確にしておく必要がある。例えば、一行２０字ぐらいの文字列を認識した結果、「一文字だけ誤認識した」ケースと、「ほとんどの文字を誤認識した」ケースが生じた場合、単純に両者を一行の「認識失敗」と評価するのは公正ではない。この理由から、オンライン枠なし手書き文字認識の評価項目を使用し、以下のような評価項目を設ける。
【０１０７】
８．２．１文字単位の認識率
次式で表される「文字単位の認識率」を評価項目として採用する。
文字単位の認識率＝正認識文字数／正解文字数
図５４は、文字単位の認識率の算出方法の説明図である。例えば、「明日は晴れ」という入力パタンに対して「日月昨晴わ」という結果が返ってきたとすると、この場合、正しく認識された文字は‘晴’の一文字だけである。正解の文字数は５であるので、文字単位の認識率は、１／５＝０．２０＝２０％と計算される。
【０１０８】
８．２．２正分割率
さらに、文字分割の精度に関する評価項目（「正分割率」とする）も加える。正分割率を次式で定義する。
正分割率＝１−（文字分割失敗箇所の数／正解文字数）
先の図５４の例では、文字分割失敗箇所は、次の二個所である。（１）本来「明」となるところを、「月」と「日」に分けてしまった。（２）本来、「日」と「は」の二文字になるところを、一文字にくっつけてしまった。したがって、正分割率は、１−２／５＝６０％と計算される。
【０１０９】
８．３実験結果
８．３．１大小２種類コード
図５５に、大小２種類コード帳票の実験結果を示す。平均処理時間は、帳票を認識し修正インタフェースを呼び出すまでの平均処理時間である。帳票サンプルの中で、書き間違った文字（２０枚の帳票サンプル中で３文字）及び、オフライン認識辞書の４４４３個の字種に入ってない文字（２０枚の帳票サンプル中で１文字）は評価の中に入れない。
切り出した手書文字に対して記入枠から読み取った手書文字の種類に従い、認識カテゴリの字種を限定することによって認識率が５９．７４％から９３．７２％へ３３．９８％向上した。手書き文字の認識率はメールアドレスの認識率も含めるため、メールアドレスの認識率に左右された。そのうえ、２０枚の帳票の中で２枚の帳票は書き方が乱雑で、人でも識別できない文字があったため、文字認識率を低下させる原因になった。
【０１１０】
帳票を読み取って修正インタフェースに渡すまでの処理時間は、本実験環境において複雑な帳票では４秒弱、簡単な帳票では２秒弱である。本実験の範囲内では、記入枠位置の検知率は１００％に達した。重畳情報レコードのデコード成功率は、情報量の少ない大小２種類コードと簡単なコード重畳方法を採用し、かつ、各デーコートエラーチェックを設けなかったため、１００％の認識率は得られなかった。
【０１１１】
８．３．２８方向線分コード
図５６に、８方向線分コードの実験結果を示す。８方向線分コード帳票の場合、枠なし手書き文字認識とメールアドレスの文字ごとの字種限定を設けたため、手書き文字の認識率項目評価においては、メールアドレス以外の枠あり手書き文字、メールアドレス、そして、枠なし手書き文字という三つの分類で評価する。平均処理時間は、帳票を認識し修正インタフェースを呼び出すまでの平均処理時間である。
帳票サンプルの中で、書き間違った文字（２０枚の帳票サンプル中で２文字）は評価の中に入れない。さらに、一枚の帳票の中で、スキャンにより帳票画像にエラーが付いたため、空き記入枠なのに文字のある枠として認識されてしまった。しかし、修正インタフェースにより修正を行ったため、これも手書き文字認識率の評価に入れない。
【０１１２】
記入枠から読み取った手書文字の種類に従い、認識カテゴリの字種を限定することにより、メールアドレス以外の枠あり手書き文字においては、認識率が８６．７８％から９８．８５％へ１２．０７％向上した。メールアドレスにおいては、文字ごとの字種を限定することにより、認識率が属性を利用した認識８０．１９％より９７．１５％へ１６．９６％向上した。メールアドレスの１０位累積認識率は、文字類を限定してもしなくてもほぼ同じであるが、１位決定率は字種を限定しない場合に激しく低下した。これは、メールアドレス認識率低下の原因が類似文字の相互的な影響からと思われる。
【０１１３】
帳票を読み取って修正インタフェースに渡すまでの処理時間は、本実験環境において複雑な帳票Ａでは４秒程度、帳票Ｂでは３秒程度である。本実験の範囲内では、記入枠位置の検知率は１００％に達した。重畳レコードのデコード成功率は、情報量の多い８方向線分コードと効率的なコード重畳方法を採用し、かつ、各デーコートエラーチェックを設けたため、１００％の認識率を得た。枠なし手書き文字の場合は、９７．７１％の文字単位の認識率と９８．４７％正分割率を得た。
【０１１４】
９．線分検出、交点検出
また、ヒストグラムを用いないで、線や交点の検出から帳票構造を認識する方法には例えば、非特許文献４、５に記載の方法がある。以下に、本実施の形態における線分検出及び交点検出の概略を示す。なお、以下の各処理は、アクティブ帳票処理システム（処理部）が実行する。
【０１１５】
９．１帳票の表現モデル
殆どの帳票が３種類の線セグメント（水平、垂直、斜め）を含んでいるので、これらの３種類の線セグメントにより帳票構造を表現する。一つの線セグメントは始点と終点の座標により表現される。本実施の形態に係る帳票においては、記入枠、水平バー等はドット又はコード図形で構成され、これらドット等を膨張処理して連結させることにより線分を形成し、水平、垂直、斜めの線セグメントを求める。
また、水平の線セグメントに対して、一番左の始点の座標に従い上から下へ左から右への方向でソートを行う。つまり、まずこれらの始点のＹ座標に従い上から下への方向でソートする。そして、同じＹ座標の水平線セグメントに対して、始点のＸ座標に従い左から右への方向でソートする。類似的に垂直線セグメントに対しては、まず一番上の始点のＸ座標に従い左から右へソートし、そして、同じＸ座標始点の垂直セグメントに対してＹ座標に従い上から下への方向でソートする。斜めの線セグメントも水平線セグメントと同じであるが、一番左の始点を使う代わりに一番上の始点を使いソートする。なお、ソートの方向は適宜の方向とすることができる。
【０１１６】
なお、帳票の表現モデルは、例えば以下に示す五つの部分から構成される。
（１）（ＮＨ、ＮＶ、ＮＳ）：ＮＨ、ＮＶ、ＮＳはそれぞれ水平、垂直と斜めの線セグメント数を表す。（２）（ＳＬＨ、ＳＬＶ、ＳＬＳ）：ＳＬＨ、ＳＬＶ、ＳＬＳはそれぞれ水平、垂直と斜めの線セグメントのリストを表す。（３）［（Ｘ_ｍｉｎ、Ｙ_ｍｉｎ）；（Ｘ_ｍａｘ、Ｙ_ｍａｘ）］：全ての線セグメントを含んだ正方領域の一番左上と一番右下の座標。（４）［（ＸＤ（ｉ）_ｍｉｎ、ＹＤ（ｉ）_ｍｉｎ）；（ＸＤ（ｉ）_ｍａｘ、ＹＤ（ｉ）_ｍａｘ）］：第ｉ番目記入枠の正方領域の一番左上と一番右下の座標。（５）［（ＸＮ（ｉ）_ｍｉｎ、ＹＮ（ｉ）_ｍｉｎ）；（ＸＮ（ｉ）_ｍａｘ、ＹＮ（ｉ）_ｍａｘ）］：第ｉ番目名前枠の正方領域の一番左上と一番右下の座標。
【０１１７】
９．２線セグメントの抽出と調節
ここで、帳票画像（データ）から線セグメントを抽出するまでの処理について説明する。
まず、帳票画像（データ）を入力し、傾きの補正を行う。これは、水平に近いできるだけ長い線分を正確に水平になるように画像を回転する。なお、垂直線を用いても良い。また、１本だけに着目してもよいし、複数本に着目して、それらを平均して水平、あるいは垂直するように回転しても良い。
アクティブ帳票の帳票画像に対して記入枠と手書きを分離し、記入枠だけの画像を求める。なお、記入枠と手書き枠の分離については、ラベリング等の上述と同様の方法を用いることができる。記入枠だけの画像に対し小さいドットが連結するまで膨張処理を施す。そして、３種類の線セグメント（水平、垂直、斜め）を抽出する。
抽出した線セグメントに対して、水平の線セグメントは二つの端点のＹ座標が所定範囲内で異なる場合は同じになるように、垂直の線セグメントは二つの端点のＸ座標が所定範囲で異なる場合は同じになるように、各座標の平均を取り補正することができる。このように、例えばわずかな座標のずれがある場合は、斜めの線セグメントとして扱わず水平又は垂直の線セグメントとして扱うことができる。次に、上述のようにソートを行う。
【０１１８】
ソートした線セグメントに対して、例えば、以下に示すような補正を行うこともできる。
（１）切れた線セグメントの連結：線セグメントを抽出しソートした後、これら線セグメントを水平、垂直、斜めの三つのグループに分ける。スキャンなどにより切れた線セグメントに対して連結処理を行う。水平線セグメントのグループにおいて、Ｙ座標がほぼ同じ二つの線セグメントが、例えば０．５ｃｍより近ければ、この二つの線セグメントを連結する。垂直線セグメントと斜めの線セグメントも同じである。
（２）小さい線セグメントの削除：もし、水平線セグメントの長さは、例えば、垂直線セグメント間の最小距離の８５％より短ければ、この水平線セグメントをストロークの線と見なし、削除する。垂直線セグメントの長さも、例えば、水平線セグメント間の最小距離の８５％より短ければ、この垂直線セグメントをストロークの線と見なし、削除する。
（３）線セグメントの補正：もし、垂直線セグメントの末端点から水平線セグメントまでの距離が０．５ｃｍより近ければ、この垂直線セグメントを水平線セグメントに接触するように補正する。水平線セグメントと斜め線セグメントも同じである。
なお、上述の判断に用いる数値は、適宜の数値とすることができる。
【０１１９】
９．３名前枠と記入枠の検知
次に、抽出した線セグメントに基づき、名前枠、記入枠を検知する処理について説明する。
普通の帳票においては、記入枠、名前枠と混合枠を含んでいる。記入枠は、データを記入してもらうための枠である。名前枠は、説明とガイダンスなどのテキストを含んだ枠である。混合枠は、記入枠と名前枠の両方を兼ねるものである。以下に説明する枠検知アルゴリズムにおいて、入力データは、ソートされた水平と垂直の線セグメントリストＳＬＨとＳＬＶであり、出力データは、記入枠と名前枠／混合枠の正方領域データ（記入枠データ）である。
（ステップ１）水平と垂直の線セグメントについて、全ての接触点、交差点、末端点を検出する。各水平線に関して、この水平線に接触か交差する垂直線を検出し、これらの接触点、交差点の座標を計算する。もし、水平線の端点において接触点と交差の垂直線が一つもなければ、この水平線の端点を末端点とする。なお、枠が長方形のように閉じた図形で構成されている場合は、末端点は基本的には現れないが、例えば画像のエラー、ノイズなどにより枠を構成する線分が切れたりすることで、末端点が現れる場合がある。また、枠ではなく、線分の上側又は下側に文字等を記入させる帳票の場合もある。
【０１２０】
接触点か交差点の分離は意味がなく、それぞれの接触点、交差点、末端点を、水平線と垂直線の位置関係等に基づき、例えば図５７に示すようなタイプ１〜４の４種類に分ける。タイプ１に示す３つの接触点及び交差点は、例えば、その点が枠の上側にあることを示している。タイプ２に示す点は、例えば枠の上側又は下側にあることを示している。タイプ３に示す点は、例えば枠の下側にあることを示している。また、タイプ４は、末端点を示す。これらタイプに従い、記入枠を検知する。例えば、タイプ１の２つの点（枠の上側の２頂点を示す）と、タイプ３の２つの点（枠の下側２頂点を示す）を適宜見つけることで、枠を検知することができる。
なお、全ての点は上から下へ、左から右への順番でソートされている。なぜならば、上述のように線セグメントがソートされているからである。まず、以下の処理で使用するポインターは、ソートされた最初の点に設定される。
【０１２１】
（ステップ２）ポインターの指している点Ａと点Ａの次の点Ｂを調べ、以下の処理を実行して枠を検知、登録（記憶）する。（１）もし、ＡとＢのＹ座標が違えばステップ４に跳ぶ。（２）もし、ＡとＢともタイプ４に属すれば、ＡＢの線セグメントから上の方向へ、黒いピクセルの水平線セグメントか文字列が見つかるまで、スキャンする。もし、黒いピクセルが見つからなければ、画像のトップに到着する。どちらでも、ＡＢ線を底とする正方領域を求める。もし、この正方領域の高さが、例えば０．４ｃｍより小さければ無視し、ステップ４へ跳ぶ。そうではなければ、一つの枠として登録（記憶）する。
（３）もし、ＡとＢともタイプ１かタイプ２に属すれば、ソートされたリストの中でＢ以降から、タイプ２、３に属し正方形（又は長方形、以下同じ）ＡＢＣＤを形成する最初の点Ｃ、Ｄを見出す。もし、Ｃ、Ｄ点を見つければ正方形ＡＢＣＤを枠として登録する。そうではなければステップ４へ跳ぶ。（４）もし、Ａはタイプ４に、Ｂはタイプ１かタイプ２に属すれば、類似的に点Ｃ、Ｄを見出す。Ｃはタイプ４に、Ｄはタイプ２かタイプ３に属し、かつ、ＡＢＣＤが正方形であるような点である。もし、Ｃ、Ｄ点を見つければ正方形ＡＢＣＤを枠として登録する。そうではなければステップ４へ跳ぶ。
（５）もし、Ａはタイプ１かタイプ２に、Ｂはタイプ４に属すれば、類似的に点Ｃ、Ｄを見出す。Ｃはタイプ２かタイプ３に、Ｄはタイプ４に属し、かつ、ＡＢＣＤが正方形であるような点である。もし、Ｃ、Ｄ点を見つければ正方形ＡＢＣＤを枠として登録する。そうではなければステップ４へ跳ぶ。（６）もし、以上の条件は全て満足しなければ、ステップ４へ跳ぶ。
【０１２２】
（ステップ３）ステップ２で求めた枠の正方領域をスキャンする。もし、この正方領域の黒いピクセルの数が予め定められた閾値以上であれば、名前枠／混合枠の正方領域とラベルする。そうでなければ、記入枠とラベルする。
（ステップ４）ポインターを１プラスし、次の点を指す。もし、ポインターは最後の点を指せば処理を終了する。そうでなければステップ２に跳ぶ。
【０１２３】
【発明の効果】
本発明によると、窓口やオフィス等で利用される帳票読取りに利用可能で、帳票記入枠と記入文字の分離が容易な帳票及び帳票処理方法、帳票処理プログラム、そのプログラムを記録した記録媒体及び帳票処理装置を提供することができる。また、本発明によると、記入文字の認識率を高めることができる。さらに、本発明によると、装置側でなく帳票側に帳票それ自身の処理方法を記述し、読取装置として汎用機を用いることができ、及び、多種の帳票を一台で処理することができる。
また、本発明によると、重畳する情報量を高めるためのドット形状を有する帳票及び帳票処理方法を提供することができる。本発明によると、手書きとの重なりや汚損から重畳情報を安定に読み取るためのコーディング方式及びデコーディング方式を提供することができる。さらに、本発明によると、特に誤読しやすい英数字記号などの補助記入方式を提供することができる。また、本発明によると、エディタで簡易に作成し、プリンタで印刷できる帳票を提供することができる。また、本発明によると、記入枠の行間の画像をスキャンする手間を省略し、処理時間を短くすることができる。
【図面の簡単な説明】
【図１】アクティブ帳票の構成図。
【図２】アクティブ帳票処理システムのハード構成図。
【図３】大小２種類の図形を用いるコード（大小２種類コード）の構成図。
【図４】８方向線分を用いるコード（８方向線分コード）の構成図。
【図５】描画エリアと配置エリアの関係。
【図６】２^１２種類の図形を用いるコード（２^１２種類コード）の構成図。
【図７】２^８種類の図形を用いるコード（２^８種類コード）の構成図。
【図８】重畳する情報レコードを示す図。
【図９】属性とコードが示す値の対応を示すテーブル。
【図１０】文字種類及び命令と、コードが示す値との対応を示すテーブル。
【図１１】情報レコードのドット列へのコーディング方法を示す図。
【図１２】重畳する情報レコードを示す図。
【図１３】８方向線分コードの重畳の例。
【図１４】パリティ符号の設定方法の説明図。
【図１５】区切り記号を表す図形。
【図１６】四方向のヒストグラムによる認識方法の説明図。
【図１７】標準パタンサンプルと入力コードの特徴値の抽出の説明図。
【図１８】標準パタンとマッチングすることによるコードの認識の説明図。
【図１９】部分的な走査による認識方法の説明図。
【図２０】大小２種類コード及び８方向線分コードにより作られたドットテクスチャの記入枠。
【図２１】四方向ヒストグラムとマッチングによる認識方法を試した結果。
【図２２】２^１２種類コードにより作られたドットテクスチャの記入枠。
【図２３】コードの認識結果。
【図２４】２^８種類コードにより作られたドットテクスチャの記入枠。
【図２５】認識実験結果。
【図２６】各種のコード形状の認識結果の比較図。
【図２７】アクティブ帳票処理システムの流れ図。
【図２８】記入枠と手書きの分離及び記入枠と記入枠構造の検知についてのフローチャート。
【図２９】メディアンフィルタの説明図。
【図３０】記入枠と手書きが重なった場合における分離されたき記入枠と手書き文字。
【図３１】ヒストグラムによる記入枠及び記入枠構造の検出の説明図。
【図３２】記入枠構造の検知方法についての説明図（１）。
【図３３】記入枠構造の検知方法についての説明図（２）。
【図３４】記入枠構造の検知方法についての説明図（３）。
【図３５】帳票の空白を考慮しない方法と、考慮した方法を適用した処理結果。
【図３６】記入枠検出時の情報損失防止処理の説明図。
【図３７】大小２種類コードのデコーディングの流れ図。
【図３８】８方向線分コードのデコーディングの流れ図。
【図３９】描画エリアの大きさ算出の説明図。
【図４０】コードの大きさによるコード切り出しの例。
【図４１】記入枠の情報バーからコードごとの図形を切り出す切出し方向の説明図。
【図４２】手書きの重なった情報レコードの検出と処理の説明図。
【図４３】コード図形におけるエラーの例を示す図。
【図４４】情報バーの位置検出誤差によるコード図形損傷への対応処理の説明図（１）。
【図４５】情報バーの位置検出誤差によるコード図形損傷への対応処理の説明図（２）。
【図４６】情報バーの位置検出誤差によるコード図形損傷への対応処理の説明図（３）。
【図４７】８方向線分コードの認識についての説明図。
【図４８】コードを認識のアルゴリズム。
【図４９】コードを認識のプログラム例。
【図５０】メールアドレスの認識率を高めるための記入枠。
【図５１】アクティブ帳票処理システムの処理の流れの例。
【図５２】出力ファイルのフォーマット。
【図５３】評価実験に用いた帳票。
【図５４】文字単位の認識率の算出方法の説明図。
【図５５】大小２種類コード帳票の実験結果。
【図５６】８方向線分コードの実験結果。
【図５７】接触点、交差点及び末端点の分類を示す図。
【符号の説明】
１０水平バー
２０記入枠
３０情報バー
４０マス
１処理部
２帳票入力部
３出力部
４表示部
５記憶部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a form, a form processing method, a form processing program, a recording medium on which a form processing program is recorded, and a form processing apparatus, and more particularly, can be used for reading a form used in a window or an office, and is processed on the form side. The present invention relates to a form in which a method is described, a form processing method for processing the form, a form processing program, a recording medium storing the form processing program, and a form processing apparatus.
[0002]
[Prior art]
2. Description of the Related Art Conventionally, as a bridge between paper and computer processing, an optical form reading device, a device that dynamically reads handwriting information from writing on paper, an e-pen, an anoto pen, and the like are known. However, the latter requires a special device for writing.
Further, in order to extract handwriting from the entry frame of the form, a dropout color may be used for the entry frame. However, the use of color increases the cost, and there is a problem that monochrome copies and monochrome faxes that are currently widely used cannot be used. Due to reasons such as printing costs, reading device costs, and the use of monochrome faxes, forms using dropouts are decreasing and monochrome forms are becoming mainstream. In addition, with the high accuracy of laser beam printers and the generalization of page description languages such as PostScript, it is possible to print forms consisting of fine dot textures with an inexpensive printer. Furthermore, anyone can easily design a form by creating a form creation editor. Since handwritten forms are easy to input and remain as evidence, digitization has not progressed, and there is an increasing need for easy digitization.
[0003]
A processing method for a form having an entry frame composed of dots has been proposed (see, for example, Patent Document 1 and Non-Patent Document 3). In addition, a method for recognizing and correcting characters entered in a form has been proposed (see, for example, Patent Documents 2 and 3). An active form system in which processing is described on the form side has been proposed (see, for example, Patent Document 1, Non-Patent Documents 1 and 2).
In addition, a method for recognizing a form structure from detection of a line or an intersection has been proposed (for example, see Non-Patent Documents 4 and 5).
[0004]
[Patent Document 1]
International Publication No. 01/93188 Pamphlet
[Patent Document 2]
JP 2001-052110 A
[Patent Document 3]
JP 2001-052111 A
[Non-Patent Document 1]
Taro Shimamura, Ran Toki, Atsushi Masuda, Motoki Onuma, Takeshi Hamada, Atsushi Kuronuma, Masaki Nakagawa, “Design and Prototype of Active Form System”, IEICE Technical Report, December 2002, PRMU 2002-132, Vol. 102, no. 531, pp. 19-24.
[Non-Patent Document 2]
Zhu Xinlan, Shimamura Taro, Nakagawa Masaki, "Prototype of Active Form Processing Technology", IEICE, IEICE Technical Report, December 2002, PRMU2002, Vol. 102, no. 531, pp. 25-30.
[Non-Patent Document 3]
Yoji Maeda, Masaki Nakagawa, “Document Editing Method Using Paper Interface”, Transactions of Information Processing Society of Japan, 2000, Vol. 41, no. 5, pp. 1308-1316.
[Non-Patent Document 4]
T.A. Watanabe, Q. Luo, and N.A. Sugie, “Layout recognition of multi-kinds of table form documents”, IEEE Trans. on Pattern Analysis and Machine Intelligence, 1995, Vol. 17, no. 4, pp. 432-334.
[Non-Patent Document 5]
L. Y. Tseng and R.M. C. Chen, “Recognition and data extraction of form documents based on three types of line segments”, Pattern Recognition, 1998, Vol. 31, no. 10. pp. 155-1540.
[0005]
[Problems to be solved by the invention]
However, in monochrome forms, it may be difficult to separate the form entry frame from the handwriting. For example, when a handwritten character overlaps with an entry frame, or when there is dirt on a form and reading noise, it may not be accurately separated. In addition, although the entered characters can be recognized, misreading may still occur. In particular, it is assumed that context processing cannot be used for recognizing an e-mail address, and that the subsequent processing is greatly affected.
In view of the above, the present invention can be used to read a form used in a window or an office, and can easily separate a form entry frame and a written character, a form processing method, a form processing program, and a program recording the program. It is an object of the present invention to provide a recording medium and a form processing apparatus. Another object of the present invention is to increase the recognition rate of written characters. Furthermore, the present invention is intended to describe the processing method of the form itself on the form side instead of the apparatus side, to use a general-purpose machine as a reading apparatus, and to process various forms on a single machine. .
[0006]
Another object of the present invention is to provide a form having a dot shape and a form processing method for increasing the amount of information to be superimposed. An object of the present invention is to provide a coding method and a decoding method for stably reading superimposition information from overlapping and fouling with handwriting. It is another object of the present invention to provide an auxiliary entry method for alphanumeric symbols that are particularly easy to misread. Another object of the present invention is to provide a form that can be easily created by an editor and printed by a printer. Another object of the present invention is to shorten the processing time by omitting the trouble of scanning the image between the lines of the entry frame.
[0007]
[Means for Solving the Problems]
The present invention facilitates separation from handwriting even in a monochrome environment by using a dot texture composed of fine dots in an entry frame, and superimposes handwritten entry attributes and instruction information after reading. In the active form, in which the form side leads the processing completely, the following is provided.
As a code shape for superimposing information in an entry frame, a dot shape is proposed so that a large amount of information can be superimposed stably by a dot texture.
[0008]
In addition, as a method for recognizing a code figure, a recognition method using a histogram in four directions and a recognition method using partial scanning are proposed. In the recognition method using the four-way histogram, the code shape feature is judged by recognizing the code by taking a four-way histogram for the code image. In the recognition method by partial scanning, by partially scanning the code image, scanning is performed sequentially for each position where the line segment should be in the code figure, and if there is a line segment, the corresponding information bit is set to 1, 0 if there is no line segment.
[0009]
In addition, as a stable detection method for the entry frame, in order to facilitate the detection of the entry frame using the dot texture, the entry frame can be stably displayed by taking a histogram in the vertical and horizontal directions after expanding each dot. To detect.
Since there is a space between the entry boxes in the form, the processing time is increased by omitting the blank processing time. Even if you do not separate the entry frame from the handwriting, you can detect the line of the entry frame while including the handwriting (handwriting is in the way to detect the position in the line, but it is not in the way to detect the line). If line detection is performed before separation and only one line-by-line image is labeled, labeling processing between lines can be omitted.
[0010]
We also propose a method to prevent the influence of noise when detecting the border of the entry frame. When detecting the position of the entry frame by the histogram in the vertical or horizontal direction, in order to avoid interference with the noise, the value below a certain threshold is regarded as 0 for the result of the histogram and is ignored as a blank even in a noisy place. To do. By doing so, the position of the entry frame can be detected without being affected by noise. However, since the value below the threshold value is regarded as 0, the information below the threshold value is also ignored at the boundary of the entry frame, which has an adverse effect on reading the information superimposed in the entry frame. In order to prevent such information loss when detecting the position of the entry frame, the position of the entry frame is detected at a position below the threshold in the histogram, and the histogram is further moved from this position toward the outside of the entry frame. If the histogram value does not become 0 even when the value of the histogram has changed from a value greater than 0 to 0, or the length traced from the start position f exceeds a certain threshold, the process starts. The position shifted from the position f to the outside by the threshold value is set as the start or end position of the entry frame.
[0011]
Also, as a method of extracting code figures even when a dot texture document is scanned at a slight tilt, the number of dot lines in the dot texture is smaller than the number of dot columns, and the histogram overlap in the vertical direction is small. Then, a vertical histogram is taken for the dot texture image, and the information bar is cut out from the left to the right of the information bar. Next, a horizontal histogram is taken for each column image, and a figure for each code is cut out.
[0012]
As a method of detecting and skipping information records that have been erased by overlapping with handwriting, the code figure for each column is cut out from the left to the right in the information bar image, and the interval between the previous column and the next column is assumed. When the code figure and handwriting overlap each other when the width is larger than the predetermined width, it is determined that the code figure is missing due to the handwriting separation process, and the subsequent code figure is ignored until a delimiter is found. Otherwise, the column code is cut out and recognized for processing.
This prevents errors such as attaching minute noise to each code figure and connecting them, or processing minute noise as a code figure. For example, if the size of the cut out figure is smaller than the expected code figure, it is ignored as noise. This prevents errors such as sticking and linking or processing minute noise as code figures.
[0013]
Also, we propose a method for dealing with information bar position detection errors. The boundary line between the information bar of the entry box and the other part may be detected with an error from the boundary line of the correct information bar. If the code figure is cut out and decoded according to the detected information bar boundary line, a part of the code figure that protrudes from the information bar is excluded and recognized, which may cause a recognition error. In order to avoid this, the following processes (1) to (4) are executed. (1) According to the detected information bar boundary line, the position of each code column is detected from left to right by a histogram in the vertical direction with respect to the information bar image. (2) A horizontal histogram is scanned with respect to the code image for each column, and each code is cut out. The cutout direction of the upper information bar of the entry frame is from top to bottom, and the cutout direction of the lower information bar is from bottom to top. From this stage onward, the information on the boundary line b is not used. (3) After one code is cut out, the number of codes cut out in this column and the scanned length L are examined. Let N be the number of code lines in the information bar. If the number of codes in this column has already reached N, or if the scanned length L has reached a length corresponding to N, code extraction is terminated. Otherwise, the histogram is subsequently scanned and the code is cut out. Here, the purpose of detecting the end with the scanned length is to prevent the code figure from going over the area of the information bar when the code figure overlaps with the handwriting and is lost. (4) The number of codes is checked for each column when the code segmentation is completed. If N is not reached, it is regarded as an error in the number of codes, and is ignored until a delimiter is found for the subsequent code figure.
[0014]
We also propose a method to increase the recognition rate of email addresses. Characters used in e-mail addresses are alphanumeric symbols with a small amount of information and have a low character recognition rate, and context processing cannot be used due to meaningless character strings. In e-mail address misrecognition, the main cause is misrecognition of similar characters. If this misrecognition can be avoided, the recognition rate can be increased. Therefore, in order to limit the character type of the email address, another entry frame is provided near the email address entry frame in the active form, and the corresponding character type of the email address is entered with a simple symbol. Considering the ease of recognition and the ease of entry by people, allow horizontal lines to be entered by overlapping horizontal lines above, below, or both within the frame.
A form is a form for character input in which a character input frame is formed with a dot texture in which a plurality of minute dots (black areas) or line segments are two-dimensionally arranged. Consists of a heading character drawn with a dot texture with a wide line including a plurality of dots, and a collection of dots or line segments arranged two-dimensionally. And an entry frame defined by a dot texture with wide lines to include.
[0015]
Since the processing unit is set from the form in which characters are input, each dot is set smaller than the handwritten black region, each dot constituting the dot texture disappears before the character image by the contraction processing, or The form for detecting the character image of the written character is targeted by removing the entry frame image based on the dot texture by either removing the connected component smaller than the predetermined threshold.
In this method, by using an entry frame composed of a dot texture composed of minute dots, even in a monochrome environment, handwriting can be easily separated even if it overlaps the entry frame. In addition, by embedding handwritten attributes and character types in the entry frame, the recognition rate can be increased for cut out handwritten characters. If the contents of handwritten characters are associated with attribute and headline information, settings for each form can be eliminated. Furthermore, by embedding a processing method for handwriting after reading, it is possible to realize a system for instructing processing on the form side according to the form only by reading the form. A form becomes active by embedding a program that has been required on the processing system side in the form side. This is the reason for the “active form”.
[0016]
According to the first solution of the present invention,
A form input step for inputting form image data including entry frame data composed of a code in which information is expressed by an inclination of a line segment or a combination of a plurality of line segments, and written handwritten data;
Based on the form image data input in the form input step, the entry frame data and the handwritten data are separated, and each entry frame data and an information bar in the entry frame data on which predetermined information is superimposed by a code, A frame detection step for detecting
Cutting out each code of the detected information bar;
A step of obtaining a histogram of four directions of the cut code, the vertical direction, the horizontal direction, the right diagonal direction, and the left diagonal direction;
A code recognition step for recognizing the code by determining the shape feature of the code based on the obtained histogram in the four directions;
, A form processing program for causing a computer to execute these processes, and a computer-readable recording medium recording the program.
[0017]
According to the second solution of the present invention,
A form input step for inputting form image data including code data in which information is expressed by a dot width, a slope of a line segment, or a combination of a plurality of line segments, and written handwritten data;
Based on the form image data input in the form input step, the entry frame data and the handwritten data are separated, and each entry frame data and an information bar in the entry frame data on which predetermined information is superimposed by a code, A frame detection step for detecting
A superimposition information recognition step of recognizing each code of the information bar detected in the entry frame detection step and extracting superimposition information;
A character recognition step of performing character recognition according to the attribute and / or character type of the handwritten data indicated by the extracted superimposition information for the handwritten data separated in the entry frame detecting step;
A command creation step for creating command data for causing other application systems to perform predetermined processing based on the character information recognized in the character recognition step and the superimposition information extracted in the superimposition information recognition step;
Outputting the instruction document data created in the instruction document creating step and / or storing the form image data inputted in the form input step and the instruction document data in association with each other;
, A form processing program for causing a computer to execute these processes, and a computer-readable recording medium recording the program.
[0018]
According to the third solution of the present invention,
A form input unit for inputting form image data including input frame data composed of a code in which information is expressed by a slope of a line segment or a combination of a plurality of line segments, and written handwritten data;
A processing unit that detects an entry frame of input form image data and recognizes a code in the entry frame;
With
The processor is
Means for inputting form image data from the form input unit;
Based on the form image data input from the form input unit, the entry frame data and the handwritten data are separated, and each entry frame data and an information bar in the entry frame data on which predetermined information is superimposed by a code, An entry frame detecting means for detecting
Means for cutting out each code of the detected information bar;
Means for obtaining a histogram of four directions of the cut cord in the vertical direction, the horizontal direction, the right diagonal direction, and the left diagonal direction;
A code recognition means for recognizing the code by judging the shape characteristic of the code based on the obtained histogram in the four directions;
A form processing apparatus is provided.
[0019]
According to the fourth solution of the present invention,
A form input unit for inputting form image data including a dot width or a slope of a line segment or a code in which information is expressed by a combination of a plurality of line segments, and written handwritten data;
A storage unit for storing form image data and / or command data created based on the form image data;
A processing unit for recognizing handwritten information entered in the input form image data and information superimposed on the entry frame and executing a predetermined process;
With
The processor is
Means for inputting form image data from the form input unit;
An entry frame that separates entry frame data and handwritten data based on the input form image data, and detects each entry frame data and an information bar in the entry frame data in which predetermined information is superimposed by a code. Detection means;
Superimposition information recognition means for recognizing each code of the information bar detected by the entry frame detection means and extracting superposition information;
Character recognition means for performing character recognition on the handwritten data separated by the entry frame detection means according to the attribute and / or character type of the handwritten data indicated by the extracted superposition information;
Instruction book creation means for creating instruction book data for causing other application systems to perform predetermined processing based on the character information recognized by the character recognition means and the superposition information extracted by the superposition information recognition means;
Means for outputting the instruction book data created by the command book creating means and / or storing the form image data inputted by the inputting means in correspondence with the instruction book data in the storage unit;
A form processing apparatus is provided.
[0020]
Also, in one of the solution means of the present invention, the entry frame detection step and the entry frame detection means are each of the entry frame data based on horizontal and vertical histograms or line segment detection and / or intersection detection with respect to the form image data. And an information bar in the entry frame data on which predetermined information is superimposed by a code.
[0021]
Furthermore, according to the fifth solution of the present invention,
An information bar in which information records on which predetermined information is superimposed are arranged over a plurality of lines by a code in which information is expressed by a dot width, a slope of a line segment, or a combination of a plurality of line segments. An entry box comprising:
A horizontal bar composed of the code and including information on the form or the information bar;
A form is provided.
[0022]
DETAILED DESCRIPTION OF THE INVENTION
1. Overview of active forms and their processing systems
1.1 Active forms
FIG. 1 is a configuration diagram of an active form.
By using a dot texture composed of fine dots in the entry frame, the active form enables separation of the entry frame and handwritten characters even in a monochrome environment, and further distributes the dots in a non-uniform manner. It is a form based on a technology that allows embedding of the character attributes and instruction information after reading into the entry frame.
The form includes a horizontal bar 10 and an entry frame 20 having an information bar 30. A horizontal line printed under the form title is called a horizontal bar 10. The entry frame 20 and the horizontal bar 10 are printed with a dot texture. The dot texture of the entry frame includes handwritten attributes (for example, names, addresses, email addresses, free writing, etc.), character types (for example, numbers, kanji, English), headings, and processing methods (for example, handwriting). Superimpose information (superimposition information) on the upper and lower sides of the entry frame instructing characters (recognizing characters, sending description contents by e-mail, saving in a database, etc.). A portion where this information is superimposed is referred to as an information bar 30. Further, the information on the entire form is superimposed on the horizontal bar 10. In addition, a division for dividing and entering characters one by one in the entry frame is called a cell 40.
[0023]
In an entry frame with a dot texture, dots are arranged in a horizontal (row) direction and a vertical (column) direction to form a texture. If the number of dot lines in the entry frame is the same, including the horizontal bar 10, and if the number of dot lines in the horizontal bar 10 is obtained, the number of dot lines in all the entry frames can be easily known. Further, in the present embodiment, it is possible to superimpose superimposition information on a code figure in which information is represented by a slope of a line segment or a combination of line segments other than dots. You may comprise this line segment using a dot. In the following description, the code figure may be simply referred to as a dot.
[0024]
1.2 Active form processing system
FIG. 2 is a hardware configuration diagram of the active form processing system.
The active form processing system includes, for example, a processing unit 1, a form input unit 2, an output unit 3, and a storage unit 5. The active form processing system may include an appropriate input unit and display unit 4. The active form processing system, for example, extracts description content and superimposition information from the read image data of the active form and outputs it to the instruction sheet. Details of the processing of the processing unit will be described later. The storage unit 5 stores form image data and command data created based on the form image data. In addition, for example, a table in which the value indicated by the code corresponds to the information is stored in the storage unit 5 in advance. Note that the storage unit 5 can include, for example, a table in which feature values of standard patterns correspond to information indicated by the standard patterns.
[0025]
2. Active form details
2.1 Code shape for information superimposition
In the active form, information indicating the handwritten attribute, character type, heading, and processing method to be entered in each entry frame is superimposed on the dot texture. One method for embedding information in the dot texture is to change the shape of the dots. Therefore, some code shapes representing information will be described.
[0026]
FIG. 3 is a block diagram of a code using two types of large and small figures (two types of large and small codes). Two types of dots of different sizes represent binary bits 1 and 0, and information is superimposed on this. The height of the two cords is the same and is h. Let L be the width of a large dot. Considering the gap between codes, an area for arranging codes evenly in a form is defined as a code arrangement area. The size of each code is the same, and the code is placed at the center. If the arrangement area of each code is arranged without leaving a gap vertically and horizontally like a grid, the code has a dot texture with a gap. Similarly, other types of codes to be described later are arranged at the center of the arrangement area to generate a dot texture for the entry frame.
[0027]
FIG. 4 is a configuration diagram of a code (8-direction line segment code) using 8-direction line segments. Eight types of codes shown in FIG. 4 from 0 to 7 can be represented by eight line segments. In FIG. 4, numbers such as “000” and “001” shown below each code are binary numbers corresponding to these codes. Each code can be associated with an appropriate number other than those shown in the figure. In addition to the eight directions, multi-directions using line segments inclined at a predetermined angle may be used. A square area of the same size for drawing a code is called a drawing area.
[0028]
FIG. 5 shows the relationship between the drawing area and the arrangement area. The drawing area is smaller than the arrangement area and is located at the center of the arrangement area. Similar to the above, the arrangement area is arranged without gaps in the vertical and horizontal directions like a grid, so that each cord is arranged with a gap.
FIG. ¹² Code that uses different types of figures (2 ¹² FIG. 2 ¹² As shown in FIG. 6, the type code is composed of a combination of 12 types of line segments. Based on the drawing area of size a × a, 2 depending on the presence or absence of these 12 types (12) of line segments. ¹² Types, that is, 4096 types of codes are created.
[0029]
FIG. ⁸ Code that uses different types of figures (2 ⁸ FIG. 2 ⁸ As shown in FIG. 7, the type code is composed of a combination of eight types of line segments. Based on the drawing area of size a × a, 2 depending on the presence / absence of these 8 types (8 lines). ⁸ Types, that is, 256 types of codes are created. 2 ⁸ Type code, 2 ¹² In addition to the type code, 2 depending on the appropriate line segment combination ⁿ Type codes can be created.
[0030]
2.2 Coding information
In order to guarantee the minute dot shape of the dot texture, a form is created in the PostScript language, the form image data is sent to the PostScript printer, and the form is printed. The coding of each of the two types of large and small codes and the 8-direction line segment code will be described. Other codes can basically use the latter coding as well.
[0031]
2.2.1 Coding in two types of large and small codes
FIG. 8 is a diagram showing information records to be superimposed. The information shown in FIG. 8 is superimposed for each dot texture entry frame. This is called an information record. For example, an information record is represented by a fixed length of 8 + 8 + 8 + 16 × 4 + 8 = 96 bits. The information record can be superimposed on the upper side of the entry frame, for example.
FIG. 9 is a table showing correspondence between attributes and values indicated by codes. The attribute is a handwritten attribute entered in the entry frame, and the heading characters in the entry frame are classified into categories. There are various types of attributes depending on the type and usage of the form. FIG. 9 is an example of attributes obtained by extracting heading characters from 36 types of forms that are actually used. As shown in FIG. 9, for example, it can be classified into 14 types of attributes and coded. However, appropriate attributes can be used, and are not limited to these.
[0032]
FIG. 10 is a table showing the correspondence between character types and instructions and values indicated by codes. The character type is the character type of the handwritten character entered in the entry box. The command is a processing method after reading for the handwritten text entered in the box. Character types and instructions can be represented by 8-bit flags. For example, as shown in FIG. 10, a bit number corresponding to each character type and command is set (for example, 1). It should be noted that other appropriate character types and commands can be used. For example, the command “recognize” may be “recognize with frame”, “recognize without frame”, or the like. In the heading character, for example, a Shift-JIS code (64 bits in total) of the first four characters of the entry frame heading is embedded. The terminal symbol is embedded with 8-bit zeros. In the storage unit 5 of the active form processing system, for example, tables as shown in FIGS. 9 and 10 are stored in advance.
[0033]
FIG. 11 is a diagram illustrating a method of coding information records into dot rows.
Two kinds of large and small codes are arranged horizontally (line), and the above information is superimposed on the upper side of the entry frame, for example. Since there is a risk that the superimposition information is lost due to the overlap of the entry frame and the handwritten character, the same superimposition information is repeated as much as possible within the dot arrangement of one line, and, for example, dots indicating 0 are inserted in the remaining part. In the second and subsequent lines, for example, a dot indicating 0 is put at the head and shifted by 2 dots, and the above dot line is repeated. In other words, 96-bit information records are connected as long as possible, and a dot indicating 0 is inserted in the remaining part. Losing information can be restored by taking a majority decision of these superposition information. It should be noted that the number of dots shifted in each row can be shifted by an appropriate number of dots other than two dots. Further, the coding method shown in FIG. 11 can be applied to other codes such as an 8-way line segment code in addition to the two types of large and small codes.
[0034]
2.2.2 Coding in 8 direction line segments
FIG. 12 is a diagram showing information records to be superimposed. The 8-way line segment code can be developed by superimposing information on two types of large and small codes, and more information can be superimposed, and an efficient superposition method such as error detection can be used. The information record shown in FIG. 12 is superimposed for each entry frame.
The attribute can be the same as the superposition of two kinds of large and small codes. Since the information amount of one code of the 8-way line segment code is 3 bits, for the convenience of expression, for example, 0 is added to the end (or the top) of the attribute of the two types of large and small codes shown in FIG. A bit. The character type and instruction can be represented by a 9-bit flag. The expression method can be the same as in FIG. Further, as the character type of the bit number 9, in order to increase the recognition rate of the mail address, information indicating the character type such as alphanumeric characters, “@”, “.”, Etc. used therefor can be used. The heading character is represented by a Shift-JIS code from 1 to 10 double-byte characters. The improvement of the mail address recognition rate will be described later. The information record of FIG. 12 (undefined length of 45 to 207 bits) is divided every 3 bits and converted into a code representing it. The information record including each code is superimposed on the upper and lower information bars of the entry frame.
[0035]
FIG. 13 shows an example of superposition of the 8-direction line segment code. As shown in FIG. 13, for example, considering a four-row dot texture, four codes are arranged from top to bottom for each column, and are coded from left to right. In the example shown in FIG. 13, line segments are arranged in the directions 1, 2, 3, 4, 5, and 6. A delimiter is added at the beginning and end of the information record, and a parity code is placed before the end, and the second and third information records are repeated. This is repeated as much as possible. The second and subsequent information records are the same as the first information record, and first check whether there is an error in the parity, and then take a majority vote of a plurality of information records to restore the lost information. Can do.
[0036]
FIG. 14 is an explanatory diagram of a parity code setting method. A parity code is added so that the value of the corresponding bit between the delimited figures becomes 0 by exclusive OR. For example, as shown in FIG. 14, for data indicating “000”, “001”, and “010”, the parity code is set so that the exclusive OR of the corresponding bit values of each data becomes zero. A code indicating “011” is added.
FIG. 15 shows a figure representing a delimiter. The delimiter has a function of indicating the boundary of each information record. For example, the delimiter is assumed to be 8 by taking over the number of the 8-direction line segment code shown in FIG. In addition to the shape shown in FIG. 15, the delimiter may have an appropriate shape that can be distinguished from that.
[0037]
2.3 Code figure recognition method
To recognize the code, any of the following methods can be used.
2.3.1 Recognition method using histograms in four directions
FIG. 16 is an explanatory diagram of a recognition method using a histogram in four directions. By taking histograms in four directions (for example, vertical, horizontal, right diagonal, left diagonal) with respect to the code image (data), the code shape feature can be determined to recognize the code. As shown in FIG. 16, the histograms in the four directions are clearly different depending on the code.
[0038]
2.3.2 Recognition method by matching
A large number of image samples are created for all types of codes, and feature values of each sample are extracted. An average feature value can be obtained for each code type, and a standard pattern for each code can be created. The feature value of the input code is extracted by the same method as the feature value extraction of the standard pattern sample, is matched with the standard pattern of each code, the standard pattern having the closest feature is obtained, and this standard pattern is used as the recognition result.
FIG. 17 is an explanatory diagram of extraction of standard pattern samples and feature values of input codes. As a feature value extraction method, a method generally used in character symbol recognition can be used. First, the line density is calculated, nonlinear normalization is performed so that the line density is uniform, and smoothing is performed to maintain line connectivity for eliminating jagged edges of the character image ((1) in FIG. 17). . Next, components in four directions of horizontal, vertical and diagonal are extracted from the normalized pattern ((2) in FIG. 17). Each is divided into 8 × 8 sections, and the amount of direction components included in each section is used as a feature, and 64 pieces thereof are arranged, and then four direction components are arranged to obtain a 64 × 4 = 256-dimensional feature vector ( FIG. 17 (3)).
[0039]
FIG. 18 shows an explanatory diagram of code recognition by matching with a standard pattern. For example, the characteristic value of the standard pattern as shown on the right side of FIG. 18 and the information indicated by the standard pattern are stored in the storage unit in advance, and the characteristic value of the read code shown on the left side of FIG. The code can be recognized by comparing with the value.
2.3.3 Recognition method by partial scanning
FIG. 19 is an explanatory diagram of a recognition method based on partial scanning. A code figure is recognized by partially scanning the code image. Scan in order for each position where the line segment should be in the code figure. If there is a line segment, the corresponding information bit is set to 1, and if there is no line segment, it is set to 0. The diagram shown in FIG. ⁸ It is an example of a kind code. The white part is the part without the line segment, and the black part is the part with the line segment. In FIG. 19, the background is gray to make it easy to see a portion without a line segment, but in an actual form, the background and a portion without a line segment can be the same color. Each line constituting the code corresponds to a 2-bit digit. These line segments are scanned in sequence. If there is a line segment, the corresponding digit is set to 1, and if there is no line segment, it is set to 0. If the drawing area or the arrangement area of the code can be cut out, the scanning portion can scan a predetermined position in these areas. By sequentially scanning, the 2-bit number “10110110” is obtained in the example of FIG. When converted to decimal, it is “182”. This value is the value of this code.
[0040]
2.4 Examination of each cord shape
The size of each code described above and the result of recognition experiment for each code will be described.
[0041]
2.4.1 Experimental environmental conditions
The processing device used for creating and recognizing each code is a model equipped with a 2.26 GHz CPU of Pentium (registered trademark) 4 and a memory of 1.00 GB, and mounted with Windows (registered trademark) XP as an OS.
As the printing accuracy, 1200 dpi, which can be printed even with an inexpensive laser beam printer, was adopted. As described above, the active form can be printed by creating a form code in the PostScript language and sending the form code to the PostScript printer. The higher the printing accuracy, the better. However, if a very high accuracy is assumed, the apparatus cost increases. As a result of the experiment, the minimum diameter of a round dot that can be printed with an accuracy of 1200 dpi is 0.1 mm.
[0042]
The scanning accuracy was 600 dpi. The higher the scanning accuracy for reading a form, the better. However, if the accuracy is high, the size of the read form increases, and the time for form processing increases. For example, when scanning is performed with an accuracy of 600 dpi, the size of the A4 form is 4924 × 6883, and the processing time of the computer is less than 2 seconds for a normal form of two types of large and small codes at this size.
In addition, in order to prevent dots having a small dot texture from being connected by scanning, it is necessary to provide an appropriate interval between the dots. It is desirable that there is a minimum interval between dots of 0.2 mm in a 600 dpi scan, including printing errors during printing and errors due to scans in a tilted form or binarization. This size can be printed with a printing accuracy of 1200 dpi. This interval does not impair the appearance of the form.
[0043]
2.4.2 Examination of each code
FIG. 20 is a dot texture entry frame created by two types of large and small codes and an 8-direction line segment code. FIG. 20A shows a dot texture entry frame created by two types of large and small codes.
Since the minimum size dot that can be printed with an accuracy of 1200 dpi is 0.1 mm, the height h of the cord shown in FIG. 3 can be 0.1 mm. In order to distinguish large dots from small dots, the length L of the large dots shown in FIG. 3 can be set to 0.25 mm in consideration of errors such as scanning and binarization when the form is inclined. As described in the scanning accuracy, in order to set the interval between dots to 0.2 mm, the size of the rectangular arrangement area shown in FIG. 3 can be 0.45 width × 0.3 mm height. FIG. 20A shows an example of a dot texture entry frame created with a code of this size. In this example, information is superimposed on the upper side of the entry frame.
Since there are only two types of codes, they can be easily identified by taking a vertical histogram and determining the width of the code. As a result of recognizing one form of the two types of codes having the above-mentioned sizes, the processing time until the correction interface is called is 1782 ms. There were no identification errors.
[0044]
FIG. 20B shows a dot texture entry frame created by the 8-direction line segment code. The line of the 8-direction line segment code has a thickness of 0.1 mm, which is the minimum size that can be printed at 1200 dpi. The number of pixels with a size of 0.1 mm in a 600 dpi scan is three. In order to distinguish the code indicating “100” and the code indicating “101” shown in FIG. 4, 4 pixels are required for the width (horizontal direction) of the code “101”. When converted to the size before scanning, it is 0.168 mm. In consideration of errors and the like, the width of the “101” code can be 0.2 mm. In this case, the size of the drawing area square is 0.4 mm. If the interval between dots is guaranteed to be 0.2 mm, the size of the arrangement area is 0.6 × 0.6 mm. FIG. 20B is an example of an entry frame created with a code of this size. In this example, as in FIG. 20A, the code is shifted by 2 codes in each row in the horizontal direction, and information is superimposed on the upper side of the entry frame.
[0045]
By taking a histogram in four directions, an eight-direction line segment code can be easily identified. In order to compare, the code recognition process was tried about the recognition method by matching and the recognition method by a histogram of four directions. The standard pattern dictionary for the matching method was created by making a total of 160 images in each of the 8 types of code in the PostScript language, using the images obtained by 1200 dpi printing and 600 dpi scanning as samples, and taking the characteristics. .
FIG. 21 shows a result of a trial of a recognition method using a four-way histogram and matching. The recognition rate is the recognition rate for all sample images used for creating a dictionary, and both reach 100%. However, the processing time is faster in the recognition method using the histogram. The processing time for one form indicates the processing time until a correction interface for correcting a recognized character is called.
[0046]
In FIG. ¹² This indicates the dot texture entry frame created by the type code.
This code shape is considerably complicated, and if the code image is small, the combination lines come into contact with each other, and even a person may not be able to identify them. The size of the combination line was made so that it could be identified by a person without contact, and was identified by a decoding method based on matching, and a size that could guarantee a recognition rate of 98% or more was found. The drawing area of this size is 0.9 × 0.9 mm. Considering the minimum code interval of 0.2 mm, the arrangement area is 1.1 × 1.1 mm. This size can be further reduced if the printing accuracy and the scanner accuracy are increased. FIG. 22 is an example of an entry frame created with a code of this size. In this example, information is superimposed on the upper side of the entry frame.
[0047]
When the code shape is small and a large number of lines are gathered, the feature cannot be identified by the recognition method using the histogram. In order to identify by the method using the histogram, it is necessary to use a code having a size larger than the above-described size. If the code size is increased, the appearance of the form is affected. Therefore, the method using the histogram is stopped, and in this example, the recognition method using matching and the recognition method using partial scanning are tried. A standard pattern dictionary for matching is created by making a total of 69632 images in the PostScript language, 17 for each of 4,096 types of codes, using 1200 dpi prints and 600 dpi scans as samples, and taking the features. Created.
[0048]
FIG. 23 shows the result of code recognition. The recognition rate is a recognition rate for all samples used for creating a dictionary. The processing time for one form is the processing time until the correction interface is called. A recognition method based on partial scanning is better for both processing time and recognition rate.
[0049]
In FIG. ⁸ This indicates the dot texture entry frame created by the type code.
2 ¹² It is the same as the type code. If this code is too small, the combination lines come into contact with each other, and even humans cannot identify. A size was created so that a person could be identified without touching the combination line, and the size was identified by a recognition method using matching, and a size that could guarantee a recognition rate of 98% or more was found. The drawing area of this size is 0.78 × 0.78 mm. Considering the minimum code interval of 0.2 mm, the arrangement area is 0.98 × 0.98 mm. This size can be further reduced if the printing accuracy and the scanner accuracy are increased. FIG. 24 shows an example of an entry frame created by a code figure of this size. In this example, information is superimposed on the upper side of the entry frame.
[0050]
This code also needs to be larger than the above-mentioned size in order to identify features by a recognition method using a histogram. Therefore, a recognition method by matching and a recognition method by partial scanning were tried. A dictionary of matching standard patterns was created by making a total of 1792 images for each of 256 types of codes in the PostScript language, taking images obtained by 1200 dpi printing and 600 dpi scanning as samples, and taking the characteristics.
FIG. 25 shows the recognition experiment results. The processing time for one form is the processing time until the correction interface is called. The recognition rate was the recognition rate for all samples used to create the dictionary, and both reached 100%. However, the processing time is faster with the recognition method based on partial scanning.
[0051]
2.4.3 Comparison of code shapes
FIG. 26 is a comparison diagram of recognition results of various code shapes. The decoding time per bit is embedded in the information bar from the time when the information bar of the entry frame is passed to the decoder engine based on the fastest recognition method for the various code shapes described above until the decoding result is obtained. It is the time divided by the amount of information. 2 ¹² Type and 2 ⁸ Kinds of codes are more difficult to cut and recognize in a realistic size than other codes. In addition, when the two types of code shapes are large, the difference from other codes is that a person can identify the shape and does not look like a code. Furthermore, if a form created with these two methods is processed by the labeling method, which is the fastest of the separation methods for handwriting and handwriting described in the following section, it cannot be separated from small characters (such as punctuation marks). Will adversely affect the recognition of characters. Although it is possible to separate the entry frame and the handwriting by other methods, the processing time for the separation becomes longer. On the other hand, the 8-direction line segment code has better information amount and decoding speed than the two types of large and small codes. Also, the two types of large and small codes are simple and easy to make a system.
[0052]
3. Main flowchart of active form processing
FIG. 27 shows a flowchart of the active form processing system. Below, each process of FIG. 27 is demonstrated. When the active form processing system starts processing, it first reads an active form image (data) (S101). Next, the active form processing system separates the entry frame and the handwriting, and detects the entry frame and the entry frame structure such as the position of the information bar (S103). The active form processing system reads the information superimposed for each entry frame according to the position of the information bar (S105). Details of the processes in steps S103 and S105 will be described later. The active form processing system stores data such as the detected position of the information bar and the entry frame structure in the storage unit at an appropriate timing.
[0053]
The active form processing system performs off-line character recognition on the handwritten character image in the entry frame according to the handwritten character attribute and the character type when the handwritten character entered according to the superimposition information attribute is a character (S107). The active form processing system corrects the offline recognition result through the correction interface (S109). The correction interface executes processing for correcting the recognized character. The active form processing system outputs the description content and superimposition information extracted from the active form, for example, in the form of a CSV file (S111). This is called an instruction book. Further, the form image is stored in the storage unit in association with the instruction document. The active form processing system ends the active form processing. Thereafter, other application systems perform processing on the information extracted from the active form according to the instruction document.
[0054]
4). Processing of separation of entry frame and handwriting and detection of entry frame and entry frame structure
4.1 Process flow
FIG. 28 shows a flowchart for separating the entry frame and the handwriting, and detecting the entry frame and entry frame structure. The flowchart shown in FIG. 28 is a detailed process of step S103 in the flowchart shown in FIG. Even if you do not separate the entry frame from the handwriting, you can detect the line of the entry frame while including the handwriting (handwriting is in the way to detect the position in the line, but it is not in the way to detect the line). If line detection is performed before separation and only one line-by-line image is labeled (algorithm 2), the labeling process between lines can be omitted. The processing procedure is as follows.
[0055]
First, the active form processing system takes a histogram in the horizontal direction for all form images (data) and detects an image for each line (S201). Next, the active form processing system applies a labeling process to the line-by-line image to separate the entry frame and the handwriting (S203). In addition to the labeling process, morphological and median filter methods can be used to separate the entry frame from the handwriting. Labeling and the like will be described later.
Further, the active form processing system detects the position of the entry frame and the square position with a histogram in the vertical direction for the image of only the entry frame for each line (S205). The active form processing system detects the position of the information bar of the entry frame with a histogram in the horizontal direction with respect to the image for each entry frame (S207). Note that the active form processing system may consider a value equal to or less than a predetermined threshold as 0 when detecting a line and an entry frame, and ignore it as a blank even in a noisy place. In this case, an information loss prevention process at the time of detecting the entry frame, which will be described later, may be executed.
[0056]
4.2 Separation method of entry frame and handwriting
4.2.1 Labeling method
Labeling refers to attaching the same label to the connected components of black pixels connected to a binary image. There are two labeling methods: a method of calling a recursive function and a method of reciprocating scanning.
[0057]
4.2.2 Morphological methods
There are the following two methods for shrinking and expanding a form image, respectively.
As a contraction method, (a) a method in which an image is scanned and a white pixel is detected and a pixel in the vicinity of 8 or 4 is also whitened. (B) a pixel is scanned and a black pixel is detected. When the neighborhood or 4 neighborhood is examined, and at least one of the 8 neighborhood or 4 neighborhood pixels is white, any method of making this black pixel white can be used.
As a dilation method, (a) if the image is scanned at a black pixel, the pixel in the vicinity of 8 or 4 is also blackened. (B) if the image is scanned at a white pixel, the pixel 8 If the neighborhood or 4 neighborhood is examined and at least one of the 8 neighborhood or 4 neighborhood pixels is black, any of the methods of making the white pixel black can be used.
[0058]
4.2.3 Medinian filter method
The Medinan filter is a technique often used for image smoothing. The Medinian filter is a median value of density in a region, that is, if it is a 3 × 3 region, nine density values are arranged in order of low (or high), and the fifth (center) density value is It is a filter that uses a new density value at the center.
FIG. 29 is an explanatory diagram of the median filter. The original density value of the target pixel at the center is 22. Use 3x3 median filter. If the nine density values in the filter are arranged in the low order, the order is 22, 22, 23, 24, 24, 24, 25, 26, 27. Since the median is the fifth value 24, the new density value of the target pixel is 24. In the case of a binary image, for a connected component of black pixels having an area (number of black pixels) smaller than half the size of the median filter, the median value is the same regardless of which black pixel of the connected component is applied with the median filter. It is always a white pixel value. Therefore, this connected component is deleted.
[0059]
4.2.4 Separation of entry frame and handwriting
As the separation method of the entry frame and the handwriting, a separation method by labeling, a separation method by contraction and expansion, a separation method combining these, and a separation method by a median filter can be considered.
I tried to separate the entry frame and handwriting for the two-size code and the 8-direction line code. These sample forms are filled in with a pen of normal thickness in order to compare the handwriting separation method with several entry frames. As a result of the separation, a large amount of handwritten marks remained in the image of the entry frame by the separation method by contraction and expansion. It is necessary to fill in with a thicker pen to be able to separate the entry frame and handwriting cleanly. However, this increases the separation time due to shrinkage and expansion. Moreover, a writing instrument will be restrict | limited.
According to the result of the separation, the separation method by labeling that calls the recursive function is the fastest and there is no restriction on the handwritten thickness and the size of the entry frame dot. Therefore, a labeling method for calling a recursive function is effective as a method for separating an entry frame and a handwriting.
FIG. 30 shows separated entry frames and handwritten characters when the entry frame and handwriting overlap. It can be confirmed that the overlapping of the entry frame and the handwriting does not affect the handwriting separation.
[0060]
4.3 Detection of entry frame and entry frame structure
In order to detect the entry frame, there are a method using line segment detection and intersection detection (described later), a method of taking histograms in the vertical and horizontal directions as described below, and the like. The active form method is effective without depending on the entry frame detection method. However, in the following, since further technical development was performed by the method using the histogram, the detection of the entry frame using the histogram will be described.
[0061]
4.3.1 Detection by histogram
FIG. 31 is an explanatory diagram of detection of an entry frame and an entry frame structure using a histogram. By executing each process shown in FIG. 28, a histogram as shown in FIG. 31 is obtained.
First, as shown in FIG. 31A, a histogram in the horizontal direction is taken over the entire form image, and an entry frame is detected line by line according to the obtained histogram (corresponding to S201). Next, as shown in FIG. 31B, a histogram in the vertical direction is taken with respect to the detected row, and the position of the entry frame and the square of the entry frame are detected according to the obtained histogram (corresponding to S205). Next, as shown in FIG. 31C, a horizontal histogram is taken with respect to the detected entry frame, and the information bar position of the entry frame is detected according to the obtained histogram (corresponding to S207).
[0062]
4.3.2 Examination of detection method of entry frame structure
32 to 34 are explanatory diagrams of a method for detecting the entry frame structure.
As shown in FIG. 32, when the height and width of the entry frame are sufficiently large, the structure of the entry frame can be clearly distinguished by the histogram in the vertical direction and the horizontal direction. An intermediate value between the maximum value and the minimum value of the histogram is obtained, and this intermediate value is used as a threshold value to detect the square position of the entry frame and the position of the information bar. However, as shown in FIG. 33, when the width or height of the entry frame is small, the intermediate value cannot be obtained stably. 32 and 33, the size of the entry frame is enlarged and reduced for ease of viewing, so it is difficult to see the difference in the size of the entry frame at a glance, but from the number of codes arranged. You can see the difference in the size of the entry frame. Furthermore, an entry frame erased by overlapping with handwriting as shown in FIG. 34 may result in an incorrect result if it is processed as it is. Therefore, when identifying the structure of the entry frame, first, the dots of the entry frame are expanded as shown in FIG. Since it takes time to expand the dots in the entry frame until they completely contact each other, it may be expanded to some extent according to the size of the dots. An intermediate value of the histogram is obtained for the expanded entry frame, and the structure of the entry box can be easily identified. The entry frame with the two types of large and small codes can recognize the structure without performing the expansion process because the dot size is small. In the entry frame by the 8-direction line segment code, the dot texture is sparse depending on the size and shape of the dots. In order to recognize the entry frame structure from the histogram, it is preferable to perform an expansion process.
[0063]
4.3.3 Reduction of processing time considering blanks
FIG. 35 shows a method (algorithm 1) for processing six types of large / small two-type codes and eight-direction line segment codes without considering blanks in the form (algorithm 2). ). Since there is a space between the entry frames in the form, in the processing shown in the above-described flowchart, the processing time for the form is shortened by omitting the blank processing time for labeling.
The process of algorithm 2 is the same as the process according to the present embodiment shown in FIG. The processing procedure of Algorithm 1 is as follows. First, the entire form image is labeled to separate the entry frame and handwriting. Next, a line is detected with a histogram in the horizontal direction with respect to an image having only an entry frame. Further, the position and the mass position of the entry frame are detected by a histogram in the vertical direction with respect to the entry frame image for each line. The position of the information bar of the entry frame is detected by a histogram in the horizontal direction with respect to the entry frame image for each line. In Algorithm 1, all forms are labeled, whereas in Algorithm 2, only the detected lines are labeled. Regardless of the density of the entry frame, Algorithm 2, that is, a method that considers the space between lines is faster. Note that the accuracy of the processing results by the two algorithms is the same.
[0064]
4.4 Information loss prevention processing when detecting entry boxes
FIG. 36 is an explanatory diagram of information loss prevention processing when an entry frame is detected.
When detecting the position of the entry frame by the histogram in the horizontal or vertical direction, in order to avoid the hindrance of noise, a value equal to or lower than a predetermined threshold is regarded as 0 with respect to the result of the histogram, and even in a noisy place Ignore it as blank. In FIG. 36, there are many noises on the upper and outer sides of the entry frame. However, by performing threshold processing, the position f of the entry frame can be detected without being affected by the noise. However, since the value below the threshold value is regarded as 0, the information below the threshold value is also ignored at the boundary of the entry frame, which has an adverse effect on reading the information superimposed in the entry frame. Therefore, when detecting the position of the entry frame, in order to prevent such information loss, the position f of the entry frame is detected at a position below the threshold value in the histogram, and further, from this position f to the outside of the entry frame. The threshold value is a position where the histogram value is traced in the direction (for example, upward in the example of FIG. 36) and the histogram value has changed from a value greater than 0 to 0, or the length traced from the first position f. If the value of the histogram does not become 0 even if the value exceeds, the position shifted by the threshold value from the initial position f is set as the start or end position of the entry frame. This makes it possible to correctly detect the start or end position of the entry frame and avoid noise outside the entry frame. FIG. 36 shows an example of row detection (scanning up and down), but the same applies to column detection (scanning left and right).
[0065]
5. Decoding
5.1 Process flow
Next, the process (decoding) in step S105 in FIG. 27 will be described.
FIG. 37 shows a flowchart for decoding large and small two types of codes. The flowchart shown in FIG. 37 is decoding for a form in which information records are arranged in the horizontal direction of the information bar.
First, the active form processing system takes a histogram in the horizontal and vertical directions with respect to the information bar of the entry frame detected in step S103, and cuts out dots for each line (S301). Next, the active form processing system measures the length of the dot according to the longitudinal histogram of each cut out dot, and determines whether the information indicated by the dot is 1 or 0. For example, the active form processing system sets 1 for a long dot and 0 for a short dot. Then, the active form processing system returns the shift of the line by a predetermined bit, for example, 2 bits (S305). The active form processing system obtains a large amount of information records from the information bar of the entry frame (S307). For example, the active form processing system can obtain as many information records as possible from the information bar of the entry frame. The active form processing system may obtain a predetermined number of information bars. The active form processing system obtains the superposition information record by taking the majority of the values of the information record (S309). The error can be recovered by taking a majority vote. In the above description, the two types of large and small codes have been described, but the same can be applied to a form having other codes arranged in the horizontal direction.
[0066]
Next, decoding of the 8-direction line segment code will be described. In cutting out the 8-way line segment code, error prevention and detection are performed, and an efficient decoding method is performed. The size of the 8-direction line segment code is described as the above-described size. In addition, it is possible to perform decoding even when the drawing area has a size of 0.35 × 0.35 mm, 0.45 × 0.45 mm, 0.5 × 0.5 mm, and 0.6 × 0.6 mm by experiment. It has been confirmed. The larger the code size, the higher the recognition rate, but the smaller the code, the better the appearance of the form. When decoding 8-way line segment code, the required space between dots is 0.2 mm, and forms can be created with the desired code size. The code size is automatically detected and adjusted to this size. Decoding. Further, since the 8-direction line segment code uses an indefinite length information record by a delimiter and a parity code for each information record, an information record length error and an information record content error can be detected. Details of these processes will be described below.
[0067]
FIG. 38 shows a flowchart of decoding of the 8-way line segment code. The flowchart shown in FIG. 38 is for decoding a form in which a code, a parity code, and a delimiter are arranged in the vertical direction of the information bar.
First, the active form processing system sets an information record (37-01). For example, the information bar detected in step S103 or a part thereof is read from the storage unit. Next, the active form processing system obtains a delimiter at the beginning of the information record. Specifically, the following steps 37-02 to 37-06 are executed. First, the active form processing system determines whether it is the end of the image (37-02). For example, the active form processing system determines whether there is an image (data) to be cut out next. In the case of the end of the image (37-02), the active form processing system proceeds to the process of step 37-17. On the other hand, when the active form processing system is not the end of the image (37-02), the active form processing system cuts out the figure of the code (37-03). For example, the active form processing system determines the division of the histogram in the vertical direction and the horizontal direction, and cuts out the figure of the code based on the width. The active form processing system determines whether it is a clipping error (37-04). As the cut-out error, for example, it is conceivable that the code cannot be cut out for the reason that the code is deleted due to the overlap of handwriting, or that the cut-out code is not in a predetermined position and other codes are deleted. In the case of a cut-out error (37-04), the active form processing system proceeds to the process of step 37-14. On the other hand, if it is not a cut-out error (37-04), the figure of the cut-out code is recognized (37-05). The active form processing system determines whether the recognized code is the head delimiter (37-06). The active form processing system proceeds to the process of step 37-07 if the recognized code is the head delimiter, and considers it as an error when it is not the head delimiter, and proceeds to the process of step 37-14.
[0068]
In the process of steps 37-14 to 37-16, the next terminal delimiter is found. First, the active form processing system cuts out the figure of the next code (37-14). The active form processing system recognizes the cut out code (37-15). The active form processing system determines whether the recognized code is a terminal delimiter (37-16). Note that the same symbol can be used for the head delimiter and the end delimiter. If the recognized code is not the terminal delimiter, the active form processing system returns to Step 37-14. The active form processing system repeats steps 37-14 to 37-16 until it finds a delimiter. On the other hand, in the case of a terminal delimiter, the process proceeds to step 37-01.
On the other hand, the active form processing system obtains the data of the information record when the code recognized in step 37-06 is the head delimiter. Specifically, the following steps 37-07 to 37-13 are executed.
[0069]
First, the active form processing system determines whether it is the end of the image (37-07). In the case of the end of the image (37-07), the active form processing system proceeds to the process of step S37-17. On the other hand, if it is not the end of the image (37-07), the active form processing system cuts out the graphic of the next code (37-08). The active form processing system determines whether it is a clipping error (37-09). In the case of a clipping error (37-09), the active form processing system proceeds to the process of step 37-14. On the other hand, if it is not a cut-out error (37-09), the figure of the cut-out code is recognized (37-10). The active form processing system determines whether the recognized code is a terminal delimiter (37-11). If the recognized code is not a terminal delimiter (37-11), the active form processing system stores the recognition result in the storage unit in association with the set information record (37-12), and the process of step 37-07 Return to. The active form processing system repeats the processes from 37-07 onward, recognizes the code until the end delimiter is found, and obtains data of the information record. On the other hand, when the recognized code is a terminal delimiter (37-11), the active form processing system ends the processing of one information record (37-13), returns to step 37-01, and further receives the next information. Perform processing on the record.
[0070]
In step 37-17, the active form processing system ends the process of obtaining multiple information superimposed on the information record, and proceeds to the process of step 37-18. The active form processing system checks the length of the information record and deletes the information record that is not within the possible range (for example, 45 to 207 bits in the example shown in FIG. 12) (37-18). The active form processing system obtains the length of the information record by taking the majority of the length of the information record (37-19). The active form processing system checks the length of the information record and deletes the incorrect information record (37-20). The active form processing system checks the parity code of the information record and deletes the incorrect information record (37-21). The active form processing system obtains the superimposition information record of the entry frame by taking the majority of the remaining information record values (37-22). Further, the active form processing system stores the obtained information record in the storage unit as appropriate, and ends the process.
[0071]
5.2 Detection of code size and number of code lines in information bar
FIG. 39 is an explanatory diagram for calculating the size of the drawing area. As an example, four lines of code are arranged in the horizontal bar shown in FIG. In order to cut out the code, the horizontal bar is used to detect the code size and the number of code lines in the information bar.
First, a vertical histogram is taken with respect to the horizontal bar, and the center position of each code graphic string is detected as shown in FIG. Next, the distance between the centers of each code figure sequence is obtained, and an average is obtained for these distances. This result is the side length of the square of the arrangement area of the code figure. If the size of the dot interval of 0.2 mm (4 pixels by 600 dpi scan) is subtracted from the side length of the arrangement area, the side length of the drawing area square can be obtained.
[0072]
The form in the present embodiment is set so that the number of code lines in the horizontal bar is the same as the number of code lines in the information bar of the entry frame. The number of code lines in the information bar of the entry box can be obtained. If the form is inclined, it is difficult to detect the number of code lines for the entire horizontal bar. Therefore, according to the vertical histogram taken for the horizontal bar when the code size is detected, the horizontal line histogram is taken for only several columns of code figures, so that the code line of the information bar of the entry frame The number can be detected. In addition to obtaining the number of code lines of the information bar by detecting the number of code lines of the horizontal bar, the information indicating the number of code lines is superimposed on the code constituting the horizontal bar, and the code is recognized. You may make it ask.
[0073]
5.3 Extraction of code figure
Next, extraction of code figures (for example, steps 37-03, 37-08, and 37-14 in FIG. 38) will be described.
[0074]
5.3.1 Necessity of a method for cutting out code figures by histogram
FIG. 40 shows an example of code cutout based on the code size.
If the size of the code can be detected, a method of cutting out the code with the code size as a fixed length can be considered. However, since there is some kind of error (for example, accumulation of small errors), there are cases where the code figure cannot be cut out correctly with the code size set to a fixed length. As shown in FIG. 40, when a code for each column is cut out with the code size BC being a fixed length and the result of the cutting is simulated by drawing a vertical line, the first several columns can be cut out correctly. In the case of the back row, the error becomes large, so that it may not be cut out correctly. In the active form processing system according to the present embodiment, the code figure is cut out by using a histogram instead of cutting the code size as a fixed length, thereby realizing accurate code cutting. .
[0075]
5.3.2 Extraction direction of code figure
First, a horizontal histogram is taken with respect to the image of the information bar, and each line is cut out. Next, a vertical histogram is taken for the image for each row, and a figure for each code is cut out. In the case of an information bar having no inclination, a code figure can be easily cut out in this cutting direction. However, as shown in FIG. 41, in the case of an information bar scanned at an angle, a code figure cannot be easily cut out in this cutting direction. However, even in such a case, pay attention to the fact that the number of rows is small compared to the number of columns and the overlapping of the histograms in the vertical direction is small, and using this property, even if the information bar is slightly inclined, the figure for each code is correctly displayed. Can be cut out. For example, the active form processing system first takes a histogram in the vertical direction with respect to the image of the information bar, and cuts out each column from the left to the right of the information bar. Next, a horizontal histogram is taken for each column image, and a figure for each code is cut out.
[0076]
5.3.3 Detection and processing of information records with overlapping handwriting
Next, clipping errors (corresponding to 37-03, 37-08, and 37-14 in FIG. 38) will be described. Due to the overlap between the handwriting and the entry frame, the part overlapping the handwriting is erased in the separated entry frame. This is detected efficiently, and a method is adopted in which processing is not performed for information records with errors. 38, the error is detected in the cut code, and the error is detected from 37-04 and 37-09 in FIG. 38 to 37-14 in FIG. The process proceeds to 37-15 and 37-16, and the code is skipped until the next delimiter is found. In other words, no processing is performed for information records with errors. In the information bar image, the code figure for each column is cut out from the left to the right, and when the distance between the previous column and the next column is larger than the expected width, the code figure and the handwriting overlap, Judge that the code figure is missing.
[0077]
FIG. 42 is an explanatory diagram of detection and processing of information records in which handwriting overlaps. FIG. 42 is the leftmost part of the information bar of the entry, and the code drawing area is represented by a virtually small square. The dot texture of the information bar is formed by arranging the drawing areas uniformly with a minimum interval (4 pixels) of 0.2 mm. The side length BB of the drawing area square is obtained from the horizontal bar by the method described above. The position of the column center position S + the drawing area side length BB / 2 is the maximum right position R2 of the column, and the actual right position of the column vertical histogram is R1.
[0078]
When the code figure of each column is cut out, the distance d between the right position R1 of this column and the reference a is obtained, and whether or not there is a handwriting overlap error is determined according to the distance d. First, how to obtain the reference position a will be described. In the image of the information bar, the code figure for each column is cut out from the left to the right. Initially, the initial position of the reference position a is the leftmost position of the information bar. Each time the code figure for each column is cut out, the reference position a is updated. The new position of the reference position “a” is the center position S of the cut out column + the drawing area side length BB / 2. That is, the reference position a always points to the maximum right position R2 of the code string except for the initial value. Since the reference position a changes based on the center position of the column, the initial position may not be correct due to noise or the like. A specific process for detecting a handwritten overlap error according to the reference position a will be described below.
[0079]
First, every time a new code string is cut out, the active form processing system obtains a distance d between the right position R1 of this string and the reference a (for example, corresponding to step 37-03). Next, the active form processing system, when the distance d is larger than the drawing area side length BB by 6 pixels (drawing area interval 4 pixels + margin 2 pixels) or more (for example, corresponding to step 37-04), It is regarded as an overlap error (cutout error) and is ignored until a delimiter is found for the subsequent code figure (corresponding to steps 37-14 to 37-16). On the other hand, when the distance d is not 6 pixels or more larger than the drawing area side length BB, the active form processing system performs a process of cutting out and recognizing a column code (for example, corresponding to steps 37-05 and 37-10).
[0080]
5.3.4 Errors in code figures
FIG. 43 is a diagram illustrating an example of an error in a code figure.
As shown in FIG. 43, the code figure may come into contact with noise, and the shape and size may become abnormal. For example, there are a code figure horizontal size error due to horizontal connection, a code figure vertical size error due to vertical connection, a code figure vertical size error due to noise, and a horizontal code figure horizontal size error due to information bar position error. For these abnormalities, if the size of the cut out figure is smaller than the expected code figure, it is ignored as noise. This prevents errors such as attaching minute noises to figures and connecting them, or processing minute noises as code figures.
[0081]
When cutting out the code figure for each column, the width of the column is checked. If the width of the column is larger than the drawing area width BB + 4 (pixel), it is determined that the code figure has a horizontal size error, and the width from the reference position a to BB + 4 is determined. This column is divided and cut out. The 8-direction line segment code is composed of line segments. As described above, the line thickness is 0.1 mm, and the size in 600 dpi scan is about 3 pixels. Considering the error, the code line thickness threshold is set to 5 pixels, for example. Since the code figure can be recognized, the width BB of the drawing area is always larger than the thickness of the line. Therefore, both the width and height of the code cannot be smaller than 5 pixels of line thickness. When cutting out a figure for each code, check the width and length of the code. If both the width and length of the code are 5 pixels or less, it is judged as an error in the size and width of the code figure, and this figure is ignored. . If the length of the code figure is larger than the drawing area width BB + 4, it is determined that the code figure has a vertical size error, and the code figure is divided and cut out from the top of the figure by the width of BB + 4.
[0082]
By doing so, it is possible to prevent errors such as attaching minute noise to each code graphic and connecting them, or processing minute noise as a code graphic.
5.3.5 Correspondence to code figure damage due to position detection error of information bar
44 to 46 are explanatory views (1) to (3) of the processing for dealing with the code figure damage due to the position detection error of the information bar.
When the position of the information bar in the entry frame is detected, as shown in FIG. 44, for example, the information bar detected in step S207 described above and the boundary b of the other part are different from the boundary of the correct information bar. There may be. If the code figure is cut out and decoded in accordance with the information bar boundary line b, a part of the code figure protruding from the information bar is excluded and recognized, which may lead to erroneous recognition. In order to avoid this, the active form processing system executes the code cutout process (corresponding to steps 37-03, 37-08, and 37-14 in FIG. 38) and the cutout error process as follows. Can do.
[0083]
In accordance with the detected information bar boundary line b, the active form processing system performs code from left to right on the information bar image, for example, for data from the top code to the boundary line b using a vertical histogram. Detect the position of each column. Next, as shown in FIG. 45, the active form processing system scans a horizontal histogram on the code image for each column and cuts out each code. As shown in FIG. 45, the cutting direction of the upper information bar of the entry frame is scanned from top to bottom. On the other hand, the cutting direction of the lower information bar scans from the bottom to the top. From this stage onward, the information on the boundary line b is not used.
[0084]
The active form processing system cuts out one code for each row, and then checks the number of cut out codes and the scanned length L in this row. If the number of code lines of the information bar obtained from the horizontal bar is N, if the number of codes in this column has already reached N and / or the scanned length L has reached a length corresponding to N , To finish cutting the code. Otherwise, the histogram is subsequently scanned and the code is cut out. Here, the purpose of detecting the end with the scanned length is to prevent the code figure from going over the area of the information bar when the code figure overlaps with the handwriting and is lost.
The active form processing system checks the number of codes for each column when the code segmentation is completed. If N has not been reached, it is regarded as an error in the number of codes (cutout error shown in FIG. 38), and is ignored until a delimiter is found for the subsequent code figure.
[0085]
By not using the boundary line b in the cutout stage of the code figure, the code figure protruding from the detected information bar boundary line can be correctly recognized. The result of cutting out the code figure by this method is shown in FIG. In FIG. 46, a code frame cut out for convenience is attached and displayed. The information bar at the top of the entry frame shows that the code is correctly cut out without being affected by the error of the boundary line b.
[0086]
5.4 Recognition of 8-way line segment code
FIG. 47 is an explanatory diagram for the recognition of the 8-direction line segment code.
Next, code figure recognition (corresponding to 37-05, 37-10, and 37-15 in FIG. 38) will be described.
If the code size can be detected, a decoding method adapted to the code size can be performed. The side length of the code drawing area square is BB. A histogram in four directions is taken for the code figure. As shown in FIG. 47, the widths of the histograms in the four directions are separately set to l, x, s, and h. In consideration of errors such as errors, the code line thickness threshold is set to 5 pixels.
[0087]
FIG. 48 shows an algorithm for code recognition. FIG. 49 shows an example of a program for recognizing codes.
First, the active form processing system determines whether h ≦ 5 (S501). The active form processing system sets code = 0 when yes, and when no, determines whether h ≧ BB-2 (S502). The active form processing system moves to step S503 when yes and moves to step S508 when no. The active form processing system determines whether x ≧ BB-2 (S503). The active form processing system moves to step S504 when yes and moves to step S506 when no. The active form processing system determines whether s ≦ 5 (S504). The active form processing system sets the code to 2 when yes and determines whether l ≦ 5 when no (S505). The active form processing system sets the code = 6 for yes and the code = 8 for no.
[0088]
In step 506, the active form processing system determines whether x> 5 (S506). The active form processing system sets the code to 4 when no, and determines whether s <BB-2 when yes (S507). The active form processing system sets the code = 3 for yes and the code = 5 for no.
In step 508, the active form processing system determines whether s <BB-2 (S508). In the active form processing system, the code is 1 when yes and the code is 7 when no.
In addition to the above determination, an appropriate determination can be made according to the size of the code and the thickness of the line. In addition to using the four-way histogram, the code can also be recognized by the partial scanning and matching described above.
[0089]
5.5 Detection and processing of information record length errors
Next, detection and processing of information record length errors (corresponding to 37-18, 37-19, and 37-20 in FIG. 38) will be described.
The 8-direction line segment code can detect the length error of the information record by the delimiter. As described in the above coding method, the length of the information record of the 8-way line segment code is indefinite, but the number of code figures including the parity symbol is 16 to 70 (the information record is 45 ~ 207 bits). When recognizing a code figure, if the delimiter is recognized incorrectly, the length of the information record may not be between 16 and 70. In that case, the information up to the next delimiter is skipped and these information records are deleted (step 37-18).
[0090]
When all the codes in the information bar are processed and recognized, a plurality of records are obtained. If the lengths of these records are not all the same, a majority decision of each length is taken, and the maximum number of records is set to the correct length (corresponding to steps 37-19). Also, the length of the information record is checked, and the information record whose length is different from the length obtained by the majority vote is deleted. The next parity check is performed on the remaining information records.
[0091]
5.6 Checking parity code
The parity code is checked for the information record (corresponding to 37-21 in FIG. 38). For example, for each information record, it is checked whether or not the number of 1s with the same number of bits is an even number. If it is, the information record is correct. If it is not correct, the information record is determined to be incorrect and the information record is deleted.
[0092]
5.7 Acquisition of entry frame overlay information record
A majority vote is taken for each bit for the information record obtained by the method described above (corresponding to 37-22 in FIG. 38). By doing so, it is possible to obtain data with strong certainty for each entry frame.
[0093]
6). How to increase email address recognition rate
FIG. 50 shows an entry frame for increasing the recognition rate of mail addresses.
The mail address notation differs depending on the mail server, and the mail address character type is also different, but the commonly used mail address format is the domain address format, and the character type is "@", alphanumeric characters, "." (Period) , “−” (Hyphen), “_” (underscore). In the active form processing, the recognition rate can be increased by limiting to these character types when the handwritten content is an email address according to the information superimposed on the entry frame.
[0094]
However, these character types generally have a low recognition rate in off-line character recognition, and the recognition rate of mail addresses depends on this recognition rate. Even if the character types are limited, English uppercase letters, lowercase English letters and numbers affect each other and tend to be misrecognized. Also, since the e-mail address is a character string that does not make sense, context processing that increases the recognition rate cannot be performed. This is also the reason why the e-mail address recognition rate cannot be increased. However, it is very important for the form processing system because there is a risk of making a big mistake if the mail address is recognized incorrectly. In e-mail address misrecognition, the main cause is misrecognition of similar characters. If this misrecognition can be avoided, the recognition rate can be increased.
[0095]
As shown in FIG. 50, in order to limit the character type of the mail address, for example, another entry frame is provided under the mail address entry frame in the active form. In this box, enter the corresponding character type for the email address above. Considering the ease of recognition and the ease of entry by people, for example, the following symbols are entered.
(1) ￣: English capital letters (allows you to fill in a frame)
(2) _: English lower case letters (can be overlaid with a frame)
(3) =: Number (allows you to fill in a frame)
(4) Blank: Other
[0096]
These codes can be easily and reliably recognized by taking a histogram in the vertical direction and the horizontal direction, or matching three kinds of symbols in the frame. By doing so, it is possible to avoid the mutual influence between the character types of the mail address, and the recognition rate is increased. As shown in FIG. 50, these symbols can be entered across successive squares. As a result, it is possible to write in character string units instead of character units, so that instructions are simplified. You only need to look inside the square for recognition. The information record of the entry box for limiting the character type can include appropriate data indicating that the above-described code for enhancing the recognition of the mail address is entered as the character type.
[0097]
7. Prototype of active form processing system
An active form processing system was prototyped using the technology described above. The outline will be described below.
[0098]
7.1 Start and form reading
When the active form system is started up, menus such as [imageScan], [file], [imageSize], and [Setting], an execution button, and the like are displayed. For example, the active form processing system reads a form through a scanner by operating an [imageScan] menu. The active form processing system can also open an existing form image from the [file] menu. The active form processing system can display the read form image. Further, the display size of the image can be changed through the [imageSize] menu. Set the application to be called after the form processing system is finished from the [Setting] menu. The progress of the form processing is displayed through the progress bar. When the image is read and displayed and the execution button is pressed, the active form processing system starts processing.
[0099]
7.2 Separation of entry frame and handwriting ・ Detection of entry frame position and entry frame structure
FIG. 51 is an example of the processing flow of the active form processing system.
After reading the form, the active form processing system detects the entry frame and the entry frame square position, and separates the entry frame and the handwriting (52-1 in FIG. 51). The active form processing system obtains a frame-only image and a hand-written image for each entry frame, and classifies the entry frame into an empty entry frame, a horizontal bar, a handwritten entry frame, and noise (52-2 in FIG. 51). For example, if the number of black pixels of an image with only handwriting in the entry frame is less than a certain value, it is determined that the entry frame is empty. A vertical histogram is taken for an image with only an entry frame, and if the fluctuation in the vertical histogram is small and not more than a certain value, it is determined as a horizontal bar. If the width and height of the entry frame are small and less than a certain value, it is determined that an error has occurred. The other entry frames shall be handwritten entry frames. Next, the active form processing system takes a horizontal histogram for the frame image of each entry frame, and positions the upper and lower frames where information such as handwritten character attributes, character types, instructions, and headings are embedded. Is detected (52-3 in FIG. 51).
[0100]
7.3 Superimposition information reading
For each entry frame, the active form processing system passes the position information of the upper and lower frames in which information is embedded to a decoding engine that recognizes the overlay information, and decodes the overlay information in the entry frame (52 in FIG. 51). -4). The handwritten attributes embedded in one entry frame include, for example, types such as an address, an e-mail address, an organization name, a person name, a job title, an amount, a number, a date and time, a mark, free writing, a check, and a figure.
[0101]
7.4 Character separation and off-line character recognition
The active form processing system classifies the entry frame into check, figure / mark, mail address, and other types of handwritten characters according to the handwritten attribute included in the decoded superposition information (52-5 in FIG. 51). The active form processing system separates each other handwritten character into characters and performs character recognition according to the type of handwriting for each entry frame included in the superimposed information (52-9, 52-10 in FIG. 51, 52-12, 52-13). Thereby, the character recognition rate can be increased. In the case of other handwritten characters, the active form processing system determines whether to apply the context processing according to the attributes of the handwritten characters. When the attribute is an organization name, job title, free writing, or address, context processing is performed (52-11 in FIG. 51). Context processing is not performed for a person name, amount, number, date and time of day. The active form processing system detects the type of each character separately for each e-mail address, and performs character recognition according to this type to increase the character recognition rate (52-6 in FIG. 51). 52-7, 52-8).
[0102]
7.5 Correction of handwritten character offline recognition results
Next, the active form processing system corrects the offline recognition result through the correction interface.
[0103]
7.6 Completing the order and saving the form image
When the active form processing system finishes correcting the result of offline recognition, it outputs the content extracted from the active form in the form of, for example, a CSV file. The output file is called a command (command data). An application that is called after the active form processing system is finished performs processing on the form in accordance with the instruction. The form image is saved in the same folder as the instruction document.
FIG. 52 shows the format of the output file. The instruction sheet represents information records for each entry frame on a line, separated by commas. The first line shows the contents of each column. [Name] is a heading, [property] is an attribute, [contents] is handwritten content, [action] is a command, and [character] is a character type. The second and subsequent lines represent information records for each entry frame. When handwriting is not written in the entry frame, nothing is output like “,,” in the column corresponding to [contents].
[0104]
8). Evaluation experiment
8.1 Experimental data and experimental environment
Print 2 sheets of A4 size large and small 2 types of code and 2 forms of 8 direction line code at 1200 dpi and ask 10 people to write them one by one. A sample and a sample of 20 8-way line segment codes were collected. These forms were scanned at 600 dpi and 8-bit gray level, and a preliminary evaluation of the active form processing system was performed. FIG. 53 is a form based on the 8-direction line segment code used in the evaluation experiment. The same applies to forms of two types of large and small codes.
[0105]
As an experimental environment, a PC having a 2.26 GHz CPU of Pentium (registered trademark) 4 was used. In the case of collecting handwritten character data without an off-line frame, because it is experimental data, the method of collecting data for on-line handwritten character experiment was used, and the following conditions were set when writing data. (1) Character size and character spacing are free. (2) It is impossible to continue between characters.
In the case of two types of large and small code forms, only the character recognition with a frame is possible in order to finish the system easily, and further, the character type limitation for each character is not provided to increase the recognition rate of the mail address. In the case of an 8-direction line segment code form, character recognition without a frame is also possible, and the character type for each character is limited to the mail address.
[0106]
8.2 Evaluation items for off-line handwritten character recognition
Frameless recognition is a process of recognizing a handwritten character string pattern as a “character string”. However, when evaluating the recognition rate of “character strings”, it is necessary to clarify the scale of evaluation. For example, as a result of recognizing a character string of about 20 characters per line, when a case of “only one character was misrecognized” and a case of “most misrecognized most characters” occurred, both were simply regarded as “recognition failure” on one line. It is not fair to evaluate. For this reason, an evaluation item for online handwritten character recognition without an online frame is used, and the following evaluation items are provided.
[0107]
8.2.1 Character recognition rate
The “character recognition rate” expressed by the following formula is adopted as an evaluation item.
Character recognition rate = number of correctly recognized characters / number of correct characters
FIG. 54 is an explanatory diagram of a method for calculating the recognition rate in character units. For example, if a result of “Sunday is clear” is returned for the input pattern “Tomorrow is sunny”, in this case, the character that has been correctly recognized is only one character “Sunny”. Since the correct number of characters is 5, the recognition rate in character units is calculated as 1/5 = 0.20 = 20%.
[0108]
8.2.2 Normal division rate
Furthermore, an evaluation item related to the accuracy of character division (referred to as “normal division ratio”) is also added. The normal division rate is defined by the following equation.
Correct division rate = 1- (number of character division failures / number of correct characters)
In the example of FIG. 54, there are the following two character division failure locations. (1) The place where “Ming” is originally divided into “Month” and “Day”. (2) Originally, the two characters of “day” and “ha” are attached to one character. Therefore, the normal division rate is calculated as 1-2 / 5 = 60%.
[0109]
8.3 Experimental results
8.3.1 Two types of codes
FIG. 55 shows the experimental results of two types of large and small code forms. The average processing time is the average processing time until the form is recognized and the correction interface is called. In the form sample, wrongly written characters (3 characters in 20 form samples) and characters not included in 4443 character types in the offline recognition dictionary (1 character in 20 form samples) are evaluated. Can't put in
The recognition rate was improved by 33.98% from 59.74% to 93.72% by limiting the character type of the recognition category according to the type of handwritten character read from the entry frame for the cut handwritten character. Since the recognition rate of handwritten characters includes the recognition rate of email addresses, it depends on the recognition rate of email addresses. In addition, of the 20 forms, two forms are messy and there are characters that cannot be identified by humans, which causes a reduction in the character recognition rate.
[0110]
The processing time from reading a form to passing it to the correction interface is less than 4 seconds for a complex form and less than 2 seconds for a simple form in this experimental environment. Within the range of this experiment, the detection rate of the entry frame position reached 100%. As the success rate of decoding the superimposition information record, a large and small two-type code with a small amount of information and a simple code superposition method were adopted, and each data code error check was not provided, so a recognition rate of 100% could not be obtained.
[0111]
8.3.2 8-way line segment code
FIG. 56 shows the experimental results of the 8-direction line segment code. In the case of an 8-way line segment code form, there is no frame-based handwritten character recognition and character type limitation for each character of the mail address, so in handwriting character recognition rate item evaluation, there are handwritten characters with a frame other than the mail address, mail address, And it evaluates by three classifications, a handwritten character without a frame. The average processing time is the average processing time until the form is recognized and the correction interface is called.
In the form sample, wrongly written characters (2 characters in 20 form samples) are not included in the evaluation. Furthermore, an error was attached to the form image by scanning in one form, so that it was recognized as a frame with characters even though it was a blank entry frame. However, since the correction was made by the correction interface, this is not included in the evaluation of the handwritten character recognition rate.
[0112]
By limiting the character type of the recognition category according to the type of handwritten character read from the entry frame, the recognition rate of the handwritten character with a frame other than the mail address is increased from 86.78% to 98.85% 12.07. % Improved. In the mail address, the recognition rate improved by 16.96% from 97.15% to 80.19% using the attribute by limiting the character type for each character. The 10th-ranked cumulative recognition rate of the mail address is almost the same regardless of whether or not the characters are limited, but the first-rank determination rate is drastically lowered when the character type is not limited. This seems to be due to the mutual influence of similar characters as the cause of the decline in the e-mail address recognition rate.
[0113]
The processing time from reading the form to passing it to the correction interface is about 4 seconds for the complicated form A and about 3 seconds for the form B in this experimental environment. Within the range of this experiment, the detection rate of the entry frame position reached 100%. As for the success rate of decoding the superimposed record, an 8-way line segment code with a large amount of information and an efficient code superimposing method were adopted and each data error check was provided, so that a recognition rate of 100% was obtained. In the case of frameless handwritten characters, a recognition rate of 97.71% in character units and a 98.47% normal division rate were obtained.
[0114]
9. Line segment detection, intersection detection
Further, methods for recognizing a form structure from detection of a line or an intersection without using a histogram include methods described in Non-Patent Documents 4 and 5, for example. The outline of line segment detection and intersection point detection in the present embodiment will be described below. The following processes are executed by the active form processing system (processing unit).
[0115]
9.1 Form representation model
Since most forms include three types of line segments (horizontal, vertical, and diagonal), the form structure is expressed by these three types of line segments. One line segment is represented by the coordinates of the start point and end point. In the form according to the present embodiment, the entry frame, horizontal bar, etc. are composed of dots or code figures, and these dots are expanded and connected to form a line segment. Horizontal, vertical, diagonal lines Find a segment.
Also, the horizontal line segment is sorted in the direction from left to right from top to bottom according to the coordinates of the leftmost starting point. That is, first, sorting is performed in the direction from top to bottom according to the Y coordinate of these start points. Then, the horizontal line segments having the same Y coordinate are sorted in the direction from left to right according to the X coordinate of the starting point. Similarly, for vertical line segments, first sort from left to right according to the X coordinate of the top starting point, and then from top to bottom according to the Y coordinate for the vertical segment of the same X coordinate starting point. Sort. The diagonal line segment is the same as the horizontal line segment, but sorts using the top starting point instead of the leftmost starting point. Note that the sorting direction can be any appropriate direction.
[0116]
The form expression model is composed of, for example, the following five parts.
(1) (NH, NV, NS): NH, NV, NS represent the number of horizontal, vertical and diagonal line segments, respectively. (2) (SLH, SLV, SLS): SLH, SLV, SLS respectively represent a list of horizontal, vertical and diagonal line segments. (3) [(X _min , Y _min ); (X _max , Y _max )]: The top left and bottom right coordinates of the square area including all line segments. (4) [(XD (i) _min , YD (i) _min ); (XD (i) _max , YD (i) _max )]: The upper left and lower right coordinates of the square area of the i-th entry frame. (5) [(XN (i) _min , YN (i) _min ); (XN (i) _max , YN (i) _max )]: The upper left and lower right coordinates of the square area of the i-th name frame.
[0117]
9.2 Line segment extraction and adjustment
Here, a process until the line segment is extracted from the form image (data) will be described.
First, a form image (data) is input and tilt correction is performed. This rotates the image so that the longest possible line segment near horizontal is exactly horizontal. A vertical line may be used. Further, attention may be paid to only one line, or a plurality of lines may be focused, and they may be rotated so as to be horizontal or vertical on average.
An entry frame and handwriting are separated from the form image of the active form, and an image of only the entry frame is obtained. For the separation of the entry frame and the handwritten frame, the same method as described above such as labeling can be used. Expansion processing is performed until a small dot is connected to an image having only an entry frame. Then, three types of line segments (horizontal, vertical, and diagonal) are extracted.
For the extracted line segment, the horizontal line segment is the same when the Y coordinates of the two end points are different within the predetermined range, and the vertical line segment is the case where the X coordinates of the two end points are different within the predetermined range. Can be corrected by taking the average of each coordinate so that they are the same. Thus, for example, when there is a slight shift in coordinates, it can be handled as a horizontal or vertical line segment without being treated as an oblique line segment. Next, sorting is performed as described above.
[0118]
For example, the following correction can be performed on the sorted line segments.
(1) Concatenation of broken line segments: After extracting and sorting line segments, the line segments are divided into three groups: horizontal, vertical, and diagonal. Concatenation processing is performed on line segments cut by scanning or the like. In a group of horizontal line segments, if two line segments having substantially the same Y coordinate are closer than 0.5 cm, for example, the two line segments are connected. The same applies to vertical line segments and diagonal line segments.
(2) Delete small line segment: If the length of the horizontal line segment is shorter than 85% of the minimum distance between the vertical line segments, for example, the horizontal line segment is regarded as a stroke line and deleted. For example, if the length of the vertical line segment is shorter than 85% of the minimum distance between the horizontal line segments, the vertical line segment is regarded as a stroke line and is deleted.
(3) Correction of line segment: If the distance from the end point of the vertical line segment to the horizontal line segment is shorter than 0.5 cm, the vertical line segment is corrected so as to touch the horizontal line segment. The same applies to horizontal line segments and diagonal line segments.
In addition, the numerical value used for the above-mentioned judgment can be made into an appropriate numerical value.
[0119]
9.3 Detection of name and entry frames
Next, processing for detecting a name frame and an entry frame based on the extracted line segment will be described.
An ordinary form contains an entry frame, a name frame, and a mixed frame. The entry frame is a frame for receiving data. The name frame is a frame including text such as explanation and guidance. The mixed frame serves as both an entry frame and a name frame. In the frame detection algorithm described below, the input data is the sorted horizontal and vertical line segment lists SLH and SLV, and the output data is the square area data (entry frame data) of the entry frame and the name frame / mixed frame. It is.
(Step 1) Detect all contact points, intersections and end points for horizontal and vertical line segments. For each horizontal line, a vertical line that touches or intersects the horizontal line is detected, and the coordinates of these contact points and intersections are calculated. If there is no vertical line intersecting the contact point at the end point of the horizontal line, the end point of this horizontal line is set as the end point. Note that if the frame is made up of a closed figure like a rectangle, the end points will not appear basically, but the line segments that make up the frame may be cut off due to, for example, image errors or noise. Terminal points may appear. In addition, there are cases in which a form or the like is entered on the upper side or lower side of a line segment instead of a frame.
[0120]
Separation of contact points or intersections is meaningless, and the respective contact points, intersections, and end points are classified into four types of types 1 to 4 as shown in FIG. 57, for example, based on the positional relationship between horizontal lines and vertical lines. The three contact points and intersections shown in Type 1 indicate, for example, that the point is on the upper side of the frame. The point shown in Type 2 indicates that it is on the upper side or the lower side of the frame, for example. The point shown in Type 3 indicates that it is on the lower side of the frame, for example. Type 4 indicates an end point. An entry box is detected according to these types. For example, the frame can be detected by appropriately finding two points of type 1 (indicating the upper two vertices of the frame) and two points of type 3 (indicating the lower two vertices of the frame).
All points are sorted in order from top to bottom and from left to right. This is because the line segments are sorted as described above. First, the pointer used in the following process is set to the first sorted point.
[0121]
(Step 2) The point A pointed to by the pointer and the point B next to the point A are examined, and the following processing is executed to detect and register (store) the frame. (1) If the Y coordinates of A and B are different, jump to step 4. (2) If both A and B belong to Type 4, scan from AB line segment upward until a horizontal line segment or character string of black pixels is found. If no black pixel is found, the top of the image is reached. In either case, a square area with the line AB as the bottom is obtained. If the height of this square area is smaller than 0.4 cm, for example, it is ignored and jumps to step 4. Otherwise, it is registered (stored) as one frame.
(3) If both A and B belong to Type 1 or Type 2, the first to form a square (or rectangular, the same applies hereinafter) ABCD belonging to Types 2 and 3 from B and later in the sorted list Find points C and D. If the points C and D are found, the square ABCD is registered as a frame. If not, jump to Step 4. (4) If A belongs to type 4 and B belongs to type 1 or type 2, points C and D are found similarly. C is type 4, D is type 2 or type 3, and ABCD is square. If the points C and D are found, the square ABCD is registered as a frame. If not, jump to Step 4.
(5) If A belongs to type 1 or type 2 and B belongs to type 4, points C and D are found similarly. C is a type 2 or type 3, D is a type 4 and ABCD is square. If the points C and D are found, the square ABCD is registered as a frame. If not, jump to Step 4. (6) If all the above conditions are not satisfied, jump to Step 4.
[0122]
(Step 3) The square area of the frame obtained in Step 2 is scanned. If the number of black pixels in the square area is equal to or greater than a predetermined threshold, the square area is labeled as a name area / mixed area square area. Otherwise, label it as an entry box.
(Step 4) Add 1 to the pointer and point to the next point. If the pointer points to the last point, the process ends. Otherwise jump to step 2.
[0123]
【The invention's effect】
According to the present invention, it is possible to read a form used in a window or an office, and a form and form processing method, a form processing program, a form processing program, a recording medium on which the program is recorded, and a form that can be easily separated from a form entry frame and entry characters. A processing device can be provided. In addition, according to the present invention, the recognition rate of written characters can be increased. Furthermore, according to the present invention, the processing method of the form itself can be described not on the apparatus side but on the form side, a general-purpose machine can be used as a reading apparatus, and various forms can be processed on a single machine.
Further, according to the present invention, it is possible to provide a form having a dot shape and a form processing method for increasing the amount of information to be superimposed. According to the present invention, it is possible to provide a coding method and a decoding method for stably reading superimposition information from handwriting overlap and contamination. Furthermore, according to the present invention, it is possible to provide an auxiliary entry method such as alphanumeric symbols that are particularly easily misread. Further, according to the present invention, it is possible to provide a form that can be easily created by an editor and printed by a printer. Further, according to the present invention, the trouble of scanning the image between the lines of the entry frame can be omitted, and the processing time can be shortened.
[Brief description of the drawings]
FIG. 1 is a configuration diagram of an active form.
FIG. 2 is a hardware configuration diagram of an active form processing system.
FIG. 3 is a configuration diagram of a code using two types of large and small graphics (two types of large and small codes).
FIG. 4 is a configuration diagram of a code (8-direction line segment code) using 8-direction line segments.
FIG. 5 shows a relationship between a drawing area and an arrangement area.
FIG. 6 ¹² Code that uses different types of figures (2 ¹² Type code) configuration diagram.
FIG. 7 ⁸ Code that uses different types of figures (2 ⁸ Type code) configuration diagram.
FIG. 8 is a diagram showing information records to be superimposed.
FIG. 9 is a table showing correspondence between attributes and values indicated by codes.
FIG. 10 is a table showing correspondence between character types and commands and values indicated by codes.
FIG. 11 is a diagram illustrating a method for coding an information record in a dot row.
FIG. 12 is a diagram showing information records to be superimposed.
FIG. 13 shows an example of superposition of an 8-direction line segment code.
FIG. 14 is an explanatory diagram of a parity code setting method.
FIG. 15 is a diagram showing a delimiter.
FIG. 16 is an explanatory diagram of a recognition method using a histogram in four directions.
FIG. 17 is an explanatory diagram of extraction of standard pattern samples and feature values of input codes.
FIG. 18 is an explanatory diagram of code recognition by matching with a standard pattern.
FIG. 19 is an explanatory diagram of a recognition method by partial scanning.
FIG. 20 is a dot texture entry frame made up of two types of large and small codes and an 8-direction line segment code.
FIG. 21 shows a result of a recognition method based on a four-way histogram and matching.
FIG. 22 ¹² An entry frame for the dot texture created by the type code.
FIG. 23 shows code recognition results.
FIG. 24 ⁸ An entry frame for the dot texture created by the type code.
FIG. 25 shows recognition experiment results.
FIG. 26 is a comparison diagram of recognition results of various code shapes.
FIG. 27 is a flowchart of an active form processing system.
FIG. 28 is a flowchart for separation of an entry frame and handwriting, and detection of an entry frame and an entry frame structure.
FIG. 29 is an explanatory diagram of a median filter.
FIG. 30 shows a separated entry frame and handwritten characters when the entry frame and handwriting overlap.
FIG. 31 is an explanatory diagram of detection of an entry frame and an entry frame structure using a histogram.
FIG. 32 is an explanatory diagram (1) of a method for detecting an entry frame structure.
FIG. 33 is an explanatory diagram (2) of a method for detecting an entry frame structure.
FIG. 34 is an explanatory diagram (3) of a method for detecting an entry frame structure.
FIG. 35 shows a method that does not consider blanks in a form and a processing result obtained by applying the considered method.
FIG. 36 is an explanatory diagram of information loss prevention processing when an entry frame is detected.
FIG. 37 is a flowchart of decoding of two types of large and small codes.
FIG. 38 is a flowchart of decoding of an 8-direction line segment code.
FIG. 39 is an explanatory diagram of calculating the size of a drawing area.
FIG. 40 shows an example of code extraction based on code size.
FIG. 41 is an explanatory diagram of a cutting direction for cutting out a figure for each code from the information bar of the entry frame.
FIG. 42 is an explanatory diagram of detection and processing of an information record in which handwriting is overlapped.
FIG. 43 is a diagram showing an example of an error in a code figure.
44 is an explanatory diagram (1) of processing for dealing with code figure damage due to an information bar position detection error; FIG.
FIG. 45 is an explanatory diagram (2) of a process for dealing with a code figure damage caused by an information bar position detection error;
FIG. 46 is an explanatory diagram (3) of processing for dealing with code figure damage caused by information bar position detection error;
FIG. 47 is an explanatory diagram of recognition of an 8-direction line segment code.
FIG. 48 shows an algorithm for recognizing a code.
FIG. 49 shows a program example for recognizing a code.
FIG. 50 is an entry frame for increasing the e-mail address recognition rate.
FIG. 51 shows an example of the processing flow of the active form processing system.
FIG. 52 shows an output file format.
FIG. 53 is a form used for an evaluation experiment.
FIG. 54 is an explanatory diagram of a method for calculating a recognition rate in character units.
FIG. 55 shows experimental results of two types of large and small code forms.
FIG. 56 shows an experimental result of an 8-direction line segment code.
FIG. 57 is a diagram showing classification of contact points, intersections, and end points.
[Explanation of symbols]
10 Horizontal bar
20 entry frame
30 Information Bar
40 trout
1 processing section
2 Form input part
3 Output section
4 display section
5 storage unit

Claims

A form input step for inputting form image data including entry frame data composed of a code in which information is expressed by an inclination of a line segment or a combination of a plurality of line segments, and written handwritten data;
Based on the form image data input in the form input step, the entry frame data and the handwritten data are separated, and each entry frame data and an information bar in the entry frame data on which predetermined information is superimposed by a code, A frame detection step for detecting
Cutting out each code of the detected information bar;
A step of obtaining a histogram of four directions of the cut cord in the vertical direction, the horizontal direction, the right diagonal direction, and the left diagonal direction;
A form processing method including: a code recognition step of determining a code shape feature based on a obtained histogram in four directions and recognizing the code.

A form input step for inputting form image data including entry frame data composed of a code in which information is expressed by an inclination of a line segment or a combination of a plurality of line segments, and written handwritten data;
Based on the form image data input in the form input step, the entry frame data and handwritten data are separated, and the horizontal and vertical histograms or line segment detection and / or intersection detection for the form image data, An entry frame detection step for detecting entry frame data and an information bar in the entry frame data in which predetermined information is superimposed by a code;
Cutting out each code of the detected information bar;
A step of obtaining a histogram of four directions of the cut cord in the vertical direction, the horizontal direction, the right diagonal direction, and the left diagonal direction;
A form processing method including: a code recognition step of determining a code shape feature based on a obtained histogram in four directions and recognizing the code.

A form input step for inputting form image data including code data in which information is expressed by a dot width, a slope of a line segment, or a combination of a plurality of line segments, and written handwritten data;
Based on the form image data input in the form input step, the entry frame data and the handwritten data are separated, and each entry frame data and an information bar in the entry frame data on which predetermined information is superimposed by a code, A frame detection step for detecting
A superimposition information recognition step of recognizing each code of the information bar detected in the entry frame detection step and extracting superimposition information;
A character recognition step of performing character recognition according to the attribute and / or character type of the handwritten data indicated by the extracted superimposition information for the handwritten data separated in the entry frame detecting step;
A command creation step for creating command data for causing other application systems to perform predetermined processing based on the character information recognized in the character recognition step and the superimposition information extracted in the superimposition information recognition step;
Form processing including outputting the instruction form data created in the instruction form creation step and / or storing the form image data inputted in the form input step and the instruction form data in association with each other Method.

A form input step for inputting form image data including code data in which information is expressed by a dot width, a slope of a line segment, or a combination of a plurality of line segments, and written handwritten data;
Based on the form image data input in the form input step, the entry frame data and handwritten data are separated, and the horizontal and vertical histograms or line segment detection and / or intersection detection for the form image data, An entry frame detection step for detecting entry frame data and an information bar in the entry frame data in which predetermined information is superimposed by a code;
A superimposition information recognition step of recognizing each code of the information bar detected in the entry frame detection step and extracting superimposition information;
A character recognition step of performing character recognition according to the attribute and / or character type of the handwritten data indicated by the extracted superimposition information for the handwritten data separated in the entry frame detecting step;
A command creation step for creating command data for causing other application systems to perform predetermined processing based on the character information recognized in the character recognition step and the superimposition information extracted in the superimposition information recognition step;
Form processing including outputting the instruction form data created in the instruction form creation step and / or storing the form image data inputted in the form input step and the instruction form data in association with each other Method.

The entry frame detection step includes:
A line detection step for taking a horizontal histogram for the input form image data and detecting the form image data for each line based on the histogram;
A separation step for separating the entry frame data and the handwritten data for the detected form image data for each line;
Taking a histogram in the vertical direction for the entry frame data for each separated line, and detecting one or more entry frame data based on the histogram;
The form processing method according to any one of claims 1 to 4, further comprising: taking a histogram in the horizontal direction with respect to the detected entry frame data and detecting an information bar of the entry frame data based on the histogram. .

The entry frame detection step includes:
Separating the input frame data and handwritten data by any of the labeling process that attaches the same label to the connected components of the connected black pixels, the morphology that contracts and expands the image, and the median filter that has the effect of smoothing the image The form processing method according to any one of claims 1 to 5.

The row detection step includes:
The form processing method according to claim 5, wherein when the value of the histogram is equal to or less than a predetermined threshold value, the influence of noise is removed by ignoring the value.

The entry frame detection step includes:
Taking a horizontal histogram and / or a vertical histogram for the input form image data;
If the value of the histogram is less than or equal to a predetermined threshold, ignoring the value to remove the effect of noise;
Scanning the histogram, and detecting a position where the value of the histogram falls from a predetermined value or less to a predetermined value or a position where the value of the histogram falls from a predetermined value or more to a predetermined value or less;
From the detected position, the value of the histogram is traced in the direction opposite to the direction in which the histogram was scanned or the direction outside the row or column, and the position where the histogram value is zero or less than the predetermined value, or from the detected position If the value of the histogram does not become zero or less than the predetermined value even if the traced length exceeds a predetermined threshold value, the threshold value is opposite to the direction scanned from the detected position or in the direction outside the row or column. A step where the position shifted by the amount is the start or end position of the entry frame; and
6. The form processing according to claim 1, further comprising a step of detecting entry frame data based on a start position and an end position in the vertical direction of the entry frame and / or a start position and an end position in the horizontal direction. Method.

The entry frame detection step includes:
The form according to any one of claims 1 to 5, comprising expanding the dots of the entry frame data, obtaining a histogram for the expanded entry frame data, and detecting each entry frame data and information bar based on the histogram. Processing method.

The superimposition information recognition step includes:
A horizontal and / or vertical histogram is taken for an information bar that is composed of rows of dots of the same height but different widths, and is placed on a plurality of rows and shifted by a predetermined number of dots. Cutting out dots for each row based on the histogram;
Taking a vertical histogram of dots for each cut out line, measuring the length of the dots according to the histogram, obtaining information indicated by each dot and obtaining data of the information record;
A step of returning a line shift by a predetermined number of dots;
The form processing method according to claim 3, further comprising a step of determining data of the information record by taking a majority of values of each information record.

The superimposition information recognition step includes:
A code cutting step of cutting out the code of the information bar detected in the entry frame detecting step;
A code recognition step for recognizing the cut code;
Determining whether the recognized code is a leading delimiter and searching for the leading delimiter;
Obtaining data of an information record including predetermined information based on a code from a leading delimiter searched in the searching step until a terminal delimiter is recognized;
Determining the data of the information record by taking a majority vote of the data of the requested information record;
The form processing method according to claim 3 or 4 including:

The code recognition step includes
12. The form processing method according to claim 11, wherein the code is recognized by a histogram in four directions of a vertical direction, a horizontal direction, a right diagonal direction, and a left diagonal direction.

The code recognition step includes
The code is recognized by sequentially scanning a predetermined position with respect to the extracted code, or the feature value data of the extracted code is extracted, and the standard pattern of each code stored in advance is extracted. The form processing method according to claim 11, wherein the code is recognized by matching with the feature value data.

The superimposition information recognition step includes:
Determining the length of the information record by taking a majority vote of the length of each determined information record;
Determining the length of each information record and deleting information records different from the determined length and / or information records whose length is not within a predetermined range;
Checking the parity code of the information record and deleting the incorrect information record;
Further including
The form processing method according to claim 11, wherein a majority vote is taken using data of an information record that has not been deleted, and data of the information record is determined.

The code cutting step includes
Cut out the code for each column from left to right or right to left of the information bar, and ignore the following code figure until the delimiter is found when the interval between the previous column and the extracted column is larger than the predetermined width The form processing method according to claim 11.

Detecting a horizontal bar that is included in the form image data and is composed of codes based on a horizontal histogram of the input form image data;
Detecting the position of the code of the detected horizontal bar and determining the distance between the codes based on the position of the code;
Determining the size of the code based on the determined distance between the codes,
The code cutting step includes
The code of the information bar detected in the entry frame detection step is cut out, and if the size of the cut out code is smaller than the obtained size, the code is ignored, while if larger, the obtained size or The form processing method according to claim 11, wherein a code cut out in a predetermined size is separated.

Detecting a horizontal bar in which codes are arranged by the same number of lines as the information bar based on a horizontal histogram of the input form image data;
Determining the number of lines of code of the detected horizontal bar,
The code cutting step includes
Cut out the code of the information bar detected in the entry frame detection step, find the number of codes cut out in this column and the scanned length, and whether the number of codes has reached the number of code lines of the obtained horizontal bar; and 12. The form processing method according to claim 11, wherein it is determined whether the scanned length has reached a length corresponding to the number of code lines of the horizontal bar, and the cutout process is terminated when it is determined that the scanned length has been reached.

The character recognition step includes:
When the attribute of the handwritten data indicated by the superimposition information extracted in the superimposition information recognition step is a mail address, the character type information entered in the character type entry frame for limiting the character type of the mail address is recognized, and the character type information The form processing method according to claim 3 or 4, wherein character recognition is performed according to the above.

The code is
The information is represented by any one of a horizontal line segment passing through a predetermined point, a vertical line segment, and a predetermined number of line segments inclined by a predetermined angle. Form processing method.

The code is
The information is represented by one of eight line segments, a horizontal line segment passing through the center of a predetermined area, a vertical line segment, and six line segments inclined by a predetermined angle. The form processing method according to any one of 1 to 4.

The code is
2. The (n + m) -th power bit information is represented by each side and the presence / absence of the line segment, including each side of the n-gon and m line segments having different inclinations passing through the inside of the n-gon. 4. The form processing method according to any one of 4.

The code is
A code that includes each side of the quadrangle and four line segments that pass through the center of the quadrangle and have different inclinations, and expresses information of 2 8 bits by the presence or absence of each side and line segment, or an octagon , And four line segments that pass through the center of the octagon and have different inclinations, and are configured by any of the codes that express information of 2 12 bits depending on the presence or absence of each side and the line segment. The form processing method according to claim 1.

A form input step for inputting form image data including entry frame data composed of a code in which information is expressed by an inclination of a line segment or a combination of a plurality of line segments, and written handwritten data;
Based on the form image data input in the form input step, the entry frame data and the handwritten data are separated, and each entry frame data and an information bar in the entry frame data on which predetermined information is superimposed by a code, A frame detection step for detecting
Cutting out each code of the detected information bar;
A step of obtaining a histogram of four directions of the cut code, the vertical direction, the horizontal direction, the right diagonal direction, and the left diagonal direction;
A form processing program for causing a computer to execute a code recognition step of determining a code shape feature and recognizing a code based on a obtained histogram in four directions.

A form input step for inputting form image data including code data in which information is expressed by a dot width, a slope of a line segment, or a combination of a plurality of line segments, and written handwritten data;
Based on the form image data input in the form input step, the entry frame data and the handwritten data are separated, and each entry frame data and an information bar in the entry frame data on which predetermined information is superimposed by a code, A frame detection step for detecting
A superimposition information recognition step of recognizing each code of the information bar detected in the entry frame detection step and extracting superimposition information;
A character recognition step of performing character recognition according to the attribute and / or character type of the handwritten data indicated by the extracted superimposition information for the handwritten data separated in the entry frame detecting step;
An instruction book for creating instruction book data for causing another application system to perform predetermined processing based on the character information recognized in the character recognition step and the information indicated by the superposition information extracted in the superposition information recognition step Creation steps,
The computer outputs the command data created in the command creation step and / or stores the form image data input in the document input step and the command data in association with each other. Form processing program to let you.

A form input step for inputting form image data including entry frame data composed of a code in which information is expressed by an inclination of a line segment or a combination of a plurality of line segments, and written handwritten data;
Based on the form image data input in the form input step, the entry frame data and the handwritten data are separated, and each entry frame data and an information bar in the entry frame data on which predetermined information is superimposed by a code, A frame detection step for detecting
Cutting out each code of the detected information bar;
A step of obtaining a histogram of four directions of the cut code, the vertical direction, the horizontal direction, the right diagonal direction, and the left diagonal direction;
A recording medium on which a form processing program for determining a code shape feature and causing a computer to execute a code recognition step based on the obtained four-direction histogram is recorded.

A form input step for inputting form image data including code data in which information is expressed by a dot width, a slope of a line segment, or a combination of a plurality of line segments, and written handwritten data;
Based on the form image data input in the form input step, the entry frame data and the handwritten data are separated, and each entry frame data and an information bar in the entry frame data on which predetermined information is superimposed by a code, A frame detection step for detecting
A superimposition information recognition step of recognizing each code of the information bar detected in the entry frame detection step and extracting superimposition information;
A character recognition step of performing character recognition according to the attribute and / or character type of the handwritten data indicated by the extracted superimposition information for the handwritten data separated in the entry frame detecting step;
An instruction book for creating instruction book data for causing another application system to perform predetermined processing based on the character information recognized in the character recognition step and the information indicated by the superposition information extracted in the superposition information recognition step Creation steps,
The computer outputs the command data created in the command creation step and / or stores the form image data input in the document input step and the command data in association with each other. A recording medium on which a form processing program for recording is recorded.

A form input unit for inputting form image data including entry frame data composed of a code in which information is expressed by a slope of a line segment or a combination of a plurality of line segments, and written handwritten data;
A processing unit that detects an input frame of input form image data and recognizes a code in the input frame;
The processor is
Based on the form image data input from the form input unit, the entry frame data and the handwritten data are separated, and each entry frame data and an information bar in the entry frame data on which predetermined information is superimposed by a code, An entry frame detecting means for detecting
Means for cutting out each code of the detected information bar;
Means for obtaining a histogram of four directions of the cut cord in the vertical direction, the horizontal direction, the right diagonal direction, and the left diagonal direction;
A form processing apparatus having code recognition means for recognizing a code by judging a shape characteristic of the code based on the obtained histogram in four directions.

Form input unit for inputting form image data including entry frame data composed of a code in which information is expressed by dot width, line slope, or a combination of a plurality of line segments, and written handwritten data When,
A storage unit for storing form image data and / or command data created based on the form image data;
A processing unit for recognizing handwritten information entered in the input form image data and information superimposed on the entry frame and executing a predetermined process;
With
The processor is
Based on the form image data input from the form input unit, the entry frame data and the handwritten data are separated, and each entry frame data and an information bar in the entry frame data on which predetermined information is superimposed by a code, An entry frame detecting means for detecting
Superimposition information recognition means for recognizing each code of the information bar detected by the entry frame detection means and extracting superposition information;
Character recognition means for performing character recognition on the handwritten data separated by the entry frame detection means according to the attribute and / or character type of the handwritten data indicated by the extracted superposition information;
Instruction book creation means for creating instruction book data for causing other application systems to perform predetermined processing based on the character information recognized by the character recognition means and the superposition information extracted by the superposition information recognition means;
Means for outputting the instruction book data created by the command book creating means and / or storing the form image data inputted by the inputting means in correspondence with the instruction book data in the storage unit; A form processing apparatus.

An information record in which predetermined information is superimposed by a code in which information is represented by a dot width, a slope of a line segment, or a combination of a plurality of line segments has an information bar arranged over a plurality of rows, and the code An entry box comprising:
A form comprising the code and comprising a form or a horizontal bar containing information on the information bar.

The code is a dot having the same height and different width,
30. The form according to claim 29, wherein the information records on which the information is superimposed are arranged shifted by a predetermined number of dots in a plurality of rows.

The code is
30. The form according to claim 29, wherein information is represented by any one of a horizontal line segment passing through a predetermined point, a vertical line segment, and a predetermined number of line segments inclined by a predetermined angle.

The code is
The information is represented by one of eight line segments, a horizontal line segment passing through the center of a predetermined area, a vertical line segment, and six line segments inclined by a predetermined angle. 29 forms.

The code is
30. The (n + m) -th power bit information is represented by each side and the presence / absence of a line segment, including each side of the n-gon and m line segments having different inclinations through the inside of the n-gon. The form described.

The code is
A code that includes each side of the quadrangle and four line segments that pass through the center of the quadrangle and have different inclinations, and expresses information of 2 8 bits by the presence or absence of each side and line segment, or an octagon , And four line segments that pass through the center of the octagon and have different inclinations, and are configured by any of the codes that express information of 2 12 bits depending on the presence or absence of each side and the line segment. The form according to claim 29.