JP3976804B2

JP3976804B2 - Image processing method

Info

Publication number: JP3976804B2
Application number: JP00977495A
Authority: JP
Inventors: 裕章池田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1995-01-25
Filing date: 1995-01-25
Publication date: 2007-09-19
Anticipated expiration: 2022-09-19
Also published as: JPH08202820A

Description

【０００１】
【産業上の利用分野】
本発明は、アンダーラインや囲み枠等の傍線を含んだ文書画像の文字を正しく、文字単位で切り出すことのできる画像処理方法および画像処理装置に関するものである。
【０００２】
【従来の技術】
従来のＯＣＲで行なわれる認識処理の一例を図２６に示す。
【０００３】
まず、Ｓ２６０１でイメージスキャナ等を用いるなどして文書画像を入力する。
【０００４】
次に、Ｓ２６０２において、入力された文書画像から画像ブロックの抽出を行なう。射影を用いた画像ブロック取り出しの例を横書き文書で示すと、図２７にあるように、左右方向に射影２７０１を取り、行を抽出後、各行に対し行方向と直角に射影２７０２を取ることで画像ブロック２６０３を抽出することができる。
【０００５】
このままでは、１文字が誤って２以上のブロックに分離されてしまった分離ブロック２７０５、２以上の文字が誤って１ブロックとして結合して抽出されてしまった結合ブロック２７０６が存在するため、Ｓ２６０３各行で画像ブロックの幅の平均や、画像ブロックの幅の分布でもっとも頻度が大きいものなどから基準文字サイズを求め、Ｓ２６０４で複数の画像ブロックを結合した時、その幅が基準文字サイズになるならそれらを結合し、画像ブロックが基準文字サイズの整数倍になっていれば、整数等分して文字単位の画像ブロックを決定する。抽出された画像ブロックは、Ｓ２６０５で画像ブロックの位置，幅，高さに関わる情報を記憶しておく。
【０００６】
その後、Ｓ２６０６で各文字画像ブロックについて識別演算を行ない、類似度がもっとも大きいカテゴリ（文字）を認識結果とする。結果は、Ｓ２６０７で記憶しておく。
【０００７】
縦書きの文書についても横と縦を入れ替えるだけで、同様に認識処理が行なわれる。
【０００８】
このように、ＯＣＲは識別演算を文字単位で行なうため、画像入力後、文字切り出しを行なうように構成されている。
【０００９】
【発明が解決しようとする課題】
しかしながら、上記従来例において傍線を含んだ画像ブロックが存在する場合、特に図５にあるような線と文字とが接触している場合、５０１のように、それらが１つの画像ブロックとして取り出されるため、文字画像を正しく切り出せない欠点があった。
【００１０】
また、上記従来の文字認識における文字切り出し方法で図１７のように囲み枠で囲まれた文字を含む場合、１７０１のように囲み枠文字は１つの画像ブロックとして抽出される為、文字単位で画像を正しく切り出せないという欠点があった。
【００１１】
【課題を解決するための手段】
上記課題を解決するために、本発明の画像処理方法は、原画像から射影を取ることにより切出された画像ブロックのうち、文字画像を含む画像ブロックを抽出する画像ブロック抽出ステップと、前記画像ブロック抽出ステップで抽出された画像ブロックについて、各画像ブロックの幅と基準文字サイズとに基づいて、線画像が含まれている可能性があるかどうか判断し、前記線画像が含まれている可能性があると判断された画像ブロックを注目画像ブロックとする判断ステップと、前記判断ステップで判断された注目画像ブロック内の文字画像の傾きを求める傾き算出ステップと、前記注目画像ブロックにおいて、前記傾き算出ステップで求めた傾きに沿った方向のヒストグラムを取るヒストグラム算出ステップと、前記ヒストグラム算出ステップで算出したヒストグラムの頻度が所定の閾値以上であるピークの位置と幅とに基づいて、線画像の情報を抽出する線画像情報抽出ステップと、前記線画像情報抽出ステップにより抽出された線画像の情報に基づいて、前記注目画像ブロックから該線画像を除去した画像情報を抽出する抽出ステップとを有する。
【００１２】
上記課題を解決するために、本発明の画像処理装置は、原画像から射影を取ることにより切出された画像ブロックのうち、文字画像を含む画像ブロックを抽出する画像ブロック抽出手段と、前記画像ブロック抽出手段で抽出された画像ブロックについて、各画像ブロックの幅と基準文字サイズとに基づいて、線画像が含まれている可能性があるかどうか判断し、前記線画像が含まれている可能性があると判断された画像ブロックを注目画像ブロックとする判断手段と、前記判断手段で判断された注目画像ブロック内の文字画像の傾きを求める傾き算出手段と、前記注目画像ブロックにおいて、前記傾き算出手段で求めた傾きに沿った方向のヒストグラムを取るヒストグラム算出手段と、前記ヒストグラム算出手段で算出したヒストグラムの頻度が所定の閾値以上であるピークの位置と幅とに基づいて、線画像の情報を抽出する線画像情報抽出手段と、前記線画像情報抽出手段により抽出された線画像の情報に基づいて、前記注目画像ブロックから該線画像を除去した画像情報を抽出する抽出手段とを有する。
【００２９】
【実施例】
（実施例１）
図１は本発明を実施するための文字認識装置の構成を示すブロック図である。１０１はＲＯＭ１０２に格納されている制御プログラムに従って本装置全体の制御を行うＣＰＵ、１０２はＣＰＵ１０１が実行する後述するフローチャートに示す処理など本装置の制御プログラムなどを格納するＲＯＭ、１０３は文書画像などを記憶するＲＡＭであり、１０４はディスプレイ、１０５はキーボードであり、このキーボード１０５より各種コマンドやディスプレイ１０４に表示された画像やテキストの編集を行う。１０６は原稿の文書画像を光学的に読み取るためのイメージスキャナである。
【００３０】
図１に示す構成の文字認識装置が実行する本実施例の概略を図２のフローチャートを用いて説明する。
【００３１】
まず、Ｓ２０１でイメージスキャナ１０６に設置された原稿の文書画像を入力し、ＲＡＭ１０３に格納する。ＲＡＭ１０３に格納された入力画像をディスプレイ１０４に表示するようにしてもよい。このディスプレイ１０４上で認識処理を行う認識枠が設定されたら、認識枠情報もＲＡＭ１０３に格納する。また、画像の入力はイメージスキャナに限らず、記憶装置（図示せず）から読み出しても、通信手段により他のコンピュータより送信されてきたものでも良い。
【００３２】
次に、Ｓ２０２において、ＲＡＭ１０３に格納されている入力された文書画像から画像ブロックの抽出を従来例のステップＳ１４０２と同様に行い、Ｓ２０３でステップＳ１４０３と同様に基準文字サイズを求める。Ｓ２０２及びＳ２０３で得た情報はＲＡＭ１０３に格納する。
【００３３】
ここで、Ｓ２０２で抽出された各画像ブロックに対し、Ｓ２０４で傍線を含むブロックかどうかの判定を各ブロックについて順次行なう。Ｓ２０５で注目ブロックが傍線を含むブロックなら、Ｓ２０６において、そのブロックから傍線を除いた画像を抽出し、新たな、傍線を含まない画像ブロックとする。
【００３４】
すべての画像ブロックの抽出が終了したら、Ｓ２０７において、Ｓ２０３で求め、ＲＡＭ１０３に格納された基準文字サイズを用いて抽出された画像ブロックを更に分割して文字単位の画像ブロックを決定する。Ｓ２０８では、抽出された文字単位の画像ブロック情報（基準座標，幅，高さ）をＲＡＭ１０３に文字単位で入力画像における位置情報や、出現順（出力順）の情報とともに格納し記憶しておく。Ｓ２０９では、格納された画像ブロック情報を１つずつ読みだし、各文字画像ブロックについてＲＯＭ１０２に格納されている文字パターンの辞書データとの識別演算を行ない、類似度がもっとも大きいカテゴリ（文字）を認識結果とする。結果は、Ｓ２１０でＲＡＭ１０３にその結果を得た文字画像の入力画像における位置情報とともに記憶しておく。認識結果をディスプレイ１０４に表示するようにしてもよい。また、Ｓ２１０では、類似度が最大の文字の他に、上位複数個の文字を候補文字として格納しても良い。
【００３５】
Ｓ２０４の傍線を含むブロックの判定について、図３のフローチャートを用いて詳細に説明を行なう。Ｓ３０１では、注目する画像ブロックの幅の基準文字サイズに対する比Ｒ＝（注目画像ブロックの幅）／（基準文字サイズ）を求める。Ｓ３０２では、あらかじめ設定された閾値ＴとＲを比較し、ＲがＴより大きければＳ３０３に進みそのブロックは傍線を含むブロックとし、ＲがＴより小さければＳ３０４に進みそのブロックは傍線を含まないブロックと判定する。通常傍線は複数文字にまたがって付けられるので、Ｔは２以上にする。また、行全体に傍線があり、画像ブロックの幅の平均や、画像ブロックの幅の分布から基準文字サイズが求められない行の場合は、Ｓ２０３において基準文字サイズを行の高さとしてもよい。Ｓ２０４においては、Ｓ２０２で抽出された全ての画像ブロックに対して図３のフローチャートに示す処理を施し、全てのブロックについて傍線ブロックであるか否かの識別情報を付加する。
【００３６】
Ｓ２０５において、Ｓ２０４で付加された傍線ブロックである旨を表わす識別情報が付加されている画像ブロックを判別し、Ｓ２０６の画像ブロックから傍線を除く処理へ進む。
【００３７】
Ｓ２０６の傍線を除いた画像ブロックの抽出について、図４のフローチャートを用いて詳細な説明を行なう。まず、Ｓ４０１で注目する傍線を含む画像ブロックから画像ブロックの抽出に必要な傍線の情報を抽出する。本実施例では、傍線の情報として、傍線の幅、傍線の傾き、傍線の位置を求めることにする。
【００３８】
傍線の傾きθは、図６にあるように、注目画像ブロックに適当な間隔Ｌをおいて２つの領域６０２を設定し、その領域における射影６０１を求め、射影のずれｄとＬより、ｔａｎθ＝ｄ／Ｌとして傾きθを得られる。この時求める射影は、左右方向に画素をサーチして１つでも黒画素があれば１とする、という方法をとる。また、間隔Ｌは、注目画像ブロックの幅により、幅より適度（例えば１／２文字幅）内側に寄った位置にとれるようにすれば良い。或いは、ブロックの幅が十分に大きければ（例えばＲ＝１０）、適度に定めたＬ（例えば基準文字サイズの２倍）を適当な位置に定めても良い。
【００３９】
傍線の位置とは、文字の下（アンダーライン）、上（オーバライン）あるいは上下両方のことである。傍線の位置を求めるためには、図７にあるように、角度θだけ傾いた注目画素ブロック７０１（ここでは「傍線」の２文字とアンダーラインを含む画像）について、１／ｔａｎθ画素毎に１画素ずつ上下にずらしながら、黒画素のヒストグラムを取る。この結果が図８である。ヒストグラムは例えばアンダーラインなら画像ブロックの下から高さの１／４まで、オーバラインなら画像ブロックの上から高さの１／４までに、多くの場合は存在する。画像ブロックの幅に対し、ヒストグラムの最大値が十分に大きければ（ブロックの幅により定める閾値Ｈ_THより大きければ）、その部分に傍線が存在すると判定し、その結果、ヒストグラムのＨ_THを超える最大値（或いは次のピーク）が上から１／４にあればオーバーライン、下から１／４にあればアンダーラインと、傍線の位置を求めることができる。また、ヒストグラムをとる方法の他に、１／ｔａｎθ画素毎に１画素ずつずらしながら、黒画素の連続性を見ていき、画像ブロックの幅に対し十分に長い黒ランが存在する、あるいは、黒ランが途中で切れていても、切れた部分の幅が非常に狭く、その切れた部分は印刷の精度によるものであり、その部分がつながっていると判断される場合、それがその傍線画像ブロックの幅に対し十分に長いものを傍線と判定する方法もある。
【００４０】
傍線の幅Ｗは、図８に示すように、１／ｔａｎθ画素毎に１画素ずつずらしながら取られた傍線が存在する部分の黒画素のヒストグラムを取り、最も外側にあるピークを含む部分について、あらかじめ定めてある値Ｈ_TH（例えばブロック幅の半分）より大きい頻度が出た部分の幅ｗにより得られる。この幅が小さめになるようであれば、傍線の幅をｗ＋αとしてもよい。
【００４１】
この結果、傍線の情報が得られなければ、Ｓ４０２で画像ブロック抽出処理を終了する。
【００４２】
傍線の情報が得られたら、Ｓ４０３で傍線を含まないように画像ブロックの抽出を行なう。従来例で説明したような上下方向の射影を用いて文字単位の画像ブロックを抽出する場合、図９にあるように、傍線の幅ｗを除き、しかも１／ｔａｎθ画素毎に１画素ずつずらしながら射影を取ることで射影９０１を抽出でき、その結果、「傍」と「線」が分割された、文字単位の画像ブロック９０２が得られる。Ｓ２０６においては、Ｓ２０５で傍線ブロックと判別される全てのブロックに対して図４のフローチャートに示す処理を順次繰り返して施す。
【００４３】
以上説明したように、傍線を含むと判定されたブロックに対し、傍線の情報を抽出し、情報が得られたものに対してのみ画像ブロックの抽出を行なうようにすることで、傍線ブロックの判定精度が高くなくても、傍線を含む画像ブロックから正しく文字画像を抽出する効果がある。
【００４４】
また、傍線の傾きを求めることで、傾きがある画像ブロックに対しても、精度良く文字画像を抽出する効果がある。
【００４５】
なお、本実施例では、本発明を実施するための最低限の構成要件で説明を行なっているが、例えばオペレータが認識領域を指定したり、認識結果を修正する操作ができるように構成されていたり、認識領域をオペレータが介在することなく自動的に決定したり、誤認識を減少させるための処理が加わるように構成されいても何ら問題ない。
【００４６】
また、本実施例では、横書きの文書について説明を行なっているが、縦書きについても行方向と列方向を変えるだけであり、同様に実施することができる。
【００４７】
また、汎用コンピュータに、本発明を実施する処理を行なうプログラムを外部から提供し、ＲＡＭに本装置の制御プログラムを格納するように構成されていてもよいし、記憶装置はＲＡＭに限定されるものでもない。
【００４８】
また、本実施例では、Ｓ２０４及びＳ２０６の処理を、全てのブロックに対して繰り返し行う例について説明したが、１つのブロックについてＳ２０４〜Ｓ２０６の処理を施した後、次のブロックについてＳ２０４〜Ｓ２０６を施すようにループを設定しても良い。
【００４９】
（実施例２）
図２のフローチャートにおいて、Ｓ２０４の傍線を含むブロックの判定とＳ２０６の傍線の画像ブロックの抽出の他の例について、図１０及び図１１を用いて説明する。なお、その他については実施例１と同様であり、説明は省略する。
【００５０】
Ｓ２０４の傍線を含むブロックの判定について、図１０のフローチャートを用いて説明を行なう。Ｓ１００１で上部の射影を取る。例えば、画像ブロックの上から高さの１／４までの射影を取ればよい。Ｓ１００２で上部の射影の頻度の最大幅Ｌ１を求める。Ｓ１００３で下部の射影を取り、Ｓ１００４で下部の射影の頻度の最大幅Ｌ２を求める。傍線があれば射影の幅は画像ブロックの幅に近い値になるので、画像ブロックの幅から得られる閾値をＬとすると、Ｓ１００５でＬ１＞ＬまたはＬ２＞Ｌなら傍線を含むと判定しＳ１００６に進み、そうでなければ傍線を含まないと判定しＳ１０１０に進む。画像ブロックの幅をＷとすると、Ｌ＝０．９×Ｗなどとすればよい。傍線を含むと判定した場合、Ｓ１００６で傍線の情報を抽出する。抽出の方法はＳ４０１と同様でよい。傍線の情報が抽出されなければ、Ｓ１０１０に進み、その画像ブロックは傍線を含まない画像ブロックとする。抽出された傍線の情報は、Ｓ１００８でＲＡＭ１０３に格納し記憶しておき、Ｓ１０００９で注目画像ブロックを傍線を含むブロックとし処理を終了する。なお、Ｌ１とＬ２を求める順序は逆でもよい。また、Ｓ１００６における傍線の情報を抽出する方法として、画像ブロックに対して、アンダーラインであれば下から、オーバラインであれば上から、Ｌの幅でラインにつきあたる点をサーチし、この点の差からｄを求め、θを導出する方法をとることにより、より素早い処理が実現する。
【００５１】
Ｓ２０６の傍線用の画像ブロックの抽出の他の方法について、図１１のフローチャートを用いて説明を行なう。Ｓ１１０１において、ＲＡＭ１０３に格納された傍線の情報を用いて、傍線を含まないように画像ブロックの抽出を行なう。抽出方法はＳ４０３と同様でよい。
【００５２】
以上説明したように、傍線を含むブロックの判定に、画像ブロックの局所的な射影を取り、その幅を用いることで、判定の精度が向上する効果がある。また、その時点で傍線の情報を抽出することで、傍線を含まない画像ブロックに対し、画像ブロックの抽出の処理を施し、無駄に処理時間を費やすことを防ぐ効果がある。
【００５３】
（実施例３）
図１に示す構成の文字認識装置が実行する他の実施例の処理の概略を図１２のフローチャートを用いて説明する。
【００５４】
Ｓ１２０１で文書画像を入力し、Ｓ１２０２において入力された文書画像から画像ブロックの抽出を行ない、Ｓ１２０３で基準文字サイズを求め、Ｓ１２０４で傍線を含むブロックかどうかの判定を行なう。Ｓ１２０５で注目ブロックが傍線を含むブロックなら、Ｓ１２０６において、傍線を除いた画像ブロックを抽出する。ここまでは、実施例１と同様であり、Ｓ２０１〜Ｓ２０６において詳細に説明しているので、ここでは詳細な説明は省略する。Ｓ１２０６で得られた画像ブロックがまだ傍線を含んでいる場合、Ｓ１２１１で再度傍線を除いた画像ブロックの抽出を行なうかどうかを調べ、行なうなら再びＳ１２０６を実行する。このＳ１２１１の判定方法はＳ１２０４と同様なものでよく、傍線抽出の前後で画像ブロックの横幅に変化がない場合にも、再度画像ブロックの抽出を行なうようにしてもよい。
【００５５】
すべての画像ブロックの抽出が終了したら、Ｓ１２０７で基準文字サイズを用いて文字単位の画像ブロックを決定し、Ｓ１２０８で抽出された画像ブロック情報を格納し、Ｓ１２０９で識別演算を行ない、Ｓ１２１０で結果を記憶する。このＳ１２０７〜Ｓ１２１０の処理は、Ｓ２０７〜Ｓ２１０と同様である。
【００５６】
以上説明したように、傍線を含む画像ブロックから、傍線を除いて画像ブロックを抽出した後、再び傍線を除いた画像ブロックの抽出を行なうかどうかの判定をすることで、２重や多重の傍線を含む画像ブロックや、部分的に多重になっている画像ブロックに対しても、文字画像ブロックを抽出することができる効果がある。
【００５７】
（実施例４）
図２のフローチャートにおけるＳ２１０の演算結果の格納について図１３を用いて説明する。図５の識別演算の結果が「これは傍線です」となったとすると、「傍」の前にアンダーライン開始の制御コードＵＩ１３０１、「線」の後ろにアンダーライン終了の制御コードＵＯ１３０２を図１３（ａ）のように入れＲＡＭ１０３に記憶する。ディスプレイ１０４上で傍線が表示可能であるならば、ＵＩからＵＯの文字にアンダーラインを引いて認識結果を表示する。同様に、オーバラインや多重線についても制御コードを定めておけばよい。
【００５８】
また、制御コードではなく、各文字に傍線に関する属性を持つようにすれば、「傍」「線」の属性をアンダーラインとして結果を図１３（ｂ）のように格納してもよい。
【００５９】
なお、結果のデータ構造は図１３に限定されるものではなく、これに候補文字などの情報が含まれていてもよい。
【００６０】
以上説明したように、傍線の付いた文字を記憶することで、入力画像の文字以外の情報を失うことなく再現することができる効果がある。
【００６１】
（実施例５）
上述の実施例１においては、図５のようなアンダーライン等の傍線が付された文字列を含む画像情報から正しく文字単位で切り出す方法について説明したが、本実施例では、図１７のように囲み枠により囲まれた文字列を含む画像から正しく文字単位で切り出す方法について説明する。
【００６２】
本実施例は、実施例１において図１を用いて詳細に説明した構成の文字認識装置で実施されるものである。
【００６３】
図１に示す構成の文字認識装置が実行する本実施例の概略を図１４のフローチャートを用いて説明する。ここで、本実施例の特徴的処理はＳ１４０１の囲み枠ブロックを見つける処理からＳ１４０３の囲み枠を除く処理までであり、ここではこの３ステップについて詳細に説明する。その他の、実施例１において説明した図２のフローチャートに示す処理と同じ処理ステップは同じステップ番号を付した。この処理に関してはここでは説明を省略する。
【００６４】
Ｓ２０３で抽出された各画像ブロックに対し、Ｓ１４０１で囲み枠を含むブロックかどうかの判定を各ブロックについて順次行なう。Ｓ１４０２で注目ブロックが囲み枠を含むブロックなら、Ｓ１４０３において、そのブロックの画像から囲み枠を除いた画像を抽出し、新たな、囲み枠を含まない画像ブロックとする。
【００６５】
Ｓ１４０１の囲み枠を含むブロックの判定について、図１５のフローチャートを用いて詳細に説明を行なう。Ｓ１５０１では、注目する画像ブロックの幅の基準文字サイズに対する比Ｒ＝（注目画像ブロックの幅）／（基準文字サイズ）を求める。Ｓ１５０２では、あらかじめ設定された閾値ＴとＲを比較し、ＲがＴより小さければＳ１５０３に進み、その枠は囲み枠を含まないブロックと判定する。ＲがＴより大きければ、Ｓ１５０４に進み、囲み枠ブロックであるか確認する処理に進む。Ｓ１５０４でその画像ブロックの上部の射影を取る。例えば、画像ブロックの上から高さの１／４までの射影を取ればよい。Ｓ１５０５でＳ１５０４でとった上部の射影の最大幅Ｌ１を求める。囲み枠があれば射影の幅は画像ブロックの幅に近い値になるので、Ｓ１５０６で画像ブロックの幅から得られる閾値をＬ（例えばブロック幅の０．９倍）とし、ＬとＬ１を比較する。Ｌ１＜Ｌならそのブロックは囲み枠ではないと判定しＳ１５０３に進む。同様に、Ｓ１５０７で下部の射影を取り、Ｓ１５０８で下部の射影の最大幅Ｌ２を求める。Ｓ１５０９でＬ２＜Ｌなら囲み枠でははないと判定しＳ１５０３に進み、そうでなければＳ１５１０に進み、囲み枠であると判定する。Ｓ１５０３及びＳ１５１０では、注目の画像ブロックに対して、囲み枠ブロックであるか否か判定し得る識別情報を付加してＲＡＭ１０３に格納する。なお、Ｌ１とＬ２の求める順序は逆でもよい。
【００６６】
Ｓ１４０３の囲み枠用の画像ブロックの抽出について、図１６のフローチャートを用いて詳細に説明を行なう。まず、Ｓ１６０１で注目する画像ブロックから画像ブロックの抽出に必要な囲み枠の情報を抽出する。本実施例では、囲み枠の線幅ｗ、囲み枠の傾きθを求めることにする。
【００６７】
囲み枠の傾きθは、図１８にあるように、注目画像ブロックに適当な間隔Ｌ（Ｌの設定方法は実施例１と同じ）をおいて２つの領域１８００を設定し、各領域の画像データについて射影１８０１を求め、射影のずれｄとＬより、ｔａｎθ＝ｄ／Ｌとしてθが得られる。精度良く求めるならば、射影の上下で傾きを求め、値が近ければその平均を注目画像ブロックの傾きとし、値が近くなければ、囲み枠でないとし、傾きの値を決定せず、Ｓ１６０２で囲み枠の情報が抽出できないとして、その画像ブロックに関しての図１６のフローチャートの処理を終了する。
【００６８】
囲み枠の線幅ｗは、図１９にあるように、注目画像ブロックについて、１／ｔａｎθ画素毎に１画素ずつ上下にずらしながら、黒画素のヒストグラムを取り、その画像ブロックの最も外側にあるピークを含む部分について、あらかじめ定めてある値Ｈ_THより大きい部分の幅ｗにより得られる。この幅が小さめになるようであれば、囲み枠の線幅をｗ＋αとしてもよい。囲み枠の上下で異なるようであれば、枠の線幅を上下両方求めておく。
【００６９】
囲み枠の情報がこの段階で得られなければＳ１６０２で囲み枠の情報が抽出できないとして、その注目画像ブロックに関しての図１６のフローチャートの画像ブロック抽出処理を終了する。Ｓ１６０１で抽出された囲み枠の傾きθ及び囲み枠の線幅ｗはその画像ブロックと関連づけてＲＡＭ１０３に格納する。
【００７０】
囲み枠の情報が得られたら、Ｓ１６０３で囲み枠を含まないように画像ブロックの抽出を行なう。従来例で説明したような上下方向での射影を用いて文字単位の画像ブロックを抽出する場合、図２０にあるように、囲み枠の線幅ｗを上下に含めず、しかも１／ｔａｎθ画素毎に１画素ずらしながら射影を取ることで射影２００１を抽出でき、その結果、画像ブロック２００２が得られる。得られたブロックのうち、左右の端のブロック２００３については、囲み枠の線幅ｗに近い幅であれば囲み枠の一部であるので、その左右端のブロックは文字画像ブロックとはしない。もし、注目画像ブロックの左右に囲み枠の線幅程度の幅を持つブロックが存在しなければ囲み枠を含む画像ブロックではないとする。
【００７１】
（実施例６）
Ｓ１４０１の囲み枠を含むブロックの判定とＳ１４０３の囲み枠の画像ブロックの抽出の他の実施例を説明する。なお、その他の処理ステップについては実施例１と同様であり、説明は省略する。
【００７２】
Ｓ１４０１の囲み枠を含むブロックの判定について、図２１のフローチャートを用いて説明を行なう。Ｓ２１０１では、注目する画像ブロックの幅の基準文字サイズに対する比Ｒを求める。Ｓ２１０２でＲがＴより小さければＳ２１０３に進み囲み枠を含まないブロックと判定する。ＲがＴより大きければ、次にＳ２１０４で注目画像ブロックの最も外側の境界線を追跡していく。境界線が許容範囲を越えたら、Ｓ２１０４で囲み枠ではないと判定しＳ２１０３に進む。許容範囲の例を図２２を用いて説明すると、ブロックの左半分において、境界線のｙ座標があらかじめ定めたｙ１とｙ２の間に存在する場合、ｘ座標があらかじめ定めたｘ１より右に存在すれば、境界線は囲み枠を追跡していないとし、許容範囲を越えたとする。ブロックの上半分なら、ｘ座標がｘ１とｘ２の間に存在する場合、ｙ座標がｙ１より下に存在すれば許容範囲を越えたとする。右半分や下半分についても同様である。
【００７３】
境界線が許容範囲を越えなければ、Ｓ２１０６に進み、囲み枠の情報を抽出する。抽出方法は、実施例５のＳ１６０１と同様でよい。もし、囲み枠の情報が抽出されなければ、Ｓ２１０７でＳ２１０３に進み判定を終了する。抽出された囲み枠の情報はＳ２１０８でＲＡＭ１０３に格納し記憶しておき、Ｓ２１０９に進み、囲み枠であると判定する。
【００７４】
Ｓ１４０３の囲み枠の画像ブロックの抽出について、図２３のフローチャートを用いて説明を行なう。Ｓ２３０１で格納された囲み枠の情報を用いて、囲み枠を含まないように画像ブロックの抽出を行なう。抽出方法は実施例１のＳ１６０３と同様でよい。
【００７５】
以上説明したように、囲み枠のブロックの判定に境界線を用いることで判定の精度が向上する効果がある。また、囲み枠のブロックの判定で囲み枠の情報を抽出するので、囲み枠ではない画像ブロックに対し、画像ブロックの抽出を行なうことを防ぐ効果がある。
【００７６】
（実施例７）
図１に示す構成の文字認識装置が実行する囲み枠を除いた文字単位の文字切り出しの実施例５とは異なる他の実施例の概略を図２４のフローチャートを用いて説明する。ただし実施例５の図１４のフローチャートに示した処理と同じ処理ステップには同じステップ番号を付し、ここでは説明を省略する。
【００７７】
本実施例における特徴的な処理ステップはＳ２４０１であり、Ｓ１４０３で得られた画像ブロックがまだ囲み枠を含んでいる場合、Ｓ２４０１で再度囲み枠を除いた画像ブロックの抽出を行なうかどうかを調べ、行なうなら再びＳ１４０３を実行する。この再判定方法はＳ１４０１と同様なものでよい。
【００７８】
すべての画像ブロックの抽出が終了したら、Ｓ２０７で基準文字サイズを用いて文字単位の画像ブロックを決定し、Ｓ２０８で抽出された画像ブロック情報を格納し、Ｓ１２０９で識別演算を行ない、Ｓ２１０で結果を記憶する。
【００７９】
以上説明したように、囲み枠を含む画像ブロックから、囲み枠を除いて画像ブロックを抽出した後、再び囲み枠を除いた画像ブロックの抽出を行なうかどうかの判定をすることで、２重や多重の囲み枠を含む画像ブロックに対しても、文字画像ブロックを抽出することができる効果がある。
【００８０】
（実施例８）
図１４のフローチャートにおけるＳ１４１０の演算結果の格納について図２５を用いて説明する。図１７の識別演算の結果が「これは囲み枠です」となったとすると、図２５（ａ）のように「囲」の前に囲み枠開始の制御コードＦＩ２５０１、「枠」の後ろに囲み枠終了の制御コードＦＯ２５０２を入れＲＡＭ１０３に記憶する。ディプレイ１０４上で囲み枠が表示可能であるならば、ＦＩからＦＯの文字に囲み枠をして認識結果を表示する。
【００８１】
また、制御コードではなく、各文字に囲み枠に関する属性を持つようにすれば、図２５（ｂ）のように「囲」「み」「枠」の各属性を囲み枠として結果を格納してもよい。
【００８２】
なお、結果のデータ構造は図１３に限定されるものではなく、これに候補文字などの情報が含まれていてもよい。
【００８３】
以上説明したように、囲み枠の付いた文字を記憶することで、入力画像の文字以外の情報を失うことなく再現することができる効果がある。
【００８４】
【発明の効果】
以上説明したように、本発明によれば、文字以外の線画像を含む画像情報からは線画像を除去した画像を抽出するので、原稿上で線を消してから画像を入力するといった手間をかけずに、線の付加された文字も正しく文字単位で切り出すことが可能となる。
【００８５】
以上説明したように、本発明によれば、傍線、例えばアンダーラインやオーバーラインが付加された文字の画像を入力した場合にも、正しく文字単位で画像を切り出すことができる。
【００８６】
以上説明したように、本発明によれば、囲み枠で囲まれた文字の画像を入力した場合にも、正しく文字単位で画像を切り出すことができる。
【００８７】
以上説明したように、本発明によれば、抽出された画像ブロックの幅と基準文字幅との比により、そのブロックが文字以外の線画像を含むか否かを判定するので、簡単な処理で線が付加されている画像ブロックを判定することができる。
【００８９】
以上説明したように、本発明によれば、線画像を除去した画像情報を抽出するので、容易な処理で文字単位に分割することができる。
【００９０】
以上説明したように、本発明によれば、線の幅に係る情報を抽出し、この情報に従って線画像の除去を行なうので、多様な線幅の線画像が文字に付加されていても正確に線画像の除去ができる。
【００９１】
以上説明したように、本発明によれば、線の傾きに係る情報を抽出し、この情報に従って線画像の除去を行なうので、原稿が傾いていたり、文字列が傾いていた場合にも正確に線画像の除去ができる。
【００９２】
以上説明したように、本発明によれば、線の位置に係る情報を抽出し、この情報に従って線画像の除去を行なうので、線画像が文字に対してどの位置に合っても正確に線画像の除去ができる。
【００９３】
以上説明したように、本発明によれば、線画像を除去した画像が線画像を含むか否か再度判断するので、多重に線画像が付加されている文字でも完全に線画像の除去ができる。
【００９４】
以上説明したように、本発明によれば、線画像を除去した後に文字認識を行なうので、文字認識の精度が向上する。
【００９５】
以上説明したように、本発明によれば、文字認識した後もその文字に対して線画像の属性を付加するので、テキストで表示する際にもともと原稿に付加されていた線情報を反映させることができる。
【図面の簡単な説明】
【図１】本発明に係る実施例の文字認識装置の構成を示すブロック図
【図２】実施例１における文字認識処理のフローチャート
【図３】実施例１における傍線ブロック判定の詳細な処理のフローチャート
【図４】実施例１における傍線を除いた画像抽出の詳細な処理のフローチャート
【図５】傍線が付加された文字列を含む画像の画像ブロックの抽出例を示す図
【図６】傍線の傾き検出の方法を説明する図
【図７】傍線の位置を検出する際のヒストグラムのとり方を説明する図
【図８】傍線の幅検出の方法を説明する図
【図９】傍線を除いた画像の抽出及び文字単位の画像の切り出しの説明図
【図１０】実施例２における傍線ブロック判定の詳細な処理のフローチャート
【図１１】実施例２における傍線を除いた画像抽出の詳細な処理のフローチャート
【図１２】実施例３の文字認識処理のフローチャート
【図１３】傍線の属性を付加した文字列の記憶例を示す図
【図１４】実施例５における文字認識処理のフローチャート
【図１５】実施例５における囲み枠ブロック判定の詳細な処理のフローチャート
【図１６】実施例５における囲み枠を除いた画像抽出の詳細な処理のフローチャート
【図１７】囲み枠が付加された文字列を含む画像の画像ブロックの抽出例を示す図
【図１８】囲み枠の傾き検出の方法を説明する図
【図１９】囲み枠の線幅を検出する際のヒストグラムのとり方を説明する図
【図２０】囲み枠の幅検出の方法を説明する図
【図２１】囲み枠を除いた画像の抽出及び文字単位の画像の切り出しの説明図
【図２２】実施例６における囲み枠ブロック判定の詳細な処理のフローチャート
【図２３】実施例６における囲み枠を除いた画像抽出の詳細な処理のフローチャート
【図２４】実施例７の文字認識処理のフローチャート
【図２５】囲み枠の属性を付加した文字列の記憶例を示す図
【図２６】従来の文字認識処理のフローチャート
【図２７】線と文字が離れている画像情報からヒストグラムにより文字画像を分割した例を示す図[0001]
[Industrial application fields]
The present invention relates to an image processing method and an image processing apparatus capable of correctly cutting out characters of a document image including a side line such as an underline or a surrounding frame in units of characters.
[0002]
[Prior art]
An example of recognition processing performed in the conventional OCR is shown in FIG.
[0003]
First, in S2601, a document image is input using an image scanner or the like.
[0004]
In step S2602, an image block is extracted from the input document image. An example of image block extraction using projection is shown in a horizontally written document. As shown in FIG. 27, by taking a projection 2701 in the left-right direction, extracting lines, and then taking a projection 2702 perpendicular to the line direction for each line. An image block 2603 can be extracted.
[0005]
In this state, there is a separation block 2705 in which one character is erroneously separated into two or more blocks, and there is a combination block 2706 in which two or more characters are erroneously combined and extracted as one block. If the reference character size is obtained from the average of the image block widths or the most frequent distribution of the image block widths, and if a plurality of image blocks are combined in step S2604, the reference character size is obtained. If the image block is an integral multiple of the reference character size, the image block is determined in character units by dividing it into integers. The extracted image block stores information related to the position, width, and height of the image block in step S2605.
[0006]
Thereafter, in S2606, each character image block is subjected to identification calculation, and the category (character) having the highest similarity is set as the recognition result. The result is stored in S2607.
[0007]
For vertical writing documents, the recognition processing is performed in the same manner by simply switching the horizontal and vertical directions.
[0008]
In this way, the OCR is configured to perform character segmentation after image input in order to perform identification calculation in units of characters.
[0009]
[Problems to be solved by the invention]
However, in the above conventional example, when there are image blocks including a side line, especially when a line and a character as shown in FIG. 5 are in contact with each other, they are extracted as one image block as in 501. There was a drawback that the character image could not be cut out correctly.
[0010]
When the character cutout method in the conventional character recognition includes a character surrounded by a frame as shown in FIG. 17, the frame character is extracted as one image block as shown in 1701, so that the image is displayed in character units. There was a drawback that could not be cut out correctly.
[0011]
[Means for Solving the Problems]
In order to solve the above problems, an image processing method according to the present invention includes an image block extraction step of extracting an image block including a character image from image blocks cut out by taking a projection from an original image, and the image For the image block extracted in the block extraction step, it is determined whether there is a possibility that a line image is included based on the width of each image block and the reference character size, and the line image may be included. A determination step that sets an image block determined to be a target image block as a target image block, a tilt calculation step that determines a tilt of a character image in the target image block determined in the determination step, and the tilt in the target image block A histogram calculation step for taking a histogram in a direction along the inclination obtained in the calculation step; and the histogram calculation step The line image information extraction step for extracting line image information based on the position and width of the peak whose histogram frequency calculated in step 1 is equal to or greater than a predetermined threshold, and the line extracted by the line image information extraction step An extraction step of extracting image information obtained by removing the line image from the target image block based on image information.
[0012]
In order to solve the above problems, an image processing apparatus of the present invention includes an image block extracting unit that extracts an image block including a character image from image blocks cut out by taking a projection from an original image, and the image For the image block extracted by the block extraction means, it is determined whether there is a possibility that a line image is included based on the width of each image block and the reference character size, and the line image may be included. A determination unit that determines an image block that has been determined to be a target image block; a tilt calculation unit that determines a tilt of a character image in the target image block determined by the determination unit; and the tilt in the target image block A histogram calculating means for taking a histogram in a direction along the inclination obtained by the calculating means; and a frequency of the histogram calculated by the histogram calculating means. Line image information extraction means for extracting line image information based on the position and width of the peak that is equal to or greater than a predetermined threshold, and based on the line image information extracted by the line image information extraction means, Extracting means for extracting image information obtained by removing the line image from the target image block.
[0029]
【Example】
Example 1
FIG. 1 is a block diagram showing the configuration of a character recognition apparatus for carrying out the present invention. 101 is a CPU that controls the entire apparatus in accordance with a control program stored in the ROM 102, 102 is a ROM that stores the control program of the apparatus, such as the processing shown in the flowchart described below, executed by the CPU 101, and 103 is a document image. A RAM to be stored, 104 is a display, and 105 is a keyboard. Various commands and images and text displayed on the display 104 are edited from the keyboard 105. Reference numeral 106 denotes an image scanner for optically reading an original document image.
[0030]
An outline of this embodiment executed by the character recognition apparatus having the configuration shown in FIG. 1 will be described with reference to the flowchart of FIG.
[0031]
First, in step S <b> 201, a document image of a document placed on the image scanner 106 is input and stored in the RAM 103. The input image stored in the RAM 103 may be displayed on the display 104. When a recognition frame for performing recognition processing is set on the display 104, the recognition frame information is also stored in the RAM 103. The image input is not limited to the image scanner, but may be read from a storage device (not shown) or may be transmitted from another computer by communication means.
[0032]
In step S202, image blocks are extracted from the input document image stored in the RAM 103 in the same manner as in step S1402 in the conventional example. In step S203, the reference character size is obtained in the same manner as in step S1403. The information obtained in S202 and S203 is stored in the RAM 103.
[0033]
Here, for each image block extracted in S202, it is sequentially determined for each block whether or not it is a block including a side line in S204. If the target block is a block including a side line in S205, an image obtained by removing the side line from the block is extracted in S206, and a new image block including no side line is obtained.
[0034]
When all the image blocks have been extracted, in S207, the image blocks obtained in S203 and extracted using the reference character size stored in the RAM 103 are further divided to determine character-by-character image blocks. In S208, the extracted image block information (reference coordinates, width, height) in character units is stored and stored in the RAM 103 together with position information in the input image and information on appearance order (output order) in character units. In S209, the stored image block information is read one by one, and each character image block is identified with the dictionary data of the character pattern stored in the ROM 102, and the category (character) having the highest similarity is recognized. As a result. The result is stored in the RAM 103 together with position information of the character image obtained as a result in S210 in S210. The recognition result may be displayed on the display 104. In S210, in addition to the character having the highest similarity, a plurality of upper characters may be stored as candidate characters.
[0035]
The determination of the block including the side line in S204 will be described in detail using the flowchart of FIG. In S301, a ratio R = (width of target image block) / (reference character size) with respect to the reference character size of the width of the target image block is obtained. In S302, a predetermined threshold value T and R are compared, and if R is greater than T, the process proceeds to S303, and the block includes a side line. If R is smaller than T, the process proceeds to S304, and the block does not include a side line. Is determined. Usually, a side line is attached over a plurality of characters, so T is set to 2 or more. Further, in the case where there is a side line in the entire line and the reference character size cannot be obtained from the average of the image block widths or the distribution of the image block widths, the reference character size may be set to the line height in S203. In S204, the processing shown in the flowchart of FIG. 3 is performed on all the image blocks extracted in S202, and identification information as to whether or not all blocks are side line blocks is added.
[0036]
In S205, an image block to which identification information indicating that it is a sideline block added in S204 is added is determined, and the process proceeds to a process of removing a sideline from the image block in S206.
[0037]
The extraction of the image block excluding the side line in S206 will be described in detail using the flowchart of FIG. First, in step S401, information on the side line necessary for extracting the image block is extracted from the image block including the side line of interest. In the present embodiment, as the side line information, the width of the side line, the inclination of the side line, and the position of the side line are obtained.
[0038]
As shown in FIG. 6, the slope θ of the side line is set to two regions 602 at an appropriate interval L in the target image block, and a projection 601 in the region is obtained. From the projection deviations d and L, tan θ = The slope θ can be obtained as d / L. The projection to be obtained at this time is a method of searching for pixels in the left-right direction and setting it to 1 if there is even one black pixel. The interval L may be set to a position that is moderately inside (for example, a 1/2 character width) from the width depending on the width of the target image block. Alternatively, if the block width is sufficiently large (for example, R = 10), an appropriately determined L (for example, twice the reference character size) may be determined at an appropriate position.
[0039]
The position of the side line is below the character (underline), above (overline) or both above and below. In order to obtain the position of the side line, as shown in FIG. 7, the pixel block of interest 701 tilted by the angle θ (here, an image including two characters of “side line” and an underline) is 1 for each 1 / tan θ pixel. Taking a histogram of black pixels while shifting up and down pixel by pixel. The result is shown in FIG. In many cases, the histogram exists, for example, from under the image block to ¼ of the height if it is underlined, to ¼ of the height from the top of the image block if it is overlined. If the maximum value of the histogram is sufficiently larger than the width of the image block (the threshold value H determined by the block width) _TH If it is larger), it is determined that there is a side line in the portion, and as a result, H of the histogram _TH If the maximum value (or the next peak) exceeding ¼ is ¼ from the top, the overline can be obtained, and if it is ¼ from the bottom, the underline and the side line can be obtained. In addition to the method of taking a histogram, the continuity of black pixels is observed while shifting one pixel at a time for each 1 / tan θ pixel, and there is a black run sufficiently long with respect to the width of the image block. Even if a run is cut off halfway, if the width of the cut portion is very narrow, the cut portion is due to printing accuracy, and it is determined that the portion is connected, that is the sideline image block There is also a method for determining a long enough line as a side line.
[0040]
As shown in FIG. 8, the width W of the side line is a black pixel histogram of the part where the side line taken while shifting by 1 pixel for each 1 / tan θ pixel, and for the part including the outermost peak, Predetermined value H _TH It is obtained by the width w of the portion where the frequency is larger (for example, half the block width). If this width is smaller, the width of the side line may be w + α.
[0041]
As a result, if sideline information is not obtained, the image block extraction process ends in S402.
[0042]
If the information on the side line is obtained, the image block is extracted so as not to include the side line in S403. When extracting an image block in units of characters using the vertical projection as described in the conventional example, as shown in FIG. 9, while excluding the side line width w, and shifting by 1 pixel per 1 / tan θ pixel. The projection 901 can be extracted by taking the projection, and as a result, an image block 902 in character units in which “side” and “line” are divided is obtained. In S206, the processing shown in the flowchart of FIG. 4 is sequentially repeated for all the blocks determined as the side line blocks in S205.
[0043]
As described above, by extracting the information of the side line for the block determined to include the side line and extracting the image block only for the information obtained, the determination of the side line block is performed. Even if the accuracy is not high, there is an effect of correctly extracting a character image from an image block including a side line.
[0044]
Further, by obtaining the inclination of the side line, there is an effect of accurately extracting the character image even for the image block having the inclination.
[0045]
In this embodiment, the description is made with the minimum configuration requirements for carrying out the present invention. However, for example, the operator can specify a recognition area or perform an operation of correcting the recognition result. There is no problem even if the recognition area is automatically determined without intervention of an operator, or a process for reducing erroneous recognition is added.
[0046]
In this embodiment, a horizontally written document has been described. However, vertical writing can be performed in the same manner by only changing the row direction and the column direction.
[0047]
Further, the general-purpose computer may be configured to externally provide a program for performing processing for carrying out the present invention and store the control program for the apparatus in the RAM, and the storage device is limited to the RAM. not.
[0048]
In this embodiment, the example in which the processes of S204 and S206 are repeated for all the blocks has been described. However, after the processes of S204 to S206 are performed for one block, the processes of S204 to S206 are performed for the next block. You may set a loop to apply.
[0049]
(Example 2)
In the flowchart of FIG. 2, another example of the determination of the block including the side line in S204 and the extraction of the image line in the side line in S206 will be described with reference to FIGS. 10 and 11. Others are the same as those in the first embodiment, and a description thereof will be omitted.
[0050]
The determination of the block including the side line in S204 will be described using the flowchart of FIG. In S1001, the upper projection is taken. For example, a projection from the top of the image block to ¼ of the height may be taken. In S1002, the maximum width L1 of the upper projection frequency is obtained. The lower projection is taken in S1003, and the maximum width L2 of the lower projection frequency is obtained in S1004. If there is a side line, the projection width becomes a value close to the width of the image block. If the threshold obtained from the width of the image block is L, it is determined in S1005 that a side line is included if L1> L or L2> L. If not, it is determined that no side line is included, and the process proceeds to S1010. If the width of the image block is W, L = 0.9 × W may be set. If it is determined that a side line is included, information about the side line is extracted in S1006. The extraction method may be the same as in S401. If the information of the side line is not extracted, the process proceeds to S1010, and the image block is an image block that does not include the side line. In step S1008, the extracted sideline information is stored and stored in the RAM 103, and in step S10009, the target image block is changed to a block including a sideline, and the process ends. Note that the order of obtaining L1 and L2 may be reversed. Further, as a method of extracting the information of the side line in S1006, a point corresponding to a line with a width of L is searched for the image block from the bottom if it is an underline, or from the top if it is an overline. By calculating d from the difference and deriving θ, faster processing is realized.
[0051]
Another method of extracting the image block for the side line in S206 will be described with reference to the flowchart of FIG. In step S <b> 1101, image blocks are extracted so as not to include a side line using the side line information stored in the RAM 103. The extraction method may be the same as in S403.
[0052]
As described above, the local projection of the image block is used for the determination of the block including the side line, and the width thereof is used, so that the determination accuracy is improved. Further, by extracting the information of the side line at that time, there is an effect that the processing of extracting the image block is performed on the image block that does not include the side line and the processing time is not wasted.
[0053]
(Example 3)
The outline of the processing of another embodiment executed by the character recognition apparatus having the configuration shown in FIG. 1 will be described with reference to the flowchart of FIG.
[0054]
In step S1201, a document image is input. In step S1202, an image block is extracted from the input document image. In step S1203, a reference character size is obtained. In step S1204, it is determined whether the block includes a side line. If the target block is a block including a side line in S1205, an image block excluding the side line is extracted in S1206. The steps so far are the same as those in the first embodiment and have been described in detail in S201 to S206, and thus detailed description thereof is omitted here. If the image block obtained in S1206 still contains a side line, it is checked again in S1211 whether or not to extract the image block excluding the side line, and if so, S1206 is executed again. The determination method in S1211 may be the same as that in S1204, and the image block may be extracted again even when the horizontal width of the image block does not change before and after the side line extraction.
[0055]
When all the image blocks have been extracted, an image block in character units is determined using the reference character size in S1207, the image block information extracted in S1208 is stored, an identification operation is performed in S1209, and the result is obtained in S1210. Remember. The processing of S1207 to S1210 is the same as that of S207 to S210.
[0056]
As described above, after extracting the image block from the image block including the side line and extracting the image block again, the double or multiple side line is determined by determining whether or not to extract the image block excluding the side line again. It is also possible to extract a character image block even for an image block including an image block or a partially multiplexed image block.
[0057]
Example 4
Storage of the calculation result of S210 in the flowchart of FIG. 2 will be described with reference to FIG. If the result of the identification operation in FIG. 5 is “This is a side line”, an underline start control code UI1301 is placed before “side”, and an underline end control code UO1302 is placed after “line” in FIG. The data is stored in the RAM 103 as shown in a). If a side line can be displayed on the display 104, an underline is drawn from the UI to the UO character to display the recognition result. Similarly, control codes may be determined for overlines and multiple lines.
[0058]
If each character has an attribute related to a side line instead of a control code, the attribute of “side” and “line” may be underlined, and the result may be stored as shown in FIG.
[0059]
Note that the data structure of the result is not limited to that shown in FIG. 13, and information such as candidate characters may be included therein.
[0060]
As described above, by storing characters with a side line, there is an effect that information other than characters of the input image can be reproduced without loss.
[0061]
(Example 5)
In the above-described first embodiment, the method of correctly cutting out character units from image information including a character string with an underline such as an underline as shown in FIG. 5 has been described. In this embodiment, as shown in FIG. A method for cutting out characters correctly from an image including a character string surrounded by a frame will be described.
[0062]
The present embodiment is implemented by the character recognition device having the configuration described in detail with reference to FIG. 1 in the first embodiment.
[0063]
An outline of this embodiment executed by the character recognition apparatus having the configuration shown in FIG. 1 will be described with reference to the flowchart of FIG. Here, the characteristic processing of this embodiment is from the processing for finding a surrounding frame block in S1401 to the processing for removing the surrounding frame in S1403. Here, these three steps will be described in detail. The same processing steps as those shown in the flowchart of FIG. 2 described in the first embodiment are denoted by the same step numbers. A description of this process is omitted here.
[0064]
For each image block extracted in S203, whether or not it is a block including an enclosing frame is sequentially determined for each block in S1401. If the target block is a block including an enclosing frame in S1402, an image obtained by removing the enclosing frame from the image of the block is extracted in S1403 to obtain a new image block that does not include the enclosing frame.
[0065]
The determination of the block including the enclosing frame in S1401 will be described in detail with reference to the flowchart of FIG. In step S1501, a ratio R = (width of target image block) / (reference character size) with respect to the reference character size of the width of the target image block is obtained. In S1502, a preset threshold value T is compared with R, and if R is smaller than T, the process proceeds to S1503, and the frame is determined to be a block that does not include a surrounding frame. If R is larger than T, the process proceeds to S1504 and proceeds to a process of confirming whether the frame is a surrounding frame block. In S1504, the projection of the upper part of the image block is taken. For example, a projection from the top of the image block to ¼ of the height may be taken. In S1505, the maximum width L1 of the upper projection taken in S1504 is obtained. If there is a surrounding frame, the projection width becomes a value close to the width of the image block. In S1506, the threshold obtained from the width of the image block is set to L (for example, 0.9 times the block width), and L and L1 are compared. . If L1 <L, it is determined that the block is not a surrounding frame, and the process advances to S1503. Similarly, the lower projection is taken in S1507, and the maximum width L2 of the lower projection is obtained in S1508. If L2 <L in S1509, it is determined that the frame is not a frame, and the process proceeds to S1503. Otherwise, the flow proceeds to S1510, and the frame is determined to be a frame. In S1503 and S1510, identification information that can be used to determine whether or not the image block is a frame block is added to the image block of interest and stored in the RAM 103. Note that the order of obtaining L1 and L2 may be reversed.
[0066]
The extraction of the image block for the surrounding frame in S1403 will be described in detail with reference to the flowchart of FIG. First, in step S <b> 1601, information on a surrounding frame necessary for extracting an image block is extracted from the image block of interest. In this embodiment, the line width w of the surrounding frame and the inclination θ of the surrounding frame are obtained.
[0067]
As shown in FIG. 18, the inclination θ of the surrounding frame is set to two regions 1800 with an appropriate interval L (the setting method of L is the same as in the first embodiment) in the target image block, and the image data of each region A projection 1801 is obtained for, and θ is obtained as tan θ = d / L from the deviations d and L of the projection. If it is obtained accurately, the inclination is calculated above and below the projection. If the values are close, the average is the inclination of the target image block. If the values are not close, the inclination is not determined, and the inclination value is not determined. Since the frame information cannot be extracted, the processing of the flowchart of FIG. 16 for the image block is ended.
[0068]
As shown in FIG. 19, the line width w of the enclosing frame is obtained by taking a histogram of black pixels while shifting the pixel block of interest up and down by 1 pixel for every 1 / tan θ pixel, and by placing the peak at the outermost side of the image block. For the part including _TH Obtained by the width w of the larger portion. If this width is smaller, the line width of the surrounding frame may be w + α. If it differs between the upper and lower sides of the enclosing frame, both the upper and lower line widths of the frame are obtained.
[0069]
If the information on the surrounding frame is not obtained at this stage, it is determined that the information on the surrounding frame cannot be extracted in S1602, and the image block extraction processing in the flowchart of FIG. The frame tilt θ and the frame width w extracted in S1601 are stored in the RAM 103 in association with the image block.
[0070]
If the information about the surrounding frame is obtained, image blocks are extracted so as not to include the surrounding frame in S1603. When extracting an image block in units of characters using the projection in the vertical direction as described in the conventional example, as shown in FIG. 20, the line width w of the surrounding frame is not included in the vertical direction, and every 1 / tan θ pixel. The projection 2001 can be extracted by taking the projection while shifting the pixel by one pixel, and as a result, an image block 2002 is obtained. Of the obtained blocks, the left and right end blocks 2003 are part of the enclosing frame as long as the width is close to the line width w of the enclosing frame. Therefore, the left and right end blocks are not character image blocks. If there is no block having a width approximately equal to the line width of the surrounding frame on the left and right of the target image block, it is determined that the image block does not include the surrounding frame.
[0071]
(Example 6)
Another embodiment of determining a block including a surrounding frame in S1401 and extracting an image block of the surrounding frame in S1403 will be described. Other processing steps are the same as those in the first embodiment, and a description thereof will be omitted.
[0072]
The determination of the block including the enclosing frame in S1401 will be described with reference to the flowchart of FIG. In S2101, the ratio R of the width of the image block of interest to the reference character size is obtained. If R is smaller than T in S2102, the process proceeds to S2103 to determine that the block does not include a surrounding frame. If R is larger than T, the outermost boundary line of the target image block is traced in step S2104. If the boundary line exceeds the permissible range, it is determined in S2104 that it is not a frame, and the process proceeds to S2103. An example of the permissible range will be described with reference to FIG. 22. In the left half of the block, when the y coordinate of the boundary line exists between y1 and y2 that is predetermined, the x coordinate is present to the right of the predetermined x1. For example, it is assumed that the boundary line does not track the bounding frame and exceeds the allowable range. If the x coordinate is between x1 and x2 in the upper half of the block, the allowable range is exceeded if the y coordinate is below y1. The same applies to the right half and the lower half.
[0073]
If the boundary line does not exceed the allowable range, the process advances to step S2106 to extract information on the surrounding frame. The extraction method may be the same as S1601 in the fifth embodiment. If the information of the surrounding frame is not extracted, the process proceeds to S2103 in S2107 and the determination ends. The extracted box information is stored and stored in the RAM 103 in S2108, and the process advances to S2109 to determine that the box is a box.
[0074]
The extraction of the image block of the enclosing frame in S1403 will be described with reference to the flowchart of FIG. Using the frame information stored in S2301, image blocks are extracted so as not to include the frame. The extraction method may be the same as S1603 in the first embodiment.
[0075]
As described above, the use of the boundary line for the determination of the block of the surrounding frame has an effect of improving the determination accuracy. Further, since the information of the surrounding frame is extracted by the determination of the block of the surrounding frame, there is an effect of preventing the image block from being extracted from the image block that is not the surrounding frame.
[0076]
(Example 7)
An outline of another embodiment different from the fifth embodiment of character segmentation excluding the enclosing frame executed by the character recognition apparatus having the configuration shown in FIG. 1 will be described with reference to the flowchart of FIG. However, the same processing steps as those shown in the flowchart of FIG. 14 of the fifth embodiment are denoted by the same step numbers, and description thereof is omitted here.
[0077]
A characteristic processing step in the present embodiment is S2401, and if the image block obtained in S1403 still includes an enclosing frame, it is checked in S2401 whether or not the image block excluding the enclosing frame is extracted again. If so, S1403 is executed again. This redetermination method may be the same as that in S1401.
[0078]
When all image blocks have been extracted, an image block in character units is determined using the reference character size in S207, the image block information extracted in S208 is stored, an identification operation is performed in S1209, and the result is obtained in S210. Remember.
[0079]
As described above, after extracting the image block from the image block including the enclosing frame and excluding the enclosing frame, it is determined whether the image block excluding the enclosing frame is extracted again. There is also an effect that a character image block can be extracted even for an image block including multiple surrounding frames.
[0080]
(Example 8)
Storage of the calculation result in S1410 in the flowchart of FIG. 14 will be described with reference to FIG. Assuming that the result of the identification calculation in FIG. 17 is “this is a box”, as shown in FIG. 25A, the control code FI2501 for starting the box before “box”, and the box behind the “frame”. An end control code FO 2502 is entered and stored in the RAM 103. If an enclosing frame can be displayed on the display 104, the recognition result is displayed by enclosing the encircling characters from FI to FO.
[0081]
In addition, if each character has an attribute related to an enclosing frame instead of a control code, the result is stored with each attribute of “enclosed”, “mi”, and “frame” as an enclosing frame as shown in FIG. Also good.
[0082]
Note that the data structure of the result is not limited to that shown in FIG. 13, and information such as candidate characters may be included therein.
[0083]
As described above, by storing characters with a surrounding frame, there is an effect that information other than characters of the input image can be reproduced without loss.
[0084]
【The invention's effect】
As described above, according to the present invention, since the image from which the line image is removed is extracted from the image information including the line image other than characters, it takes time and effort to erase the line on the document and input the image. In addition, it is possible to correctly cut out characters with lines added in character units.
[0085]
As described above, according to the present invention, an image can be correctly cut out in units of characters even when an image of a character to which a side line such as an underline or overline is added is input.
[0086]
As described above, according to the present invention, even when an image of a character surrounded by a surrounding frame is input, the image can be correctly cut out in units of characters.
[0087]
As described above, according to the present invention, whether or not the block includes a line image other than characters is determined based on the ratio between the width of the extracted image block and the reference character width. An image block to which a line is added can be determined.
[0089]
As described above, according to the present invention, the image information from which the line image is removed is extracted, so that it can be divided into character units with easy processing.
[0090]
As described above, according to the present invention, the information related to the line width is extracted, and the line image is removed according to this information. Therefore, even if line images having various line widths are added to the character, Line images can be removed.
[0091]
As described above, according to the present invention, information related to the inclination of the line is extracted, and line images are removed according to this information. Therefore, even when the document is inclined or the character string is inclined accurately. Line images can be removed.
[0092]
As described above, according to the present invention, the information related to the position of the line is extracted, and the line image is removed according to this information. Can be removed.
[0093]
As described above, according to the present invention, since it is determined again whether or not the image from which the line image is removed includes the line image, it is possible to completely remove the line image even if the characters have multiple line images added thereto. .
[0094]
As described above, according to the present invention, character recognition is performed after the line image is removed, so that the accuracy of character recognition is improved.
[0095]
As described above, according to the present invention, the line image attribute is added to the character even after the character is recognized, so that the line information originally added to the document is reflected when displaying the text. Can do.
[Brief description of the drawings]
FIG. 1 is a block diagram showing the configuration of a character recognition apparatus according to an embodiment of the present invention.
FIG. 2 is a flowchart of character recognition processing in the first embodiment.
FIG. 3 is a flowchart of detailed processing of side line block determination according to the first embodiment.
FIG. 4 is a flowchart of detailed processing of image extraction excluding side lines in the first embodiment.
FIG. 5 is a diagram illustrating an example of extracting an image block of an image including a character string to which a side line is added.
FIG. 6 is a diagram for explaining a method for detecting the inclination of a side line.
FIG. 7 is a diagram for explaining how to take a histogram when detecting the position of a side line
FIG. 8 is a diagram illustrating a method for detecting the width of a side line
FIG. 9 is an explanatory diagram of image extraction excluding side lines and image segmentation in character units.
FIG. 10 is a flowchart of detailed processing of side line block determination according to the second embodiment.
FIG. 11 is a flowchart of detailed processing of image extraction excluding side lines in the second embodiment.
FIG. 12 is a flowchart of character recognition processing according to the third embodiment.
FIG. 13 is a diagram showing a storage example of a character string to which an attribute of a side line is added
FIG. 14 is a flowchart of character recognition processing in the fifth embodiment.
FIG. 15 is a flowchart of detailed processing for enclosing frame block determination according to the fifth embodiment.
FIG. 16 is a flowchart of detailed processing of image extraction excluding a surrounding frame in the fifth embodiment.
FIG. 17 is a diagram illustrating an example of extracting an image block of an image including a character string to which a surrounding frame is added.
FIG. 18 is a diagram for explaining a method for detecting the inclination of a surrounding frame.
FIG. 19 is a diagram for explaining how to take a histogram when detecting the line width of an enclosing frame;
FIG. 20 is a diagram for explaining a method for detecting the width of a surrounding frame.
FIG. 21 is an explanatory diagram of extracting an image excluding a surrounding frame and extracting an image in character units.
FIG. 22 is a flowchart of detailed processing for enclosing frame block determination according to the sixth embodiment.
FIG. 23 is a flowchart of detailed processing of image extraction excluding a surrounding frame in the sixth embodiment.
FIG. 24 is a flowchart of character recognition processing according to the seventh embodiment.
FIG. 25 is a diagram showing a storage example of a character string to which an attribute of a frame is added
FIG. 26 is a flowchart of conventional character recognition processing.
FIG. 27 is a diagram showing an example in which a character image is divided by a histogram from image information where lines and characters are separated from each other;

Claims

An image block extraction step for extracting an image block including a character image from among image blocks cut out by taking a projection from the original image;
For the image block extracted in the image block extraction step, it is determined whether there is a possibility that a line image is included based on the width of each image block and the reference character size, and the line image is included. A step of determining an image block that is determined to be a target image block,
An inclination calculating step for determining an inclination of the character image in the target image block determined in the determining step;
In the target image block, a histogram calculation step that takes a histogram in a direction along the inclination obtained in the inclination calculation step;
A line image information extraction step for extracting line image information based on the position and width of a peak at which the frequency of the histogram calculated in the histogram calculation step is equal to or greater than a predetermined threshold ;
An image processing method comprising: an extraction step of extracting image information obtained by removing the line image from the target image block based on information of the line image extracted by the line image information extraction step.

Assuming that the inclination angle obtained in the inclination calculation step is θ, in the histogram calculation step, when taking a histogram in the direction along the inclination in the target image block, the position where the histogram is taken is one pixel in the vertical direction for each tan θ pixel. the image processing method according to claim 1, characterized in that go I taken a histogram of black pixels in the horizontal direction while shifting.

The image processing method according to claim 1, wherein the line image includes at least one of a side line and a frame of a surrounding frame.

The image processing method according to claim 1, wherein the line image includes at least one of an underline and an overline.

The image processing method according to claim 1, further comprising a dividing step of dividing the image information extracted in the extracting step into character units.

In the linear image information extraction step, as the information of the line image, the width of the該線image, gradient, and extracts information relating to the position,
2. The image processing method according to claim 1, wherein in the extraction step, image information obtained by removing the line image from the image is extracted based on information on the width, inclination, and position of the acquired line image. .

2. The image processing method according to claim 1, wherein the processing by the histogram calculation step, the line image information extraction step, and the extraction step is performed again on the image information extracted in the extraction step. .

2. The image processing method according to claim 1, further comprising a character recognition processing step of performing character recognition processing on the image information extracted in the extraction step to obtain a character code as a recognition result.

Further, in the character recognition step, the recognition result obtained by performing character recognition processing on the image information obtained by removing the line image from the image determined to include the line image in the extraction step, The image processing method according to claim 8, wherein an attribute related to the line information is added and stored.

9. The image processing method according to claim 8, wherein in the character recognition step, a code related to the line information is inserted and stored in a character code string of a recognition result obtained by performing character recognition processing.

In the inclination calculating step, two small areas are set with a predetermined interval in the image, a projection is obtained in each small area, and based on the deviation between the projections of each small area and the predetermined interval The image processing method according to claim 1, wherein an inclination is obtained.

Image block extraction means for extracting an image block including a character image from image blocks cut out by taking a projection from the original image;
For the image block extracted by the image block extraction means, it is determined whether there is a possibility that a line image is included based on the width of each image block and the reference character size, and the line image is included. A determination unit that sets an image block that has been determined to be a target image block,
An inclination calculating means for obtaining an inclination of the character image in the target image block determined by the determining means;
In the target image block, a histogram calculation unit that takes a histogram in a direction along the inclination obtained by the inclination calculation unit;
Line image information extraction means for extracting line image information based on the position and width of a peak whose histogram frequency calculated by the histogram calculation means is equal to or greater than a predetermined threshold ;
An image processing apparatus comprising: extraction means for extracting image information obtained by removing the line image from the target image block based on the line image information extracted by the line image information extraction means.