JP4995507B2

JP4995507B2 - Image processing system, character recognition system, and image processing program

Info

Publication number: JP4995507B2
Application number: JP2006205784A
Authority: JP
Inventors: 明廣瀬; 純平小山; 雅弘加藤
Original assignee: Fuji Xerox Co Ltd; University of Tokyo NUC; Fujifilm Business Innovation Corp
Current assignee: University of Tokyo NUC; Fujifilm Business Innovation Corp
Priority date: 2006-07-28
Filing date: 2006-07-28
Publication date: 2012-08-08
Anticipated expiration: 2026-07-28
Also published as: JP2008033604A

Description

本発明は、画像処理システム、文字認識システムおよび画像処理プログラムに関する。 The present invention relates to an image processing system, a character recognition system, and an image processing program.

紙文書をスキャンして文字情報を抽出する技術として、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ、光学的文字認識）が利用されている。文字は手書きされたものと活字で印字されたものに分類できるが、高い認識精度を得るためにそれぞれ専用のＯＣＲ方式を用いる必要がある。特に、手書き文字を認識するためには、予め文字が存在する領域を指定しておく必要があるため、帳票原稿など定型文書での利用に留まっている。
これに対し、人が扱う文書は、活字文字と手書き文字が混在しているものが多い。各種申請書やアンケート等が代表例である。これらをスキャンから文字情報抽出に至るまで全自動化するためには、上記活字文字と手書き文字とを自動的に判別する必要がある。
これを実現するために、従来技術では画像中から１文字毎に正確に文字を切り出し、切り出した文字外接矩形を作成してその縦辺および横辺の長さを計測し、これに基づいて縦横辺の比であるとか、辺長の分散であるとか、外接矩形間の距離のばらつきなどを評価することで、活字文字と区別を行っていた。
しかしながら、これらの方式は活字文字が確実に文字毎に外接矩形を作成できたり、並びが均等であったりすることを前提としている。しかし、実際にはこの前提を満足していない場合が多いので、この判断基準では判断を誤るケースも多い。また、これらの方法では、言語依存性が強くなる。 As a technique for extracting character information by scanning a paper document, OCR (Optical Character Recognition) is used. Characters can be classified into handwritten characters and printed characters, but it is necessary to use a dedicated OCR method for obtaining high recognition accuracy. In particular, in order to recognize handwritten characters, it is necessary to designate an area in which characters are present in advance.
On the other hand, many documents handled by humans are a mixture of printed characters and handwritten characters. Various application forms and questionnaires are typical examples. In order to fully automate these processes from scanning to character information extraction, it is necessary to automatically distinguish between the printed characters and the handwritten characters.
In order to realize this, the conventional technique accurately cuts out each character from the image, creates a cut out circumscribed rectangle, measures the length of the vertical and horizontal sides, and based on this, calculates the vertical and horizontal directions. It was distinguished from type characters by evaluating the ratio of sides, the variance of side lengths, and the variation in distance between circumscribed rectangles.
However, these methods are based on the premise that type characters can surely create a circumscribed rectangle for each character, or the arrangement is uniform. However, there are many cases in which this assumption is not actually satisfied. In addition, these methods have strong language dependency.

これらに関連する技術として、例えば、特許文献１には、手書き文字は活字文字に比べてピッチがバラつくことに着目して、文字行を切り出し、切り出した文字行においてそのピッチのバラツキを調べる技術が開示されている。
また、例えば、特許文献２には、文字の外接矩形の縦横比で判定する技術に関し、白画素に隣接する黒画素の配列様態（横／縦画素率、曲／直画素率など）で判定する技術が開示されている。
また、例えば、特許文献３には、文字の特徴パラメータと辞書により得られるそれの距離を求め、その距離の平均的なバラツキを求めて判定する技術が開示されている。
また、例えば、特許文献４には、手書き文字では水平方向に延びる線分が比較的少ないことに着目し、文字を構成する画素のうち、水平方向成分の画素の割合に基づいて判定する技術が開示されている。
また、例えば、特許文献５には、各種特徴量（文字の高さ、ピッチ、面積など）の分散に基づき総合的に判断する技術が開示されている。
また、例えば、特許文献６には、文字行に対して各ラベルの重心がほぼ直線上に並んでいるかを判定し、並んでいれば各ラベルの外接矩形の長辺方向の長さがほぼ等しいかを判定し、ほぼ等しい場合には活字文字列と判定する技術が開示されている。
また、例えば、特許文献７には、テクスチャ領域の抽出に関するものであり、テクスチャ画像の方は卓越しているパワーを持つ空間周波数が空間周波数領域において規則的に点在しており、一方テクスチャではない画像のパワースペクトルは卓越周波数の規則性は見られないことに着目したものであり、既に抽出されているテクスチャ区画の領域から外側に向かって、テクスチャ画像の画像特徴量の変化を調べ、変化の大きくなったところをエッジ部と判断し、画像特徴量として１次元のパワースペクトルを採択する技術が開示されている。
また、例えば、特許文献８には、文字領域の判定に関するものであり、一般に、文字の周辺は、文字を見やすくするために無地になっているのが普通であり、文字でない背景パターンの場合は、周辺が無地であるという傾向をもたないことに着目したものであり、画像を所定サイズのブロックに分割、直交変換を行い空間周波数分布を獲得し、このパワーが所定の閾値より大きいブロックを選択して文字領域の候補とし、この候補の周囲近傍のブロックの高空間周波数領域のパワーが所定の閾値より小さく、かつ、低空間周波数領域のパワーが互いに等しければ、当該文字領域の候補を文字領域であると判定する技術が開示されている。
特開昭５７−１１１６７９号公報特開昭５８−０３７７７５号公報特開昭６０−１３８６９２号公報特開昭６３−１０３３８９号公報特開平０１−０１４６８２号公報特開２０００−３３９４０４号公報特開平０７−１２１６９９号公報特開平０９−１８６８６１号公報 As a technique related to these, for example, in Patent Document 1, focusing on the fact that handwritten characters vary in pitch as compared to printed characters, a technique for cutting out character lines and examining the pitch variation in the cut out character lines Is disclosed.
Further, for example, Patent Document 2 relates to a technique for determining by the aspect ratio of a circumscribed rectangle of a character, and determining by the arrangement mode of black pixels adjacent to white pixels (horizontal / vertical pixel ratio, curved / straight pixel ratio, etc.). Technology is disclosed.
Further, for example, Patent Document 3 discloses a technique for determining a characteristic parameter of a character and a distance obtained from the dictionary and determining an average variation in the distance.
Further, for example, Patent Document 4 focuses on the fact that handwritten characters have relatively few line segments extending in the horizontal direction, and a technique for determining based on the proportion of pixels in the horizontal direction component among the pixels constituting the character. It is disclosed.
Further, for example, Patent Document 5 discloses a technique for making a comprehensive determination based on the dispersion of various feature amounts (character height, pitch, area, etc.).
Further, for example, in Patent Document 6, it is determined whether or not the center of gravity of each label is aligned on a straight line with respect to a character line, and if they are aligned, the length of the circumscribed rectangle of each label in the long side direction is substantially equal. A technique for determining whether or not a character string is a character string is disclosed.
Further, for example, Patent Document 7 relates to extraction of a texture region, and a texture image is regularly scattered with spatial frequencies having superior power in the spatial frequency region. Focusing on the fact that the regularity of the dominant frequency is not seen in the power spectrum of the non-existing image, the change in the image feature amount of the texture image is examined and changed from the already extracted texture area to the outside. A technique is disclosed in which a one-dimensional power spectrum is adopted as an image feature amount by determining that the edge of the image is an edge portion.
In addition, for example, Patent Document 8 relates to determination of a character region. Generally, the periphery of a character is usually plain for easy viewing of the character, and in the case of a background pattern that is not a character. The image is divided into blocks of a predetermined size, orthogonal transform is performed to obtain a spatial frequency distribution, and blocks whose power is greater than a predetermined threshold value. If the power of the high spatial frequency region of a block in the vicinity of the candidate is smaller than a predetermined threshold and the powers of the low spatial frequency region are equal to each other, the character region candidate is selected as a character region candidate. A technique for determining an area is disclosed.
Japanese Patent Laid-Open No. 57-111679 JP 58-037775 A Japanese Patent Laid-Open No. 60-138692 JP-A 63-103389 Japanese Unexamined Patent Publication No. 01-014682 JP 2000-339404 A Japanese Patent Laid-Open No. 07-121699 Japanese Patent Laid-Open No. 09-186861

本発明は、このような背景技術の状況の中でなされたもので、手書き文字領域または活字文字領域であることが不明である画像から、手書き文字領域または活字文字領域を判断することができるようにした画像処理システム、文字認識システムおよび画像処理プログラムを提供することを目的としている。 The present invention has been made in the background of such a background art, so that a handwritten character region or a printed character region can be determined from an image that is unknown to be a handwritten character region or a printed character region. An object of the present invention is to provide an image processing system, a character recognition system, and an image processing program.

かかる目的を達成するための本発明の要旨とするところは、次の各項の発明に存する。
［１］画像から所定の特徴を抽出する特徴抽出手段と、
前記特徴抽出手段により抽出された特徴に応じて、テクスチャ空間に各画素または各領域のテクスチャ情報をマップしたテクスチャ空間を生成するテクスチャ空間生成手段と、
前記テクスチャ空間生成手段により生成されたテクスチャ空間にて、手書き文字領域である部分空間または活字文字領域である部分空間を推定する部分空間推定手段と、
前記部分空間推定手段により推定された部分空間に応じて、実画像上にて手書き文字領域または活字文字領域を確定する領域確定手段
を具備し、
前記特徴抽出手段が抽出する特徴量は、手書き文字特徴を表す特徴量または活字文字特徴を表す特徴量であり、
２次元フーリエ変換結果の周波数平面にて、所定の軸からの距離で重み付けをしたスペクトル強度の和である
ことを特徴とする画像処理システム。 The gist of the present invention for achieving the object lies in the inventions of the following items.
[1] Feature extraction means for extracting a predetermined feature from an image;
Texture space generation means for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction means;
In the texture space generated by the texture space generation means, a partial space estimation means for estimating a partial space that is a handwritten character area or a partial space that is a printed character area;
In accordance with the partial space estimated by the partial space estimation means, comprising an area determination means for determining a handwritten character area or a printed character area on an actual image ,
The feature amount extracted by the feature extraction means is a feature amount representing a handwritten character feature or a feature amount representing a type character feature,
An image processing system, which is a sum of spectral intensities weighted by a distance from a predetermined axis on a frequency plane of a two-dimensional Fourier transform result .

［２］画像から所定の特徴を抽出する特徴抽出手段と、
前記特徴抽出手段により抽出された特徴に応じて、テクスチャ空間に各画素または各領域のテクスチャ情報をマップしたテクスチャ空間を生成するテクスチャ空間生成手段と、
前記テクスチャ空間生成手段により生成されたテクスチャ空間にて、手書き文字領域らしさまたは活字文字領域らしさを表す指標を設定する指標設定手段と、
前記テクスチャ空間生成手段により生成されたテクスチャ空間にて、前記指標設定手段によって設定された指標に応じて手書き文字領域である部分空間または活字文字領域である部分空間を推定する部分空間推定手段と、
前記部分空間推定手段により推定された部分空間に応じて、実画像上にて手書き文字領域または活字文字領域を確定する領域確定手段
を具備し、
前記特徴抽出手段が抽出する特徴量は、手書き文字特徴を表す特徴量または活字文字特徴を表す特徴量であり、
２次元フーリエ変換結果の周波数平面にて、所定の軸からの距離で重み付けをしたスペクトル強度の和である
ことを特徴とする画像処理システム。 [2] Feature extraction means for extracting predetermined features from the image;
Texture space generation means for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction means;
In the texture space generated by the texture space generating means, an index setting means for setting an index representing the handwriting character area-likeness or the print character area-likeness;
In the texture space generated by the texture space generation means, a partial space estimation means for estimating a partial space that is a handwritten character area or a partial space that is a printed character area according to the index set by the index setting means;
In accordance with the partial space estimated by the partial space estimation means, comprising an area determination means for determining a handwritten character area or a printed character area on an actual image ,
The feature amount extracted by the feature extraction means is a feature amount representing a handwritten character feature or a feature amount representing a type character feature,
An image processing system, which is a sum of spectral intensities weighted by a distance from a predetermined axis on a frequency plane of a two-dimensional Fourier transform result .

［３］画像から所定の特徴を抽出する特徴抽出手段と、
前記特徴抽出手段により抽出された特徴に応じて、テクスチャ空間に各画素または各領域のテクスチャ情報をマップしたテクスチャ空間を生成するテクスチャ空間生成手段と、
前記テクスチャ空間生成手段により生成されたテクスチャ空間にて、手書き文字領域である部分空間または活字文字領域である部分空間を推定する部分空間推定手段と、
前記部分空間推定手段により推定された部分空間に応じて、実画像上にて手書き文字領域または活字文字領域を確定する領域確定手段と、
前記領域確定手段により確定された手書き文字領域または活字文字領域を用いて、前記画像からレイアウトを解析するレイアウト解析手段
を具備し、
前記特徴抽出手段が抽出する特徴量は、手書き文字特徴を表す特徴量または活字文字特徴を表す特徴量であり、
２次元フーリエ変換結果の周波数平面にて、所定の軸からの距離で重み付けをしたスペクトル強度の和である
ことを特徴とする画像処理システム。 [ 3 ] Feature extraction means for extracting a predetermined feature from the image;
Texture space generation means for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction means;
In the texture space generated by the texture space generation means, a partial space estimation means for estimating a partial space that is a handwritten character area or a partial space that is a printed character area;
In accordance with the partial space estimated by the partial space estimating means, an area determining means for determining a handwritten character area or a printed character area on an actual image;
Using a handwritten character area or a printed character area determined by the area determination means, and comprising a layout analysis means for analyzing a layout from the image ,
The feature amount extracted by the feature extraction means is a feature amount representing a handwritten character feature or a feature amount representing a type character feature,
An image processing system, which is a sum of spectral intensities weighted by a distance from a predetermined axis on a frequency plane of a two-dimensional Fourier transform result .

［４］画像から所定の特徴を抽出する特徴抽出手段と、
前記特徴抽出手段により抽出された特徴に応じて、テクスチャ空間に各画素または各領域のテクスチャ情報をマップしたテクスチャ空間を生成するテクスチャ空間生成手段と、
前記テクスチャ空間生成手段により生成されたテクスチャ空間にて、手書き文字領域である部分空間または活字文字領域である部分空間を推定する部分空間推定手段と、
前記部分空間推定手段により推定された部分空間に応じて、実画像上にて手書き文字領域または活字文字領域を確定する領域確定手段と、
前記画像からレイアウトを解析するレイアウト解析手段
を具備し、
前記特徴抽出手段が抽出する特徴量は、手書き文字特徴を表す特徴量または活字文字特徴を表す特徴量であり、
２次元フーリエ変換結果の周波数平面にて、所定の軸からの距離で重み付けをしたスペクトル強度の和であり、
前記領域確定手段は、前記レイアウト解析手段によって解析されたレイアウトをも用いて手書き文字領域または活字文字領域を確定する
ことを特徴とする画像処理システム。 [ 4 ] Feature extraction means for extracting predetermined features from the image;
Texture space generation means for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction means;
In the texture space generated by the texture space generation means, a partial space estimation means for estimating a partial space that is a handwritten character area or a partial space that is a printed character area;
In accordance with the partial space estimated by the partial space estimating means, an area determining means for determining a handwritten character area or a printed character area on an actual image;
Comprising layout analysis means for analyzing the layout from the image,
The feature amount extracted by the feature extraction means is a feature amount representing a handwritten character feature or a feature amount representing a type character feature,
It is the sum of the spectral intensities weighted by the distance from a predetermined axis in the frequency plane of the two-dimensional Fourier transform result,
The image processing system characterized in that the area determination means determines a handwritten character area or a printed character area using the layout analyzed by the layout analysis means.

［５］［１］、［２］、［３］または［４］記載の画像処理システムの前記領域確定手段により確定された手書き文字領域または活字文字領域を認識する手書き文字認識手段または活字文字認識手段
を具備することを特徴とする文字認識システム。 [ 5 ] Handwritten character recognition means or typeface character recognition for recognizing a handwritten character area or a printed character area determined by the area determining means of the image processing system according to [1], [2], [3] or [4] A character recognition system comprising: means.

［６］前記領域確定手段は、前記手書き文字認識手段により認識された文字または前記活字文字認識手段により認識された文字の確からしさをも用いて手書き文字領域または活字文字領域を確定する
ことを特徴とする［５］記載の文字認識システム。 [ 6 ] The area determining means determines the handwritten character area or the printed character area using the character recognized by the handwritten character recognizing means or the probability of the character recognized by the printed character recognizing means. The character recognition system according to [ 5 ].

［７］コンピュータに、
画像から所定の特徴を抽出する特徴抽出機能と、
前記特徴抽出機能により抽出された特徴に応じて、テクスチャ空間に各画素または各領域のテクスチャ情報をマップしたテクスチャ空間を生成するテクスチャ空間生成機能と、
前記テクスチャ空間生成機能により生成されたテクスチャ空間にて、手書き文字領域である部分空間または活字文字領域である部分空間を推定する部分空間推定機能と、
前記部分空間推定機能により推定された部分空間に応じて、実画像上にて手書き文字領域または活字文字領域を確定する領域確定機能
を実現させ、
前記特徴抽出機能が抽出する特徴量は、手書き文字特徴を表す特徴量または活字文字特徴を表す特徴量であり、
２次元フーリエ変換結果の周波数平面にて、所定の軸からの距離で重み付けをしたスペクトル強度の和である
ことを特徴とする画像処理プログラム。 [ 7 ] To the computer,
A feature extraction function for extracting predetermined features from an image;
A texture space generation function for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction function;
In the texture space generated by the texture space generation function, a partial space estimation function that estimates a partial space that is a handwritten character region or a partial space that is a printed character region;
In accordance with the partial space estimated by the partial space estimation function, an area determination function for determining a handwritten character area or a printed character area on an actual image is realized ,
The feature amount extracted by the feature extraction function is a feature amount representing a handwritten character feature or a feature amount representing a type character feature,
An image processing program characterized by a sum of spectral intensities weighted by a distance from a predetermined axis on a frequency plane of a two-dimensional Fourier transform result .

［８］コンピュータに、
画像から所定の特徴を抽出する特徴抽出機能と、
前記特徴抽出機能により抽出された特徴に応じて、テクスチャ空間に各画素または各領域のテクスチャ情報をマップしたテクスチャ空間を生成するテクスチャ空間生成機能と、
前記テクスチャ空間生成機能により生成されたテクスチャ空間にて、手書き文字領域らしさまたは活字文字領域らしさを表す指標を設定する指標設定機能と、
前記テクスチャ空間生成機能により生成されたテクスチャ空間にて、前記指標設定機能によって設定された指標に応じて手書き文字領域である部分空間または活字文字領域である部分空間を推定する部分空間推定機能と、
前記部分空間推定機能により推定された部分空間に応じて、実画像上にて手書き文字領域または活字文字領域を確定する領域確定機能
を実現させ、
前記特徴抽出機能が抽出する特徴量は、手書き文字特徴を表す特徴量または活字文字特徴を表す特徴量であり、
２次元フーリエ変換結果の周波数平面にて、所定の軸からの距離で重み付けをしたスペクトル強度の和である
ことを特徴とする画像処理プログラム。
[ 8 ] To the computer,
A feature extraction function for extracting predetermined features from an image;
A texture space generation function for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction function;
In the texture space generated by the texture space generation function, an index setting function for setting an index representing the handwriting character area-likeness or the typeface character area-likeness;
In the texture space generated by the texture space generation function, a partial space estimation function for estimating a partial space that is a handwritten character area or a partial space that is a printed character area according to an index set by the index setting function;
In accordance with the partial space estimated by the partial space estimation function, an area determination function for determining a handwritten character area or a printed character area on an actual image is realized ,
The feature amount extracted by the feature extraction function is a feature amount representing a handwritten character feature or a feature amount representing a type character feature,
An image processing program characterized by a sum of spectral intensities weighted by a distance from a predetermined axis on a frequency plane of a two-dimensional Fourier transform result .

本発明にかかる画像処理システム、文字認識システムおよび画像処理プログラムによれば、本構成を有していない場合に比較して、次のいずれかの効果を奏することができる。
（１）従来の文字切り出し処理が不要である。個々の文字を画像上で正確に切り出す必要がないため、判断を困難にしていた要因である文字切り出し処理に依存することがない。
（２）経験データを必ずしも必要としない。例えば、ニューラルネットワークを使用した前学習や、テンプレートのようなあらかじめ準備しておくデータを必要としない。
（３）言語依存性が低い。人が書くことにより起こる揺らぎを扱うために、それぞれの言語の特徴を扱う必要がなく、言語の種類を問うことがない。 According to the image processing system, the character recognition system, and the image processing program according to the present invention, any of the following effects can be achieved as compared with the case where the present configuration is not provided.
(1) A conventional character segmentation process is not required. Since it is not necessary to cut out each character accurately on the image, it does not depend on the character cut-out process, which is a factor that makes the determination difficult.
(2) Experience data is not always necessary. For example, pre-learning using a neural network and data prepared in advance such as a template are not required.
(3) Low language dependency. In order to handle fluctuations caused by human writing, it is not necessary to handle the characteristics of each language, and it is not necessary to ask the type of language.

まず、本実施の形態の概要を説明する。
本実施の形態では、言語によらず手書き文字を構成する線分は、活字文字における線分に比べて“揺らぎ”が多いという特徴に着目したものである。この揺らぎを的確に表す特徴量をも案出した。この特徴量は、画像の画素ごとまたは領域ごとに求める。なお、この特徴量は複数でもよい。求めた特徴量をテクスチャの要素としたテクスチャ情報空間を生成し、生成したテクスチャ情報空間において、所定の指標に基づき空間の境界を求め、求めた境界に基づき当該テクスチャ情報空間を複数の部分空間に分割する。手書き文字を表すと考えられる部分空間に対応した、実画像上の領域を手書き文字領域と判定する。または、活字文字を表すと考えられる部分空間に対応した、実画像上の領域を活字文字領域と判定する。
本実施の形態によれば、個々の文字を正確に切り出す必要がないため、判断をばらつかせていた要因である文字切り出し処理に依存することなく、活字文字領域と手書き文字領域の判断が可能となる。また、経験データを必ずしも必要としない。例えば、ニューラルネットワークを使用した前学習や、テンプレートのようなあらかじめ準備しておくデータを必要としない。さらに、手書き文字に固有の“線分の揺らぎ”に着目するため、様々な言語（表記文字）に対応することが可能となる。
なお、テクスチャ（ｔｅｘｔｕｒｅ）とは、きめ（肌理）であり、画素値の統計的な指標によって定量化され、例えば完全に周期的ではないがある統計的な性質のもとで繰返し的に配置されてできる模様のことをいう。 First, an outline of the present embodiment will be described.
In the present embodiment, attention is paid to the feature that the line segments constituting the handwritten character have more “fluctuation” than the line segment in the printed character regardless of the language. We also devised a feature that accurately represents this fluctuation. This feature amount is obtained for each pixel or region of the image. A plurality of feature quantities may be used. A texture information space is generated using the obtained feature amount as a texture element. In the generated texture information space, a boundary of the space is obtained based on a predetermined index, and the texture information space is divided into a plurality of partial spaces based on the obtained boundary. To divide. A region on the actual image corresponding to a partial space that is considered to represent a handwritten character is determined as a handwritten character region. Alternatively, an area on the actual image corresponding to a partial space considered to represent a printed character is determined as a printed character area.
According to the present embodiment, since it is not necessary to accurately cut out individual characters, it is possible to determine a printed character region and a handwritten character region without depending on the character cutout process that is a factor that caused the determination to vary. It becomes. Moreover, experience data is not necessarily required. For example, pre-learning using a neural network and data prepared in advance such as a template are not required. Furthermore, since attention is paid to “line segment fluctuation” unique to handwritten characters, it is possible to deal with various languages (notation characters).
Note that texture is texture (texture), is quantified by a statistical index of pixel values, and is repeatedly arranged under a statistical property that is not completely periodic, for example. The pattern that can be done.

本実施の形態が対象とする活字文字と手書き文字とが混在する文書を図２を用いて説明する。フォームが定まっており、そのフォームに手書きで入力するようなビジネスで用いられる文書は多い。例えば、図２に示した見積回答書のような文書である。この文書は、「数量・単位」、「税抜単価（円）」、「税抜合計金額（円）」等のように、予め記入すべき項目は定まっており、その欄に手書き文字２１で記入されている。この文書を、人が見れば、活字文字と手書き文字の差異は一目瞭然である。この一目瞭然である理由は、文字の揺らぎによるものである。つまり、言語を問わず手書き文字は、人が書くことに起因する揺らぎが多いという特徴がある。文字の揺らぎとは、線分の直線あるいはなめらかな曲線からのずれや、各文字の大きさの不均一性や並びのずれなどを指す。
そこで、本実施の形態では、活字文字には直線的な線分が多くあるが、手書き文字には少ないということを用いている。 A document in which printed characters and handwritten characters targeted by the present embodiment are mixed will be described with reference to FIG. There are many documents used in business where a form is defined and the form is handwritten. For example, it is a document such as an estimate response shown in FIG. In this document, items to be entered in advance, such as “quantity / unit”, “unit price excluding tax (yen)”, “total amount excluding tax (yen)”, etc. are determined. It is filled in. If one sees this document, the difference between printed characters and handwritten characters is obvious. The reason for this at a glance is due to the fluctuation of characters. In other words, regardless of language, handwritten characters are characterized by many fluctuations caused by human writing. Character fluctuation refers to a deviation from a straight line or a smooth curve of a line segment, a non-uniformity in the size of each character, a deviation in arrangement, or the like.
Therefore, in the present embodiment, it is used that type characters have many straight line segments but handwritten characters have few.

図３を用いて、本実施の形態の処理手順の概要を説明する。処理手順は二つのモジュールからなり、一つ目は入力画像（文書データ）のテクスチャを解析するために２次元フーリエ変換を行うフーリエ変換モジュール３１、二つ目はフーリエ変換モジュール３１から出力された周波数領域画像（テクスチャ特徴量）に対して揺らぎの評価量を算出し、その評価量に応じて手書き文字か活字文字かを判断する手書き文字度評価モジュール３２である。
図３に示すように、フーリエ変換モジュール３１は入力画像３３を入力し、周波数領域画像（テクスチャ特徴量）３４を作成する。そして、手書き文字度評価モジュール３２が周波数領域画像（テクスチャ特徴量）３４を受け取り、手書き文字度（評価値）Ｅ３５を最終的に出力する。 The outline of the processing procedure of the present embodiment will be described with reference to FIG. The processing procedure consists of two modules. The first is a Fourier transform module 31 that performs a two-dimensional Fourier transform to analyze the texture of the input image (document data), and the second is the frequency output from the Fourier transform module 31. This is a handwritten character degree evaluation module 32 that calculates an evaluation amount of fluctuation for a region image (texture feature amount) and determines whether the character is a handwritten character or a printed character according to the evaluation amount.
As shown in FIG. 3, the Fourier transform module 31 inputs an input image 33 and creates a frequency domain image (texture feature amount) 34. Then, the handwritten character degree evaluation module 32 receives the frequency domain image (texture feature amount) 34 and finally outputs the handwritten character degree (evaluation value) E35.

フーリエ変換モジュール３１は、画像（文書画像データ）に二次元フーリエ変換を適用し、元データの周波数領域画像を取得する。周波数領域画像というのは、例えば文字画像があるとき、その文字が様々な周波数の波が集まって生成されていると考えて、その文字を構成するそれぞれの周波数のスペクトル強度を表した画像である。
数１に二次元フーリエ変換の定義式を記す。

The Fourier transform module 31 applies a two-dimensional Fourier transform to the image (document image data) to obtain a frequency domain image of the original data. A frequency domain image is an image that represents the spectral intensity of each frequency that constitutes a character, for example, when there is a character image, considering that the character is generated by collecting waves of various frequencies. .
The formula for defining the two-dimensional Fourier transform is shown in Equation 1.

例えば、入力画像３３の画像データと周波数領域画像（テクスチャ特徴量）３４のサイズは共に６４×６４ピクセルである。画像データは画像の左下を原点とし、水平な軸をｘ軸、鉛直な軸をｙ軸とする。周波数領域画像は入力画像のｘ軸方向のスペクトルをｕ軸、ｙ軸方向のスペクトルをｖ軸とし、それぞれの周波数が０となる原点を中心としている。二次元フーリエ変換を画像データに適用して得られたスペクトル強度の取りうる幅をグレースケールのビットマップ画像で表し、スペクトル強度が高ければ黒色、逆に低ければ白色と表現している（図３の周波数領域画像（テクスチャ特徴量）３４参照）。 For example, the sizes of the image data of the input image 33 and the frequency domain image (texture feature amount) 34 are both 64 × 64 pixels. In the image data, the lower left of the image is the origin, the horizontal axis is the x axis, and the vertical axis is the y axis. The frequency domain image has a spectrum in the x-axis direction of the input image as the u-axis and a spectrum in the y-axis direction as the v-axis, and is centered at the origin where the respective frequencies are zero. The possible range of the spectrum intensity obtained by applying the two-dimensional Fourier transform to the image data is represented by a grayscale bitmap image, and is expressed as black when the spectrum intensity is high, and as white when the spectrum intensity is low (FIG. 3). Frequency domain image (texture feature amount) 34).

手書き文字度評価モジュール３２は、周波数領域画像から揺らぎの評価量を算出し、その評価量に応じて手書き文字か活字文字かの区別を行う。
手書き文字特有の揺らぎの評価方法は以下の通りである。
例えば、人間が真横の直線を描くとき、それは完全にまっすぐな直線にはならず、揺らぎが含まれる線となる。完全に真横の直線は周波数領域画像においてｖ軸上にスペクトル強度の変化が現れるのだが、揺らぎが含まれると、その影響によりスペクトル強度の変化がｖ軸上だけでなく、その周囲にも拡散して表れるようになる。主にここでは、水平、鉛直の線成分からなる漢字を対象とし、画像及び文書データの縦横直線成分以外の成分を抽出することを説明する。その一例を数２に示す。

Ｅを手書き文字度と呼ぶことにする。
この抽出方法では、二次元フーリエ変換結果の周波数領域画像において、軸周りのピクセルのスペクトル強度を除き（重みを０にし）、その重み付け後の和を取っている。軸周りのピクセル強度を除くということは、元の文字画像データにおける水平、鉛直の直線成分に起因するスペクトル強度を除くということであり、それはすなわち手書き文字の揺らぎによって生じるスペクトル強度の非集中性を検出することができることを意味している。それにより漢字について手書き文字と活字文字を区別できることとなる。 The handwritten character degree evaluation module 32 calculates an evaluation amount of fluctuation from the frequency domain image, and distinguishes between handwritten characters and printed characters according to the evaluation amount.
The evaluation method for fluctuations peculiar to handwritten characters is as follows.
For example, when a human draws a straight line, it does not become a completely straight line, but a line that includes fluctuations. In the frequency domain image, a change in spectral intensity appears on the v-axis, but if fluctuation is included, the change in spectral intensity diffuses not only on the v-axis but also around it. Appear. In this example, extraction of components other than vertical and horizontal straight line components of image and document data will be described, targeting a Chinese character composed of horizontal and vertical line components. An example is shown in Equation 2.

Let E be called handwritten character degree.
In this extraction method, in the frequency domain image as a result of the two-dimensional Fourier transform, the spectral intensity of pixels around the axis is removed (weight is set to 0), and the weighted sum is taken. Excluding the pixel intensity around the axis means removing the spectral intensity caused by horizontal and vertical linear components in the original character image data, which means that the spectral intensity deconcentration caused by fluctuations in handwritten characters is eliminated. It means that it can be detected. As a result, handwritten characters and type characters can be distinguished for kanji.

図４を用いて、活字文字と手書き文字に対して、二次元フーリエ変換を施した処理例を示す。図４（Ａ）は、「富」の活字文字（図４（Ａ）の左側）に対して、二次元フーリエ変換を施した結果である（図４（Ａ）の右側）。図４（Ｃ）は、「富」の手書き文字（図４（Ｃ）の左側）に対して、二次元フーリエ変換を施した結果である（図４（Ｃ）の右側）。これらから活字文字は、Ｕ、Ｖ軸上にスペクトルが集中していることがわかり、手書き文字はＵ、Ｖ軸から離れた位置にスペクトルが分散していることがわかる。
なお、「富」という文字は縦横の線分から構成されているが、「研」のように斜めの線分がある文字に対しても、同様の結果が得られている（図４（Ｂ）、図４（Ｄ）参照）。 An example of processing in which two-dimensional Fourier transform is performed on type characters and handwritten characters will be described with reference to FIG. FIG. 4A shows the result of performing a two-dimensional Fourier transform on the “wealth” type character (left side of FIG. 4A) (right side of FIG. 4A). FIG. 4C shows the result of performing two-dimensional Fourier transform on the handwritten character “wealth” (left side of FIG. 4C) (right side of FIG. 4C). From these, it can be seen that the printed characters have a spectrum concentrated on the U and V axes, and that the handwritten characters have a spectrum dispersed at positions away from the U and V axes.
Note that the word “wet” is composed of vertical and horizontal line segments, but the same result is obtained even for characters with diagonal line segments such as “ken” (FIG. 4B). FIG. 4D).

図１は、一実施の形態の概念的なモジュール構成図を示している。
なお、モジュールとは、一般的に論理的に分離可能なソフトウェア、ハードウェア等の部品を指す。したがって、本実施の形態におけるモジュールはプログラムにおけるモジュールのことだけでなく、ハードウェア構成におけるモジュールも指す。それゆえ、本実施の形態は、プログラム、システムおよび方法の説明をも兼ねている。また、モジュールは機能にほぼ一対一に対応しているが、実装においては、１モジュールを１プログラムで構成してもよいし、複数モジュールを１プログラムで構成してもよく、逆に１モジュールを複数プログラムで構成してもよい。また、複数モジュールは１コンピュータによって実行されてもよいし、分散または並列環境におけるコンピュータによって１モジュールが複数コンピュータで実行されてもよい。また、以下、「接続」とは物理的な接続の他、論理的な接続を含む。
また、システムとは、複数のコンピュータ、ハードウェア、装置等がネットワーク等で接続されて構成されるほか、１つのコンピュータによって実現される場合も含まれる。 FIG. 1 is a conceptual module configuration diagram of an embodiment.
The module generally refers to a component such as software or hardware that can be logically separated. Therefore, the module in the present embodiment indicates not only a module in a program but also a module in a hardware configuration. Therefore, the present embodiment also serves as an explanation of a program, a system, and a method. In addition, the modules correspond almost one-to-one with the functions. However, in mounting, one module may be composed of one program, or a plurality of modules may be composed of one program. A plurality of programs may be used. The plurality of modules may be executed by one computer, or one module may be executed by a plurality of computers in a distributed or parallel environment. Hereinafter, “connection” includes not only physical connection but also logical connection.
In addition, the system includes a configuration in which a plurality of computers, hardware, devices, and the like are connected via a network or the like, and includes a case where the system is realized by a single computer.

本実施の形態は、図１に示すように、画像傾き補正モジュール１０１、解像度変換モジュール１０２、手書き文字特徴抽出モジュール１０３、テクスチャ空間生成モジュール１０４、部分空間推定モジュール１０５、文字種別確定モジュール１０６、画像二値化モジュール１０７、レイアウト解析モジュール１０８、手書き文字認識モジュール１０９、活字文字認識モジュール１１０を有している。なお、図１では、画像の流れは黒矢印で示し、情報の流れは矢印で示している。すなわち、画像傾き補正モジュール１０１から解像度変換モジュール１０２へ、画像傾き補正モジュール１０１から画像二値化モジュール１０７へ、解像度変換モジュール１０２から手書き文字特徴抽出モジュール１０３へ、テクスチャ空間生成モジュール１０４から部分空間推定モジュール１０５へ、画像二値化モジュール１０７からレイアウト解析モジュール１０８、手書き文字認識モジュール１０９、活字文字認識モジュール１１０へは、画像が渡される。その他のモジュール間は、情報が渡される。 In this embodiment, as shown in FIG. 1, an image inclination correction module 101, a resolution conversion module 102, a handwritten character feature extraction module 103, a texture space generation module 104, a subspace estimation module 105, a character type determination module 106, an image A binarization module 107, a layout analysis module 108, a handwritten character recognition module 109, and a type character recognition module 110 are provided. In FIG. 1, the flow of images is indicated by black arrows, and the flow of information is indicated by arrows. That is, the image inclination correction module 101 to the resolution conversion module 102, the image inclination correction module 101 to the image binarization module 107, the resolution conversion module 102 to the handwritten character feature extraction module 103, and the texture space generation module 104 to estimate the partial space. An image is passed from the image binarization module 107 to the layout analysis module 108, handwritten character recognition module 109, and type character recognition module 110 to the module 105. Information is passed between other modules.

画像傾き補正モジュール１０１は、解像度変換モジュール１０２、画像二値化モジュール１０７と接続されている。画像傾き補正モジュール１０１は、文書画像データを入力し、その文書画像データの傾きを検知して、アフィン変換によってその傾きを補正する。一般的にスキャナを用いて文書を入力した場合、傾き等が発生してしまう。その傾きは、後の処理に悪影響を与えるので、前処理として補正することを行う。傾きの検知は、例えば、文書内に存在する直線を抽出し、この傾きが微小である場合は、その直線を水平または垂直にする方向に画像全体を回転させる等の方法がある。また、ここでの入力画像は、例えばカラー画像であり、１画素が２４ビットで表現されている。傾き補正処理後の画像を、解像度変換モジュール１０２、画像二値化モジュール１０７へ渡す。 The image inclination correction module 101 is connected to the resolution conversion module 102 and the image binarization module 107. The image inclination correction module 101 receives document image data, detects the inclination of the document image data, and corrects the inclination by affine transformation. In general, when a document is input using a scanner, inclination or the like occurs. Since the inclination adversely affects the subsequent processing, it is corrected as preprocessing. The detection of the tilt includes, for example, a method of extracting a straight line existing in the document and rotating the entire image in a direction to make the straight line horizontal or vertical when the tilt is very small. The input image here is, for example, a color image, and one pixel is expressed by 24 bits. The image after the inclination correction processing is transferred to the resolution conversion module 102 and the image binarization module 107.

解像度変換モジュール１０２は、画像傾き補正モジュール１０１、手書き文字特徴抽出モジュール１０３と接続されている。つまり、画像傾き補正モジュール１０１より文書画像データを受け取り、解像度変換を行い、その結果の画像データを手書き文字特徴抽出モジュール１０３へ渡す。ここでは、対象とする画像を正規化する目的で解像度変換（拡大・縮小処理）を行う。手書き文字特徴抽出モジュール１０３で行う特徴抽出に必要十分な解像度の画像データを渡すことになる。 The resolution conversion module 102 is connected to the image inclination correction module 101 and the handwritten character feature extraction module 103. That is, the document image data is received from the image inclination correction module 101, resolution conversion is performed, and the resulting image data is passed to the handwritten character feature extraction module 103. Here, resolution conversion (enlargement / reduction processing) is performed for the purpose of normalizing the target image. Image data having a resolution sufficient for feature extraction performed by the handwritten character feature extraction module 103 is passed.

手書き文字特徴抽出モジュール１０３は、解像度変換モジュール１０２、テクスチャ空間生成モジュール１０４と接続されている。つまり、解像度変換モジュール１０２より画像データを受け取り、手書き文字の揺らぎを表す特徴量（具体的には、上述の数２により計算される手書き文字度Ｅ）を抽出し、テクスチャ空間生成モジュール１０４へ渡す。 The handwritten character feature extraction module 103 is connected to the resolution conversion module 102 and the texture space generation module 104. In other words, the image data is received from the resolution conversion module 102, the feature amount (specifically, the handwritten character degree E calculated by the above-described equation 2) representing the fluctuation of the handwritten character is extracted and passed to the texture space generation module 104. .

テクスチャ空間生成モジュール１０４は、手書き文字特徴抽出モジュール１０３、部分空間推定モジュール１０５と接続されている。つまり、手書き文字特徴抽出モジュール１０３より手書き文字の特徴量を受け取り、テクスチャ空間を生成して、そのテクスチャ空間を部分空間推定モジュール１０５へ渡す。ここで、テクスチャ空間生成モジュール１０４は、手書き文字特徴抽出モジュール１０３により抽出された特徴に応じて、テクスチャ空間に各画素または各領域のテクスチャ情報をマップする。そして、各画素または各領域の情報がマップされたテクスチャ空間を部分空間推定モジュール１０５へ渡す。 The texture space generation module 104 is connected to the handwritten character feature extraction module 103 and the partial space estimation module 105. That is, the feature amount of the handwritten character is received from the handwritten character feature extraction module 103, a texture space is generated, and the texture space is passed to the subspace estimation module 105. Here, the texture space generation module 104 maps the texture information of each pixel or each region into the texture space according to the feature extracted by the handwritten character feature extraction module 103. Then, the texture space in which the information of each pixel or each region is mapped is passed to the partial space estimation module 105.

部分空間推定モジュール１０５は、テクスチャ空間生成モジュール１０４、文字種別確定モジュール１０６と接続されている。つまり、テクスチャ空間生成モジュール１０４より各画素または各領域の情報がマップされたテクスチャ空間を受け取り、手書き文字領域である部分空間を推定する。なお、ここでの推定とは、人間による頭脳処理のことではなく、本実施の形態において、手書き文字領域の部分空間であるとする可能性が高いものであると暫定的に定めることである。この推定は、まず手書き文字領域らしさを表す指標を設定し、テクスチャ空間生成モジュール１０４により生成されたテクスチャ空間にて、設定された指標に応じて手書き文字領域である部分空間を推定する。そして、手書き文字領域の可能性が高いと推定された範囲の情報を文字種別確定モジュール１０６へ渡す。 The partial space estimation module 105 is connected to the texture space generation module 104 and the character type determination module 106. That is, a texture space in which information of each pixel or each region is mapped is received from the texture space generation module 104, and a partial space that is a handwritten character region is estimated. Note that the estimation here is not a human brain process but a provisional determination that the handwriting character area is likely to be a partial space in the present embodiment. In this estimation, first, an index representing the likelihood of a handwritten character area is set, and a partial space that is a handwritten character area is estimated in the texture space generated by the texture space generation module 104 according to the set index. Then, the range information estimated to be highly likely to be a handwritten character area is passed to the character type determination module 106.

画像二値化モジュール１０７は、画像傾き補正モジュール１０１、レイアウト解析モジュール１０８、手書き文字認識モジュール１０９、活字文字認識モジュール１１０と接続されている。つまり、画像傾き補正モジュール１０１より渡された文書画像データを二値化処理する。レイアウト解析モジュール１０８、手書き文字認識モジュール１０９、活字文字認識モジュール１１０の処理では、多値画像ではなく、二値化画像が適しているからである。二値化処理は、閾値を定めて、黒画素、白画素を定める方法がとられるが、閾値を適応的に定めるようにしてもよい。この二値化画像（１画素を１ビットで表現）をレイアウト解析モジュール１０８、手書き文字認識モジュール１０９、活字文字認識モジュール１１０へ渡す。 The image binarization module 107 is connected to the image inclination correction module 101, the layout analysis module 108, the handwritten character recognition module 109, and the type character recognition module 110. That is, the binarization processing is performed on the document image data transferred from the image inclination correction module 101. This is because, in the processing of the layout analysis module 108, the handwritten character recognition module 109, and the printed character recognition module 110, a binary image is suitable instead of a multi-valued image. In the binarization processing, a threshold value is determined and a black pixel and a white pixel are determined. However, the threshold value may be adaptively determined. This binarized image (1 pixel is expressed by 1 bit) is passed to the layout analysis module 108, the handwritten character recognition module 109, and the type character recognition module 110.

レイアウト解析モジュール１０８は、画像二値化モジュール１０７、文字種別確定モジュール１０６、手書き文字認識モジュール１０９、活字文字認識モジュール１１０と接続されている。つまり、画像二値化モジュール１０７より二値化画像を受け取り、これに対して文書のレイアウト解析処理を行い、その結果を手書き文字認識モジュール１０９、活字文字認識モジュール１１０または文字種別確定モジュール１０６へ渡す。レイアウト解析処理とは、文書には一定の構造、つまり文字領域、図形領域、画像領域等に分けることができ、また、その中でもタイトル、段落、注等の外観によっておおよその内容を推定するものである。このレイアウト解析処理によって文字領域であると判断された領域は手書き文字認識モジュール１０９または活字文字認識モジュール１１０へ渡す。また、文字領域であると判断された領域に関する情報をレイアウト解析モジュール１０８へ渡すことによって、文字種別確定モジュール１０６による文字種別処理の参考にすることができる。また、逆に文字種別確定モジュール１０６から、領域内の文字種別情報を得ることによって、その情報を用いて、レイアウト解析処理の参考にすることができる。例えば、レイアウト解析処理として文字領域であるか否かの可能性があまり高いものでないが、明らかに文字領域ではないとも判断できないような場合に、文字種別確定モジュール１０６によって手書き文字領域である可能性が高いことが判明した場合は、その領域を文字領域として判断してもよい。 The layout analysis module 108 is connected to the image binarization module 107, the character type determination module 106, the handwritten character recognition module 109, and the type character recognition module 110. That is, a binarized image is received from the image binarization module 107, a document layout analysis process is performed on the received binarized image, and the result is passed to the handwritten character recognition module 109, the type character recognition module 110, or the character type determination module 106. . Layout analysis processing can be divided into a certain structure in a document, that is, character area, graphic area, image area, etc., and among them, approximate contents are estimated by the appearance of titles, paragraphs, notes, etc. is there. The area determined to be a character area by this layout analysis processing is passed to the handwritten character recognition module 109 or the type character recognition module 110. Further, by passing information related to the area determined to be a character area to the layout analysis module 108, it is possible to refer to the character type processing by the character type determination module 106. Conversely, by obtaining character type information in the area from the character type determination module 106, the information can be used as a reference for layout analysis processing. For example, the possibility that the character area is not a character area as a layout analysis process is not so high, but if it cannot be determined that it is clearly not a character area, the character type determination module 106 may determine that it is a handwritten character area. If it is found that the area is high, the area may be determined as a character area.

手書き文字認識モジュール１０９は、画像二値化モジュール１０７、レイアウト解析モジュール１０８、文字種別確定モジュール１０６と接続されている。つまり、画像二値化モジュール１０７より二値化画像を受け取り、レイアウト解析モジュール１０８より文字領域の座標を受け取り、さらに文字種別確定モジュール１０６より手書き文字領域の座標を受け取る。そして、その領域に対して、手書き文字に適した文字認識処理を行い、その結果（テキストコード）を手書き文字情報として出力する。また、文字認識の結果の確からしさ（信頼度）をレイアウト解析モジュール１０８または文字種別確定モジュール１０６へ出力してもよい。例えば、特許第２９９１７７９号公報に記載の文字の信頼度値を適用することができる。 The handwritten character recognition module 109 is connected to the image binarization module 107, the layout analysis module 108, and the character type determination module 106. That is, the binarized image is received from the image binarization module 107, the coordinates of the character area are received from the layout analysis module 108, and the coordinates of the handwritten character area are received from the character type determination module 106. And the character recognition process suitable for a handwritten character is performed with respect to the area | region, and the result (text code) is output as handwritten character information. Further, the probability (reliability) of the character recognition result may be output to the layout analysis module 108 or the character type determination module 106. For example, the reliability value of a character described in Japanese Patent No. 299979 can be applied.

活字文字認識モジュール１１０は、画像二値化モジュール１０７、レイアウト解析モジュール１０８、文字種別確定モジュール１０６と接続されている。つまり、画像二値化モジュール１０７より二値化画像を受け取り、レイアウト解析モジュール１０８より文字領域の座標を受け取り、さらに文字種別確定モジュール１０６より活字文字領域の座標を受け取る。そして、その領域に対して、活字文字に適した文字認識処理を行い、その結果（テキストコード）を活字文字情報として出力する。また、文字認識の結果の確からしさ（信頼度）をレイアウト解析モジュール１０８または文字種別確定モジュール１０６へ出力してもよい。 The type character recognition module 110 is connected to the image binarization module 107, the layout analysis module 108, and the character type determination module 106. That is, the binarized image is received from the image binarization module 107, the coordinates of the character area are received from the layout analysis module 108, and the coordinates of the printed character area are received from the character type determination module 106. And the character recognition process suitable for a printed character is performed with respect to the area | region, and the result (text code) is output as printed character information. Further, the probability (reliability) of the character recognition result may be output to the layout analysis module 108 or the character type determination module 106.

文字種別確定モジュール１０６は、部分空間推定モジュール１０５、レイアウト解析モジュール１０８、手書き文字認識モジュール１０９、活字文字認識モジュール１１０と接続されている。つまり、部分空間推定モジュール１０５より手書き文字領域の可能性が高いと推定された範囲の情報を受け取り、レイアウト解析モジュール１０８より文字領域の座標を受け取り、その領域が手書き文字領域であるか活字領域であるかを判断する。なお、手書き文字認識モジュール１０９、活字文字認識モジュール１１０から文字認識の結果の確からしさを受け取り、この情報を用いて、手書き文字であるか活字文字であるかの判断の閾値を修正するようにしてもよい。 The character type determination module 106 is connected to the partial space estimation module 105, the layout analysis module 108, the handwritten character recognition module 109, and the type character recognition module 110. That is, it receives information of a range estimated to have a high possibility of a handwritten character area from the subspace estimation module 105, receives the coordinates of the character area from the layout analysis module 108, and determines whether the area is a handwritten character area or a printed area. Determine if there is. Note that the likelihood of the result of character recognition is received from the handwritten character recognition module 109 and the typeface character recognition module 110, and the threshold value for determining whether the character is a handwritten character or type character is corrected using this information. Also good.

図５を参照して、本実施の形態のハードウェア構成例について説明する。図５に示す構成は、例えばパーソナルコンピュータ（ＰＣ）などによって構成される画像処理システムであり、スキャナ等のデータ読み取り部４１７と、プリンタなどのデータ出力部４１８を備えたハード構成例を示している。 With reference to FIG. 5, a hardware configuration example of the present embodiment will be described. The configuration illustrated in FIG. 5 is an image processing system configured by, for example, a personal computer (PC), and illustrates a hardware configuration example including a data reading unit 417 such as a scanner and a data output unit 418 such as a printer. .

ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）４０１は、上述の実施の形態において説明した各種のモジュール、すなわち、画像傾き補正モジュール１０１、解像度変換モジュール１０２、手書き文字特徴抽出モジュール１０３、テクスチャ空間生成モジュール１０４等の各モジュールの実行シーケンスを記述したコンピュータ・プログラムに従った処理を実行する制御部である。 A CPU (Central Processing Unit) 401 includes various modules described in the above-described embodiments, that is, modules such as the image inclination correction module 101, the resolution conversion module 102, the handwritten character feature extraction module 103, and the texture space generation module 104. It is a control part which performs the process according to the computer program which described the execution sequence.

ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）４０２は、ＣＰＵ４０１が使用するプログラムや演算パラメータ等を格納する。ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）４０３は、ＣＰＵ４０１の実行において使用するプログラムや、その実行において適宜変化するパラメータ等を格納する。これらはＣＰＵバスなどから構成されるホストバス４０４により相互に接続されている。 A ROM (Read Only Memory) 402 stores programs used by the CPU 401, operation parameters, and the like. A RAM (Random Access Memory) 403 stores programs used in the execution of the CPU 401, parameters that change as appropriate during the execution, and the like. These are connected to each other by a host bus 404 including a CPU bus.

ホストバス４０４は、ブリッジ４０５を介して、ＰＣＩ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｃｏｎｎｅｃｔ／Ｉｎｔｅｒｆａｃｅ）バスなどの外部バス４０６に接続されている。 The host bus 404 is connected to an external bus 406 such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 405.

キーボード４０８、マウス等のポインティングデバイス４０９は、操作者により操作される入力デバイスである。ディスプレイ４１０は、液晶表示装置またはＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）などから成り、各種情報をテキストやイメージ情報として表示する。 A keyboard 408 and a pointing device 409 such as a mouse are input devices operated by an operator. The display 410 includes a liquid crystal display device or a CRT (Cathode Ray Tube), and displays various types of information as text and image information.

ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）４１１は、ハードディスクを内蔵し、ハードディスクを駆動し、ＣＰＵ４０１によって実行するプログラムや情報を記録または再生させる。ハードディスクは、入力された画像、二値化画像、特徴量、文字認識結果データなどが格納される。さらに、その他の各種のデータ処理プログラム等、各種コンピュータ・プログラムが格納される。 An HDD (Hard Disk Drive) 411 includes a hard disk, drives the hard disk, and records or reproduces a program executed by the CPU 401 and information. The hard disk stores input images, binarized images, feature amounts, character recognition result data, and the like. Further, various computer programs such as various other data processing programs are stored.

ドライブ４１２は、装着されている磁気ディスク、光ディスク、光磁気ディスク、または半導体メモリ等のリムーバブル記録媒体４１３に記録されているデータまたはプログラムを読み出して、そのデータまたはプログラムを、インタフェース４０７、外部バス４０６、ブリッジ４０５、およびホストバス４０４を介して接続されているＲＡＭ４０３に供給する。リムーバブル記録媒体４１３も、ハードディスクと同様のデータ記録領域として利用可能である。 The drive 412 reads data or a program recorded on a mounted removable recording medium 413 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, and the data or program is read from the interface 407 and the external bus 406. , And supplied to the RAM 403 connected via the bridge 405 and the host bus 404. The removable recording medium 413 can also be used as a data recording area similar to a hard disk.

接続ポート４１４は、外部接続機器４１５を接続するポートであり、ＵＳＢ、ＩＥＥＥ１３９４等の接続部を持つ。接続ポート４１４は、インタフェース４０７、および外部バス４０６、ブリッジ４０５、ホストバス４０４等を介してＣＰＵ４０１等に接続されている。通信部４１６は、ネットワークに接続され、外部とのデータ通信処理を実行する。データ読み取り部４１７は、例えばスキャナであり、ドキュメントの読み取り処理を実行する。データ出力部４１８は、例えばプリンタであり、ドキュメントデータの出力処理を実行する。 The connection port 414 is a port for connecting the external connection device 415 and has a connection unit such as USB, IEEE1394. The connection port 414 is connected to the CPU 401 and the like via the interface 407, the external bus 406, the bridge 405, the host bus 404, and the like. The communication unit 416 is connected to a network and executes data communication processing with the outside. The data reading unit 417 is a scanner, for example, and executes document reading processing. The data output unit 418 is a printer, for example, and executes document data output processing.

なお、図５に示すハードウェア構成は、１つの構成例を示すものであり、本実施の形態は、図５に示す構成に限らず、本実施の形態において説明したモジュールを実行可能な構成であればよい。例えば、一部のモジュールを専用のハードウェア（例えばＡＳＩＣ等）で構成してもよく、一部のモジュールは外部のシステム内にあり通信回線で接続しているような形態でもよく、さらに図５に示すシステムが複数互いに通信回線によって接続されていて互いに協調動作するようにしてもよい。また、複写機、ファックス、スキャナ、プリンタ、複合機（多機能複写機とも呼ばれ、スキャナ、プリンタ、複写機、ファックス等の機能を有している）などに組み込まれていてもよい。 Note that the hardware configuration illustrated in FIG. 5 illustrates one configuration example, and the present embodiment is not limited to the configuration illustrated in FIG. 5, and is configured to execute the modules described in the present embodiment. I just need it. For example, some modules may be configured by dedicated hardware (for example, ASIC), and some modules may be in an external system and connected via a communication line. A plurality of systems shown in FIG. 5 may be connected to each other via communication lines so as to cooperate with each other. Further, it may be incorporated in a copying machine, a fax machine, a scanner, a printer, a multifunction machine (also called a multi-function copying machine, which has functions of a scanner, a printer, a copying machine, a fax machine, etc.).

次に作用・働き（動作）を説明する。
図６を用いて、文書画像を認識する処理を説明する。
ステップＳ６０１では、画像傾き補正モジュール１０１が入力した文書画像の傾きを補正する。傾き補正の他、ノイズ除去等の文字認識のための前処理としての画像処理を行ってもよい。
ステップＳ６０２では、解像度変換モジュール１０２がステップＳ６０１で補正された文書画像に対して、解像度変換を行う。
ステップＳ６０３では、手書き文字特徴抽出モジュール１０３がステップＳ６０２で解像度変換された文書画像に対して、手書き文字の特徴である揺らぎの特徴量を抽出する。 Next, the function and operation (operation) will be described.
Processing for recognizing a document image will be described with reference to FIG.
In step S601, the image inclination correction module 101 corrects the inclination of the document image input. In addition to inclination correction, image processing as preprocessing for character recognition such as noise removal may be performed.
In step S602, the resolution conversion module 102 performs resolution conversion on the document image corrected in step S601.
In step S603, the handwritten character feature extraction module 103 extracts the fluctuation feature amount that is the feature of the handwritten character from the document image whose resolution has been converted in step S602.

ステップＳ６０４では、テクスチャ空間生成モジュール１０４がステップＳ６０３で抽出した特徴量を用いて、テクスチャ空間を生成する。
ステップＳ６０５では、部分空間推定モジュール１０５がステップＳ６０４で生成されたテクスチャ空間内で、手書き文字領域である部分空間を推定する。
ステップＳ６０６では、文字種別確定モジュール１０６がステップＳ６０５で推定された手書き文字領域を確定する。ただし、後にステップＳ６１１でＮｏの場合またはステップＳ６１４でＮｏの場合には繰り返し処理が行われる。その場合は、ステップＳ６０５で推定された手書き文字領域を、ステップＳ６０８でのレイアウト解析の結果、文字認識結果の確からしさ（ステップＳ６１１、ステップＳ６１４）をも用いて、手書き文字領域を修正し、確定する。 In step S604, the texture space generation module 104 generates a texture space using the feature amount extracted in step S603.
In step S605, the partial space estimation module 105 estimates a partial space that is a handwritten character region in the texture space generated in step S604.
In step S606, the character type determination module 106 determines the handwritten character area estimated in step S605. However, if the result is No in step S611 or No in step S614, the process is repeated. In that case, the handwritten character region estimated in step S605 is corrected using the probability of the character recognition result (step S611, step S614) as a result of the layout analysis in step S608, and confirmed. To do.

ステップＳ６０７では、画像二値化モジュール１０７がステップＳ６０１で補正された文書画像に対して、二値化処理を行う。
ステップＳ６０８では、レイアウト解析モジュール１０８がステップＳ６０７で二値化された画像を受け取り、画像全体のレイアウト解析を行う。ここで、ステップＳ６０６で確定された文字種別をも用いて、レイアウト解析を行う。これによって、レイアウト解析単独で行う結果よりも精度よく行うことができる。
ステップＳ６０９では、ステップＳ６０８でのレイアウト解析の結果、文字領域に対して、手書き文字領域であるか否か（活字文字領域であるか否かであってもよい）を判断する。かかる判断において、ＹｅｓであるとステップＳ６１０へ、ＮｏであるとステップＳ６１３へと進む。 In step S607, the image binarization module 107 performs binarization processing on the document image corrected in step S601.
In step S608, the layout analysis module 108 receives the image binarized in step S607 and performs layout analysis of the entire image. Here, the layout analysis is also performed using the character type determined in step S606. This can be performed with higher accuracy than the result of layout analysis alone.
In step S609, as a result of the layout analysis in step S608, it is determined whether the character area is a handwritten character area (may be a printed character area). In this determination, if Yes, the process proceeds to step S610, and if No, the process proceeds to step S613.

ステップＳ６１０では、手書き文字認識モジュール１０９がステップＳ６０８でのレイアウト解析の結果、手書き文字領域であると判断された領域の画像に対して、手書き文字認識を行う。
ステップＳ６１１では、ステップＳ６１０での手書き文字認識の結果、その確からしさが所定の閾値以上であるか否かが判断される。かかる判断において、ＹｅｓであるとステップＳ６１２へ、ＮｏであるとステップＳ６０６へと戻る。
ステップＳ６１２では、ステップＳ６０８でのレイアウト解析の結果の全ての文字領域内の文字に対して、文字認識処理を行ったか否かを判断する。かかる判断において、Ｙｅｓであると終了し、ＮｏであるとステップＳ６０９へと戻る。 In step S610, the handwritten character recognition module 109 performs handwritten character recognition on the image of the area determined as the handwritten character area as a result of the layout analysis in step S608.
In step S611, it is determined whether the certainty is equal to or greater than a predetermined threshold as a result of handwritten character recognition in step S610. In this determination, if Yes, the process returns to Step S612, and if No, the process returns to Step S606.
In step S612, it is determined whether or not character recognition processing has been performed for all characters in the character region as a result of the layout analysis in step S608. In this determination, if yes, the process ends. If no, the process returns to step S609.

ステップＳ６１３では、活字文字認識モジュール１１０がステップＳ６０８でのレイアウト解析の結果、活字文字領域であると判断された領域の画像に対して、活字文字認識を行う。
ステップＳ６１４では、ステップＳ６１３での活字文字認識の結果、その確からしさが所定の閾値以上であるか否かが判断される。かかる判断において、ＹｅｓであるとステップＳ６１５へ、ＮｏであるとステップＳ６０６へと戻る。
ステップＳ６１５では、ステップＳ６０８でレイアウト解析の結果の全ての文字領域内の文字に対して、文字認識処理を行ったか否かを判断する。かかる判断において、Ｙｅｓであると終了し、ＮｏであるとステップＳ６０９へと戻る。 In step S613, the print character recognition module 110 performs print character recognition on an image in an area determined to be a print character area as a result of the layout analysis in step S608.
In step S614, it is determined whether or not the certainty is equal to or greater than a predetermined threshold as a result of the type character recognition in step S613. In this determination, if Yes, the process returns to Step S615, and if No, the process returns to Step S606.
In step S615, it is determined whether or not character recognition processing has been performed on all characters in the character region as a result of layout analysis in step S608. In this determination, if yes, the process ends. If no, the process returns to step S609.

前記実施の形態においては、例示として漢字を示したが、他の言語による文字であってもよい。
また、前記実施の形態においては、文字の揺らぎを用いて、主に手書き文字領域と判断することを示した。逆に、文字の揺らぎが少ないということは、活字文字であることを意味している。つまり、文字領域と判定された領域に対して、算出された手書き文字度が低い場合には、当該領域は活字文字領域であると判断できる。したがって、活字文字領域の判定についても、手書き文字領域と同様に行うことが可能である。
なお、前記実施の形態においては、手書き文字の特徴として、入力画像の水平、鉛直の線分について手書き文字特有の揺らぎを検出しているが、さらに斜めの線分の揺らぎを検出するようにしてもよい。 In the above-described embodiment, a Chinese character is shown as an example, but a character in another language may be used.
Moreover, in the said embodiment, it showed that it was mainly judged as a handwritten character area | region using the fluctuation of a character. On the contrary, the fact that there is little fluctuation of the character means that it is a printed character. That is, when the calculated handwritten character degree is low with respect to the area determined as the character area, it can be determined that the area is a printed character area. Therefore, the determination of the type character area can be performed in the same manner as the handwritten character area.
In the above-described embodiment, fluctuations peculiar to handwritten characters are detected as horizontal and vertical line segments of the input image as a characteristic of the handwritten characters. However, fluctuations of diagonal line segments are further detected. Also good.

なお、説明したプログラムについては、記録媒体に格納することも可能であり、その場合は、例えば以下の発明としても把握することができる。
コンピュータに、
画像から所定の特徴を抽出する特徴抽出機能と、
前記特徴抽出機能により抽出された特徴に応じて、テクスチャ空間に各画素または各領域のテクスチャ情報をマップしたテクスチャ空間を生成するテクスチャ空間生成機能と、
前記テクスチャ空間生成機能により生成されたテクスチャ空間にて、手書き文字領域である部分空間または活字文字領域である部分空間を推定する部分空間推定機能と、
前記部分空間推定機能により推定された部分空間に応じて、実画像上にて手書き文字領域または活字文字領域を確定する領域確定機能
を実現させることを特徴とする画像処理プログラムを記録したコンピュータ読み取り可能な記録媒体。 In addition, about the program demonstrated, it is also possible to store in a recording medium, and it can also be grasped | ascertained as the following invention, for example in that case.
On the computer,
A feature extraction function for extracting predetermined features from an image;
A texture space generation function for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction function;
In the texture space generated by the texture space generation function, a partial space estimation function that estimates a partial space that is a handwritten character region or a partial space that is a printed character region;
A computer readable recording of an image processing program that realizes an area determination function for determining a handwritten character area or a printed character area on an actual image according to the partial space estimated by the partial space estimation function Recording medium.

コンピュータに、
画像から所定の特徴を抽出する特徴抽出機能と、
前記特徴抽出機能により抽出された特徴に応じて、テクスチャ空間に各画素または各領域のテクスチャ情報をマップしたテクスチャ空間を生成するテクスチャ空間生成機能と、
前記テクスチャ空間生成機能により生成されたテクスチャ空間にて、手書き文字領域らしさまたは活字文字領域らしさを表す指標を設定する指標設定機能と、
前記テクスチャ空間生成機能により生成されたテクスチャ空間にて、前記指標設定機能によって設定された指標に応じて手書き文字領域である部分空間または活字文字領域である部分空間を推定する部分空間推定機能と、
前記部分空間推定機能により推定された部分空間に応じて、実画像上にて手書き文字領域または活字文字領域を確定する領域確定機能
を実現させることを特徴とする画像処理プログラムを記録したコンピュータ読み取り可能な記録媒体。 On the computer,
A feature extraction function for extracting predetermined features from an image;
A texture space generation function for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction function;
In the texture space generated by the texture space generation function, an index setting function for setting an index representing the handwriting character area-likeness or the typeface character area-likeness;
In the texture space generated by the texture space generation function, a partial space estimation function for estimating a partial space that is a handwritten character area or a partial space that is a printed character area according to an index set by the index setting function;
A computer readable recording of an image processing program that realizes an area determination function for determining a handwritten character area or a printed character area on an actual image according to the partial space estimated by the partial space estimation function Recording medium.

「プログラムを記録したコンピュータ読み取り可能な記録媒体」とは、プログラムのインストール、実行、プログラムの流通などのために用いられる、プログラムが記録されたコンピュータで読み取り可能な記録媒体をいう。
なお、記録媒体としては、例えば、デジタル・バーサタイル・ディスク（ＤＶＤ）であって、ＤＶＤフォーラムで策定された規格である「ＤＶＤ−Ｒ、ＤＶＤ−ＲＷ、ＤＶＤ−ＲＡＭ等」、ＤＶＤ＋ＲＷで策定された規格である「ＤＶＤ＋Ｒ、ＤＶＤ＋ＲＷ等」、コンパクトディスク（ＣＤ）であって、読出し専用メモリ（ＣＤ−ＲＯＭ）、ＣＤレコーダブル（ＣＤ−Ｒ）、ＣＤリライタブル（ＣＤ−ＲＷ）等、光磁気ディスク（ＭＯ）、フレキシブルディスク（ＦＤ）、磁気テープ、ハードディスク、読出し専用メモリ（ＲＯＭ）、電気的消去および書換可能な読出し専用メモリ（ＥＥＰＲＯＭ）、フラッシュ・メモリ、ランダム・アクセス・メモリ（ＲＡＭ）等が含まれる。
そして、上記のプログラムまたはその一部は、上記記録媒体に記録して保存や流通等させることが可能である。また、通信によって、例えば、ローカル・エリア・ネットワーク（ＬＡＮ）、メトロポリタン・エリア・ネットワーク（ＭＡＮ）、ワイド・エリア・ネットワーク（ＷＡＮ）、インターネット、イントラネット、エクストラネット等に用いられる有線ネットワーク、あるいは無線通信ネットワーク、さらにはこれらの組合せ等の伝送媒体を用いて伝送することが可能であり、また、搬送波に乗せて搬送することも可能である。
さらに、上記のプログラムは、他のプログラムの一部分であってもよく、あるいは別個のプログラムと共に記録媒体に記録されていてもよい。 The “computer-readable recording medium on which a program is recorded” refers to a computer-readable recording medium on which a program is recorded, which is used for program installation, execution, program distribution, and the like.
The recording medium is, for example, a digital versatile disc (DVD), which is a standard established by the DVD Forum, such as “DVD-R, DVD-RW, DVD-RAM,” and DVD + RW. Standards such as “DVD + R, DVD + RW, etc.”, compact discs (CDs), read-only memory (CD-ROM), CD recordable (CD-R), CD rewritable (CD-RW), etc. MO), flexible disk (FD), magnetic tape, hard disk, read only memory (ROM), electrically erasable and rewritable read only memory (EEPROM), flash memory, random access memory (RAM), etc. It is.
The program or a part of the program can be recorded on the recording medium and stored or distributed. Also, by communication, for example, a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN), a wired network used for the Internet, an intranet, an extranet, etc., or wireless communication It can be transmitted using a transmission medium such as a network or a combination of these, and can also be carried on a carrier wave.
Furthermore, the above program may be a part of another program, or may be recorded on a recording medium together with a separate program.

画像処理システムの実施の形態の一構成例を示すブロック図である。It is a block diagram which shows one structural example of embodiment of an image processing system. 活字文字と手書き文字が混在する文書を示す説明図である。It is explanatory drawing which shows the document in which a print character and a handwritten character are mixed. 実施の形態の概略を説明するブロック図である。It is a block diagram explaining the outline of embodiment. 活字文字画像、手書き文字画像をフーリエ変換した例を示す説明図である。It is explanatory drawing which shows the example which carried out the Fourier-transform of the printed character image and the handwritten character image. 実施の形態を実現するコンピュータのハードウェア構成の一例の説明図である。And FIG. 11 is an explanatory diagram illustrating an example of a hardware configuration of a computer that implements an embodiment. 文書画像を認識する処理を示すフローチャートである。It is a flowchart which shows the process which recognizes a document image.

Explanation of symbols

２１…手書き文字
３１…フーリエ変換モジュール
３２…手書き文字度評価モジュール
３３…入力画像
３４…周波数領域画像（テクスチャ特徴量）
３５…手書き文字度（評価値）Ｅ
１０１…画像傾き補正モジュール
１０２…解像度変換モジュール
１０３…手書き文字特徴抽出モジュール
１０４…テクスチャ空間生成モジュール
１０５…部分空間推定モジュール
１０６…文字種別確定モジュール
１０７…画像二値化モジュール
１０８…レイアウト解析モジュール
１０９…手書き文字認識モジュール
１１０…活字文字認識モジュール 21 ... Handwritten characters 31 ... Fourier transform module 32 ... Handwritten character degree evaluation module 33 ... Input image 34 ... Frequency domain image (texture feature)
35 ... Handwritten character degree (evaluation value) E
DESCRIPTION OF SYMBOLS 101 ... Image inclination correction module 102 ... Resolution conversion module 103 ... Handwritten character feature extraction module 104 ... Texture space generation module 105 ... Subspace estimation module 106 ... Character type determination module 107 ... Image binarization module 108 ... Layout analysis module 109 ... Handwritten character recognition module 110 ... Printed character recognition module

Claims

Feature extraction means for extracting predetermined features from the image;
Texture space generation means for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction means;
In the texture space generated by the texture space generation means, a partial space estimation means for estimating a partial space that is a handwritten character area or a partial space that is a printed character area;
In accordance with the partial space estimated by the partial space estimation means, comprising an area determination means for determining a handwritten character area or a printed character area on an actual image ,
The feature amount extracted by the feature extraction means is a feature amount representing a handwritten character feature or a feature amount representing a type character feature,
An image processing system, which is a sum of spectral intensities weighted by a distance from a predetermined axis on a frequency plane of a two-dimensional Fourier transform result .

Feature extraction means for extracting predetermined features from the image;
Texture space generation means for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction means;
In the texture space generated by the texture space generating means, an index setting means for setting an index representing the handwriting character area-likeness or the print character area-likeness;
In the texture space generated by the texture space generation means, a partial space estimation means for estimating a partial space that is a handwritten character area or a partial space that is a printed character area according to the index set by the index setting means;
In accordance with the partial space estimated by the partial space estimation means, comprising an area determination means for determining a handwritten character area or a printed character area on an actual image ,
The feature amount extracted by the feature extraction means is a feature amount representing a handwritten character feature or a feature amount representing a type character feature,
An image processing system, which is a sum of spectral intensities weighted by a distance from a predetermined axis on a frequency plane of a two-dimensional Fourier transform result .

Feature extraction means for extracting predetermined features from the image;
Texture space generation means for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction means;
In the texture space generated by the texture space generation means, a partial space estimation means for estimating a partial space that is a handwritten character area or a partial space that is a printed character area;
In accordance with the partial space estimated by the partial space estimating means, an area determining means for determining a handwritten character area or a printed character area on an actual image;
Using a handwritten character area or a printed character area determined by the area determination means, and comprising a layout analysis means for analyzing a layout from the image ,
The feature amount extracted by the feature extraction means is a feature amount representing a handwritten character feature or a feature amount representing a type character feature,
An image processing system, which is a sum of spectral intensities weighted by a distance from a predetermined axis on a frequency plane of a two-dimensional Fourier transform result .

Feature extraction means for extracting predetermined features from the image;
Texture space generation means for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction means;
In the texture space generated by the texture space generation means, a partial space estimation means for estimating a partial space that is a handwritten character area or a partial space that is a printed character area;
In accordance with the partial space estimated by the partial space estimating means, an area determining means for determining a handwritten character area or a printed character area on an actual image;
Comprising layout analysis means for analyzing the layout from the image,
The feature amount extracted by the feature extraction means is a feature amount representing a handwritten character feature or a feature amount representing a type character feature,
It is the sum of the spectral intensities weighted by the distance from a predetermined axis in the frequency plane of the two-dimensional Fourier transform result,
The image processing system characterized in that the area determination means determines a handwritten character area or a printed character area using the layout analyzed by the layout analysis means.

5. A handwritten character recognition unit or a typeface character recognition unit for recognizing a handwritten character region or a typed character region determined by the region determination unit of the image processing system according to claim 1, 2, 3, or 4. Character recognition system.

The region determination unit determines the handwritten character region or the printed character region using also the likelihood of the character recognized by the handwritten character recognition unit or the character recognized by the typed character recognition unit. Item 6. The character recognition system according to Item 5 .

On the computer,
A feature extraction function for extracting predetermined features from an image;
A texture space generation function for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction function;
In the texture space generated by the texture space generation function, a partial space estimation function that estimates a partial space that is a handwritten character region or a partial space that is a printed character region;
In accordance with the partial space estimated by the partial space estimation function, an area determination function for determining a handwritten character area or a printed character area on an actual image is realized ,
The feature amount extracted by the feature extraction function is a feature amount representing a handwritten character feature or a feature amount representing a type character feature,
An image processing program characterized by a sum of spectral intensities weighted by a distance from a predetermined axis on a frequency plane of a two-dimensional Fourier transform result .

On the computer,
A feature extraction function for extracting predetermined features from an image;
A texture space generation function for generating a texture space in which texture information of each pixel or each region is mapped to the texture space according to the feature extracted by the feature extraction function;
In the texture space generated by the texture space generation function, an index setting function for setting an index representing the handwriting character area-likeness or the typeface character area-likeness;
In the texture space generated by the texture space generation function, a partial space estimation function for estimating a partial space that is a handwritten character area or a partial space that is a printed character area according to an index set by the index setting function;
In accordance with the partial space estimated by the partial space estimation function, an area determination function for determining a handwritten character area or a printed character area on an actual image is realized ,
The feature amount extracted by the feature extraction function is a feature amount representing a handwritten character feature or a feature amount representing a type character feature,
An image processing program characterized by a sum of spectral intensities weighted by a distance from a predetermined axis on a frequency plane of a two-dimensional Fourier transform result .