JPH0490082A

JPH0490082A - Device for detecting character direction in document

Info

Publication number: JPH0490082A
Application number: JP2203983A
Authority: JP
Inventors: Yutaka Nakamura; 豊中村
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1990-08-02
Filing date: 1990-08-02
Publication date: 1992-03-24
Anticipated expiration: 2010-07-19
Also published as: TW206290B; KR940009712B1; KR920005021A; JPH0766413B2

Abstract

PURPOSE:To make it possible to accurately detect the character direction of a document even in the case of using an original mixing halftone or graphic or an original of complex column composition by finding out a unit area (circumscribed rectangle), measuring the length (run length) of a space area (color run) in the unit area and deciding the character direction by the distribution of appearance frequency. CONSTITUTION:A picture element area forming means 1 detects a picture element group consisting of mutually relating picture elements in an input image, e.g. a picture element group constituting one character and a picture element constituting one table or a graphic, as a unit area. A run length frequency distribution feature extracting means 3 finds out the distribution of run length appearance frequency in each of vertical and horizontal directions from run length found out by a run length detecting means 2 and finds out its feature, e.g. the peak value of frequency. A direction deciding means 4 compares the features (peak values) of vertical and horizontal appearance frequency found out by the means 3 with each other and decides the vertical or horizontal writing of an inputted document. Thereby, the character direction of a document can be accurately detected even in the case of using an original mixing halftone or graphic or an original of complex column composition.

Description

[Detailed description of the invention]

うに、文字のみの簡単な段組であれば周辺分布により文
字方向の判定ができた。しかし、第１３図（ｂ）に示す
ような中間調、図表を含んだ原稿、あるいは同Ｚ　（ｃ
　）に示すような複雑な段組の原稿では周辺分布により
文字方向の判定をすることは困難であった。本発明は、このような従来技術の問題点を解決すること
を目的とするものである。すなわち、本発明は文字のみ
の簡単な原稿のみでなく、中間調や図表の混在した原稿
、段組の複雑な原稿も正確に文書の文字方向を検出する
ことのできる文書文字方向検出装置を提供することを目
的とするものである。ｒＩ１層を解決するための手段】上記目的を達成するために、本発明は、第１図に示すよ
うに、２値画像において互いに関連する画素群を単位領
域として求める画素領域化手段１と、各隣接する単位領
域間の空白領域の縦方向および横方向のランレングスを
求めるランレングス検出手段２と、縦方向および横方向
のランレングスの出現頻度分布の特徴を抽出するランレ
ングス頻度分布特徴抽出手段３と、抽出された縦および
横方向のランレングスの出現頻度分布の特徴を比較して
文書の文字の方向を判定する方向判定手段４とを備えた
ものである。In other words, for simple columns containing only characters, the direction of the characters could be determined based on the peripheral distribution. However, manuscripts containing halftones, figures and tables as shown in Figure 13(b), or the same Z (c
) It was difficult to determine the character direction based on the peripheral distribution in manuscripts with complicated columns, such as the one shown in (). The present invention aims to solve these problems of the prior art. That is, the present invention provides a document character direction detection device that can accurately detect the character direction of a document, not only for a simple document containing only text, but also for a document containing halftones, figures and tables, and a complex document with columns. The purpose is to Means for solving rI1 layer] In order to achieve the above object, the present invention, as shown in FIG. run-length detection means 2 for determining run lengths in the vertical and horizontal directions of blank areas between adjacent unit areas; and run-length frequency distribution feature extraction for extracting features of appearance frequency distribution of run lengths in the vertical and horizontal directions; The present invention includes a means 3 and a direction determining means 4 for determining the direction of characters in a document by comparing the characteristics of the appearance frequency distribution of the extracted vertical and horizontal run lengths.

[Effect]

第９図は文書画像の一例であり、文書は文字、中間調、
図形等から構成されている。ここで文字に着目すると、
文字は行を構成し、横書きの場合、縦方向の文字（行）
間隔よりも横方向の文字間隔が狭い。逆に縦書きの場合
、横方向の文字（行）間隔よりも縦方向の文字間隔が狭
い特性を示す。本発明はこの特性に着目して、文字間隔の計測を基に文
書の文字の方向を検出するものである。画素領域化手段Ｉは、入力ｆｌｉｌｊ像において互いＪ
こ関連する画素群、例えば１文字を構成する画素群、一
つの表や図形、を構成する画素群を単位領域として検出
する。画素群から単位領域を抽出する方法としては、例
えば画素群の外接矩形を求めこれを単位領域とする外接
矩形化処理がある。第１０図は画素群を単位領域化（外
接矩形化）処理した結果の一例を示すものである。次にランレングス検出手段２により各隣接する単位領域
間の空白領域の縦方向および横方向のランレングスを求
める。ランレングス頻度分布特徴抽出手段３は、ランレングス
検出手段２により求めたランレングスから、縦方向およ
び横方向のそれぞれにつきランレングスの出現頻度の分
布を求め、その特徴例えば頻度のピーク値を求める。方向判定手段４は、ランレングス頻度分布特徴抽出手段
３により求めた縦横の出現頻度の特徴（ピーク値）を比
較し、入力された文書が縦書きであるか、横書きである
かを判定する。第１０図および第１１図に横書きおよび
縦書きの場合のそれぞれの出現頻度の分布の例を示すが
、これらの図から明らかなように横方向および縦方向の
白ランのランレングスの頻度のピーク値の大小関係は、
横書き文書の場合と縦書き文書の場合で逆になっており
、従って、方向判定手段４は上記大小関係を調べること
により判定を行なうことができるのである。本発明によれば、文字だけの簡単な原稿のみでなく、中
間調や図表の混在した原稿、段組の複雑な原稿も正確に
文書の文字方向を検出することができる。Figure 9 is an example of a document image, where the document includes characters, halftones,
It is composed of figures, etc. If we focus on the characters here,
Characters form lines, and in the case of horizontal writing, vertical characters (lines)
The horizontal character spacing is narrower than the spacing. Conversely, in the case of vertical writing, the vertical character spacing is narrower than the horizontal character (line) spacing. The present invention focuses on this characteristic and detects the direction of characters in a document based on the measurement of character spacing. The pixel region forming means I
A related pixel group, for example, a pixel group constituting one character, a pixel group constituting one table or figure, is detected as a unit area. As a method for extracting a unit area from a pixel group, there is, for example, a circumscribing rectangle process in which a circumscribed rectangle of a pixel group is obtained and this is used as a unit area. FIG. 10 shows an example of the result of converting a pixel group into a unit area (circumscribing rectangle). Next, the run length detecting means 2 determines the vertical and horizontal run lengths of blank areas between adjacent unit areas. The run length frequency distribution feature extracting means 3 obtains the distribution of appearance frequency of the run lengths in each of the vertical and horizontal directions from the run lengths obtained by the run length detecting means 2, and obtains the characteristic, for example, the peak value of the frequency. The direction determining means 4 compares the vertical and horizontal appearance frequency features (peak values) obtained by the run length frequency distribution feature extracting means 3, and determines whether the input document is written vertically or horizontally. Figures 10 and 11 show examples of the frequency distribution for horizontal and vertical writing, and as is clear from these figures, there is a peak in the frequency of run lengths of white runs in the horizontal and vertical directions. The magnitude relationship of the values is
The situation is reversed for horizontally written documents and vertically written documents, and therefore, the direction determining means 4 can make a determination by examining the above-mentioned magnitude relationship. According to the present invention, it is possible to accurately detect the character direction of a document, not only for a simple document containing only text, but also for a document containing halftones, figures and tables, and a complex document with columns.

【Example】

以下、本発明を図面に示す実施例により詳細に説明する
。第２図は本発明の一実施例による文書画像処理装置の全
体構成を示すものである。この文書画像処理装置は、文
書画像をデジタル画像として走査入力するイメージスキ
ャナ２１．入力画像を処理のため記憶するイメージメモ
リ２２、本発明の文書の文字方向を検出する文字方向抽
出部２３、装置全体を制御するＣＰＵ２４、データバス
２５、画像やメツセージを表示するモニター２６、キー
ボード２７、画像等を格納する外部記憶装置２８、およ
び画像を出力するプリンタ２９等からなっている。第３図は第２図における本発明の文字方向検出部２３の
構成を示す図である。文字方向検出部２３は外接矩形処理部３０と、白ラン検
出処理部３１と、頻度分布検出処理部３２と、ピーク検
出処理部３３と、文字方向判定部３４からなっている。外接矩形処理部３０は、第１図の黒画素領域化手段１の
一例であって、２値画像において互いに関連する黒画素
群すなわち互いにつながっている複数の黒画素からなる
領域（例えば各文字の領域）を外接して囲む矩形領域（
単位領域）を生成するものである。この矩形化処理には
従来の任意の方式を用いることができるが、例えば従来
の矩形化処理方式の一例として、画像の輪郭線を追跡す
ることにより黒画素群を矩形で囲む方式がある。この方
式では、黒領域に対して黒画素の連結成分の追跡を行な
うことにより黒画素群を含む要素の最小Ｘ、Ｙ座標、最
大Ｘ、Ｙ座標を求めることができ、それぞれの構造を持
った黒画素群を矩形で囲むことができるものである。な
お、本発明者はこの従来の方式を改良した方式を提案し
、本出願人によって出願されている（特願平１−８７０
３９号）。この改良された技術を本発明の矩形化処理に
利用するのが好ましい。上記出願の矩形化処理の例を簡単に説明する。ここでは−例として文字である「辺」という画像を例と
して説明する。第４図に「辺」という文字を画素単位で示す。そこでこの画像に対し゛で第５図（ａ）に示すマスクパ
ターンを用いて左上から右下に黒画素の連結を行う。注
目画素に対して上の画素、左の画素が共に黒画素であっ
たならば注目画素を黒画素に変換する処理を行う。この
マスク処理を行った例を第５図（ｂ）に示す。黒画素を
右手方向に連結した画像が得られる。次に第６図（ａ）に示すマスクパターンで右上から左下
に対して黒画素連結処理を行う。得られた結果を第６図
（ｂ）に示す。先程の処理と逆に左方向に黒画素を連結
した画像が得られる。次に第７図（ａ）に示すマスクパターンで左下から右上
に対して黒画素連結処理を行う。得られた結果を第７図
（ｂ）に示す。文字全体がほぼ矩形領域で囲まれてきた
ことがわかる。最後に第８図（ａ）に示すマスクパターンで右下から左
上に対して黒画素の連結処理を行う。得られた結果を第
８図（ｂ）に示す。このように一連の４回の処理を行うことにより文字領域
を矩形で囲むことができる。処理の説明は文字を例にし
て行ったが、この結果は図形、表、中間調に対してもあ
てはまる。第９図は横書きの文書画像の一例を示すもの
であり、第１０図は第９図の画像の外接矩形処理後の画
像を示すものである。白ラン検出処理部３１は、第１図におけるランレングス
検出手段２に対応するものであり、上記のようにして得
られた画像に対して横方向、縦方向にそれぞれ走査して
白ランのランレングスを求める処理部である。ランレン
グスを求める処理は従来からある方式から任意に選択し
て利用することができる。頻度分布検出処理部３２は、白ラン検出処理部３１によ
り得られた白ランのランレングスを基にその出現頻度の
分布を求めるものである。第１１図は横書き文書の場合
のランの頻度分布を表したグラフであり、同図（ａ）は
横方向の白ランのランレングスの頻度を示し、同図（ｂ
）は縦方向の白ランのランレングスの頻度を示す。また
、第１２図は縦書き文書の場合のランの頻度分布を表し
たグラフであり、同図（ａ）は横方向の白ランのランレ
ングスの頻度を示し、同図（ｂ）は縦横方向の白ランの
ランレングスの頻度を示す。これらの図かられかるよう
に、縦書き文書も横書き文書も、縦方向と横方向の白ラ
ンのランレングスの分布が異なっており、しかも縦書き
文書と横書き文書とでは白ランのランレングスの分布が
逆になっている。この分布の相違により縦書きか横書き
かを判定することができる。本実施例では、各分布の特徴は出現頻度のピーり値を持
つ白ランのランレングスによって捉える。ピーク検出処理部３３は、頻度の比較により頻度のピー
ク値を有する白ランのランレングスを求めるものである
。上記の頻度分布検出処理部３２とピーク検出処理部３
３からなる処理部は、第１図におけるランレングス頻度
分布特徴抽出手段３に対応する。文字方向判定部３４は、第１図における方向判定部４に
対応するものであり、ピーク検出処理部３３で求めた横
方向および縦方向の最高頻度のランレングスを比較し、
（横方向のランレングスく縦方向のランレングス）のと
き横書きと判定し、横方向のランレングス〉縦方向のラ
ンレングス）のとき縦書きと判定するものである。本実施例は、文字間の白ランが基本となっているので、
段組の影響は受けない。第１３図（Ｃ）のような複雑な
段組で、従来の周辺分布の検出では方向の判定が不可能
であったような文書でも、容易に判定をすることができ
る。また、中間調、図表は通常の文字間隔よりも広いた
め出現頻度分布においてピーク値から外れたところに出
現頻度が位置するので、縦書き／横書きの判定に影響を
及ぼすことがない。そのため、高い精度で文書の文字方
向を検出することができる。なお、実施例では、すべての外接矩形に対して矩形間の
白ランを求めたが、白ランを求める前処理として、ある
程度大きい矩形は処理対象から外すようにしてもよい。この前処理を行なうことにより、中間調、図表が文字方
向判定に及ぼす影響を回避することができ、行方向検出
に有効なデータのみを集めることができるため、測定精
度も向上する。Hereinafter, the present invention will be explained in detail with reference to embodiments shown in the drawings. FIG. 2 shows the overall configuration of a document image processing apparatus according to an embodiment of the present invention. This document image processing device includes an image scanner 21 that scans and inputs a document image as a digital image. An image memory 22 that stores input images for processing, a character direction extraction unit 23 that detects the character direction of the document of the present invention, a CPU 24 that controls the entire device, a data bus 25, a monitor 26 that displays images and messages, and a keyboard 27. , an external storage device 28 for storing images, etc., and a printer 29 for outputting images. FIG. 3 is a diagram showing the configuration of the character direction detection section 23 of the present invention in FIG. 2. The character direction detection section 23 includes a circumscribed rectangle processing section 30, a white run detection processing section 31, a frequency distribution detection processing section 32, a peak detection processing section 33, and a character direction determination section 34. The circumscribed rectangle processing unit 30 is an example of the black pixel region forming means 1 shown in FIG. A rectangular area (
unit area). Any conventional method can be used for this rectangularization process, and one example of a conventional rectangularization process is a method in which a group of black pixels is surrounded by a rectangle by tracing the outline of an image. In this method, the minimum X, Y coordinates and maximum X, Y coordinates of the element containing the black pixel group can be determined by tracing the connected components of black pixels in the black area. This allows a group of black pixels to be surrounded by a rectangle. The present inventor has proposed a method that improves this conventional method, and the present applicant has filed an application (Japanese Patent Application No. 1-870).
No. 39). Preferably, this improved technique is utilized in the rectangularization process of the present invention. An example of the rectangularization process of the above application will be briefly explained. Here, as an example, an image called "side" which is a character will be explained. FIG. 4 shows the word "side" in pixels. Therefore, black pixels are connected from the upper left to the lower right of this image using the mask pattern shown in FIG. 5(a). If the pixel above and the pixel to the left of the pixel of interest are both black pixels, processing is performed to convert the pixel of interest into a black pixel. An example of this masking process is shown in FIG. 5(b). An image in which black pixels are connected in the right-hand direction is obtained. Next, black pixel connection processing is performed from the upper right to the lower left using the mask pattern shown in FIG. 6(a). The obtained results are shown in FIG. 6(b). An image in which black pixels are connected in the left direction is obtained in the opposite way to the previous process. Next, black pixel connection processing is performed from the lower left to the upper right using the mask pattern shown in FIG. 7(a). The obtained results are shown in FIG. 7(b). It can be seen that the entire character is almost surrounded by a rectangular area. Finally, black pixels are connected from the lower right to the upper left using the mask pattern shown in FIG. 8(a). The obtained results are shown in FIG. 8(b). By performing a series of four processes in this way, the character area can be surrounded by a rectangle. Although the processing has been explained using text as an example, the results also apply to figures, tables, and halftones. FIG. 9 shows an example of a horizontally written document image, and FIG. 10 shows an image after circumscribing rectangle processing of the image in FIG. 9. The white run detection processing section 31 corresponds to the run length detection means 2 in FIG. 1, and scans the image obtained as described above in the horizontal and vertical directions to detect white runs. This is a processing section that calculates the length. The process for determining the run length can be arbitrarily selected from conventional methods. The frequency distribution detection processing section 32 determines the distribution of the frequency of appearance of white runs based on the run lengths of the white runs obtained by the white run detection processing section 31. FIG. 11 is a graph showing the frequency distribution of runs in the case of a horizontally written document.
) indicates the run length frequency of vertical white runs. In addition, Fig. 12 is a graph showing the frequency distribution of runs in the case of a vertically written document. Indicates the run length frequency of white runs. As can be seen from these figures, the distribution of the run length of white runs in the vertical and horizontal directions is different for both vertically written documents and horizontally written documents, and the run length of white runs is different between vertically written documents and horizontally written documents. The distribution is reversed. Based on the difference in this distribution, it is possible to determine whether the text is written vertically or horizontally. In this embodiment, the characteristics of each distribution are captured by the run length of the white run having the peak value of the appearance frequency. The peak detection processing unit 33 calculates the run length of a white run having a peak frequency value by comparing the frequencies. The above frequency distribution detection processing section 32 and peak detection processing section 3
The processing unit 3 corresponds to the run length frequency distribution feature extraction means 3 in FIG. The character direction determining unit 34 corresponds to the direction determining unit 4 in FIG.
When (horizontal run length - vertical run length), horizontal writing is determined, and when (horizontal run length > vertical run length), vertical writing is determined. This example is based on white runs between characters, so
It is not affected by columns. Even in a document with complicated columns as shown in FIG. 13(C), the orientation of which cannot be determined using conventional peripheral distribution detection, can be easily determined. Furthermore, since halftones and figures and tables are wider than the normal character spacing, their appearance frequency is located outside the peak value in the appearance frequency distribution, so they do not affect the determination of vertical writing/horizontal writing. Therefore, the character direction of the document can be detected with high accuracy. In the embodiment, white runs between rectangles are calculated for all circumscribed rectangles, but rectangles that are large to a certain extent may be excluded from processing targets as a preprocess for calculating white runs. By performing this preprocessing, it is possible to avoid the influence of halftones and graphics on character direction determination, and only data that is effective for line direction detection can be collected, thereby improving measurement accuracy.

【Effect of the invention】

本発明によれば、単位領域（外接矩形）を求め、その単
位領域間の空白領域（白ラン）の長さ（ランレングス）
を計測し、出現頻度分布により文字方向の判定をするの
で、文字だけの簡単な原稿のみでなく、中間調や図表の
混在した原稿、段組の複雑な原稿も正確に文書の文字方
向を検出することができる。According to the present invention, a unit area (circumscribed rectangle) is obtained, and the length (run length) of a blank area (white run) between the unit areas is determined.
Since it measures the character direction and determines the character direction based on the appearance frequency distribution, it can accurately detect the character direction of documents, not only for simple documents with only text, but also for documents with mixed halftones, figures and tables, and complex documents with columns. can do.

[Brief explanation of drawings]

第１図は、本発明の基本的構成を示すブロック図である
。第２図は、本発明の一実施例の文書画像処理装置の構成
を示す図である。第３図は、第２図における文字方向検出部の構成を示す
図である。第４図は、文字の画素パターンの一例を示す図である。第５図（ａ）は左上から右下へマスク処理を行なう場合
のマスクおよび走査方向を示す図であり、同図（ｂ）は
そのマスク処理の結果を示す図である。第６図（ａ）は右上から左下へマスク処理を行なう場合
のマスクおよび走査方向を示す図であり、同図（ｂ）は
そのマスク処理の結果を示す図である。第７図（ａ）は左下から右上へマスク処理を行なう場合
のマスクおよび走査方向を示す図であり、同図（ｂ）は
そのマスク処理の結果を示す図である。第８図（ａ）は右下から左上へマスク処理を行なう場合
のマスクおよび走査方向を示す図であり、同図（ｂ）は
そのマスク処理の結果を示す図である。第９図は文書画像の一例を示す図である。第１０図は外接矩形処理後の画像を示す図である。第１１図は横書き文書の場合のランの頻度分布を示す図
であって、（ａ）は横方向の白ラン長の頻度分布を表す
図、（ｂ）は縦方向の白ラン長の頻度分布を表す図であ
る。第１２図は横書き文書の場合のランの頻度分布を示す図
であって、（ａ）は横方向の白ラン長の頻度分布を表す
図、（ｂ）は縦方向の白ラン長の頻度分布を表す図であ
る。第１３図は、従来の周辺分布検出による文書文字方向検
出の方式を説明するための図であって、（ａ）は文字の
みの簡単な段組の場合、（ｂ）は中間調の画像を含んだ
原稿の場合、（ｃ）は複雑な段組の原稿の場合をそれぞ
れ示している。 ■・・・画素領域化手段、２・・・ランレングス検出手
段、３・・・頻度分布特徴抽出手段、４・・・方向判定
手段。特許出願人　富士ゼロックス株式会社代　理　人　弁理士　岩上昇代　理　人　弁理士　１）中隆秀代　理　人　弁理士　小野寺洋二第２図第１図第４図（ｂ）第６図第７図（ｂ）第８図（ｂ）第９図第１０し鋳（ｂ）第１３図（ａ）＜ｃ＞FIG. 1 is a block diagram showing the basic configuration of the present invention. FIG. 2 is a diagram showing the configuration of a document image processing apparatus according to an embodiment of the present invention. FIG. 3 is a diagram showing the configuration of the character direction detection section in FIG. 2. FIG. 4 is a diagram showing an example of a pixel pattern of a character. FIG. 5(a) is a diagram showing the mask and scanning direction when masking is performed from the upper left to the lower right, and FIG. 5(b) is a diagram showing the result of the masking process. FIG. 6(a) is a diagram showing the mask and scanning direction when masking is performed from the upper right to the lower left, and FIG. 6(b) is a diagram showing the result of the masking process. FIG. 7(a) is a diagram showing the mask and scanning direction when masking is performed from the lower left to the upper right, and FIG. 7(b) is a diagram showing the result of the masking process. FIG. 8(a) is a diagram showing the mask and scanning direction when masking is performed from the lower right to the upper left, and FIG. 8(b) is a diagram showing the result of the masking process. FIG. 9 is a diagram showing an example of a document image. FIG. 10 is a diagram showing an image after circumscribed rectangle processing. FIG. 11 is a diagram showing the frequency distribution of runs in the case of a horizontally written document, where (a) is a diagram showing the frequency distribution of horizontal white run length, and (b) is a diagram showing the frequency distribution of vertical white run length. FIG. FIG. 12 is a diagram showing the frequency distribution of runs in the case of a horizontally written document, where (a) is a diagram showing the frequency distribution of horizontal white run length, and (b) is a diagram showing the frequency distribution of vertical white run length. FIG. FIG. 13 is a diagram for explaining the method of detecting document character direction using conventional peripheral distribution detection, in which (a) is a case of a simple column of characters only, and (b) is a case of a half-tone image. (c) shows the case of a manuscript with complicated columns. (2)...Pixel region forming means, 2...Run length detection means, 3...Frequency distribution feature extraction means, 4...Direction determining means. Patent applicant Fuji Xerox Co., Ltd. Agent Patent attorney Iwatakayoyo Attorney Patent attorney 1) Hideyo Nakataka Attorney Patent attorney Yoji Onodera Figure 2 Figure 1 Figure 4 (b) Figure 6 Figure 7 (b ) Fig. 8 (b) Fig. 9 No. 10 casting (b) Fig. 13 (a) <c>

Claims

[Scope of Claims] A black pixel region forming means for obtaining mutually related black pixel groups as unit regions in a binary image, and a run length for obtaining the vertical and horizontal run lengths of blank areas between adjacent unit regions. A detection means, a run length frequency distribution feature extraction means for extracting features of the appearance frequency distribution of the run lengths in the vertical and horizontal directions, and a feature of the extracted appearance frequency distribution of the run lengths in the vertical and horizontal directions are compared. 1. A document character direction detection device comprising: direction determination means for determining the direction of characters in a document.