JPH05290166A

JPH05290166A - Segment recognition system

Info

Publication number: JPH05290166A
Application number: JP4120162A
Authority: JP
Inventors: Goro Bessho; 吾朗別所
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1992-04-13
Filing date: 1992-04-13
Publication date: 1993-11-05
Anticipated expiration: 2016-03-07
Also published as: JP3142950B2

Abstract

PURPOSE:To extract more accurate segment data by recognizing the thickness of the segment. CONSTITUTION:A rectangle integration part 5 which extracts only black runs having length larger than a certain value and stores them in a black run memory 4 integrates black runs into a rectangle including all the black runs when the black runs are within a range having certain values determined individually in a main scanning direction and a subscanning direction and stores them in a rectangle memory 6. A segment extraction part 7 recognizes the segment by regarding rectangles as a new rectangle including all the rectangles when the rectangles are within a range having certain values determined respectively in the main scanning direction and subscanning direction and stores its in a segment memory 8. A segment thickness recognition part 9 outputs the thickness of the segment with the black runs and vertical length of the extracted segment rectangle.

Description

Detailed Description of the Invention

【０００１】[0001]

【技術分野】本発明は、線分認識方式に関し、より詳細
には、文書画像処理、例えば文書の帳票等の２値画像中
の線分認識処理における線分認識方式に関する。TECHNICAL FIELD The present invention relates to a line segment recognition method, and more particularly to a line segment recognition method in document image processing, for example, line segment recognition processing in a binary image such as a document form.

【０００２】[0002]

【従来技術】文書画像中の線分を認識してそれをデータ
として入力して再利用する要望が強い。例えばワープロ
で出力された表原稿を入力する場合、表内部の文字をＯ
ＣＲ（Optical character Reader；光学式文字読取装
置）で入力するのみならず、表を構成する罫線も同時に
入力して再利用することがある。そのため、表中の文字
を読み取るだけでなく、罫線を認識して罫線データとし
て入力する方法が取られている。ところがこの方式は、
罫線の存在する座標値を求めることに主眼を置いてお
り、太線・中線・細線などの線の太さに関する情報は一
切求めることができない。したがって線の太さが異なる
原稿に対しては、これらの違いが識別できず、原稿のよ
り忠実な再現を行ないたいという要望に答えることはで
きない。そこで、本発明は、太線・中線・細線などの線
の太さを求め、これらの情報を取り入れることによっ
て、原稿により忠実な線分の再現を行なうことを可能と
したものである。2. Description of the Related Art There is a strong demand for recognizing a line segment in a document image, inputting it as data, and reusing it. For example, when inputting a front manuscript output by a word processor, the characters inside the front
In addition to inputting with a CR (Optical character Reader), ruled lines forming a table may be input and reused at the same time. Therefore, in addition to reading the characters in the table, a method of recognizing a ruled line and inputting it as ruled line data is adopted. However, this method
Since the focus is on finding the coordinate values where ruled lines exist, it is not possible to find any information about the thickness of lines such as thick lines, medium lines, and thin lines. Therefore, for documents having different line thicknesses, these differences cannot be identified, and the demand for more faithful reproduction of the document cannot be met. Therefore, the present invention makes it possible to faithfully reproduce line segments in a document by obtaining the thickness of lines such as thick lines, middle lines, and thin lines, and incorporating these pieces of information.

【０００３】[0003]

【目的】本発明は、上述のごとき実情に鑑みてなされた
もので、文書画像処理装置における文書中の線分を抽出
する処理として、その線分の太さを認識して、より正確
な線分データの抽出を可能にする線分認識方式を提供す
ることを目的としてなされたものである。The present invention has been made in view of the above circumstances, and as a process of extracting a line segment in a document in a document image processing apparatus, the thickness of the line segment is recognized to obtain a more accurate line. The purpose of the present invention is to provide a line segment recognition method that enables extraction of minute data.

【０００４】[0004]

【構成】本発明は、上記目的を達成するために、（１）
２値の文書画像に対して線分を抽出する線分抽出方式に
おいて、一定値以上の黒ランを抽出する黒ラン抽出手段
と、該黒ランが主走査方向あるいは副走査方向にそれぞ
れ別に定めた一定値以内の範囲に入っている場合は、該
黒ランを全て包含する矩形に統合する統合手段と、該矩
形が主走査方向あるいは副走査方向にそれぞれ別に定め
た一定値以内の範囲に入っている場合は、該矩形を全て
包含する新たな矩形として線分を認識する線分抽出手段
と、該線分矩形の黒ランと垂直方向の長さによって線分
の太さを認識する線分太さ認識手段とから成ること、或
いは、（２）２値の文書画像に対して線分を抽出する線
分抽出方式において、一定値以上の黒ランを抽出する黒
ラン抽出手段と、該黒ランが主走査方向あるいは副走査
方向にそれぞれ別に定めた一定値以内の範囲に入ってい
る場合は、該黒ランを全て包含する矩形に統合する統合
手段と、該矩形が主走査方向あるいは副走査方向にそれ
ぞれ別に定めた一定値以内の範囲に入っている場合は、
該矩形を全て包含する新たな矩形として線分を認識する
線分抽出手段と、該線分矩形の中の黒画素数を計数する
黒画素数計数手段と、該黒画素数を黒ランと水平方向の
長さで割った値によって線分の太さを認識する線分太さ
認識手段とから成ること、或いは、（３）２値の文書画
像に対して線分を抽出する線分抽出方式において、一定
値以上の黒ランを抽出する黒ラン抽出手段と、該黒ラン
が主走査方向あるいは副走査方向にそれぞれ別に定めた
一定値以内の範囲に入っている場合は、該黒ランを全て
包含する矩形に統合する統合手段と、黒ランを矩形に統
合する際に黒画素数を計数する黒画素数計数手段と、該
矩形が主走査方向あるいは副走査方向にそれぞれ別に定
めた一定値以内の範囲に入っている場合は、該矩形を全
て包含する新たな矩形として線分を認識する線分抽出手
段と、黒画素数を黒ランと水平方向の長さで割った値に
よって線分の太さを認識する線分太さ認識手段とから成
ること、或いは、（４）２値の文書画像に対して線分を
抽出する線分抽出方式において、黒ランの連結する範囲
を包含する矩形を求め、全ての矩形の幅と高さおよびそ
の隣接する矩形との距離の値を計数する矩形数計数手段
と、それぞれの累積値がある一定値を越えた場合には各
矩形を統合して線分を認識する線分抽出手段と、該線分
矩形の中の黒画素数を計数する黒画素数計数手段と、該
黒画素数を黒ランと水平方向の長さの補正値で割った値
によって線分の太さを認識する線分太さ認識手段とから
成ること、或いは、（５）２値の文書画像に対して線分
を抽出する線分抽出方式において、黒ランの連結する範
囲を包含する矩形を求め、黒ランを矩形に統合する際に
黒画素数を計数する黒画素数計数手段と、全ての矩形の
幅と高さおよびその隣接する矩形との距離の値を計数す
る矩形数計数手段と、それぞれの累積値がある一定値を
越えた場合には各矩形を統合して線分を認識する線分抽
出手段とから成り、黒画素数を黒ランと水平方向の長さ
の補正値で割った値によって線分の太さを認識する線分
太さ認識手段とから成ること、更には、（６）前記
（１）において、前記線分矩形の黒ランと垂直方向の長
さをしきい値処理によって線種を判断すること、更に
は、（７）前記（２）において、前記黒画素数を黒ラン
と水平方向の長さで割った値をしきい値処理によって線
種を判断すること、更には、（８）前記（３）におい
て、前記黒画素数を黒ランと水平方向の長さで割った値
をしきい値処理によって線種を判断すること、更には、
（９）前記（４）において、前記黒画素数を黒ランと水
平方向の長さの補正値で割った値をしきい値処理によっ
て線種を判断すること、更には、（１０）前記（５）に
おいて、前記黒画素数を黒ランと水平方向の長さで割っ
た値をしきい値処理によって線種を判断することを特徴
としたものである。以下、本発明の実施例に基づいて説
明する。In order to achieve the above object, the present invention provides (1)
In a line segment extraction method for extracting a line segment from a binary document image, a black run extracting means for extracting a black run having a predetermined value or more and the black run are separately defined in a main scanning direction or a sub scanning direction. If it is within the range of a certain value, the integrating means for integrating the black run into a rectangle that includes all of the black runs, and the rectangle are within a range of a certain value defined separately in the main scanning direction or the sub-scanning direction. If it is, the line segment extraction means for recognizing the line segment as a new rectangle including all the rectangles, and the line segment thickness for recognizing the thickness of the line segment by the length in the vertical direction of the line segment rectangle and the vertical direction. Or (2) in the line segment extraction method for extracting line segments from a binary document image, a black run extraction unit for extracting black runs of a certain value or more, and the black run extraction unit. Are divided into the main scanning direction and the sub-scanning direction. If it is within a predetermined range, the integration means for integrating the black run into a rectangle that includes all of the black runs and the rectangle within the predetermined range defined in the main scanning direction or the sub-scanning direction, respectively. If yes,
Line segment extraction means for recognizing a line segment as a new rectangle including all the rectangles, black pixel number counting means for counting the number of black pixels in the line segment rectangle, and the number of black pixels horizontally with the black run. A line segment thickness recognizing means for recognizing the thickness of the line segment by a value divided by the length in the direction, or (3) a line segment extraction method for extracting a line segment from a binary document image. In black run extraction means for extracting black runs equal to or greater than a certain value, and if the black runs are within a range defined by a separate value in the main scanning direction or the sub scanning direction, all the black runs are An integrating unit that integrates the included rectangle, a black pixel number counting unit that counts the number of black pixels when the black run is integrated into the rectangle, and the rectangle is within a fixed value determined separately in the main scanning direction or the sub-scanning direction. If it is within the range of, a new A line segment extracting means for recognizing a line segment as a shape, and a line segment thickness recognizing means for recognizing the thickness of the line segment by a value obtained by dividing the number of black pixels by the black run and the length in the horizontal direction, or , (4) In a line segment extraction method for extracting line segments from a binary document image, a rectangle including a range in which black runs are connected is obtained, and the widths and heights of all the rectangles and their adjacent rectangles are obtained. Number counting means for counting the value of the distance, line segment extracting means for recognizing a line segment by integrating the respective rectangles when the respective cumulative values exceed a certain value, and the line segment rectangle A black pixel number counting means for counting the number of black pixels, and a line segment thickness recognizing means for recognizing the thickness of the line segment by a value obtained by dividing the black pixel number by a correction value of the black run and the length in the horizontal direction. Or (5) in the line segment extraction method for extracting line segments from a binary document image, A black pixel number counting means for counting the number of black pixels when a black run is integrated into a rectangle by obtaining a rectangle including a range in which runs are connected, and the widths and heights of all the rectangles and the distances between the adjacent rectangles. The number of black pixels is determined by the number of black pixels and the number of black pixels which is used for recognizing a line segment by integrating the rectangles when the cumulative value of each rectangle exceeds a certain value. And a line segment thickness recognizing means for recognizing the thickness of the line segment by a value divided by a correction value of the length in the horizontal direction. (6) In (1), the line segment rectangle A line type is determined by thresholding the black run and the length in the vertical direction, and (7) In (2), the value obtained by dividing the number of black pixels by the black run and the length in the horizontal direction. To determine the line type by thresholding, and (8) in (3) above, A line type is determined by thresholding a value obtained by dividing the number of black pixels by the black run and the length in the horizontal direction.
(9) In (4), the line type is determined by thresholding a value obtained by dividing the number of black pixels by a correction value of a black run and a horizontal length, and (10) above ( In 5), the line type is determined by thresholding a value obtained by dividing the number of black pixels by the black run and the length in the horizontal direction. Hereinafter, description will be given based on examples of the present invention.

【０００５】図１は、本発明による線分認識方式の一実
施例（実施例１）を説明するための構成図で、図中、１
は２値画像入力部、２は２値イメージメモリ、３は黒ラ
ン抽出部、４は黒ランメモリ、５は矩形統合部、６は矩
形メモリ、７は線分抽出部、８は線分メモリ、９は線分
太さ認識部である。スキャナ等の２値画像入力装置を用いて文書や帳票を
読み込み、２値イメージメモリ２に格納する。読み込んだ２値イメージに対して、黒ラン抽出部３に
おいて、一定値以上の長さを持つ黒ランのみを抽出し、
黒ランメモリ４に格納する（図６の太い実線）。矩形統合部５において、前記で抽出された黒ランに
対して、黒ラン同士が主走査方向あるいは副走査方向に
それぞれ別に定めた一定値以内の範囲にあれば、その黒
ランを全て包含する矩形に統合し、矩形メモリ６に格納
する（図６の点線の矩形）。線分抽出部７において、前記で抽出された矩形に対
して、矩形同士が主走査方向あるいは副走査方向にそれ
ぞれ別に定めた一定値以内の範囲にあれば、その矩形を
全て包含する新たな矩形として線分を認識し、線分メモ
リ８の格納する（図６の実線の矩形）。線分太さ認識部９において、前記で抽出された線分
矩形の、黒ランと垂直方向の長さによって線分の太さを
認識し、その値を線分太さとして出力する。FIG. 1 is a block diagram for explaining an embodiment (embodiment 1) of a line segment recognition system according to the present invention.
Is a binary image input unit, 2 is a binary image memory, 3 is a black run extraction unit, 4 is a black run memory, 5 is a rectangular integration unit, 6 is a rectangular memory, 7 is a line segment extraction unit, and 8 is a line segment memory. , 9 are line segment thickness recognition units. A document or form is read using a binary image input device such as a scanner and stored in the binary image memory 2. From the read binary image, the black run extraction unit 3 extracts only black runs having a length of a certain value or more,
Stored in the black run memory 4 (thick solid line in FIG. 6). In the rectangle unifying unit 5, if the black runs are within a predetermined range defined in the main scanning direction or the sub-scanning direction with respect to the black runs extracted above, a rectangle including all the black runs. , And store it in the rectangular memory 6 (dotted rectangle in FIG. 6). In the line segment extraction unit 7, if the rectangles are within a predetermined value defined separately in the main scanning direction or the sub scanning direction with respect to the rectangles extracted above, a new rectangle including all the rectangles The line segment is recognized as and is stored in the line segment memory 8 (solid line rectangle in FIG. 6). The line segment thickness recognizing unit 9 recognizes the thickness of the line segment based on the length of the extracted line segment rectangle in the vertical direction with respect to the black run, and outputs the value as the line segment thickness.

【０００６】図２は、本発明による線分認識方式の他の
実施例（実施例２）を示す図で、図中、１０は黒画素数
計数部、１１は黒画素数メモリで、その他、図１と同じ
作用をする部分は同一の符号を付してある。スキャナ等
の２値画像入力装置を用いて文書を読み込むところか
ら、線分抽出を行ない線分メモリに格納するところまで
は図１と同様である。抽出された線分メモリ８上の座標値を参照して、２値
イメージメモリ２から黒画素数を計数し、黒画素数メモ
リ１１に格納する（図７の線分矩形の中の黒画素数を計
数する）。線分太さ認識部９において、線分メモリ８、黒画素数
メモリ１１を参照し、線分太さを計算する。線分太さの
計算は次のように行なう。前記で得られた黒画素数Pixelと線分矩形の黒ラン
と水平方向の長さWidthから線分太さThickを求める。 Thick＝Pixel／Width 線分太さThickの値をその線分の太さとして出力す
る。この線分太さThickを用いると図７（ａ）のような
スキューのある原稿に対しては、図７（ｂ）のような実
際の線分の太さに近いものが得られることになる。FIG. 2 is a diagram showing another embodiment (embodiment 2) of the line segment recognition method according to the present invention. In FIG. 2, 10 is a black pixel number counting unit, 11 is a black pixel number memory, and the like. Portions having the same functions as those in FIG. 1 are designated by the same reference numerals. The process from reading a document using a binary image input device such as a scanner to extracting the line segment and storing it in the line segment memory is the same as in FIG. The number of black pixels is counted from the binary image memory 2 with reference to the extracted coordinate values on the line segment memory 8 and stored in the black pixel number memory 11 (the number of black pixels in the line segment rectangle in FIG. 7). Is counted). The line segment thickness recognizing unit 9 refers to the line segment memory 8 and the black pixel number memory 11 to calculate the line segment thickness. The line segment thickness is calculated as follows. The line segment thickness Thick is obtained from the black pixel number Pixel obtained above, the black run of the line segment rectangle, and the horizontal length Width. Thick = Pixel / Width Line thickness Thick value is output as the thickness of the line. When this line segment thickness Thick is used, for a document having a skew as shown in FIG. 7A, a line segment thickness close to the actual line segment thickness as shown in FIG. 7B is obtained. ..

【０００７】図３は、本発明による線分認識方式の更に
他の実施例（実施例３）を示す図である。読み込んだ２値イメージに対して、黒ラン抽出部３に
おいて、一定値以上の長さを持つ黒ランのみを抽出し、
黒ランメモリ４に格納する。黒画素数計数部１０において、矩形統合されうる黒ラ
ンに対してその黒画素数を黒画素数メモリ１１に累積し
ていく。矩形統合部５において、黒ランを統合し、矩形メモリ
６に格納する。線分太さ認識部９において、前記で求めた黒画素数
Pixelと線分矩形の黒ランと水平方向の長さWidthから線
分太さThickを求める。 Thick＝Pixel／Width 線分太さThickの値をその線分の太さとして出力す
る。FIG. 3 is a diagram showing yet another embodiment (third embodiment) of the line segment recognition method according to the present invention. From the read binary image, the black run extraction unit 3 extracts only black runs having a length of a certain value or more,
Stored in the black run memory 4. The black pixel number counting unit 10 accumulates the number of black pixels in the black pixel number memory 11 for black runs that can be integrated into a rectangle. The rectangular integration unit 5 integrates the black runs and stores them in the rectangular memory 6. In the line segment thickness recognition unit 9, the number of black pixels obtained above
The line thickness Thick is calculated from the Pixel, the black run of the line segment rectangle, and the horizontal length Width. Thick = Pixel / Width Line thickness Thick value is output as the thickness of the line.

【０００８】図４は、本発明による線分認識方式の更に
他の実施例（実施例４）を示す図で図中、１２は矩形数
計数部、１３はヒストグラムメモリで、その他、図２と
同じ作用をする部分は同一の符号を付してある。スキャナ等の２値画像入力装置を用いて文書や帳票を
読み込み、２値イメージメモリ２に格納する。読み込んだ２値イメージに対して、黒ラン抽出部３に
おいて、黒ランを抽出し、黒ランメモリ４に格納する。矩形統合部５において、前記で抽出された黒ランに
対して、黒ラン同士が接続しているものがあればその黒
ランを全て包含する矩形に統合し、矩形メモリ６に格納
する。矩形数計数部１２のおいて、すベての矩形の幅と高さ
およびその隣接する矩形との距離の値を計数し、そのそ
れぞれの累積値を求め、ヒストグラムメモリ１３に格納
する。線分抽出部７において、前記で求められたヒストグ
ラムがある一定値を益えた場合には矩形を統合して線分
を抽出し、線分メモリ８に格納する。抽出された線分メモリ８上の座標値を参照して、２値
イメージメモリ２から黒画素数を計数し、黒画素数メモ
リ１１に格納する。線分太さ認識部９において、線分メモリ８、黒画素数
メモリ１１を参照し、線分太さを計算する。線分太さの
計算は次のように行なう。前記で得られた黒画素数Pixelと線分矩形の黒ラン
と水平方向の長さWidthから線分太さThickを求める。 Thick＝Pixel／（Width−CONST）（CO
NSTは定数）この補正は点線や破線のきれぎれの線分に対するもので
ある。線分太さThickの値をその線分の太さとして出力す
る。FIG. 4 is a diagram showing still another embodiment (embodiment 4) of the line segment recognition method according to the present invention, in which 12 is a rectangular number counting unit, 13 is a histogram memory, and FIG. Portions having the same function are designated by the same reference numerals. A document or form is read using a binary image input device such as a scanner and stored in the binary image memory 2. The black run extraction unit 3 extracts a black run from the read binary image and stores it in the black run memory 4. In the rectangle integration unit 5, if there are black runs connected to the black runs extracted above, they are integrated into a rectangle including all the black runs and stored in the rectangle memory 6. The rectangle number counting unit 12 counts the values of the widths and heights of all the rectangles and the distances between the rectangles adjacent to each other, calculates the cumulative value of each, and stores them in the histogram memory 13. In the line segment extraction unit 7, when the histogram obtained as described above has gained a certain value, the rectangles are integrated to extract line segments, which are stored in the line segment memory 8. The number of black pixels is counted from the binary image memory 2 with reference to the extracted coordinate values on the line segment memory 8 and stored in the black pixel number memory 11. The line segment thickness recognizing unit 9 refers to the line segment memory 8 and the black pixel number memory 11 to calculate the line segment thickness. The line segment thickness is calculated as follows. The line segment thickness Thick is obtained from the black pixel number Pixel obtained above, the black run of the line segment rectangle, and the horizontal length Width. Thick = Pixel / (Width-CONST) (CO
(NST is a constant) This correction is for broken line and broken line segments. Line thickness Thick value is output as the line thickness.

【０００９】図５は、本発明による線分認識方式の更に
他の実施例（実施例５）を示す図である。読み込んだ２値イメージに対して、黒ラン抽出部３に
おいて、黒ランを抽出し、黒ランメモリ４に格納する。黒画素数計数部１０において、矩形統合されうる黒ラ
ンに対して、その黒画素数を黒画素数メモリ１１に累積
していく。矩形統合部５において、黒ランを統合し、矩形メモリ
６に格納する。線分太さ認識部９において、前記で求めた黒画素数
Pixelと線分矩形の黒ランと水平方向の長さWidthから線
分太さThickを求める。 Thick＝Pixel／（Width−CONST）（CO
NSTは定数）この補正は点線や破線のきれぎれの線分に対するもので
ある。線分太さThickの値をその線分の太さとして出力す
る。FIG. 5 is a diagram showing still another embodiment (embodiment 5) of the line segment recognition method according to the present invention. The black run extraction unit 3 extracts a black run from the read binary image and stores it in the black run memory 4. The black pixel number counting unit 10 accumulates the number of black pixels in the black pixel number memory 11 for the black runs that can be integrated into a rectangle. The rectangular integration unit 5 integrates the black runs and stores them in the rectangular memory 6. In the line segment thickness recognition unit 9, the number of black pixels obtained above
The line thickness Thick is calculated from the Pixel, the black run of the line segment rectangle, and the horizontal length Width. Thick = Pixel / (Width-CONST) (CO
(NST is a constant) This correction is for broken line and broken line segments. Line thickness Thick value is output as the line thickness.

【００１０】次に、実施例６について説明する。実施例
１におけるスキャナ等の２値画像入力装置を用いて文書
を読み込むところから、線分抽出を行ない線分メモリ８
に格納するところまでは同様である。線分太さ認識部９
において、抽出された線分矩形の、黒ランと垂直方向の
長さを例えば２段階のしきい値によって、太線・中線・
細線の３種類の線種に分類し、この線種を出力する。Next, a sixth embodiment will be described. The line segment memory 8 is used to extract line segments from the reading of the document using the binary image input device such as the scanner in the first embodiment.
It is the same until it is stored in. Line thickness recognition unit 9
In, the length of the extracted line segment rectangle in the vertical direction with respect to the black run is set to a thick line, a middle line,
It is classified into three types of thin lines and this line type is output.

【００１１】次に実施例７について説明する。実施例２
におけるスキャナ等の２値画像入力装置を用いて文書を
読み込むところから、線分抽出を行ない線分メモリ８に
格納するところまでは同様である。線分太さThickの値
を例えば２段階のしきい値によって、太線・中線・細線
の３種類の線種に分類し、この線種を出力する。Next, a seventh embodiment will be described. Example 2
The process from reading a document using a binary image input device such as a scanner to the process of extracting line segments and storing them in the line segment memory 8 is the same. The value of the line thickness Thick is classified into three types of line types, for example, a thick line, a medium line, and a thin line, by using a two-step threshold value, and this line type is output.

【００１２】次に、実施例８について説明する。実施例
３におけるスキャナ等の２値画像入力装置を用いて文書
を読み込むところから、線分抽出を行ない線分メモリ８
に格納するところまでは同様である。線分太さThickの
値を例えば２段階のしきい値によって、太線・中線・細
線の３種類の線種に分類し、この線種を出力する。Next, an eighth embodiment will be described. The line segment memory 8 is used to extract the line segment from the reading of the document using the binary image input device such as the scanner in the third embodiment.
It is the same until it is stored in. The value of the line thickness Thick is classified into three types of line types, for example, a thick line, a medium line, and a thin line, by using a two-step threshold value, and this line type is output.

【００１３】次に、実施例９について説明する。実施例
４におけるスキャナ等の２値画像入力装置を用いて文書
を読み込むところから、線分抽出を行ない線分メモリ８
に格納するところまでは同様である。線分太さThickの
値を例えば２段階のしきい値によって、太線・中線・細
線の３種類の線種に分類し、この線種を出力する。Next, a ninth embodiment will be described. The line segment extraction is performed from the reading of the document using the binary image input device such as the scanner in the fourth embodiment, and the line segment memory 8
It is the same until it is stored in. The value of the line thickness Thick is classified into three types of line types, for example, a thick line, a medium line, and a thin line, by using a two-step threshold value, and this line type is output.

【００１４】次に、実施例１０について説明する。実施
例５におけるスキャナ等の２値画像入力装置を用いて文
書を読み込むところから、線分抽出を行ない線分メモリ
８に格納するところまでは同様である。線分太さThick
の値を例えば２段階のしきい値によって、太線・中線・
細線の３種類の線種に分類し、この線種を出力する。Next, a tenth embodiment will be described. The process from reading a document using a binary image input device such as a scanner in the fifth embodiment to extracting the line segment and storing it in the line segment memory 8 is the same. Line thickness Thick
The value of is, for example, a two-step threshold value
It is classified into three types of thin lines and this line type is output.

【００１５】[0015]

【効果】以上の説明から明らかなように、本発明による
と、以下のような効果がある。（１）本実施例１の効果は、文書中の線分を抽出する処
理に際し、その線分の太さを実際の画像上の絶対値とし
て出力できるところであり、これによって、原稿により
正確な線分データを抽出することが可能になる。以上の説明から明らかなように、本発明によると、以下
のような効果がある。（２）本実施例２の効果は、文書中の線分を抽出する処
理に際し、その線分の太さを実際の画像上の絶対値とし
て出力できるところであり、特に、スキューのある原稿
に対して、より正確な値を求めることが可能になる。（３）本実施例３の効果は、文書中の線分を抽出する処
理に際し、その線分の太さを実際の画像上の絶対値とし
て出力できるところであり、特に、スキューのある原稿
に対して、より正確な値を求めるができる。また、ラン
の統合の際に黒画素数を計数しているので、より高速な
処理が可能となる。（４）本実施例４の効果は、点線や破線のようなきれぎ
れの線分に対しても、線分太さを求めることが可能にな
ることで、これによって、線分の種類の許容度が増すこ
とになる。（５）本実施例５の効果は、点線や破線のようなきれぎ
れの線分に対しても、線分太さを求めることが可能にな
ることで、これによって、線分の種類の許容度が増すこ
とになる。また、ランの統合の際に黒画素数を計数して
いるので、より高速な処理が可能となる。（６）本実施例６の効果は、文書中の線分を抽出する処
理に際し、その線分の太さを実際の画像上の絶対値とし
て出力できるところであり、これによって、原稿により
正確な線分データを表現することが可能になる。（７）本実施例７の効果は、文書中の線分を抽出する処
理に際し、その線分の太さを実際の画像上の絶対値とし
て出力できるところであり、これによって、原稿により
正確な線分データを表現することが可能になる。特に、
スキューのある原稿に対してより正確な線分データを表
現することが可能になる。（８）本実施例８の効果は、文書中の線分を抽出する処
理に際し、その線分の太さを実際の画像上の絶対値とし
て出力できるところであり、これによって、原稿により
正確な線分データを表現することが可能になる。特に、
スキューのある原稿に対してより正確な線分データを表
現することが可能になる。また、ランの統合の際に黒画
素数を計数しているので、より高速な処理が可能とな
る。（９）本実施例９の効果は、点線や破線のようなきれぎ
れの線分に対しても、線分太さを求めることが可能にな
ることで、これによって、線分の種類の許容度が増すこ
とになる。（１０）本実施例１０の効果は、点線や破線のようなき
れぎれの線分に対しても、線分太さを求めることが可能
になることで、これによって、線分の種類の許容度が増
すことになる。また、ランの統合の際に黒画素数を計数
しているので、より高速な処理が可能となる。As is apparent from the above description, the present invention has the following effects. (1) The effect of the first embodiment is that the thickness of the line segment can be output as an absolute value on the actual image in the process of extracting the line segment in the document. It becomes possible to extract minute data. As is clear from the above description, the present invention has the following effects. (2) The effect of the second embodiment is that the thickness of the line segment can be output as an absolute value on the actual image in the process of extracting the line segment from the document. Therefore, it becomes possible to obtain a more accurate value. (3) The effect of the third embodiment is that the thickness of the line segment can be output as an absolute value on the actual image in the process of extracting the line segment in the document. Therefore, a more accurate value can be obtained. Further, since the number of black pixels is counted when the runs are integrated, higher speed processing becomes possible. (4) The effect of the fourth embodiment is that it is possible to obtain the line segment thickness even for a broken line segment such as a dotted line or a broken line, whereby the type of line segment is allowed. The frequency will increase. (5) The effect of the fifth embodiment is that it is possible to obtain the thickness of a line segment even for a broken line segment such as a dotted line or a broken line, which allows the type of line segment to be permitted. The frequency will increase. Further, since the number of black pixels is counted when the runs are integrated, higher speed processing becomes possible. (6) The effect of the sixth embodiment is that the thickness of the line segment can be output as an absolute value on the actual image in the process of extracting the line segment in the document. It becomes possible to express minute data. (7) The effect of the seventh embodiment is that the thickness of the line segment can be output as an absolute value on the actual image in the process of extracting the line segment in the document. It becomes possible to express minute data. In particular,
It is possible to express more accurate line segment data for a document with skew. (8) The effect of the eighth embodiment is that the thickness of the line segment can be output as an absolute value on the actual image in the process of extracting the line segment in the document. It becomes possible to express minute data. In particular,
It is possible to express more accurate line segment data for a document with skew. Further, since the number of black pixels is counted when the runs are integrated, higher speed processing becomes possible. (9) The effect of the ninth embodiment is that the line segment thickness can be obtained even for a broken line segment such as a dotted line or a broken line, which allows the type of line segment to be permitted. The frequency will increase. (10) The effect of the tenth embodiment is that it is possible to obtain the thickness of a line segment even for a broken line segment such as a dotted line or a broken line. The frequency will increase. Further, since the number of black pixels is counted when the runs are integrated, higher speed processing is possible.

[Brief description of drawings]

【図１】本発明による線分認識方式の一実施例を説明
するための構成図である。FIG. 1 is a configuration diagram for explaining an embodiment of a line segment recognition method according to the present invention.

【図２】本発明による線分認識方式の他の実施例を示
す図である。FIG. 2 is a diagram showing another embodiment of a line segment recognition method according to the present invention.

【図３】本発明による線分認識方式の更に他の実施例
を示す図である。FIG. 3 is a diagram showing still another embodiment of the line segment recognition method according to the present invention.

【図４】本発明による線分認識方式の更に他の実施例
を示す図である。FIG. 4 is a diagram showing still another embodiment of the line segment recognition method according to the present invention.

【図５】本発明による線分認識方式の更に他の実施例
を示す図である。FIG. 5 is a diagram showing still another embodiment of the line segment recognition method according to the present invention.

【図６】図１における線分太さの説明図である。6 is an explanatory diagram of line segment thickness in FIG. 1. FIG.

【図７】図２における線分太さの説明図である。7 is an explanatory diagram of line segment thickness in FIG. 2. FIG.

[Explanation of symbols]

１…２値画像入力部、２…２値イメージメモリ、３…黒
ラン抽出部、４…黒ランメモリ、５…矩形統合部、６…
矩形メモリ、７…線分抽出部、８…線分メモリ、９…線
分太さ認識部。1 ... Binary image input section, 2 ... Binary image memory, 3 ... Black run extraction section, 4 ... Black run memory, 5 ... Rectangular integration section, 6 ...
Rectangular memory, 7 ... Line segment extraction unit, 8 ... Line segment memory, 9 ... Line segment thickness recognition unit.

Claims

[Claims]

1. A line segment extraction method for extracting a line segment from a binary document image, and a black run extracting unit for extracting a black run having a predetermined value or more, and the black run having a main scanning direction or a sub-scanning direction. In the range within a predetermined value separately set, the integration means for integrating into a rectangle including all the black runs, and the rectangle within a predetermined value separately set in the main scanning direction or the sub-scanning direction , The line segment extraction means for recognizing the line segment as a new rectangle including all the rectangles, and the thickness of the line segment by the black run of the line segment rectangle and the length in the vertical direction. A line segment recognition method comprising a line segment thickness recognizing means for recognizing.

2. A line segment extraction method for extracting a line segment from a binary document image, and a black run extracting unit for extracting a black run having a predetermined value or more, and the black run having a main scanning direction or a sub-scanning direction. In the range within a predetermined value separately set, the integration means for integrating into a rectangle including all the black runs, and the rectangle within a predetermined value separately set in the main scanning direction or the sub-scanning direction Line segment extraction means for recognizing a line segment as a new rectangle including all the rectangles, and black pixel number counting means for counting the number of black pixels in the line segment rectangle. A line segment recognition method, comprising line segment thickness recognition means for recognizing the thickness of a line segment by a value obtained by dividing the number of black pixels by the black run and the length in the horizontal direction.

3. A line segment extraction method for extracting line segments from a binary document image, and a black run extraction unit for extracting a black run having a predetermined value or more, and the black run having a main scanning direction or a sub-scanning direction. In the range within a fixed value determined separately for each, the integrating means for integrating the black run into a rectangle including all the black runs, and the number of black pixels for counting the number of black pixels when integrating the black run into the rectangle Counting means and line segment extracting means for recognizing a line segment as a new rectangle including all of the rectangles when the rectangle is within a range defined within the predetermined value in the main scanning direction or the sub-scanning direction And a line segment thickness recognizing means for recognizing the thickness of the line segment by a value obtained by dividing the number of black pixels by the black run and the length in the horizontal direction.

4. A line segment extraction method for extracting a line segment from a binary document image, wherein a rectangle including a range in which black runs are connected is obtained, and the widths and heights of all the rectangles and their adjacent rectangles are obtained. A rectangle number counting means for counting the value of the distance to the line segment, a line segment extraction means for recognizing a line segment by integrating the rectangles when the respective cumulative values exceed a certain value, and the line segment rectangle A black pixel number counting means for counting the number of black pixels in the line, and a line segment thickness recognizing means for recognizing the thickness of the line segment by a value obtained by dividing the black pixel number by a correction value of the black run and the length in the horizontal direction. A line segment recognition method comprising: and.

5. A line segment extraction method for extracting a line segment from a binary document image, wherein a rectangle including a range in which black runs are connected is obtained, and the number of black pixels is integrated when the black runs are integrated into the rectangle. Black pixel number counting means for counting, rectangular number counting means for counting the width and height of all rectangles and the distance values between adjacent rectangles, and when the cumulative value of each exceeds a certain value A line segment extracting means for recognizing a line segment by integrating each rectangle, and a line segment thickness for recognizing the thickness of the line segment by a value obtained by dividing the number of black pixels by a correction value of the black run and the length in the horizontal direction. A line segment recognition method characterized by comprising a height recognition means.