JPS627590B2

JPS627590B2 -

Info

Publication number: JPS627590B2
Application number: JP54068416A
Authority: JP
Inventors: Takahiko Chuma; Toshiaki Katahira
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1979-05-31
Filing date: 1979-05-31
Publication date: 1987-02-18
Also published as: JPS55162176A

Description

【発明の詳細な説明】本発明は、伝票等においてみられるように、縦
線や横線等の線図形と文字とが描かれた原稿上で
線図形と文字とが互いに交わる部分がある場合に
おいても、線図形と文字とを個別に抽出すること
のできる新規な画像抽出方式を提供することを目
的とする。[Detailed Description of the Invention] The present invention is applicable to documents where line figures such as vertical lines or horizontal lines and characters intersect with each other, such as in slips, etc. Another object of the present invention is to provide a new image extraction method that can extract line figures and characters separately.

なお以下では、原稿上に描かれた文字、記号並
びに数字を、例えば枠線のような他の線図形と区
別して「文字」と呼ぶ。 Note that hereinafter, characters, symbols, and numbers drawn on a document will be referred to as "characters" to distinguish them from other line figures such as frame lines.

伝票等の線図形と文字とが描かれた原稿の画情
報から、線図形と文字とを抽出する、あるいは文
字のみを抽出する方法は、これまでにも見られる
が、これらの場合においては、原稿上の線図形と
文字に対して「線図形と文字とは互いに交わらな
い」という条件を前提としており、この条件を利
用して、黒レベル信号の連続性が途切れた時点
で、連続領域の占める面積の大きさにより線図形
と文字とを識別しており、線図形と文字とが交わ
る部分がある原稿に対しては、これらの抽出方法
は適用できない。しかしながら、例えば伝票等で
も枠線をはみ出して文字を記入した場合などは実
際には頻繁に発生するものであつて、画像の抽出
処理にあたつて非常に困難な問題となつていた。 Methods have been seen in the past to extract line figures and characters, or to extract only the characters, from the image information of a document on which line figures and characters are drawn, such as a slip, but in these cases, The condition for line figures and characters on a document is that ``line figures and characters do not intersect with each other.'' Using this condition, when the continuity of the black level signal is interrupted, the continuous area is Line figures and characters are distinguished by the size of the area they occupy, and these extraction methods cannot be applied to manuscripts in which there are parts where line figures and characters intersect. However, in practice, for example, when characters are written outside the frame lines of documents, etc., this frequently occurs, and this has become a very difficult problem in image extraction processing.

本発明は、原稿上に線図形と文字とが互いに交
わる部分があつても、それらを別々に抽出できる
画像抽出方式であつて、以下に本発明を実現する
ための構成の一実施例を図面にもとづいて説明す
る。 The present invention is an image extraction method that can extract line figures and characters separately even if there are parts where they intersect with each other on a document. I will explain based on this.

第１図において、１は走査部であり、原稿を線
順次に走査して原稿上の画情報を画素単位で２値
のデイジタル信号として取り出し、後述する第１
記憶部２に記憶させる。以下、一回の走査で画情
報が取り出されるラインの方向を主走査方向、こ
れと直交する方向を副走査方向と呼ぶ。 In FIG. 1, reference numeral 1 denotes a scanning section, which scans the document line-by-line and extracts image information on the document as a binary digital signal pixel by pixel.
The information is stored in the storage unit 2. Hereinafter, the direction of the line from which image information is extracted in one scan will be referred to as the main scanning direction, and the direction perpendicular to this will be referred to as the sub-scanning direction.

２は第１記憶部であり、走査部１で取り出され
た２値信号を走査順に記憶し、容量は原稿１ペー
ジ分以上を有するものとする。 Reference numeral 2 denotes a first storage section, which stores the binary signals taken out by the scanning section 1 in scanning order, and has a capacity of one page or more of a document.

３は黒ラン検出部であり、第１記憶部２の２値
信号をチエツクし、縦線及び横線の部分である可
能性を有する黒ランを見つける。 Reference numeral 3 denotes a black run detection section which checks the binary signal in the first storage section 2 to find black runs that may be part of vertical lines and horizontal lines.

４は線抽出部で黒ラン検出部３で見つけられた
縦線及び横線の部分である可能性を有する黒ラン
の周辺の２値信号を、後述する第２記憶部５に情
報を残しながらチエツクしてゆき、縦線及び横線
の抽出を行ない、これに関する情報を後述する第
３記憶部８に記憶させる。 Reference numeral 4 denotes a line extraction unit which checks the binary signals around the black runs that may be vertical and horizontal lines found by the black run detection unit 3 while leaving the information in a second storage unit 5, which will be described later. Then, vertical lines and horizontal lines are extracted, and information regarding this is stored in a third storage section 8, which will be described later.

５は第２記憶部であり、線抽出部４の処理中の
情報を記憶する。 A second storage section 5 stores information being processed by the line extraction section 4.

６は記憶修正部であり、線抽出部４で抽出され
た縦線及び横線に相当する画信号を第１記憶部２
から消去すると共に、これら縦線及び横線と、こ
れら以外の文字等とが交わつていた部分の画信号
を第１記憶部２上で補充する。 Reference numeral 6 denotes a memory correction unit, which stores image signals corresponding to the vertical lines and horizontal lines extracted by the line extraction unit 4 in the first storage unit 2.
At the same time, the first storage unit 2 is supplemented with image signals of the portions where these vertical lines and horizontal lines intersect with characters other than these.

７は文字抽出部であり、記憶修正部６の処理を
施された後の第１記憶部２の画信号から、残され
た文字等を抽出し、これに関する情報を後述する
第３記憶部８に記憶させる。 Reference numeral 7 denotes a character extraction section, which extracts remaining characters, etc. from the image signal in the first storage section 2 after being processed by the memory correction section 6, and a third storage section 8 whose information related thereto will be described later. to be memorized.

８は第３記憶部であり、線抽出部４で抽出され
た縦線及び横線と、文字抽出部７で抽出された文
字等とに関する情報を記憶する。 Reference numeral 8 denotes a third storage section, which stores information regarding vertical lines and horizontal lines extracted by the line extraction section 4 and characters etc. extracted by the character extraction section 7.

第２図は本実施例の説明に用いる原稿上の画情
報の一例であり、ます目は画素の区切りを表わ
し、また、a₁〜a₃₅は主走査方向のアドレス、b₁
〜b₃₅は副走査方向のアドレスを表わすが、いず
れも図の繁雑化を避けるため５つ毎に付与するに
留めている。 FIG. 2 is an example of image information on a document used to explain this embodiment. Squares represent pixel divisions, a ₁ to a ₃₅ are addresses in the main scanning direction, and b ₁
_.about.b35 represents an address in the sub-scanning direction, but in order to avoid complicating the diagram, only every five addresses are given.

以下に本実施例の処理方式について具体的に説
明する。 The processing method of this embodiment will be specifically explained below.

本実施例においては、原稿上の画像の性質につ
いて次のような制限を付する。縦線及び横線の太
さは画素数にして２〜３の範囲にあり、その縦線
もしくは横線を形成している画素の列もしくは行
の各々が画素数にして10以上の長さを有する。ま
た、縦線の副走査方向の長さと横線の主走査方向
の長さとの下限、及び文字の主走査方向並びに副
走査方向の大きさの上限を画素数にして20とす
る。以上の制限は、処理の具体的説明のために一
応設定したものであつて、本発明の制約となるも
のではなく、運用において、実際の原稿の性質、
走査部の性能等から設定されるべきものである。 In this embodiment, the following restrictions are imposed on the nature of the image on the document. The thickness of the vertical line and horizontal line is in the range of 2 to 3 in terms of the number of pixels, and each column or row of pixels forming the vertical line or horizontal line has a length of 10 or more in terms of the number of pixels. Further, the lower limit of the length of the vertical line in the sub-scanning direction and the length of the horizontal line in the main-scanning direction, and the upper limit of the size of the character in the main-scanning direction and the sub-scanning direction are set to 20 in pixels. The above limitations have been set for the purpose of concretely explaining the processing, and do not limit the present invention.
This should be set based on the performance of the scanning unit, etc.

まず、走査部１において原稿が線順次に走査さ
れて原稿上の画情報が画素単位で２値のデイジタ
ル信号として取り出され、これが走査順に第１記
憶部２に１ページ分記憶される。 First, a document is line-sequentially scanned in the scanning section 1, and image information on the document is extracted pixel by pixel as a binary digital signal, which is stored for one page in the first storage section 2 in the scanning order.

次に、黒ラン検出部３において、第１記憶部２
内の画信号を主走査方向のライン単位に見てゆ
き、縦線の太さに相当する長さ２〜３の範囲の黒
ランの有無及び横線を形成する長さ10以上の行要
素の有無をチエツクする。第２図の原稿例の場合
であれば、b₁のラインにおいてa₂₈〜a₂₉の長さ２
の黒ランが縦線の第１候補として検出され、この
アドレス情報が次の線抽出部４に引継がれる。 Next, in the black run detection section 3, the first storage section 2
Examine the image signal within line by line in the main scanning direction, and check whether there are black runs with a length of 2 to 3, which corresponds to the thickness of the vertical line, and whether there are row elements with a length of 10 or more that form a horizontal line. Check. In the case of the manuscript example in Figure 2, the length of a ₂₈ to a ₂₉ in line b ₁ is 2.
The black run is detected as the first candidate for the vertical line, and this address information is taken over to the next line extraction unit 4.

線抽出部４においては、黒ラン検出部３で検出
された縦線候補もしくは横線候補の黒ランの周辺
への連続性を、その黒ランが検出されたラインの
前後のラインについて順次追跡して、前述の縦線
及び横線に関する制限内にはいつているか否かを
判断し、制限内にはいつているものを縦線もしく
は横線として抽出する。第２図の原稿例の場合で
あれば、黒ラン検出部３において、第１縦線候補
としてb₁ライン上のa₂₈〜a₂₉のランが検出されて
いるので、主走査方向アドレスa₂₈とa₂₉の画素列
を副走査方向アドレスb₂から順にチエツクしてゆ
く。a₂₉の画素列についてはb₁からb₂₄まで黒が続
いていて、縦線を形成する長さ10以上の画素列と
いう条件を満足している。またa₂₈の画素列はb₁
からb₁₅まで黒が続いていて長さ10以上の画素列
という条件を満足している。b₁₅の点でa₂₈の画素
列の黒が途切れたため、a₂₉の画素列に対してa₂₈
と反対側のa₃₀の画素列をチエツクすると、b₁₃か
らb₂₄まで黒が続いており長さ10以上の画素列と
いう条件を満足し、これによつて、b₁からb₂₄ま
での間で太さ２〜３の範囲にあるという条件及び
縦線の副走査方向の長さの下限20以上という条件
を満たすことが判明したわけで、この第１縦線候
補を縦線であると判断し、この縦線の始点アドレ
スa₂₈，b₁と終点アドレスa₃₀，b₂₄を第３記憶部８
に記憶させる。この処理中に第２記憶部５が使用
され、縦線と判断した時点でこの縦線を形成する
画素列３つの始点及び終点、a₂₈とb₁，b₁₅，a₂₉と
b₁，b₂₄，a₃₀とb₁₃，b₂₄のみを残こす。 The line extraction unit 4 sequentially traces the continuity of the vertical line candidate or horizontal line candidate detected by the black run detection unit 3 to the periphery of the black run for the lines before and after the line where the black run was detected. , it is determined whether or not the lines are within the limits regarding the vertical lines and horizontal lines described above, and lines that are within the limits are extracted as vertical lines or horizontal lines. In the case of the document example shown in FIG. 2, the black run detection unit 3 detects the runs a ₂₈ to a ₂₉ on line b ₁ as the first vertical line candidate, so the main scanning direction address a ₂₈ The pixel columns of and _a29 are checked in order from the sub-scanning direction address _b2 . Regarding pixel column a ₂₉ , black continues from b ₁ to b ₂₄ , satisfying the condition of a pixel column having a length of 10 or more and forming a vertical line. Also, the pixel column of a ₂₈ is b ₁
The black continues from b to _b15 , satisfying the condition of a pixel row of length 10 or more. Since the black of the pixel column a ₂₈ is interrupted at the point b ₁₅ , the pixel column a ₂₈ _is
When checking the pixel row of a ₃₀ on the opposite side, black continues from b ₁₃ to b ₂₄ and satisfies the condition of a pixel row of length ₁₀ _or more. It turns out that the conditions that the thickness is in the range of 2 to 3 and the minimum length of the vertical line in the sub-scanning direction are 20 or more are met, so this first vertical line candidate is determined to be a vertical line. The starting point address a ₂₈ , b ₁ and the ending point address a ₃₀ , b ₂₄ of this vertical line are stored in the third storage unit 8.
to be memorized. During this process, the second storage unit 5 is used, and when it is determined that it is a vertical line, it stores the starting and ending points of the three pixel columns forming this vertical line, a ₂₈ , b ₁ , b ₁₅ , a ₂₉ and
Only b ₁ , b ₂₄ , a ₃₀ and b ₁₃ , b ₂₄ are left.

記憶修正部６においては、第３記憶部８に記憶
された縦線の要素画素列に関する情報を第２記憶
部５から読み出し、これに対応する第１記憶部２
内の画信号を消去する。この後、いま消去した縦
線の要素画素列の原稿上の左右に隣接する部分に
当たるアドレスの画信号をチエツクし、残された
画像と交わつていたとみなされる場合はその交わ
つていたとみなされる部分の画信号を黒信号とす
る。第２図の原稿例の場合では、縦線の左右に隣
接する部分は、b₁からb₁₂の間ではa₂₇とa₃₀、b₁₃
からb₁₅の間ではa₂₇とa₃₁、b₁₆からb₂₄の間ではa₂₈
とa₃₁の画素列である。この部分に、縦線の左側
に黒があれば、その画素行に前後の画素行を含め
て３つの画素行上の縦線の右側の部分に黒がある
かどうかをチエツクし、このような黒の対があれ
ば、その黒の対にはさまれる縦線のあつた部分を
黒信号に修正する。第２図の原稿例で言えば、b₁
とb₁₂の間でa₂₇の画素列を見ていくとb₈の画素行
の部分に黒信号が検出されるので、この画素行b₈
と前後の画素行b₇及びb₉の画素行上の縦線の反対
側に当たるa₃₀の画素列との交点a₃₀，b₇、a₃₀，
b₈、a₃₀，b₉をチエツクするとa₃₀，b₈が黒である
ことがわかる。この黒信号の対a₂₇，b₈とa₃₀，b₈
ではさまれるa₂₈，b₈、a₂₉，b₈の点を黒信号に修
正する。これにより、一旦縦線に相当する部分と
して消去された、縦線と残された画像との交点に
あたる部分の黒信号が、復活したことになる。以
上がb₁ライン上の黒ランa₂₈〜a₂₉で始まる縦線の
処理であるが、この処理が終わつた後、再び黒ラ
ン検出部３での処理にもどる。 In the memory correction unit 6, information regarding the element pixel column of the vertical line stored in the third memory unit 8 is read from the second memory unit 5, and the information related to the element pixel column of the vertical line stored in the third memory unit 8 is read out from the second memory unit 5, and
Erase the image signal within. After this, the image signal of the address corresponding to the left and right adjacent parts on the document of the element pixel row of the vertical line that has just been erased is checked, and if it is deemed to have intersected with the remaining image, it is deemed that it has intersected. The image signal of the part is set as the black signal. In the case of the manuscript example in Figure 2, the parts adjacent to the left and right of the vertical line are a ₂₇ and a ₃₀ between b ₁ and b ₁₂ , and b ₁₃
a ₂₇ and a ₃₁ between b ₁₅ and a ₂₈ between b ₁₆ and b ₂₄
and a ₃₁ pixel row. If there is black on the left side of the vertical line in this part, check whether there is black on the right side of the vertical line on three pixel rows including the previous and next pixel rows. If there is a black pair, the part with the vertical line between the black pairs is corrected to a black signal. In the manuscript example in Figure 2, b ₁
Looking at the pixel column _a27 between and _b12 , a black signal is detected in the pixel row _b8 , so this pixel row _b8
_{The intersection point a 30} _, _b ₇ _, a ₃₀ ,
Checking b ₈ , a ₃₀ , and b ₉ reveals that a ₃₀ and b ₈ are black. This black signal pair a ₂₇ , b ₈ and a ₃₀ , b ₈
The points a ₂₈ , b ₈ , a ₂₉ , and b ₈ sandwiched between are corrected to black signals. As a result, the black signal in the portion corresponding to the intersection of the vertical line and the remaining image, which was once erased as a portion corresponding to the vertical line, is restored. The above is the processing of the vertical line starting from the black runs _a28 to _a29 on the _b1 line, and after this processing is completed, the processing returns to the black run detecting section 3 again.

黒ラン処理部３では次に、縦線の太さに相当す
る長さ２〜３の範囲の黒ランとして、b₅ライン上
においてa₂₁からa₂₂の間の長さ２の黒ランを検出
し、これを第２縦線候補として縦抽出部４に引継
ぐ。 Next, the black run processing unit 3 detects a black run with a length of 2 between a ₂₁ and a 22 on the b ₅ line as a black run with a length of _{2 to 3} corresponding to the thickness of the vertical line. Then, this is passed on to the vertical extraction unit 4 as a second vertical line candidate.

線抽出部４では、この黒ランの主走査方向アド
レスa₂₁とa₂₂の画素列を追跡してゆくが、a₂₁の画
素列はb₅で始まつてb₁₀で終わる長さ４、a₂₂の画
素列はb₅で始まつてb₁₁で終わる長さ５の画素列
であることが検出され、いずれも長さ10以上の画
素列という縦線の要素画素列とはならない。a₂₁
の左側に現われるa₂₀，a₁₉，a₁₈、右側に現れる
a₂₃，a₂₄，a₂₅，a₂₆の画素列についても同様であ
る。 The line extraction unit 4 traces the pixel rows of the main scanning direction addresses a ₂₁ and a ₂₂ of this black run, but the pixel row of a ₂₁ starts at b ₅ and ends at b ₁₀ , and has a length of 4, a It is detected that _{the 22} pixel columns are pixel columns of length 5 starting at b ₅ and ending at b ₁₁ , and none of them are element pixel columns of the vertical line, which are pixel columns of length 10 or more. a ₂₁
a ₂₀ , a 19 , a 18 that appear on the left side of , a ₁₉ , a ₁₈ that appears on the right side of
The same applies to the pixel columns a ₂₃ , a ₂₄ , a ₂₅ , and a ₂₆ .

次に横線の抽出について説明する。 Next, extraction of horizontal lines will be explained.

黒ラン検出部３ではラインb₈において、a₁₈で
始まりa₃₁で終わる長さ14の黒画素行を検出し横
線の要素となる長さ10以上の画素行の条件をみた
すのでこれを第１横線候補として線抽出部４に引
継ぐ。 The black run detecting unit 3 detects a black pixel row of length 14 starting from a ₁₈ and ending at a ₃₁ in line b ₈ , which satisfies the condition of a pixel row having a length of 10 or more and being an element of a horizontal line, so this is detected as the first black pixel row. The line extraction unit 4 takes over the line as a horizontal line candidate.

線抽出部４では第１横線候補の存在するb₈ライ
ンの前後のライン上に、同じく長さ10以上の黒ラ
インが有るかどうかをチエツクするが、b₇ライン
上にもb₉ライン上にも長さ10以上の黒ランがない
ので、この第１横線候補は横線の部分ではないと
判断する。 The line extraction unit 4 checks whether there is a black line with a length of 10 or more on the lines before and after the _b8 line where the first horizontal line candidate exists, but there are black lines on the _b7 line as well as on the _b9 line. Since there is no black run with a length of 10 or more, it is determined that this first horizontal line candidate is not a horizontal line part.

次に黒ラン検出部３において、ラインb₃₁上に
a₁₂で始まりa₂₈で終わる長さ17の黒ランを検出
し、第２横線候補として線抽出部４に引継く。 Next, in the black run detection section 3, on line b ₃₁
A black run of length 17 starting at a ₁₂ and ending at a ₂₈ is detected and handed over to the line extraction unit 4 as a second horizontal line candidate.

線抽出部４では、この第２横線候補の有つたラ
インb₃₁の前後のラインb₃₀及びb₃₂上の長さ10以上
の黒ランの有無をチエツクする。ラインb₃₂上に
a₁で始まりa₂₈で終わる長さ28の黒ランを検出
し、さらに次のラインb₃₃上にa₁で始まりa₁₃で終
わる黒ランを検出する。以上から、ラインb₃₁，
b₃₂及びb₃₃上の各画素行がいずれも10以上の長さ
を有し、主走査方向のアドレスa₁からa₂₈の間で
太さ２〜３を満足し、しかも全体の長さ28が横線
の長さの下限20以上であることから、これを横線
と判定し、その始点a₁，b₃₃、終点a₂₈，b₃₁を第３
記憶部８に記憶させる。また、縦線の処理の場合
と同様に、第２記憶部５には、第２横線候補を横
線と判定した後、これを形成する３つの画素行の
始点及び終点、b₃₁とa₁₂、a₂₈，b₃₂とa₁，a₂₈，b₃₃
とa₁，a₁₃を残す。 The line extractor 4 checks whether there are black runs with a length of 10 or more on the lines b ₃₀ and b ₃₂ before and after the line b ₃₁ containing the second horizontal line candidate. on line b ₃₂
A black run of length ₂₈ starting with a ₁ and ending with a 28 is detected, and a black run of length 28 starting with a ₁ and ending with a ₁₃ is detected on the next line b ₃₃ . From the above, line b ₃₁ ,
Each pixel row on b ₃₂ and b ₃₃ has a length of 10 or more, satisfies the thickness 2 to 3 between addresses a ₁ to a ₂₈ in the main scanning direction, and has an overall length of 28 is greater than or equal to the lower limit of the length of the horizontal line, 20, so this is determined to be a horizontal line, and its starting points a ₁ , b ₃₃ and ending points a ₂₈ , b ₃₁ are
The information is stored in the storage unit 8. In addition, as in the case of vertical line processing, after determining the second horizontal line candidate as a horizontal line, the second storage unit 5 stores the starting point and ending point of the three pixel rows forming the second horizontal line candidate, b ₃₁ and a ₁₂ , a ₂₈ , b ₃₂ and a ₁ , a ₂₈ , b ₃₃
and leave a ₁ and a ₁₃ .

記憶修正部６においては、抽出された横線の要
素画素行に関する情報を第２記憶部から読み出
し、これに対応する第１記憶部２内の画信号を消
去する。その後、縦線の場合と同様に、いま消去
した横線の上下に隣接する部分の画信号をチエツ
クする。第２図の場合で言えば、主走査方向アド
レスa₁からa₁₁の範囲ではb₃₁とb₃₄の画素行、a₁₂か
らa₁₃の範囲ではb₃₀とb₃₄の画素行、a₁₄からa₂₈の
範囲ではb₃₀とb₃₃の画素行である。まずa₁₆，b₃₀
の点が黒であるので、横線に関して反対側に隣接
し主走査方向アドレスa₁₆とその前後のa₁₅及びa₁₇
の点をチエツクするが、黒信号はない。次に
a₁₇，b₃₀の点が黒であるので、同様にa₁₆，b₃₃、
a₁₇，b₃₃、a₁₈，b₃₃の３画素をチエツクすると、
a₁₈，b₃₃の画素が黒であることがわかるので、こ
れらa₁₇，b₃₀とa₁₈，b₃₃の画素にはさまれる４画
素a₁₇，b₃₁、a₁₈，b₃₁、a₁₇，b₃₂、a₁₈，b₃₂を黒信
号に修正する。これにより、一旦横線に相当する
部分として消去された、横線と残された画像との
交点にあたる部分の黒信号が復活したことにな
る。 The memory correction unit 6 reads information regarding the extracted element pixel row of the horizontal line from the second storage unit, and erases the corresponding image signal in the first storage unit 2. Thereafter, as in the case of vertical lines, the image signals of the portions adjacent above and below the horizontal line just erased are checked. In the case of Fig. 2, in the main scanning direction address range from a ₁ to a ₁₁ , the pixel rows are b ₃₁ and b ₃₄ , in the range from a ₁₂ to a ₁₃ , the pixel rows are b ₃₀ and b ₃₄ , and from a ₁₄ to pixel rows In the range of a ₂₈ , there are pixel rows of b ₃₀ and b ₃₃ . First a ₁₆ , b ₃₀
Since the dot is black, the adjacent main scanning direction address A ₁₆ on the opposite side with respect to the horizontal line and A ₁₅ and A ₁₇ before and after it are
Check the points, but there is no black signal. next
Since the points a ₁₇ , b ₃₀ are black, similarly a ₁₆ , b ₃₃ ,
When you check the three pixels a ₁₇ , b ₃₃ , a ₁₈ , b ₃₃ ,
Since we know that the pixels a ₁₈ and b ₃₃ are black, the four pixels sandwiched between these pixels a ₁₇ , b ₃₀ and a ₁₈ , b ₃₃ are a ₁₇ , b ₃₁ , a ₁₈ , b ₃₁ , a ₁₇ , b ₃₂ , a ₁₈ , b ₃₂ are corrected to black signals. As a result, the black signal in the portion corresponding to the intersection of the horizontal line and the remaining image, which was once erased as a portion corresponding to the horizontal line, is restored.

このように原稿上の縦線及び横線が抽出されて
そのアドレス情報が第２記憶部５に記憶され、そ
れら縦線及び横線に相当する画信号が第１記憶部
２から消去され、しかも縦線及び横線が他の画像
と交わつていたとみなされる部分があれば、その
部分の画信号を補充された後の第１記憶部２内の
画信号を、文字抽出部７が続み出して文字を抽出
し、その情報を第３記憶部８に記憶させる。 In this way, the vertical lines and horizontal lines on the document are extracted and their address information is stored in the second storage unit 5, the image signals corresponding to these vertical lines and horizontal lines are deleted from the first storage unit 2, and the vertical lines If there is a part where the horizontal line is considered to intersect with another image, the character extracting part 7 continues to extract the image signal in the first storage part 2 after being supplemented with the image signal of that part and converts it into a character. is extracted and the information is stored in the third storage unit 8.

本発明による画像抽出方式によれば、上記実施
例によつて示した通り、原稿上の線図形と文字と
が交わる部分がある場合においても、縦線及び横
線の太さ及び長さに関する制限を設定することに
より、この制限を利用して縦線及び横線と文字と
を個別に抽出することができる。なおこの制限
は、扱う原稿上の画像の性質と走査部の性能とか
ら適当なところに設定することが可能で、実際の
画像の抽出処理に極めて有効な方式を提供するも
のである。 According to the image extraction method according to the present invention, as shown in the above embodiment, even when there is a portion where a line figure and a character intersect on a document, restrictions regarding the thickness and length of vertical lines and horizontal lines can be applied. By setting, vertical lines, horizontal lines, and characters can be extracted individually using this restriction. Note that this limit can be set appropriately depending on the nature of the image on the document to be handled and the performance of the scanning unit, and provides an extremely effective method for actual image extraction processing.

[Brief explanation of the drawing]

第１図は、本発明による画像抽出方式を実現す
るための一実施例を示すブロツク図、第２図は、
処理の一例の説明に用いる原稿上の画情報の例を
示す図である。１…走査部、２…第１記憶部、３…黒ラン検出
部、４…線抽出部、５…第２記憶部、６…記憶修
正部、７…文字抽出部、８…第３記憶部。 FIG. 1 is a block diagram showing an embodiment of the image extraction method according to the present invention, and FIG.
FIG. 6 is a diagram illustrating an example of image information on a document used to explain an example of processing. 1... Scanning section, 2... First storage section, 3... Black run detection section, 4... Line extraction section, 5... Second storage section, 6... Memory correction section, 7... Character extraction section, 8... Third storage section .

Claims

[Claims]

1. Image information of a document on which line figures and characters are drawn is stored as a binary digital signal in pixel units, conditions are set regarding the thickness and length of the line figure, and images that satisfy the conditions are extracted as line figures. Then, a binary signal corresponding to image information of the line figure is deleted from the memory, and a binary signal corresponding to image information of a portion where the line figure and another image intersect is added to the memory. An image extraction method characterized by extracting characters from the stored image information after doing so.