JPS5932077A - Character segmenting device - Google Patents

Character segmenting device

Info

Publication number
JPS5932077A
JPS5932077A JP57142364A JP14236482A JPS5932077A JP S5932077 A JPS5932077 A JP S5932077A JP 57142364 A JP57142364 A JP 57142364A JP 14236482 A JP14236482 A JP 14236482A JP S5932077 A JPS5932077 A JP S5932077A
Authority
JP
Japan
Prior art keywords
black
change position
characters
character
detected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP57142364A
Other languages
Japanese (ja)
Inventor
Akira Sakurai
彰 桜井
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP57142364A priority Critical patent/JPS5932077A/en
Publication of JPS5932077A publication Critical patent/JPS5932077A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Input (AREA)

Abstract

PURPOSE:To segment accurately characters, by segmenting a read pattern for each row to store them and counting the number of vertical black picture elements and the number of groups of continuous black picture elements to obtain the difference of count values as well as the changing state of said difference. CONSTITUTION:A pattern read out of a slip, etc. is segmented 1 for each row and supplied to a character segmenting device 2 to be stored in a picture memory 11. Then the number of black picture elements is counted 12 for each vertical column, and at the same time the number of groups of continuous black picture elements (black run) is counted 13. A shadow block is detected 14 out of the output of the counter 12, and at the same time a sudden change position is detected 15 for the count value. Then a region changing position is detected 16 from the output of the counter 13. The segmenting position is decided 17 from those detecting positions. The memory 11 is segmented at a segmenting part 18 and fed to a picture memory 3. For instance, the black run is constant for the 11th and 12th columns shown in the figure, and the black picture element has a width approximately equal to the width of a line to be defined as a segmenting position candidate. Thus it is possible to segment characters even though no space exists between characters.

Description

【発明の詳細な説明】 〔技術分野〕 本発明は、帳票等から読み取られた文字等の読取パター
ンを文字ごとに切り出す文字切出し装置に関する。
DETAILED DESCRIPTION OF THE INVENTION [Technical Field] The present invention relates to a character cutting device that cuts out a reading pattern of characters read from a form or the like, character by character.

〔従来技術〕[Prior art]

従来の文字切出し装置においては第1図に示すような切
出し方法が用いられている。
In a conventional character cutting device, a cutting method as shown in FIG. 1 is used.

■ 各走査ラインごとに黒画素の有無を検出し、黒画素
が連続している部分全1文字として切シ出す方法。
■ A method of detecting the presence or absence of black pixels on each scanning line and cutting out all parts of consecutive black pixels as one character.

■ 各走査ラインごとに黒画素数をカウントし射影濃度
分布を求め、その分布に基づき文字を切り出す方法。
■ A method of counting the number of black pixels for each scanning line, determining the projected density distribution, and cutting out characters based on that distribution.

■ 文字ピッチが一定の文字においては、その文字ピッ
チを用いて文字を切り出す方法。
■ For characters with a constant character pitch, a method of cutting out characters using that character pitch.

しかしながら第2図に示すように、連結文字のように文
字間がなくなっている場合は上記の、■の方法では切り
出せず、文字ピッチが可変な場合は、上記■の方法では
所望の文字切出しができないという問題があった。
However, as shown in Figure 2, when there is no space between characters, such as connected characters, the above method cannot be used, and when the character pitch is variable, the desired character cannot be cut out using the above method. The problem was that I couldn't do it.

〔目的〕〔the purpose〕

本発明は上記事情を考慮してなされたもので、文字ピッ
チが変化するような文字で、文字間がな゛ぐなっている
場合でも所望の文字切出しができる文字切出し装置を提
供することを目的とする。
The present invention has been made in consideration of the above-mentioned circumstances, and an object of the present invention is to provide a character cutting device that can cut out desired characters even when the character pitch changes and the spaces between characters are wide. shall be.

〔実施例〕〔Example〕

本発明の一実施例を第3図に示す。帳票等に書かれた文
字等は読取部(図示せず)によって読み取られ、その読
取パターンは行切出し部lで1行分の行切出しがおこな
われる。行切出しさ冗た読取パターンは文字切出し装置
2の画像メモリ11に格納された1文字ごとに分離する
切出しがおこなわれる。黒画素カウンタ12は読取パタ
ーンの縦方向の各走査ラインごとの黒画素数をカウント
するものである。また黒ランカウンタ■3は各走査ライ
ンごとの連続する黒画素の集まり(これを「黒ラン」と
いう)の数をカウントするものである。これら黒画素カ
ウンタ12と黒ランカウンタ13とはひとつのカウンタ
によ多構成してもよい。黒画素カウンタ12tこよシカ
ラントされた各走査ラインごとの黒I!il素数を用い
て、射影ブロック検出部14では「射影ブロック」が検
出され、急激変化位置検出部15では「急激変化位置」
が検出される。また領域変化位置検出部16では黒ラン
カウンタ13でカウントされた各走査ラインごとの黒ラ
ン数を用いて「領域変化位置」が検出される。「射影ブ
ロック」、「急激変化位置」、「領域変化位置」につい
ては後述する。これら各検出部14 、15 、16の
検出結果を用いて、切出し判定部では切出し位置の決定
がおこなわれ(詳細については後述する)、切出し部1
8でその決定した切出し位置で切り出される。切り出さ
れた文字ごとの読取パターンは、文字メモリ3に格納さ
れ文字認識等の処理がおこなわれる。
An embodiment of the present invention is shown in FIG. Characters written on a form or the like are read by a reading section (not shown), and one line of the reading pattern is cut out by a line cutting section l. Line Cutting Redundant reading patterns are cut out for each character stored in the image memory 11 of the character cutting device 2. The black pixel counter 12 counts the number of black pixels for each vertical scanning line of the reading pattern. The black run counter 3 counts the number of consecutive black pixel groups (referred to as "black runs") for each scanning line. The black pixel counter 12 and the black run counter 13 may be configured as one counter. Black pixel counter 12t black I for each scanned line! Using the il prime number, the projection block detection section 14 detects a "projection block," and the sudden change position detection section 15 detects a "sudden change position."
is detected. Further, the area change position detecting section 16 detects the "area change position" using the number of black runs for each scanning line counted by the black run counter 13. The "projection block", "rapid change position", and "region change position" will be described later. Using the detection results of these detection units 14, 15, and 16, the extraction determination unit determines the extraction position (details will be described later).
In step 8, the image is cut out at the determined cutting position. The extracted reading pattern for each character is stored in the character memory 3 and subjected to processing such as character recognition.

次に本実施例の動作分力4図、第5図を用いて更に詳し
く述べる。
Next, the present embodiment will be described in more detail with reference to FIG. 4 and FIG. 5.

(1)黒画素カウンタ12で各走査ラインごとの黒画素
数をカウントする。
(1) The black pixel counter 12 counts the number of black pixels for each scanning line.

(2)黒ランカウンタ13で各走査ラインごとの黒ラン
数をカウントする。
(2) The black run counter 13 counts the number of black runs for each scanning line.

(3)  (1)でカウントした黒画素数正こより、黒
画素数がゼロでない連続する走査ラインの範囲すなわち
「射影ブロック」を射影ブロック検出部■4にて検出す
る。
(3) Based on the number of black pixels counted in (1), a range of continuous scanning lines in which the number of black pixels is not zero, that is, a "projection block" is detected by the projection block detection unit (4).

(4)  (1)でカウントした黒画素数が、文字線巾
の2倍程度(本実施例では「7」)以上変化する位置を
「急激変化位置」として、急激変化位置検出部15にて
検出する。
(4) The position where the number of black pixels counted in (1) changes by more than twice the character line width (“7” in this example) is defined as a “rapid change position” and is detected by the rapid change position detection unit 15. To detect.

(5)  (7)でカウントした黒ラン数が変化する位
置を「領域変化位置」として、領域変化位置検出部16
にて検出する。
(5) The area change position detection unit 16 sets the position where the number of black runs counted in (7) changes as the “area change position”.
Detected by

(6)領域変化位置が射影ブロック内にない場合はその
射影ブロックを1文字として切シ出す。
(6) If the area change position is not within a projection block, cut out the projection block as one character.

(7)  (3)によシ求めた射影ブロックが所定故(
本実施例ではr18J)より小さいときはそのまま1文
字として切シ出す。
(7) The projection block obtained according to (3) is
In this embodiment, if it is smaller than r18J), it is cut out as one character.

(8)  (6)、(7)の条件を満足しないときは射
影ブロックの左右1/4を除いた「処理範囲」で以下の
処理をおこなう。
(8) If the conditions (6) and (7) are not satisfied, perform the following processing in a "processing range" excluding the left and right quarters of the projection block.

(i)(2)でカウントした黒ラン数が1である領域に
おいて、 げン急激変化位置があるときはその位置で領域を分割し
、(1)でカウントした黒画素数が小さい方の領域にお
いて、黒画素数が線幅程度(本実施例では「3」)以下
の位置があれば切出し位置候補とする。第4図のAが切
出し位置候補となる。
(i) In an area where the number of black runs counted in (2) is 1, if there is a position where there is a sudden change in the density, divide the area at that position, and divide the area into the area where the number of black pixels counted in (1) is smaller. If there is a position where the number of black pixels is equal to or less than the line width (“3” in this embodiment), it is determined as a candidate for the cutting position. A in FIG. 4 is a candidate for the cutting position.

(ロ)急激変化位置がないときは、黒ラン数が1である
すべての領域において、黒画素数が線幅程度(本実施例
では「3」)以下の位置があれば切出し位置候補とする
(b) If there is no sudden change position, in all areas where the number of black runs is 1, if there is a position where the number of black pixels is equal to or less than the line width ("3" in this example), it is considered as a candidate for the cropping position. .

(ii)(2)でカウントとした黒ラン数が2である領
域において、 この領域の両端が急激変化位置であυ、両側の画素数が
大きいときで、この領域内に画素数が線幅の2倍程度(
本実施例では「5」)以下の位置があれば切り出し位置
候補とする。第5図のB、Cが切出し位置候補となる。
(ii) In an area where the number of black runs counted in (2) is 2, both ends of this area are abrupt change positions υ, the number of pixels on both sides is large, and the number of pixels in this area is the line width. About twice as much (
In this embodiment, if there is a position equal to or less than "5", it is considered a cutting position candidate. B and C in FIG. 5 are cutout position candidates.

(fit)処理範囲の中央Oから5以上離れた切出し位
置候補は棄却する。
(fit) Cutting position candidates that are five or more away from the center O of the processing range are rejected.

(i■) 2ケ所以上の切出し位置候補があるときは処
理範囲の中央Oに近い位置を切出し位置とする。第5図
では切出し位置候補B、Cのうち中央Oに近いCが切出
し位置となる。
(i■) When there are two or more candidate cutting positions, a position close to the center O of the processing range is set as the cutting position. In FIG. 5, of the cropping position candidates B and C, C closest to the center O is the cropping position.

(9)(7)において1文字として切り出す読取パター
ンのうちとなりあう2つの読取パターンの間隔がn以下
で2つの読取パターンをあわせた幅がm以下のときは1
文字として切り出す。本実施例ではn=51m=18と
した。
(9) In (7), if the interval between two adjacent reading patterns cut out as one character is n or less and the combined width of the two reading patterns is m or less, 1
Cut out as text. In this example, n=51m=18.

以上のような処理の結果第4図に示す読取パターンでは
位置Aで、第5図に示す言つて取パターンでは位置Cで
文字切出しがおこなわれ、なぐなった英小文字のr L
、8 J p r t rjも正しく切り出すことがで
きる。
As a result of the above processing, characters are cut out at position A in the reading pattern shown in Fig. 4 and at position C in the reading pattern shown in Fig. 5, resulting in a rounded lowercase English letter r L.
, 8 J p r t rj can also be correctly extracted.

なお、各判定条件における数値は切出し対象文字、セン
サの解像度、等により経験的に定められるものであシ、
上述した数値に限定されるものではない。
Note that the numerical values for each judgment condition are determined empirically based on the characters to be cut out, the resolution of the sensor, etc.
It is not limited to the numerical values mentioned above.

〔効果〕〔effect〕

以上の通り本発明によれば、文字ピッチが変化するよう
な文字で文字間がなく・なっている場合でも、所望の文
字切出しをおこなうことができる。
As described above, according to the present invention, even when the character pitch changes and there is no space between characters, desired character extraction can be performed.

特に文字ピッチが可変な印字文字に対しては有効であり
、OCR,文書編集装置等に利用することができる。
It is particularly effective for printed characters with variable character pitch, and can be used in OCR, document editing devices, etc.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は従来の文字切出し装置の文字切出し方法を示す
パターン図、 第2図(a) e (b) 、 (C) 、 (d)は
それぞれ従来の文字切出し方法では切出し困難な読取ノ
くターンを示すノシターン図、 第3図は本発明の一実施例による文字切出し装置のブロ
ック図、 第4図、第5図はそれぞれ同装置の文字切出し動作金示
す説明図でおる。 1・・・行切出し部、2・・・文字切出し装置、3・・
・文字メモリ、11・・・画像メモリ、12・・・黒画
素カウンタ、13・・・黒ランカウンタ、14・・・射
影ブロック検出部、15・・・急激変化位置検出部、1
6・・・領域変化位置検出部、17・・・切出し判定部
、18・・・切出し部。 出願人代理人  猪 股    清 第2図 (a)     (b)      (c)     
 (d)第3図 /2 第4図 (1)黒画素数24 +52020t94211128
878891+ 13112(2)黒うン数12322
221223333333221(3)射影ブ叱ソつ□ (4) 急;秀父変イヒJ[TυHtJf      
       I(5)  9頁カ!し変イロil装置
 NJ     Ill        Ji(8) 
(i )切出し1Ω置葭補        A第5図 (1)黒画素数134162021206664481
616157677754(2)黒ラン数111111
12222231112221111(3)射影ブ引ツ
ク□
Fig. 1 is a pattern diagram showing the character cutting method of a conventional character cutting device, and Fig. 2 (a), e, (b), (C), and (d) are pattern diagrams showing the character cutting method of a conventional character cutting device. FIG. 3 is a block diagram of a character cutting device according to an embodiment of the present invention, and FIGS. 4 and 5 are explanatory diagrams showing the character cutting operation of the same device. 1...Line cutting unit, 2...Character cutting device, 3...
・Character memory, 11... Image memory, 12... Black pixel counter, 13... Black run counter, 14... Projection block detection section, 15... Rapid change position detection section, 1
6... Area change position detection section, 17... Cutting out determination section, 18... Cutting out section. Applicant's agent Kiyoshi Inomata Figure 2 (a) (b) (c)
(d) Figure 3/2 Figure 4 (1) Number of black pixels 24 +52020t94211128
878891+ 13112 (2) Black number 12322
221223333333221 (3) Projection scolding□ (4) Sudden;
I (5) 9 pages! Shihen Ill Ji (8)
(i) Cutout 1Ω replacement A Fig. 5 (1) Number of black pixels 134162021206664481
616157677754 (2) Number of black runs 111111
12222231112221111 (3) Projection book □

Claims (1)

【特許請求の範囲】 帳檗等から読み取られた文字等の読取パターンを文字等
ごとに切り出す文字切出し装置において、前記読取パタ
ーンにおける各走査ラインごとの黒画素数および黒ラン
数をカウントするカウンタと、 このカウンタによりカウントされた各走査ラインごとの
黒画素数の有無により射影ブロックを検出する射影ブロ
ック検出部と、 前記カウンタによりカウントされた各走査ラインごとの
黒画素数が所定数以上変化する走査ラインの急激変化位
置を検出する急激変化位置検出部と、 前記カウンタによりカウントされた各走査ラインごとの
黒ラン数が変化する走査ラインの領域変化位置を検出す
る領域変化位置検出部と、前記射影ブロック検出部で検
出された射影ブロックと、前記急激位置検出部で検出さ
れた急激変化位置と、前記領域変化位置検出部で検出さ
れた領域変化位置とに基づいて予め足められた判定条件
によシ切出し位置を決定する切出し判定部と金備え、 前記切出し判定部により決定された切出し位置により前
記読取パターンの切出しをおこなうことを特徴とする文
字切出し装置。
[Scope of Claims] A character cutting device that cuts out a reading pattern of characters read from a book board or the like into individual characters, comprising: a counter that counts the number of black pixels and the number of black runs for each scanning line in the reading pattern; , a projection block detection unit that detects a projection block based on the presence or absence of the number of black pixels in each scanning line counted by the counter; and a scanning unit in which the number of black pixels in each scanning line counted by the counter changes by a predetermined number or more. an abrupt change position detection section that detects a sudden change position of a line; an area change position detection section that detects an area change position of a scanning line where the number of black runs for each scanning line counted by the counter changes; and the projection Judgment conditions are set in advance based on the projection block detected by the block detection section, the sudden change position detected by the sudden position detection section, and the area change position detected by the area change position detection section. What is claimed is: 1. A character cutting device, comprising: a cut-out determination section for determining a cut-out position; and a cut-out determination section, which cuts out the reading pattern at the cut-out position determined by the cut-out determination section.
JP57142364A 1982-08-17 1982-08-17 Character segmenting device Pending JPS5932077A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP57142364A JPS5932077A (en) 1982-08-17 1982-08-17 Character segmenting device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP57142364A JPS5932077A (en) 1982-08-17 1982-08-17 Character segmenting device

Publications (1)

Publication Number Publication Date
JPS5932077A true JPS5932077A (en) 1984-02-21

Family

ID=15313661

Family Applications (1)

Application Number Title Priority Date Filing Date
JP57142364A Pending JPS5932077A (en) 1982-08-17 1982-08-17 Character segmenting device

Country Status (1)

Country Link
JP (1) JPS5932077A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63163684A (en) * 1986-12-26 1988-07-07 Toshiba Corp Character pattern segmentation device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS63163684A (en) * 1986-12-26 1988-07-07 Toshiba Corp Character pattern segmentation device

Similar Documents

Publication Publication Date Title
JPS6077279A (en) Initiation of character image
EP0248262B1 (en) Apparatus and method for detecting character components on a printed document
EP0750415B1 (en) Image processing method and apparatus
JPH04195692A (en) Document reader
EP0375352A1 (en) Method of searching a matrix of binary data
EP0062665A1 (en) Segmentation system and method for optical character scanning
JPS5932077A (en) Character segmenting device
JPH0430070B2 (en)
JPH0410087A (en) Base line extracting method
JPS6325391B2 (en)
JPH0373916B2 (en)
JPH02273884A (en) Detecting and correcting method for distortion of document image
JPH0564396B2 (en)
JPS6227887A (en) Character type separating system
JPH04267494A (en) Character segmenting method and character recognizing device
JPH09106438A (en) Method and apparatus for detection of width in equiwidth font
JPH04339471A (en) Device for identifying image area
JPS63101983A (en) Character string extracting system
JPH07120392B2 (en) Character pattern cutting device
JP2851102B2 (en) Character extraction method
JPH03240184A (en) Attribute decision device
JP2508195B2 (en) Character line extraction device
JPH04343192A (en) Character segmenting method of character recognizing device
JPH0459670B2 (en)
Ragupathi A fast and robust approach for document segmentation and classification