JPS62192887A - Feature quantity generating method in character recognizing device - Google Patents

Feature quantity generating method in character recognizing device

Info

Publication number
JPS62192887A
JPS62192887A JP61036056A JP3605686A JPS62192887A JP S62192887 A JPS62192887 A JP S62192887A JP 61036056 A JP61036056 A JP 61036056A JP 3605686 A JP3605686 A JP 3605686A JP S62192887 A JPS62192887 A JP S62192887A
Authority
JP
Japan
Prior art keywords
histogram
input pattern
blocks
instruction data
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP61036056A
Other languages
Japanese (ja)
Inventor
Masahiro Nakamura
昌弘 中村
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ricoh Co Ltd
Original Assignee
Ricoh Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ricoh Co Ltd filed Critical Ricoh Co Ltd
Priority to JP61036056A priority Critical patent/JPS62192887A/en
Publication of JPS62192887A publication Critical patent/JPS62192887A/en
Pending legal-status Critical Current

Links

Abstract

PURPOSE:To decrease a feature quantity and to realize the improvement of a recognizing ratio and the recognizing speed by unifying a part of the divided block of an input pattern, generating a histogram and designating the block to be unified. CONSTITUTION:An operator inputs unifying instruction data to a host computer 51 by using a keyboard 54. The host computer 51 writes the unifying instruction data into a memory 53. After an OCR processor 52 reads an original 55 and the direction coding and area dividing of an outline part are executed concerning the input pattern, the histogram is prepared. At the time of preparing the historgram, the OCR processor 52 reads the unifying instruction data of the memory 53, decides the block to be unified and unifies the histogram of plural instructed blocks. Thereafter, the OCR processor 52 compares the obtained histogram with the dictionary and determines a candidate character.

Description

【発明の詳細な説明】 〔技術分野〕 本発明はOCR等の文字認識装置における特徴量生成方
法に関する。
DETAILED DESCRIPTION OF THE INVENTION [Technical Field] The present invention relates to a feature generation method in a character recognition device such as OCR.

〔従来技術〕[Prior art]

OCR等における文字認識方法の一つに、入力パターン
をその輪郭部に方向コードを付けて複数ブロックに分割
し、この分割したブロック毎にその方向コード別のヒス
トグラムをとり、この各ヒストグラムを特徴量として文
字認識を行う方法がある。この場合、例えば入力パター
ンを4×4に分割し、8方向の方向コードを用いるとす
ると、4x4x8=128次元の特徴量が抽出される。
One of the character recognition methods in OCR, etc. is to divide an input pattern into multiple blocks by attaching a direction code to its outline, take a histogram for each divided block by its direction code, and use each histogram as a feature value. There is a method for character recognition. In this case, for example, if the input pattern is divided into 4×4 and eight direction codes are used, 4×4×8=128 dimensional feature amounts are extracted.

ところで、この特徴量の中には文字の識別能力の高いも
のもあれば低いものもある。しかしながら、従来はこれ
らの特徴量をいずれも同じように扱って距離演算を行っ
ており、このため、認識率及び認識速度の低下を招く一
因となっていた。
By the way, some of these feature quantities have high character discrimination ability, while others have low character discrimination ability. However, conventionally, distance calculations have been performed by treating all of these feature amounts in the same way, which has been one of the causes of a decrease in recognition rate and recognition speed.

〔目 的〕〔the purpose〕

本発明の目的は、入力パターンを複数ブロックに分割し
、この分割したブロック毎にその方向コード別にヒスト
グラムをとり、この各ヒストグラムを特徴量として文字
認識を行う文字認識装置において、認識率及び認識速度
の向上を図ることにある。
An object of the present invention is to improve recognition rate and recognition speed in a character recognition device that divides an input pattern into a plurality of blocks, takes a histogram for each direction code for each divided block, and performs character recognition using each histogram as a feature quantity. The aim is to improve the

〔構 成〕〔composition〕

本発明は、分割したブロックの一部を統合してヒストグ
ラムを生成し、しかも、統合するブロックを操作者が指
定できるようにして、特徴量を削減し、それによって認
識率及び認識速度の向上を実現するものである。以下、
図面によって本発明の一実施例を説明する。
The present invention integrates some of the divided blocks to generate a histogram, and also allows the operator to specify the blocks to be integrated, thereby reducing the amount of features and thereby improving the recognition rate and speed. It is something that will be realized. below,
An embodiment of the present invention will be described with reference to the drawings.

はじめ、第2図により入力パターンの領域分割について
説明する。まず、入力パターンの輪郭部について方向コ
ードを付ける(ステップ21)。
First, region division of an input pattern will be explained with reference to FIG. First, a direction code is attached to the outline of the input pattern (step 21).

次に、この入力パターンの輪郭部に付けた方向コードを
カウントし、その総数を求める(ステップ22)。次に
、方向コードの総数に基づいてX方向、Y方向への分割
座標を求める。例えば、領域をnXmに分割するとして
、方向コードの総数をそこで、入力パターンをX方向に
スキャンし、方向コード数が各分割点となるX座標を求
める(ステップ25)。同様に、Y方向の分割点は一一
一。
Next, the direction codes attached to the contours of this input pattern are counted and the total number is determined (step 22). Next, the division coordinates in the X direction and the Y direction are determined based on the total number of direction codes. For example, assuming that the area is divided into nXm, the input pattern is scanned in the X direction using the total number of direction codes, and the X coordinates at which the number of direction codes corresponds to each division point are determined (step 25). Similarly, the dividing point in the Y direction is 111.

m          m 27)。そこで、入力パターンをY方向にスキャンし、
方向コード数が各分割点となるY座標を求める(ステッ
プ28)。
m m27). Therefore, scan the input pattern in the Y direction,
The Y coordinate at which the direction code number corresponds to each division point is determined (step 28).

第1図は本発明による特徴量生成を説明するためのフロ
ーチャートである。
FIG. 1 is a flowchart for explaining feature amount generation according to the present invention.

まず、各分割したブロック毎に、それぞれの方向コード
のヒストグラムを作成する(ステップ11)1次に、ホ
ストプロセッサからの統合指示データを読み込み(ステ
ップ12)、この統合指示データを参照して該当ブロッ
クのヒストグラムを統合する(ステップ13)。第3図
は4×4に領域を分割する例であり、この場合の統合す
るブロックと統合指示データの一例を第4図に示す。
First, a histogram of each direction code is created for each divided block (step 11).Next, the integrated instruction data from the host processor is read (step 12), and this integrated instruction data is referenced to block the corresponding block. (step 13). FIG. 3 shows an example of dividing an area into 4×4 areas, and FIG. 4 shows an example of blocks to be integrated and integration instruction data in this case.

第5図は本発明の方法を実現するハードウェア構成の概
略ブロック図である。第5図において、ホストコンピュ
ータ51とOCRプロセッサ52はメモリ53を共用し
ている。操作者はキーボード54を用いて第4図に示す
如き形式の統合指示データをホストコンピュータ51に
入力する。この統合指示データをホストコンピュータ5
1はメモリ53に書込む。一方、OCRプロセッサ52
は原稿55を読み取り、その入力パターンについて、第
2図のフローにしたがって輪郭部の方向コード付は及び
領域分割を行った後、第1図のフローにしたがってヒス
トグラムを作成する。このヒストグラム作成時、OCR
プロセッサ52はメモリ53の統合指示データを読み取
って統合するブロックを判定し、指示された複数ブロッ
クのヒストグラムを統合する。その後、OCRプロセッ
サ52は、得られた各ヒストグラムを入力パターンの特
徴量としてあらかじめ用意した辞書と比較演算し、その
距離により候補文字を決定する。
FIG. 5 is a schematic block diagram of a hardware configuration that implements the method of the present invention. In FIG. 5, a host computer 51 and an OCR processor 52 share a memory 53. The operator uses the keyboard 54 to input integrated instruction data in the format shown in FIG. 4 into the host computer 51. This integrated instruction data is transferred to the host computer 5.
1 is written to the memory 53. On the other hand, OCR processor 52
reads the original document 55, attaches a direction code to the outline and divides the input pattern into regions according to the flowchart shown in FIG. 2, and then creates a histogram according to the flowchart shown in FIG. When creating this histogram, OCR
The processor 52 reads the integration instruction data in the memory 53, determines the blocks to be integrated, and integrates the histograms of the specified blocks. Thereafter, the OCR processor 52 compares each of the obtained histograms with a dictionary prepared in advance as a feature quantity of the input pattern, and determines candidate characters based on the distance.

なお、統合するブロックは、あらかじめ辞書を作成する
時に多変量解析等の手法を使い、識別能力を判定して決
定すればよい、また、操作者は文字種又はフォント毎に
統合するブロックを可変にできる。
Note that the blocks to be integrated can be determined by using methods such as multivariate analysis when creating the dictionary in advance and determining the discrimination ability.Also, the operator can change the blocks to be integrated for each character type or font. .

〔効 果〕〔effect〕

本発明によれば、入力パターンをその輪郭部に方向コー
ドを付けて複数ブロックに分割し、該分割したブロック
毎にその方向コード別のヒストグラムをとり、この各ヒ
ストグラムを特徴量として文字認識行う際、一部のブロ
ックを統合して特徴量を作成することにより、特徴量が
削減できるため、認識速度の向上が期待できる。また、
統合する特徴量は辞書識別能力の低いものであることか
ら、認識率の向上が期待でき、かつ、辞書容量が削減で
きる。
According to the present invention, an input pattern is divided into a plurality of blocks by attaching a direction code to its outline, a histogram is obtained for each direction code for each divided block, and each histogram is used as a feature when performing character recognition. By merging some blocks to create a feature, the number of features can be reduced, and recognition speed can be expected to improve. Also,
Since the features to be integrated have low dictionary identification ability, it is expected that the recognition rate will improve and the dictionary capacity can be reduced.

【図面の簡単な説明】[Brief explanation of drawings]

第1図は本発明方法を説明するためのフローチャート、
第2図は入力パターンの領域分割を説明するためのフロ
ーチャート、第3図は領域分割の一例を示す図、第4図
は統合指示データと統合ブロックの一例を示す図、第5
図は本発明を実現するハードウェア構成のブロック図で
ある。 1・・・ホストコンピュータ、  52・・・OCRプ
ロセッサ、 53・・・メモリ、 54・・・キーボー
ド。 55・・・原稿。
FIG. 1 is a flowchart for explaining the method of the present invention,
FIG. 2 is a flowchart for explaining region division of an input pattern, FIG. 3 is a diagram showing an example of region division, FIG. 4 is a diagram showing an example of integrated instruction data and integrated blocks, and FIG.
The figure is a block diagram of a hardware configuration for realizing the present invention. DESCRIPTION OF SYMBOLS 1...Host computer, 52...OCR processor, 53...Memory, 54...Keyboard. 55...Manuscript.

Claims (2)

【特許請求の範囲】[Claims] (1)輪郭部に方向コードの付された入力パターンを複
数ブロックに分割して、各ブロック毎に方向コード別の
ヒストグラムをとり、各ヒストグラムを特徴量として文
字認識を行う文字認識装置において、前記分割したブロ
ックの一部を統合してヒストグラムを生成することを特
徴とする文字認識装置における特徴量生成方法。
(1) In a character recognition device that divides an input pattern in which a direction code is attached to an outline into a plurality of blocks, takes a histogram for each direction code for each block, and performs character recognition using each histogram as a feature quantity, A method for generating feature amounts in a character recognition device, the method comprising generating a histogram by integrating parts of divided blocks.
(2)前記統合するブロックを操作者が任意に指定する
ことを特徴とする特許請求の範囲第1項記載の文字認識
装置における特徴量生成方法。
(2) A method for generating feature amounts in a character recognition device according to claim 1, wherein an operator arbitrarily specifies the blocks to be integrated.
JP61036056A 1986-02-20 1986-02-20 Feature quantity generating method in character recognizing device Pending JPS62192887A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP61036056A JPS62192887A (en) 1986-02-20 1986-02-20 Feature quantity generating method in character recognizing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP61036056A JPS62192887A (en) 1986-02-20 1986-02-20 Feature quantity generating method in character recognizing device

Publications (1)

Publication Number Publication Date
JPS62192887A true JPS62192887A (en) 1987-08-24

Family

ID=12459058

Family Applications (1)

Application Number Title Priority Date Filing Date
JP61036056A Pending JPS62192887A (en) 1986-02-20 1986-02-20 Feature quantity generating method in character recognizing device

Country Status (1)

Country Link
JP (1) JPS62192887A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5467411A (en) * 1991-09-26 1995-11-14 Mitsubishi Denki Kabushiki Kaisha System with approximation mechanism for recognizing graphical elements in a drawing

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5467411A (en) * 1991-09-26 1995-11-14 Mitsubishi Denki Kabushiki Kaisha System with approximation mechanism for recognizing graphical elements in a drawing

Similar Documents

Publication Publication Date Title
US6903751B2 (en) System and method for editing electronic images
GB2190778A (en) Character recognition with variable subdivision of a character region
JPH08161421A (en) Device and method for extracting character string area
JP3437347B2 (en) Character recognition apparatus and method and computer
JP2006227824A (en) Drawing recognition method and device
JPS62192887A (en) Feature quantity generating method in character recognizing device
JPH0812668B2 (en) Handwriting proofreading method
JPS62192886A (en) Feature quantity generating method in character recognizing device
JPH08320914A (en) Table recognition method and device
JP2582611B2 (en) How to create a multi-font dictionary
JP2650903B2 (en) Standard pattern storage method and device in character recognition device
JPS6327991A (en) Formation of histogram for input information recognizing device
JPH10222612A (en) Document recognizing device
JP3138546B2 (en) How to create user characters
JPH06348909A (en) Device and method for recognizing handwriting input
JP2840281B2 (en) Character processing apparatus and method
JP2001266070A (en) Device and method for recognizing character and storage medium
JP2954218B2 (en) Image processing method and apparatus
RU2166209C2 (en) Method for building dynamic raster standards of character-expressed computer codes in recognition of respective subpictures
JP2978548B2 (en) Character reader
JP2663550B2 (en) Feature extraction method
JPS63121991A (en) Dictionary forming method for character recognizing device
JPH01126766A (en) Collective erasing system for character or ruled line
JPH09251291A (en) Formation of vector font
JPH03269689A (en) Document reading device