JPS6337490A - Character recognizing device - Google Patents

Character recognizing device

Info

Publication number
JPS6337490A
JPS6337490A JP61182163A JP18216386A JPS6337490A JP S6337490 A JPS6337490 A JP S6337490A JP 61182163 A JP61182163 A JP 61182163A JP 18216386 A JP18216386 A JP 18216386A JP S6337490 A JPS6337490 A JP S6337490A
Authority
JP
Japan
Prior art keywords
character
section
recognition
area
recognized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
JP61182163A
Other languages
Japanese (ja)
Inventor
Masahiro Shimizu
正博 清水
Mariko Takenouchi
磨理子 竹之内
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Panasonic Holdings Corp
Original Assignee
Matsushita Electric Industrial Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Matsushita Electric Industrial Co Ltd filed Critical Matsushita Electric Industrial Co Ltd
Priority to JP61182163A priority Critical patent/JPS6337490A/en
Publication of JPS6337490A publication Critical patent/JPS6337490A/en
Pending legal-status Critical Current

Links

Landscapes

  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

PURPOSE:To correctly recognize characters on documents including characters in similar shapes by specifying the area to be recognized of a picture to be recognized and designating the type of character in an area. CONSTITUTION:First, an input picture is binarized and stored in a picture memory part 2. A recognized area specification part 3 specifies the area to be recognized of the input picture by a rectangle shown by oblique lines. A character type specification part 4 designates the type of character in each area that the part 3 specifies as follows: A alphameric, B KANJI (Chinese character)/KANA (Japanese syllabary)/ alphameric, C-E numeric, F KANJI/KANA/alphameric, G numeric. A character segment part 5 segments picture data corresponding to the specified area one by one as a character pattern in a rectangular shape R out of the memory part 2. A feature extraction part 6 sets the directional code of a character pattern, checks the linkage of picture elements by the code, extracts strokes and further extracts a feature amount fj such as the number of strokes, position and length. A classification part 7 obtains a distance between the standard feature amount of each character in a recognition dictionary 8 and fj, takes the character with a minimum distance and the specified type for a 1st candidate character and displays said character.

Description

【発明の詳細な説明】 産業上の利用分野 本発明は、新聞・雑誌等の活字及び手書き文字を認識し
、例えばJISコード等の情報量に変換する文字認識装
置に関するものである0従来の技術 従来の文字認識装置では認識対象文字から得られた特徴
量と予め認識用辞書に各文字毎に貯えられている特徴量
との距離を計算し、前記認識用辞書の中から最も距離の
小さい文字を認識候補文字としていた(例えば、昭和6
0年度電子通信学会総合全国大会157了゛輪郭方向密
度と背景密度を用いた手書き文字の認識”竹之内他)0
発明が解決しようとする問題点 しかしながら、実際には帳票等の定型文書において、あ
る領域に例えば数字しか記入されないにもかかわらず、
認識対象文字の特徴量と辞書との距離を求める時に、認
識用辞書に格納されている全ての文字との距離を求め、
最も距離の短い文字を認識候補文字としているため、認
識領域の文字が「○」のようであれば、類似形状の文字
である英字のrOJと数字のrOJとの区別が出来ず、
認識エラーを起こしていた。本発明は上記問題点を解決
することを目的としたもので、類似形状の文字を含んだ
文書に対してもより正確に文字認識を行なうことができ
る文字認識装置を提供することを目的としている。
DETAILED DESCRIPTION OF THE INVENTION Field of Industrial Application The present invention relates to a character recognition device that recognizes printed and handwritten characters from newspapers, magazines, etc. and converts them into information amounts such as JIS codes. Conventional character recognition devices calculate the distance between the feature amount obtained from the recognition target character and the feature amount stored in advance for each character in a recognition dictionary, and select the character with the smallest distance from the recognition dictionary. were used as recognition candidate characters (for example,
IEICE General Conference 157 completed ゛Handwritten character recognition using contour direction density and background density'' Takenouchi et al.) 0
Problems to be Solved by the Invention However, in reality, in a standard document such as a form, even though only numbers are written in a certain area,
When calculating the distance between the feature amount of the recognition target character and the dictionary, calculate the distance from all characters stored in the recognition dictionary,
Since the character with the shortest distance is used as the recognition candidate character, if the character in the recognition area is like "○", it will not be possible to distinguish between the alphabet rOJ and the number rOJ, which are characters with similar shapes.
A recognition error occurred. The present invention aims to solve the above-mentioned problems, and aims to provide a character recognition device that can more accurately perform character recognition even for documents containing characters with similar shapes. .

問題点を解決するための手段 本発明は前記問題点を解決するため、認識対象画像の認
識対象領域を指定し、前記認識対象領域の文字種を指定
することにより、前記辞書の特徴量と認識対象文字パタ
ーンの特徴量との距離を用いて認識候補文字を求める時
に、認識対象領域の文字種に該当する文字のみを認識候
補文字とし、上記問題点を解決している。
Means for Solving the Problems In order to solve the above-mentioned problems, the present invention specifies a recognition target area of a recognition target image and specifies the character type of the recognition target area. When finding candidate characters for recognition using the distance from the feature amount of the character pattern, only characters that correspond to the character type in the recognition target area are selected as candidate characters for recognition, thereby solving the above problem.

作  用 本発明は前記の技術的手段により、英字のrOJや数字
の「0」のような類似文字でも正確に文字認識が可能と
なる。
Operation The present invention enables accurate character recognition even for similar characters such as the alphabetic character rOJ and the number "0" by the above-mentioned technical means.

実施例 以下、本発明の実施例について図面を参照しながら説明
する◎ 第1図は、本発明による文字認識装置の一実施例の構成
図である。1は画像入力部であり、認識対象文字パター
ンを含む画像を走査して2値化号で画像を入力し画像メ
モリ部2に格納する。3は認識領域指定部であシ、入力
された画像の認識対象となる領域を指定する。4は文字
種指定部であり、認識領域指定部3で指定された各認識
領域の文字種を指定する。5は文字切り出し部であり、
前記認識領域から文字を矩形で切り出す。6は特徴抽出
部であり、文字切り出し部5で切り出した文字のストロ
ーク等の特徴量を求める。了は分類部であり、特徴抽出
部6で得られた認識対象文字パターンの特徴量と認識用
辞書8の各文字の特徴量との距離を求め、前記距離の中
で最も距離が小さくかつ文字種指定部4で指定された文
字種に該当する文字を認識候補文字とする。9は表示部
であり、分類部7で得られた認識結果を表示する。
Embodiments Hereinafter, embodiments of the present invention will be described with reference to the drawings. ◎ FIG. 1 is a block diagram of an embodiment of a character recognition device according to the present invention. Reference numeral 1 denotes an image input unit which scans an image including a character pattern to be recognized, inputs the image in binary code, and stores the image in the image memory unit 2. Reference numeral 3 denotes a recognition area specifying section, which specifies an area to be recognized in the input image. Reference numeral 4 denotes a character type specifying section, which specifies the character type of each recognition area specified by the recognition area specifying section 3. 5 is a character cutting part;
A rectangular character is cut out from the recognition area. Reference numeral 6 denotes a feature extracting section, which obtains feature quantities such as strokes of characters cut out by the character cutting section 5. The classification section calculates the distance between the feature amount of the recognition target character pattern obtained by the feature extraction section 6 and the feature amount of each character in the recognition dictionary 8, and selects the character type with the smallest distance among the distances. A character corresponding to the character type designated by the designation section 4 is set as a recognition candidate character. A display section 9 displays the recognition results obtained by the classification section 7.

以上のように構成された文字認識装置について、第2図
に示す入力画像を例に説明する。
The character recognition device configured as described above will be explained using an input image shown in FIG. 2 as an example.

画像入力部1から入力された第2図に示すような画像は
2値化されて画像メモリ部2に格納される。認識領域指
定部3は、画像メモリ部2に貯えられている入力画像の
認識対象となる領域を、第3図に示すような矩形で指定
する。文字種指定部4は、認識領域指定部3で指定され
た認識対象領域の文字種を、第4図に示すように指定す
る。文字切り出し部5は、認識領域指定部3で指定され
た認識対象領域に対応する画像データを画像メモリ部2
から1文字ずつ認識対象文字パターンとして第5図aに
示すような矩形Rで切り出す。
An image as shown in FIG. 2 input from the image input section 1 is binarized and stored in the image memory section 2. The recognition area specifying unit 3 specifies the area to be recognized in the input image stored in the image memory unit 2 in a rectangular shape as shown in FIG. The character type specifying unit 4 specifies the character type of the recognition target area specified by the recognition area specifying unit 3, as shown in FIG. The character cutting section 5 stores the image data corresponding to the recognition target area specified by the recognition area specifying section 3 in the image memory section 2.
A rectangle R as shown in FIG. 5a is cut out one character at a time as a character pattern to be recognized.

特徴抽出部6では、文字切り出し部で得られた第6図に
示すような認識対象文字パターンPについて、第6図す
の矢印が示す方向に着目画素を含んでM個以上連なって
いるか否かを調べ方向コードを設定し、方向コード毎に
各画素の連結性を調べてストロークを抽出し、ストロー
クの数・位置・長さ等の特徴量1jを抽出する。第6図
に文字パターンrOJのストロークの抽出結果を示す。
The feature extraction unit 6 determines whether or not the character pattern P to be recognized as shown in FIG. 6 obtained by the character extraction unit is continuous in M or more including the pixel of interest in the direction indicated by the arrow in FIG. 6. is checked, a direction code is set, and strokes are extracted by checking the connectivity of each pixel for each direction code, and feature quantities 1j such as the number, position, and length of strokes are extracted. FIG. 6 shows the stroke extraction results of the character pattern rOJ.

分類部7では、特徴抽出部6で求めた認識対象文字パタ
ーンの特徴量f、と認識用辞書8の各文字の標準特徴i
q kjとの距離DkをDk=Σl f、qk51 によシ求め、Dkの中から最も小さい値でしかも文字種
指定部4で指定された文字種に該当する文字G!nを第
1候補文字とする。
The classification unit 7 uses the feature amount f of the character pattern to be recognized obtained by the feature extraction unit 6 and the standard feature i of each character in the recognition dictionary 8.
The distance Dk from q kj is determined by Dk=Σl f, qk51, and the character G that has the smallest value among Dk and also corresponds to the character type specified in the character type designation section 4 is found. Let n be the first candidate character.

表示部9では、分類部7で得られた候補文字を表示する
The display unit 9 displays the candidate characters obtained by the classification unit 7.

例えば、認識対象領域の文字種が第4図C,D。For example, the character types in the recognition target area are C and D in FIG.

E又はGのように数字と指定されており、第5図の認識
対象文字パターン「0」の認識結果の第一候補が英字の
rOJであり、第二候補が数字の「0」の場合、指定さ
れた文字種により認識結果は容易に数字の「0」と決定
することができる。
If a number is designated as E or G, and the first candidate of the recognition result for the recognition target character pattern "0" in FIG. 5 is the alphabetic character rOJ, and the second candidate is the number "0", Depending on the specified character type, the recognition result can be easily determined as the number "0".

発明の効果 本発明によれば、認識対象文字パターンの認識をおこな
う場合、英字のrOJと数字のrOJのような類似文字
に対しても高い確率で認識することが出来、文字認識の
精度を向上する事が出来る。
Effects of the Invention According to the present invention, when recognizing a character pattern to be recognized, similar characters such as the alphabet rOJ and the number rOJ can be recognized with a high probability, improving the accuracy of character recognition. I can do it.

【図面の簡単な説明】[Brief explanation of the drawing]

第1図は本発明の一実施例による文字認識装置の構成図
、第2図は入力画像の1例を示す説明図、第3図は入力
画像に対して認識対象領域を指定した説明図、第4図は
認識対象領域の文字種を指定した説明図、第5図は認識
対象文字パターンを抽出した説明図、第6図はストロー
クを抽出した説明図である。 1・・・・・・画像入力部、2・・・・・・画像メモリ
部、3・・・・・・認識領域指定部、4・・・・・・文
字種指定部、6・・・・・・文字切り出し部、6・・・
・・・特徴抽出部、7・・・・・・分類部、8・・・・
・・認識用辞書部、9・・・・・・表示部。 代理人の氏名 弁理士 中 尾 敏 男 ほか1名第2
図 第3図 A−−一奥体層 8−一一濃官/CAらヂば/雫よ丈多 C−−−数さ D−一一斡 E−一一数官 v、   4   o               
            p−−−−1i3/CAら’
y’、、;/*’e3Ct−−−改客 第5図 (久)        R 第6図 ■
FIG. 1 is a block diagram of a character recognition device according to an embodiment of the present invention, FIG. 2 is an explanatory diagram showing an example of an input image, and FIG. 3 is an explanatory diagram showing a recognition target area specified for the input image. FIG. 4 is an explanatory diagram showing the designation of the character type of the recognition target area, FIG. 5 is an explanatory diagram showing the extraction of the recognition target character pattern, and FIG. 6 is an explanatory diagram showing the extraction of strokes. 1... Image input section, 2... Image memory section, 3... Recognition area specification section, 4... Character type specification section, 6...・Character cutting part, 6...
... Feature extraction section, 7... Classification section, 8...
...Recognition dictionary section, 9...Display section. Name of agent: Patent attorney Toshio Nakao and 1 other person 2nd
Figure 3 A--Inner Body Layer 8-Eleven Inner Officials/CA Rajiba/Shizukuyo Jota C---Number Sa D-Eleven Square E-Eleven Number Officials v, 4 o
p----1i3/CA et al'
y',,;/*'e3Ct---Customer Figure 5 (Ku) R Figure 6 ■

Claims (1)

【特許請求の範囲】[Claims] 認識対象文字を含む画像を入力する画像入力部と、前記
画像入力部で入力された画像から認識対象となる領域を
指定する認識領域指定部と、前記認識領域における文字
種を指定する文字種指定部と、前記認識領域指定部で指
定された領域の文字パターンを矩形で切り出す文字切り
出し部と、前記文字切り出し部で得られた認識対象文字
パターンの文字特徴を求める特徴抽出部と、各文字の特
徴量を予め格納している認識用辞書部と、前記特徴抽出
部で得られた認識対象文字パターンの特徴量と予め前記
認識用辞書に貯えられている各文字の特徴量との距離を
求め、前記文字種指定部で指定された文字種の中で最も
距離の小さい文字を認識候補文字とする分類部とを有す
ることを特徴とする文字認識装置。
an image input section for inputting an image containing characters to be recognized; a recognition area specifying section for specifying a region to be recognized from the image input by the image input section; and a character type specifying section for specifying a character type in the recognition area. , a character cutting section that cuts out a character pattern in a region specified by the recognition area specifying section into a rectangular shape, a feature extraction section that obtains character features of the recognition target character pattern obtained by the character cutting section, and a feature amount of each character. The distance between the feature amount of the character pattern to be recognized obtained by the feature extraction section and the feature amount of each character stored in the recognition dictionary in advance is determined, 1. A character recognition device comprising: a classification section that sets a character with the smallest distance among the character types specified by the character type specification section as a recognition candidate character.
JP61182163A 1986-08-01 1986-08-01 Character recognizing device Pending JPS6337490A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP61182163A JPS6337490A (en) 1986-08-01 1986-08-01 Character recognizing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP61182163A JPS6337490A (en) 1986-08-01 1986-08-01 Character recognizing device

Publications (1)

Publication Number Publication Date
JPS6337490A true JPS6337490A (en) 1988-02-18

Family

ID=16113452

Family Applications (1)

Application Number Title Priority Date Filing Date
JP61182163A Pending JPS6337490A (en) 1986-08-01 1986-08-01 Character recognizing device

Country Status (1)

Country Link
JP (1) JPS6337490A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0373084A (en) * 1989-04-28 1991-03-28 Hitachi Ltd Character recognizing device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS58115422U (en) * 1982-01-30 1983-08-06 いすゞ自動車株式会社 Full plate for vehicle fuel tank
JPS61226335A (en) * 1985-03-30 1986-10-08 Nissan Shatai Co Ltd Liquid container for vehicle

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS58115422U (en) * 1982-01-30 1983-08-06 いすゞ自動車株式会社 Full plate for vehicle fuel tank
JPS61226335A (en) * 1985-03-30 1986-10-08 Nissan Shatai Co Ltd Liquid container for vehicle

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0373084A (en) * 1989-04-28 1991-03-28 Hitachi Ltd Character recognizing device

Similar Documents

Publication Publication Date Title
JP2713622B2 (en) Tabular document reader
US7437001B2 (en) Method and device for recognition of a handwritten pattern
JP3452774B2 (en) Character recognition method
JPH0420226B2 (en)
JPH1011531A (en) Slip reader
JPS63182793A (en) Character segmenting system
JP2000315247A (en) Character recognizing device
JPH09319824A (en) Document recognizing method
JPS6337490A (en) Character recognizing device
JPH05225399A (en) Document processor
JPS6330991A (en) Character recognizing device
JP2537973B2 (en) Character recognition device
JP2993533B2 (en) Information processing device and character recognition device
JPS63137384A (en) Character recognizing device
JP3151866B2 (en) English character recognition method
JPH07107700B2 (en) Character recognition device
JP2962911B2 (en) Character recognition device
JP2918363B2 (en) Character classification method and character recognition device
JPS62251888A (en) Character recognizing device
JP2931485B2 (en) Character extraction device and method
JPH11134439A (en) Method for recognizing word
JPH0584552B2 (en)
JPH04280393A (en) Character/graphic recognizing device
JPS63221495A (en) Character recognizing device
JPS63239569A (en) Character recognition device