JPS603071A - Character recognition system - Google Patents
Character recognition systemInfo
- Publication number
- JPS603071A JPS603071A JP58109188A JP10918883A JPS603071A JP S603071 A JPS603071 A JP S603071A JP 58109188 A JP58109188 A JP 58109188A JP 10918883 A JP10918883 A JP 10918883A JP S603071 A JPS603071 A JP S603071A
- Authority
- JP
- Japan
- Prior art keywords
- character
- pattern
- subpattern
- scanning
- extracted
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Landscapes
- Character Discrimination (AREA)
Abstract
Description
【発明の詳細な説明】
(技術分野)
本発明にL高速て精度の良い特徴抽出方式に関1″ろも
のである。DETAILED DESCRIPTION OF THE INVENTION (Technical Field) The present invention relates to a fast and accurate feature extraction method.
(背景技術)
従来文字図形認識装置に於ては、文字□□□形ノ々ター
ンよりストロークを抽出し、それし抽出されたストロー
クの位置、長さ、ストローク間の相互関係等を用いて認
識する方式が多べ採用されている。(Background technology) In conventional character/figure recognition devices, strokes are extracted from the character □□□ shape number turns, and then recognized using the position, length, mutual relationship between strokes, etc. of the extracted strokes. Many methods have been adopted.
この棟の装置においては、(1)文字図形の輪郭を追跡
することにより検出された輪郭点系列について曲率を計
算し、その曲率の大きな値の点を分岐点として輪郭系列
を分割し、分割された系列を組合わせることによりスト
ロークを抽出するか、又は(2)文字図形パターンに細
線化処理を行なって骨格化し、その骨格パターンの連結
性及び骨格パターンを追跡し急激な角度の変化点等を検
出し一〇゛ストロークを抽出し、該抽出されたストロー
クについて幾伺学的な肋徴等を抽出し文字図形の認別な
行なっていた。しかしながら、(11の方法(・コ1、
文字(図形パターンが大きくなり又文字図形パターンか
複動6化すると、その処理量か増太しそのため処理速度
の低下を招き、(2)の方法は、文字図形パターンを細
線化する必要があり又その細線化によるパターンのひず
み、ヒゲ等の問題かあり、その後の処理が複雑なものと
なる欠点がある。The equipment in this building (1) calculates the curvature of a contour point series detected by tracing the contour of a character figure, divides the contour series using points with large curvature values as branching points; (2) Extract the strokes by combining the series, or (2) perform thinning processing on the character/figure pattern to create a skeleton, trace the connectivity of the skeleton pattern and trace the skeletal pattern, and detect sudden angular changes, etc. It detected and extracted 10 strokes, extracted geometrical rib features, etc. from the extracted strokes, and recognized character shapes. However, (11 methods (・ko1,
When the character (figure pattern) becomes large and the character figure pattern becomes double-acting 6, the amount of processing increases and the processing speed decreases.In method (2), it is necessary to make the character figure pattern thinner. Further, there are problems such as pattern distortion and whiskers due to the thinning of the lines, and the subsequent processing becomes complicated.
(発明の課題)
本発明の目的はこれらの欠点を改善1−ろもので、文字
図形パターンの所望の方向のストローク成分を表わすサ
ブパターンを抽出し、サブパターンについ−C文字外接
枠の辺上の点から走査して走査線上のすベーこり文字線
の位置を検出し、走査開始点と文字線との距ぬ(1を1
3iJ記走査方向の文字外接枠の大きさで正規化し−C
l’lll tlj ′1−るN乗和の辺単位の配列を
分割して!l−1IG(−\りトル群を抽出づ−ること
を肋徴とし、その11的ばif7+速て安定な文字認識
装置を提供′fろことにJ’Iイ)。(Problems to be solved by the invention) The purpose of the present invention is to improve these drawbacks by extracting a sub-pattern representing a stroke component in a desired direction of a character figure pattern, and extracting a sub-pattern representing a stroke component in a desired direction of a character figure pattern. The position of the entire character line on the scanning line is detected by scanning from the point , and the distance between the scanning start point and the character line (1 is 1
3iJ Normalized by the size of the character circumscribing frame in the scanning direction -C
l'llll tlj '1-Divide the side-by-side array of the N-th power sum! 1-1IG (-\I) whose objective is to extract a small group, and to provide a fast and stable character recognition device if7+.
(発明の構成J・5よσ作J−11)
第1図は本発明の文字認識装置におけろ実施例、第21
はサノバクーン例、第3図は勃徴抽Lfj例を示−J−
。(Structure of the invention J-5 to σ J-11) Fig. 1 shows an embodiment of the character recognition device of the present invention.
Figure 3 shows an example of Sanobakun, and Figure 3 shows an example of erectile drawing Lfj.
.
第1図中1(佳光届1俊換1711(,2はパターンレ
ジスタ、;うは線幅言口″+1部、・ロ土ザブパターン
抽In Bti、5(佳文字枠倹1゜旧′1シ、6(1
、辺距離計算部、7(」、重機マトリクス仙II届’i
ll、)N、l:識別i′if)、0は文字名出力−〇
−ある。In Fig. 1, 1 (Kako Report 1 Shunkan 1711 (, 2 is the pattern register, ; U is the line width word mouth" + 1 part, Roto Zabu pattern drawing In Bti, 5 (Kako letter frame + 1゜old') 1shi, 6(1
, Edge Distance Calculation Department, 7('', Heavy Equipment Matrix Sen II Notification'i
ll, )N, l: Identification i'if), 0 is character name output -〇-.
本実施例の動作は、読取機構にセットされた帳閉上の文
字は光電変換>xl+ +において2値の量子化された
ティジクル電気・箭号に変換され、パターンレジスタ2
に格納される。それと同時に、線幅計算部3に46いて
入カバターンの線幅(W)が計算される。サブパターン
抽出部4(・土、パターンレジスタについて垂直スキャ
ンを全面1j−なって、黒ヒントの連続長さと線幅割算
部;3におい℃計碧された線幅との関係より垂直サブパ
ターン(〜1S11J)を抽出する。同様に、水平スキ
ャンにより水平ザブパターン(H4F )を、右斜め・
15°スキヤンにより右斜めサブパターン(l(、SP
)を、左斜め45°スキヤンにより左心1めザブパタ
ーン(L、’−Ef) )を抽出1−ろ。The operation of this embodiment is that the characters on the book closing set in the reading mechanism are converted into binary quantized tickle electricity and digit symbols by photoelectric conversion>xl+ +, and the pattern register 2
is stored in At the same time, the line width (W) of the input cover pattern is calculated in the line width calculating section 346. Sub-pattern extraction unit 4 (Sat, the pattern register is vertically scanned over the entire surface, and the continuous length of black hints and line width division unit; ~1S11J).Similarly, the horizontal sub pattern (H4F) is extracted by horizontal scanning.
A right diagonal subpattern (l(, SP
), the left center first sub-pattern (L,'-Ef) is extracted by scanning 45 degrees diagonally to the left.
第2図は原パターンと各ザブパターンの例で(21)は
原パターン、(1))は垂直ザブパターン(VSI’)
、(C)は水平サブパターン(ll5P )、((1)
は右斜めザブパターン(R8P ) 、(clは左斜め
サブパクー:、’ (LSP)である。文字枠検出部5
は、パターンレジスタ内の文字図形パターンに外接1−
ろ方形の枠(以後文字枠と称する)を検出し、パターン
レジスタで定義される2次元平面における前記文字枠を
規定′1−る為の位置座標を特徴抽出部6へ送出する。Figure 2 is an example of the original pattern and each sub-pattern, where (21) is the original pattern and (1)) is the vertical sub-pattern (VSI').
, (C) is the horizontal subpattern (ll5P ), ((1)
is a right diagonal sub pattern (R8P), (cl is a left diagonal sub pattern:,' (LSP).Character frame detection unit 5
is a circumscribed 1- to the character/figure pattern in the pattern register.
A rectangular frame (hereinafter referred to as a character frame) is detected, and position coordinates for defining the character frame on a two-dimensional plane defined by a pattern register are sent to the feature extraction unit 6.
以後の説明においては文字枠の左−トを原点とし、水平
方向をX軸、垂直方向をY軸とする座標系を使用する。In the following description, a coordinate system will be used in which the left-hand side of the character frame is the origin, the horizontal direction is the X axis, and the vertical direction is the Y axis.
特徴抽出部6ば、まず垂直サブパターンにつt・て、文
字枠を構成する4辺のうち垂直な辺である左辺−1−の
点P(0,y)から水平査定を開始し、白点から黒点へ
の変化点を1−べて検出し、検出した変11点と前記走
査を開始した垂直辺」二の点1ゝとの間の距1々[Lず
なわしX座標の差を文字枠のX方向の長さを正規化定数
どして正規化した仙のN乗(Nは定数、本実施例て゛は
N−2)の値の割算を1)IJ記検出したーJ−八での
変化点について行な(・、その総和を配列■p、(y)
に格納″1−る。但し前記内点とは文字背jIt部を表
わし、黒点とは文字′に9.都を表わす。また式(1)
は前記のVパy)を式て表わしたもθうであり△XAば
それぞれσ)変化点と文字枠辺上の走査開始点との距前
な示し、〃−1,・・、1\、1(は検出された変化点
個数を表わづ−。又、式(1)中のΔXは文字枠の水s
(1方向の長さでk)す、Cは整数化定数でk)り本実
施例において(1、C=5(+とした。The feature extraction unit 6 first starts horizontal evaluation of the vertical sub-pattern from point P(0, y) on the left side -1-, which is the vertical side among the four sides constituting the character frame, and Detect the change points from a dot to a black dot, and calculate the distance between the detected change point 11 and the point 1 on the vertical side where the scanning started [L square X coordinate difference] The length of the character frame in the X direction is normalized using a normalization constant, and the division of the value of X to the Nth power (N is a constant, in this example, ゛ is N-2) is calculated as follows: 1) IJ was detected. - Do this for the change point at 8 (・, the sum is arrayed ■ p, (y)
is stored in ``1-''.However, the interior point represents the back jIt part of the character, and the black point represents the 9. capital in the character '.Also, formula (1)
is the formula for the above-mentioned Vpy), and △XA is the distance between the change point and the scanning start point on the side of the character frame, 〃-1,..., 1\ , 1 (represents the number of detected change points. Also, ΔX in equation (1) is the water s of the character frame.
(The length in one direction is k), and C is an integer constant k). In this example, (1, C=5 (+) is used.
上n(2の様な処理を文字枠の2つの垂直辺上のづ−ベ
ての点を開始点として行ない、垂直ザブパターンについ
て、文字枠の左辺上の点から水平走査を開始して作成す
る配列V、(i)、文字枠の右辺」−の点から水平走査
を開始して作成1−ろ配列Vr(i)を抽出づ−ろ。但
し1−09 ・、YT;YT&j、文字枠上辺のY座標
である。同様な処理により、水31zザブパターン、右
斜めザブパターン、左斜めサブパターンにすし・では文
字枠の2個の水平辺土の1−べての点から垂直走査を行
なって、水平ザブパターンについて配列”h(j)、i
+ズ(」)、右斜めサブパターンにつ℃・ての配列R,
!、(j)、HバJ)、左斜めザブパターンについての
配列Lh(J)、以(1)を抽1゛j」する。Perform the process similar to above (2) starting from all points on the two vertical sides of the character frame, and create the vertical sub pattern by starting horizontal scanning from the point on the left side of the character frame. Extract the array Vr(i) created by starting horizontal scanning from the point "-" on the right side of the character frame.However, 1-09 ・,YT; This is the Y coordinate of the top side.By similar processing, vertical scanning is performed from all points on the two horizontal sides of the character frame for the water 31z sub pattern, right diagonal sub pattern, and left diagonal sub pattern. Then, for the horizontal sub pattern, the array ``h(j),i
+ zu (''), ℃・te arrangement R for the right diagonal sub-pattern,
! , (j), HbJ), array Lh(J) for the left diagonal diagonal pattern, and extract (1) below.
世しJ−0,・・・、Xlt、 X+(、は文字枠右辺
のX座標である。前記水平サブパターン、右斜めザフィ
々り一ン、左斜めザブパターンについ−C抽出1−ろ配
列の添字りは文字枠の水平な下辺上の点を開始点とした
ものを表わし又は水平な上辺上の点を開始点としたもの
を表わす。又、11h (] ) 、1. lス(jL
lもh(J)、Hソ、(j)。SeiJ-0,..., Xlt, The subscript represents a point on the horizontal bottom side of the character frame as the starting point, or a point on the horizontal top side as the starting point.
l also h (J), H so, (j).
Ll、 fj) 、Lx(j)を抽出する際における走
査15トj始点と前記変化点との距離の正規化定数とし
ては文字枠の垂直方向の長さ△Yを使用した。文字マ1
. IJクス抽出部7は!時機抽出’*’rlS 6に
おいて抽出された8河Iの配列を使用し、各配列をM個
(へ4は定数、本実施例でむ1、へ4−7)に分割し、
分割された配列の同一分割71j位内の配列の値の平均
値をJ1算することによりへ・1×8次元の6徴マトリ
クスP CIn・1])を抽出する7、但いn =、
、1 、・・、へ4.n=l、・・−28識別>511
8は重機マトリクス抽出部7で抽出された重機マトリク
スと、同形式で記述された標準文字マスクf (m 、
n )どの間の式(2)で示される距離(1))をM
’l算しその距ν11(か1股も小さい値をiiえろ標
準文字マスクのカテゴリ名を文字名出力()−\出力す
る。When extracting Ll, fj) and Lx(j), the vertical length ΔY of the character frame was used as a normalization constant for the distance between the start point of scan 15 and the change point. Character mark 1
.. IJ Kusu extraction part 7 is! Time extraction '*' rlS Using the array of 8 rivers extracted in 6, divide each array into M pieces (He4 is a constant, M1 in this example, He4-7),
By calculating J1 the average value of the array values within the same division 71j of the divided array, a 1×8-dimensional six-character matrix P CIn 1]) is extracted7, where n =,
, 1 ,..., to 4. n=l,...-28 identification>511
8 is a standard character mask f (m,
n) between which distance (1)) shown in equation (2) is M
Calculate the distance ν11 (or 1) and select the smaller value.Output the category name of the standard character mask as a character name ()-\.
D−〜t’7TTT席、1〕)−′(m匹)(2)(発
jllJの効果)
J以−L説明した様に、本実施例の4)徴マl−1)ク
ス抽出部において抽出された特徴マ) l)クスは、文
字図形パターンのストロークの位置、長さ、方向等を表
わすものて、文字71当有な性質を表現している。又、
図3に2独り形が類似した文学パターンと特徴抽出部で
抽出1−ろ配列を図形的に表現した例において観察され
るように、文字の局ni的な違いが前記配列に充分に反
映されるので認識I?4度の向」二を図ることができろ
。又、特徴抽出部において各配列を作成する際に、走査
を開始した文字枠辺上の点と文学線との距離を文字枠の
当該走査方向の大きさで正規化して見゛るりで手摺文字
に46いて特有な筆者の違いによる文字の大きさの変動
的を吸収することかてきろのて’11’j 1隻の良い
認)撮か用能である。又、文字図形パターンからの特徴
抽出を単純な走査といつ処理により実現しているので高
度な認識かDJ能であり、装置の小、!11!!化を図
ることもできる利点かある。D-~t'7TTT seat, 1])-' (m animals) (2) (Effect of emitted jllJ) J to L As explained, 4) Signal mark l-1) Kusu extraction part of this example Features extracted in (1) The marks represent the position, length, direction, etc. of the stroke of the character graphic pattern, and express the characteristics typical of the character 71. or,
As observed in Figure 3, an example of a literary pattern with similar two solitary forms and a graphical representation of the extraction arrangement using the feature extraction unit, local differences in characters are sufficiently reflected in the arrangement. Is it recognized because it is? Be able to measure the direction of 4 degrees. In addition, when creating each array in the feature extraction section, the distance between the point on the side of the character frame where scanning started and the literary line is normalized by the size of the character frame in the scanning direction, and the handrail characters are It is possible to absorb the fluctuations in font size due to differences in writers, which is unique to 46 characters. In addition, since feature extraction from text/figure patterns is achieved through simple scanning and processing, advanced recognition or DJ capabilities are possible, and the equipment is small! 11! ! It also has the advantage of being able to be used in a variety of ways.
本発明は、文字図形パターンか1:)各方向のストロー
ク成分を抽出したザブパターンを垂直又は水平に走査し
、文字枠辺と文字線との距1カ[1を当該走査方向の文
字枠の大きさで正規化した値のN乗和を44.徴として
いるので、複雑な処理を必要とせす又、手書文字の変形
に追従して安定に市徴を抽出しているので高速で精度の
良い文字認識装置に利用することかできろ。The present invention scans a character figure pattern (1:) vertically or horizontally with a sub pattern from which stroke components are extracted in each direction, and calculates the distance between the character frame side and the character line by 1 [1] of the character frame in the scanning direction. The sum of the N-th power of the values normalized by the size is 44. Since city signs are used as signs, complex processing is required.Also, since city signs are stably extracted by following the deformation of handwritten characters, it can be used in high-speed and highly accurate character recognition devices.
第1図に1本発明の文字1;、り識装置にオ6けろ実施
例、i′J2図C’) 〜(c)&:L−リフパターン
例を示1−1図、第:3図は特徴・例をン」ミ″づ図て
あろ、。
]・・・光重重変換部 2・・パターンレジスタ;3
線幅泪鏝部 ・1・・・ザブパターン抽出部5・・文字
枠構i、L′lj′tB (i・・・特徴抽出部7・・
・71力徴マトリクス油出riB 8・・・識別部9・
・・文字名出力
!時d71出力須人
沖TU:気工業株式会社
特671出1如代」![1人
弁 埋士 山 木 恵 −
券/ 図
襄2 図
< o、 ) < bノ にり
(d) <e)Figure 1 shows an example of the letter 1 of the present invention, an example of the character 1; The figure shows the features and examples step by step.]...Light weight/weight conversion unit 2...Pattern register; 3
Line width drop section 1... Sub pattern extraction section 5... Character frame structure i, L'lj'tB (i... Feature extraction section 7...
・71 Force signature matrix YuderiB 8...Identification part 9・
...Character name output! Time d71 output Sunoki TU: Ki Kogyo Co., Ltd. Special 671 Output 1 Ruyo”! [Single-person dialect Buri Megumi Yamaki - Ticket / Illustration 2 Figure < o, ) < b no niri (d) < e)
Claims (1)
向におけろ文字線の断面を検出し、断面長が前記文字図
形パターンの文字線幅より十分長い断面を抽出すること
により行なうザブパターンの抽出を複数の方向について
行ない、抽出したそれぞれのサブパターンについて、m
l」記文牢図形パターンの文字外接枠の辺土の点を開始
点として所定の方向へ走査し、走査線上におけるすべて
の文字線の位置を検出し、走査を開始した辺上の点と、
前記検出した文字線との距離をAil記文字外接枠の前
記所定の方向の大きさで正規化した値のへ乗和(Nは定
数)を抽出する処理を、前記ザブパターン毎に、又前記
文字外接枠の4辺のうち少なくとも2辺以上の辺上のす
べての点を開始点として、走査線を単位として行なって
、各サブパターン毎に文字外接枠の辺を単位とした前記
走査線毎のN乗和の配列を1lll L13 L、抽出
した前記N乗和の配列をそれぞれM個(Mは定数)に分
割し、各分割部位内の前記N乗和の平均値を計算したも
のを、それぞれの配列からM次元の特徴ベクトルとして
抽出1″ろことにより、前記文字図形ノ(ターンにつ(
・ての特徴ベクトル群を抽出し、抽出したil#徴ベト
ル群と同形式で表現された辞書と照合1−ろことにより
文字を認識1−ることを特徴とする文字認識方式。A subpattern is created by scanning a character/figure pattern in a desired direction, detecting a cross section of a character line in the scanning direction, and extracting a cross section whose cross section length is sufficiently longer than the character line width of the character/figure pattern. Extraction is performed in multiple directions, and for each extracted subpattern, m
Scanning in a predetermined direction using the edge point of the character circumscribing frame of the "l" written figure pattern as a starting point, detecting the positions of all character lines on the scanning line, and the point on the edge where scanning started,
The process of extracting the sum of powers (N is a constant) of the value obtained by normalizing the distance to the detected character line by the size of the Ail character circumscribing frame in the predetermined direction is carried out for each of the subpatterns, and Starting from all points on at least two of the four sides of the character circumscribing frame, scan lines are scanned as a unit, and for each sub-pattern, each scanning line is scanned using the sides of the character circumscribing frame as a unit. The array of the N-th power sum of 1llll L13 L, the extracted N-th power sum array is each divided into M pieces (M is a constant), and the average value of the N-th power sum within each division part is calculated as follows: By extracting M-dimensional feature vectors from each array, the character figure (turn)
- A character recognition method characterized in that a character is recognized by extracting a group of feature vectors and comparing them with a dictionary expressed in the same format as the extracted il# feature vector group.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP58109188A JPS603071A (en) | 1983-06-20 | 1983-06-20 | Character recognition system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP58109188A JPS603071A (en) | 1983-06-20 | 1983-06-20 | Character recognition system |
Publications (2)
Publication Number | Publication Date |
---|---|
JPS603071A true JPS603071A (en) | 1985-01-09 |
JPH0545990B2 JPH0545990B2 (en) | 1993-07-12 |
Family
ID=14503864
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP58109188A Granted JPS603071A (en) | 1983-06-20 | 1983-06-20 | Character recognition system |
Country Status (1)
Country | Link |
---|---|
JP (1) | JPS603071A (en) |
-
1983
- 1983-06-20 JP JP58109188A patent/JPS603071A/en active Granted
Also Published As
Publication number | Publication date |
---|---|
JPH0545990B2 (en) | 1993-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Sanjeev Kunte et al. | A simple and efficient optical character recognition system for basic symbols in printed Kannada text | |
Sahu et al. | A study on optical character recognition techniques | |
JPH08180139A (en) | Method and apparatus for on-line recognition of unrestrainedhandwritten alphanumeric character | |
Patel et al. | Handwritten character recognition using multiresolution technique and euclidean distance metric | |
Yadav et al. | A robust approach for offline English character recognition | |
Maloo et al. | Support vector machine based Gujarati numeral recognition | |
CN106503694A (en) | Digit recognition method based on eight neighborhood feature | |
Sanjrani et al. | Handwritten optical character recognition system for Sindhi numerals | |
Pradeep et al. | An investigation on the performance of hybrid features for feed forward neural network based English handwritten character recognition system | |
JPS603071A (en) | Character recognition system | |
Singh et al. | A comprehensive survey on Bangla handwritten numeral recognition | |
Saon | Cursive word recognition using a random field based hidden Markov model | |
Jain et al. | Recognition of offline gujarati handwritten disjoint consonants using pattern matching | |
Zaw et al. | Character segmentation and recognition for Myanmar warning signboard images | |
Alqudah et al. | Shift and scale invariant recognition of printed numerals | |
Zaw et al. | Y-position based Myanmar touching character segmentation and sub-components based character classification | |
Zaw et al. | Segmentation Method for Myanmar Character Recognition Using Block based Pixel Count and Aspect Ratio | |
Kaur et al. | Urdu ligature recognition techniques-A review | |
Khaled et al. | Braille character recognition using associative memories | |
Parker et al. | Vector templates for symbol recognition | |
JPS6222186A (en) | Drawing reader | |
JPS62125485A (en) | Character recognizing system | |
JPS6019287A (en) | Character recognizing method | |
Sandyal et al. | Segmentation approach for offline handwritten Kannada scripts | |
Ravikumar et al. | Segmentation of Handwritten Characters from Answer Scripts |