JPS62125485A

JPS62125485A - Character recognizing system

Info

Publication number: JPS62125485A
Application number: JP60263672A
Authority: JP
Inventors: Koji Ito; 伊東　晃治; Yoshiyuki Yamashita; 山下　義征; Toshiyuki Ariga; 有賀　寿之
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1985-11-26
Filing date: 1985-11-26
Publication date: 1987-06-06
Anticipated expiration: 2009-08-22
Also published as: JPH0664629B2

Abstract

PURPOSE:To obtain the same feature matrix even in the Ming style and the Gothic style and to execute the stable character recognition to the deformation of a character and the change of a character line width by dividing the character frame area into the area of (NXM) concerning the subpattern extracted from an original pattern, and forming the feature matrix based upon the black point in the area and the line width of respective subpatterns. CONSTITUTION:A line width calculating part 4 calculates the line width of an input pattern. A vertical subpattern extracting part 6 extracts the vertical subpattern from the relation of the continuous length and the line width of a black bit. A vertical subpattern line width calculating part 10 calculates the line width of the vertical pattern and in the same way, calculates the horizontal and right and left slanting subpattetrn. A feature matrix extracting part 14 forms the feature matrix from these, normarizes it by the size of the character from a character frame detecting pat 5 and prepares the characteristic matrix. The feature matrix is compared with a standard feature matrix at an identifying part 15, and the category name of the standard character matrix to give the minimum distance is given as a character name output 16.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は文字図形を認識する文字認識方式に関し、更に
詳細には、文字図形・ξターンを複数のサブパターンに
分割して各サブノぞターン毎の特徴マトリクスを得、こ
の特徴マトリクスを辞書内の標準文字マスクと照合して
識別する方式に関する。Detailed Description of the Invention (Industrial Field of Application) The present invention relates to a character recognition method for recognizing character figures, and more specifically, a character figure/ξ turn is divided into a plurality of sub-patterns, and each sub-noise turn is The present invention relates to a method of obtaining a feature matrix for each character and comparing this feature matrix with a standard character mask in a dictionary for identification.

（従来の技術）従来、文字図形認識装置に於ては文字図形・やターンよ
りストロークを抽出し、それら抽出されたストロークの
位置、長さ、ストローク間の相互関係等を用いて認識す
る方式が多く採用さ九でいる。(Prior Art) Conventionally, in character/figure recognition devices, strokes are extracted from character/figures/turns, and recognition is performed using the positions, lengths, and mutual relationships between strokes of the extracted strokes. There are nine people who are often employed.

その手法は（１）文字図形の輪郭を追跡することにより
検出された輪郭点系列について曲率を計算し、その曲率
の大きな値の点を分割点として輪郭系列を分割し、分割
された系列を組合わせることによりストロークを抽出す
るか、（２）文字図形パターンに細線化処理を行なって
骨格化し、その骨格パターンの連結性及び骨格パターン
を追跡し急激な角度の変化点等を検出してストロークを
抽出し、前記（１）　（２）より抽出されたストローク
について幾何学的な特徴等を抽出して識別を行なってい
た。しかしながら（１）の方法は文字図形パターンが大
きくなり、又文字図形・ンターンが複雑化すると、その
処理帯が増大し処理速度の低下を招いていた。又（２）
の方法は文字図形・εターンを細線化する必要があり、
又その細線化による／−Ｏターンのひずみヒデの発生等
の問題がありその後の処理を複雑なものとし、ていた。The method is (1) Calculate the curvature of the contour point series detected by tracing the contour of the character figure, divide the contour series using points with large values of curvature as dividing points, and assemble the divided series. (2) Extract the strokes by combining them, or (2) perform thinning processing on the character/figure pattern to create a skeleton, trace the connectivity of the skeleton pattern and detect sudden angle changes, etc., and extract the strokes. The strokes extracted from (1) and (2) above are identified by extracting their geometric features. However, in method (1), as the character/graphic pattern becomes larger and the character/graphic/pattern becomes more complex, the processing band increases, resulting in a reduction in processing speed. Also (2)
In this method, it is necessary to thin the character shapes and ε-turns,
Further, due to the thinning of the wire, there are problems such as the occurrence of distortion and hides at the /-O turn, which complicates subsequent processing.

これらの問題点を解決するために、本出願人は特開昭５
７−２３１−８５号公報に開示の文字認識方式を提案し
ている。この方式゛は、（ａ）文字図形を光電変換して
量子化することにより黒ビット及び白ピットで表わされ
るディジタル信号の原パターンを作成し、（ｂ）この原
・εターンの文字図形全体の線幅（例えば平均値）を算
出し、（ｃ）この原パターンを複数の方向に走査を行っ
て各走査列毎の黒ビ、）の連続個数を検出し、この黒ビ
ット連続個数と前記線幅とに基づいて先の複数の方向毎
に対応した複数のサブパターンを抽出し、（ｄ）先の原
パターンの文字枠内領域をサブパターンについて（Ｎ×
Ｍ）個の領域（Ｎ、Ｍは定数）に分割し、更に先の原パ
ターンの線幅を用いて各領域内の文字線長を表わす特徴
量をセルを単位として計算し、（ｅ）この特徴量を文字
の大きさで正規化して特徴マ）　ＩＪクスを作成し、（
ｆ）この特徴マトリクスを予め用意した文字図形パター
ンの標準文字マスクと照合して文字図形を認識する構成
である。In order to solve these problems, the applicant has proposed
A character recognition method disclosed in Japanese Patent No. 7-231-85 is proposed. This method (a) creates an original pattern of a digital signal represented by black bits and white pits by photoelectrically converting and quantizing the character shape, and (b) creates the original pattern of the entire character shape of this original ε turn. Calculate the line width (for example, average value), (c) scan this original pattern in multiple directions to detect the number of consecutive black bits in each scanning line, and compare this number of consecutive black bits with the line width. A plurality of subpatterns corresponding to each of the plurality of directions are extracted based on the width, and (d) the area within the character frame of the previous original pattern is extracted for the subpattern (N×
M) regions (N and M are constants), further calculate the feature amount representing the character line length in each region using the line width of the original pattern in units of cells, and (e) Normalize the feature amounts by the font size and create a feature matrix),
f) This feature matrix is compared with a standard character mask of character and graphic patterns prepared in advance to recognize characters and figures.

（発明が解決しようとする問題点）しかしながら、上記従来の文字認識方式は以下の問題点
を有する。(Problems to be Solved by the Invention) However, the conventional character recognition method described above has the following problems.

例えば、印刷文字のゴシック体と、明朝体について認識
する場合、コ゛シック体は縦線幅、横線幅がほぼ一定で
あるため、文字全体の線幅が縦線扁又は横線幅と等しく
、セルを単位として文字全体の線幅で正規化しても特徴
は安定している。一方明朝体は縦線幅、横線幅が大きく
異なるため、セルを単位として文字全体の線幅で正規化
すると、ゴシック体と同じ線長を持つ文字線でもゴシッ
ク体と比較して水平特徴マ）　ＩＪクスの量は小さく、
垂直特徴マトリクスの量は大きくなり、特徴として不安
定である。For example, when recognizing Gothic fonts and Mincho fonts in printed characters, since the vertical line width and horizontal line width of Gothic fonts are almost constant, the line width of the entire character is equal to the vertical line width or horizontal line width, and the cells are The characteristics remain stable even when normalized using the line width of the entire character as a unit. On the other hand, since Mincho fonts have large differences in vertical and horizontal line widths, if we normalize the line width of the entire character in units of cells, even if the character lines have the same line length as Gothic fonts, the horizontal feature width will be significantly different compared to Gothic fonts. ) The amount of IJ soup is small,
The amount of vertical feature matrix becomes large and is unstable as a feature.

第２にゴシック体と明朝体のサブパターンの例を示す。Second, examples of subpatterns for Gothic and Mincho fonts are shown.

同様の現象が手書文字にも当てはまり、筆記用具の状態
、書き手が筆記用具に入れる力の具合で同一の線長を持
つ文字線でも各特徴量は異なり、特徴が不安定である。A similar phenomenon applies to handwritten characters, and even character lines of the same length can have different feature quantities, making the characteristics unstable, depending on the state of the writing instrument and the force applied by the writer to the writing instrument.

この問題点を解決しようとすれば、その識別のための辞
書を増加させる必要があり、ひいては処理時間の低下を
招くことになる。In order to solve this problem, it is necessary to increase the number of dictionaries for identification, which in turn leads to a reduction in processing time.

従って、本発明はこれらの問題点を解決し、高速で安定
な文字認識方式を提供することにある。Therefore, it is an object of the present invention to solve these problems and provide a high-speed and stable character recognition method.

（問題点を解決するだめの手段）本発明は、以下の構成要素（ａ）〜（ｆ）を具備して構
成される。(Means for Solving the Problems) The present invention includes the following components (a) to (f).

（、）文字図形を光電変換して量子化することにより黒
ビット及び白ビットで表わされるディジタル信号の原パ
ターンを作成する。(,) By photoelectrically converting and quantizing character figures, an original pattern of a digital signal represented by black bits and white bits is created.

（ｂ）次に、該原パターンの線幅を算出する。(b) Next, calculate the line width of the original pattern.

（ｃ）次に、前記原パターンを複数の方向に走査を行っ
て各走査列毎の黒ビットの連続個数を検出し、当該黒ビ
ット連続個数と前記線幅とに基づいて前記複数の走査方
向毎に対応した複数のサブパターンを抽出する。(c) Next, the original pattern is scanned in a plurality of directions to detect the number of consecutive black bits in each scanning row, and based on the number of consecutive black bits and the line width, the number of consecutive black bits is detected in the plurality of scanning directions. A plurality of sub-patterns corresponding to each pattern are extracted.

（ｄ）次に、前記原パターンの文字枠内領域をサブ・ε
ターンについて（Ｎ×Ｍ）個の領域（Ｎ、Ｍは定数）に
分割し、該分割された領域内についてセルを単位として
黒点を計数した結果と各サブ・εターンの線幅とを基に
特徴量を計算する。(d) Next, sub-ε
The turn is divided into (N x M) areas (N and M are constants), and based on the result of counting black points in units of cells within the divided area and the line width of each sub-ε turn. Calculate features.

（ｅ）次に、該特徴量を文字の大きさで正規化して特徴
マトリクスを作成する。(e) Next, the feature amount is normalized by the character size to create a feature matrix.

（ｆ）そして、該特徴マ）　ＩＪクスを予め用意した文
字図形パターンの標準文字マスクと照合して文字図形を
認識する。(f) Then, the IJ mask is compared with a standard character mask of a character/figure pattern prepared in advance to recognize the character/figure.

（作用）上記構成要素（ａ）は識別すべき文字図形を２値化した
ディジタル信号に変換して原パターンを作成する作用を
呈する。(Function) The component (a) has the function of converting characters and figures to be identified into binary digital signals to create an original pattern.

構成要素（ｂ）は原パターンの線幅（平均値）を算出す
る作用を呈する。Component (b) functions to calculate the line width (average value) of the original pattern.

構成要素（ｃ）は原・パターンから複数のサブパターン
を抽出する作用を呈する。Component (c) functions to extract a plurality of sub-patterns from the original pattern.

構成要素（ｄ）はサブ・やターン毎の特徴マ）　ＩＪク
スを作成する作用を呈する。特に、特徴量を黒点数と各
サブ・パターンの線幅とに基づき計算することは、文字
幅変動等を吸収する作用を呈する。Component (d) functions to create a feature matrix for each sub-turn. In particular, calculating the feature amount based on the number of black dots and the line width of each sub-pattern has the effect of absorbing variations in character width and the like.

構成要素（ｅ）は特徴マトリクスを正規化する作用を呈
する。Component (e) acts to normalize the feature matrix.

構成要素（ｆ）は得られた特徴マトリクスを予め用意し
た文字図形パターンの標準文字マスクと照別して、入力
文字図形を識別する作用を呈する。Component (f) has the function of identifying an input character figure by comparing the obtained feature matrix with a standard character mask of a character figure pattern prepared in advance.

（実施例）以下、本発明を一実施例に基づき図面を参照して詳細に
説明する。(Example) Hereinafter, the present invention will be described in detail based on an example with reference to the drawings.

第１図は、本発明の一実施例を示すブロック図である。FIG. 1 is a block diagram showing one embodiment of the present invention.

同図において、１は光信号入力、２は光電変換部、３は
・ぞターンンノスタ、４は線幅計算部、５は文字枠検出
部、６は垂直・にターン抽出部、７は水平パターン抽出
部、８は右斜めサブ・Ｊ？ターン抽出部、９は左斜めサ
ブパターン抽出部、１０は垂直サブ・にターン線幅計算
部、１１は水平サブ・ぞターン線幅計算部、１２は右斜
めサブパターン線幅計算部、１３は左斜めサブｉＪ？タ
ーン線幅計算部、１４は特徴マトリクス抽出部、１５は
識別部、１６は文字名出力である。In the figure, 1 is an optical signal input, 2 is a photoelectric conversion unit, 3 is a turn noster, 4 is a line width calculation unit, 5 is a character frame detection unit, 6 is a vertical/turn extraction unit, and 7 is a horizontal pattern extraction unit. Part, 8 is right diagonal sub-J? 9 is a left diagonal sub-pattern extraction section; 10 is a vertical sub-turn line width calculation section; 11 is a horizontal sub-turn line width calculation section; 12 is a right diagonal sub-pattern line width calculation section; Left diagonal sub iJ? 14 is a feature matrix extraction section, 15 is an identification section, and 16 is a character name output.

次に、各部の構成について説明する。光電変換部２は原
パターンの光信号入力を２値の量子化された電気信号に
変換する。Ａ’ターンレ・マスク３ばこの電気信号を格
納する。この格納の際、文字は例えば１００×１００個
のセルに分割されて、各々の２値コードがパターンレノ
スタ３に記憶される。Next, the configuration of each part will be explained. The photoelectric conversion unit 2 converts the optical signal input of the original pattern into a binary quantized electric signal. A' Turnle mask 3 Stores the electrical signal of the cigarette. During this storage, the characters are divided into, for example, 100×100 cells, and each binary code is stored in the pattern renoster 3.

線幅計算部４は入力・ぞターン（原パターン）の線幅を
計算する。垂直サブパターン抽出部６は・にターンレ・
マスク３について垂直スキャンを全回行なって、黒ビッ
トの連続の長さと線幅計算部４に於て計算された線幅と
の関係よυ垂直サブパターン（ｖｓｐ）を抽出する。同
様に水平サブパターン抽出部７は水平スキャンにより水
平サブパターン（Ｉ（ＳＰ）を、右斜めサブパターン抽
出部８は右斜め（４５°）スキャンにより、右斜めサブ
／’Ｐターン（Ｈ８Ｐ）　ヲ、左斜めサブパターン抽出
部９は左斜め（４５°）スキャンにより、左斜めサブパ
ターン（ＬＳＰ）を抽出する。The line width calculation unit 4 calculates the line width of the input turn (original pattern). The vertical sub-pattern extraction unit 6 is
The mask 3 is vertically scanned all times to extract a vertical sub-pattern (vsp) based on the relationship between the continuous length of black bits and the line width calculated by the line width calculation unit 4. Similarly, the horizontal sub-pattern extractor 7 extracts the horizontal sub-pattern (I (SP)) by horizontal scanning, and the right diagonal sub-pattern extractor 8 extracts the right diagonal sub/'P turn (H8P) by scanning right diagonally (45°). , the left diagonal sub-pattern extraction unit 9 extracts a left diagonal sub-pattern (LSP) by performing a left diagonal (45°) scan.

第３図に原・パターンとサブパターンの例を示す。第３
図の（ａ）が原・パターン、（ｂ）がｖｓｐ、（ｃ）が
Ｈ８Ｐ、（ｄ）がＨ８Ｐ、（ｅ）がＬＳＰである。文字
枠検出部５は・ｐターンンソスタ内の文字パターンに外
接スる文字枠を検出しその結果を特徴マトリクス抽出部
１０へ送る。特徴マトリクス抽出部１０はサブｉｅ　ｐ
−ンレノスタについて原７８ターンの文字枠に対応する
領域を（Ｎ×Ｍ　）の領域（本発明の実施例ではＮ＝Ｍ
＝５）に分割する。文字が１００×１００のセルに分割
され、Ｎ＝Ｍ＝５の場合には各領域は２０Ｘ２０のセル
を有する。FIG. 3 shows an example of an original pattern and a sub-pattern. Third
In the figure, (a) is the original pattern, (b) is the vsp, (c) is the H8P, (d) is the H8P, and (e) is the LSP. The character frame detection unit 5 detects a character frame that circumscribes the character pattern in the p-turn soster and sends the result to the feature matrix extraction unit 10. The feature matrix extraction unit 10 is a sub ie p
- For Nrenostar, the area corresponding to the original 78-turn character frame is an area of (N×M) (in the embodiment of the present invention, N=M
=5). The character is divided into 100x100 cells, and if N=M=5, each region has 20x20 cells.

ここで、ＶＳＰを例にとり、特徴マトリクスを抽出する
方法を説明する。垂直サブパターン線幅計算部１０は、
垂直サブパターン抽出部６において抽出された垂直サブ
パターンの線幅（Ｗｖ　）を、線幅計算部４と同様の処
理により計算する。特徴マトリクス抽出部１４は前述し
た分割領域の黒点数（Ｂｉｊ）をセルを単位として計算
し、下記（１）式を基に垂直サブ・やターンの線幅（ｗ
ｖ）を使用して下記（２）式によシ各領域内のサブｉＪ
？ターンのストロークの長さを表現する特徴量をセルを
単位として計算し、（Ｎ×Ｍ）次元の特徴マトリクスを
作成する。Here, a method for extracting a feature matrix will be explained using VSP as an example. The vertical sub-pattern line width calculation unit 10
The line width (Wv) of the vertical sub-pattern extracted by the vertical sub-pattern extraction section 6 is calculated by the same process as the line width calculation section 4. The feature matrix extraction unit 14 calculates the number of black points (Bij) of the divided area described above in units of cells, and calculates the line width (w) of vertical sub-turns based on the following formula (1).
sub iJ in each region according to the following equation (2) using
? A feature amount representing the length of the stroke of a turn is calculated in units of cells, and a (N×M) dimensional feature matrix is created.

黒点数＝線長Ｘ線幅　　　　　　　　・・・（１）ｔｉ
　ｊ　＝　Ｂｉｊ／　Ｗｖ　　　　　　　　　　　　　
−（２）同様の処理をＨ３Ｐについては、水平サブパタ
ーン線幅計算部１１で計算した水平サブパターンの線幅
（”ｏ　）を使用し、Ｈ８Ｐについては右斜めサブパタ
ーン線幅計算部１２で計算した右斜めサブパターンの線
幅（ｗＲ）を使用し、ＬＳＰについては、左斜めサブパ
ターン線幅計算部１３で計算した左斜めサブパターンの
線幅（ｗＬ）を使用して行い特徴マトリクスを作成する
。Number of sunspots = line length x line width ... (1) ti
j = Bij/Wv
- (2) Similar processing is performed for H3P by using the horizontal sub-pattern line width ("o) calculated by the horizontal sub-pattern line width calculation section 11, and for H8P by using the right diagonal sub-pattern line width calculation section 12. The calculated line width (wR) of the right diagonal sub-pattern is used, and for LSP, the line width (wL) of the left diagonal sub-pattern calculated by the left diagonal sub-pattern line width calculation unit 13 is used to generate the feature matrix. create.

次に、特徴マトリクス抽出部１４は、抽出した特徴マｌ
−ＩＪクスを文字の大きさで正規化し、正規化した特徴
マトリクスを作成する。その方法は正規化前の特徴マ）
　ＩＪクスの１要素を１１ｊ、正規化後の要素をＬｉ１
、文字枠の水平方向の長さをΔＸ、垂直方向の長さをΔ
Ｙとすると下記の様な処理を行なう。Next, the feature matrix extraction unit 14 extracts the extracted feature matrix.
- Normalize IJ Kus by character size and create a normalized feature matrix. The method is to use the feature matrix before normalization)
One element of IJ is 11j, and the element after normalization is Li1.
, the horizontal length of the character frame is ΔX, and the vertical length is Δ
If Y is selected, the following processing will be performed.

（１）垂直サブパターン（ｖｓｐ）マトリクスの場合Ｌ
ｉ　ｊ　＝　ｔｉ　ｊ／ΔＹ　　　　　　　　　　・・
・（３）（２）水平サブパターン（ＲＳＰ）マトリクス
の場合Ｌ１ｊ＝ｔｊｊ／ΔＸ　　　　　　　　　　・・
・（４）（３）斜めサブパターン（ＲＳＰ　、　ＬＳＰ
）マトリクスの場合Ｌｉ　ｊ　””　’ｉ　ｊ／（ΔＸ）２＋（Δｙ　）　
２　　　　・・・（５）前記処理により、特徴マトリク
ス抽出部１４は最終的に原パターンを表現する（（Ｎ×
Ｍ）Ｘ４’を次元の正規化した特徴マドＩＪクスを作成
する。(1) For vertical subpattern (vsp) matrix L
i j = ti j/ΔY...
・(3) (2) In case of horizontal sub-pattern (RSP) matrix L1j=tjj/ΔX ・・
・(4)(3) Diagonal sub-patterns (RSP, LSP
) For matrix, Li j ”” 'i j/(ΔX)2+(Δy)
2 (5) Through the above processing, the feature matrix extraction unit 14 finally expresses the original pattern ((N×
M) Create a feature matrix whose dimensions are normalized by X4'.

識別部１５は標準文字マスク（ｆｍ）と特徴マトリクス
抽出部１４に於て抽出された特徴マトリクス（ｆｌ）と
の間に従来から使用されている下式（６）の距離（Ｄ）
、すなわち（Ｎ×Ｍ）Ｘ４次元特特徴間に於ける２つの
ベクトルの差分ベクトルの長さが最小の値を与える標準
文字マスクのカテがす名を文字名出力１２として出力す
る。The identification unit 15 calculates the distance (D) between the standard character mask (fm) and the feature matrix (fl) extracted by the feature matrix extraction unit 14 according to the formula (6) below.
That is, the category name of the standard character mask that gives the minimum length of the difference vector between the two vectors between the (N×M)X4-dimensional feature is output as the character name output 12.

Ｄ＝７罰−丁１７　　　　　　　・・（６）次に、動作
を説明する。光信号人力１は光電変換部２より光電変換
され、パターンレノスタ３及び線幅計算部４に供給され
る。線幅計算部４は原・々ターンの線幅を計算し、これ
を各サブパターン抽出部６〜９に出力する。垂直サブ・
ぞターン抽出部はパターンレノスタ３の出力及び線幅計
算部・４の出力に基づき、前述した方法によりｖｓｐを
抽出する。同様にして、水平サブ・にターン抽出部７、
右斜めサブパターン抽出部８及び左斜めサブパターン抽
出部９はそれぞれＲＳＰ　、　ＲＳＰ及びＬＳＰを抽出
する。垂直サブパターン線幅計算部１０ばｖｓｐの線幅
（Ｗｙ　）を計算する。同様にして、水平サブパターン
線幅計算部１１、右斜めサブパターン線幅計算部１２及
び左斜めサブパターン線幅計算部１３ばそれぞれＲＳＰ
の線幅（Ｗ、、　）、ＲＳＰの線幅（Ｗ□）及びＬＳＰ
の線幅（ＷＬ）を計算する。そして、特徴マ）　ＩＪク
ス抽出部１４は前記（１）及び（２）式に従い特徴マト
リクスを作成し、更に文字枠検出部５から出力された文
字の大きさで正規化し、特徴マトリクスを作成する。こ
のようにして作成された特徴７トリクスは識別部１１で
標準文字マスクと比較され、最小の距離を与える標準文
字マスクのカテゴリ名が文字名出力１２として出力され
る。D=7 penalty - D17 (6) Next, the operation will be explained. The optical signal input 1 is photoelectrically converted by a photoelectric converter 2 and supplied to a pattern renoster 3 and a line width calculator 4. The line width calculation section 4 calculates the line width of each original turn and outputs it to each subpattern extraction section 6 to 9. Vertical sub-
The turn extraction section extracts vsp based on the output of the pattern renostar 3 and the output of the line width calculation section 4 using the method described above. Similarly, the horizontal sub-turn extraction section 7,
The right diagonal subpattern extraction unit 8 and the left diagonal subpattern extraction unit 9 extract RSP, RSP, and LSP, respectively. The vertical sub-pattern line width calculation unit 10 calculates the line width (Wy) of vsp. Similarly, the horizontal sub-pattern line width calculation section 11, the right diagonal sub-pattern line width calculation section 12, and the left diagonal sub-pattern line width calculation section 13 each perform an RSP.
line width (W, , ), RSP line width (W□) and LSP
Calculate the line width (WL) of Then, the feature matrix extraction unit 14 creates a feature matrix according to equations (1) and (2) above, and further normalizes it with the character size output from the character frame detection unit 5 to create a feature matrix. . The feature 7 matrix created in this manner is compared with the standard character mask in the identification unit 11, and the category name of the standard character mask that provides the minimum distance is output as the character name output 12.

（発明の効果）以上説明したように、原パターンから抽出したサブパタ
ーンについて文字枠領域を（Ｎ×Ｍ）の領域に分割し、
更に分割領域内の黒点と各サブパターンの線幅を基に特
徴マトリクスを作成し、更にこれを文字の大きさで正規
化した特徴マトリクスを抽出しているので、明朝体及び
ゴシック体においても同じ特徴マトリクスが得られ、文
字の変形、文字線幅の変動に対して安定な文字認識が可
能となる。また、原パターンから抽出したサブパターン
について単純な走査により特徴マトリクスを抽出してい
るので、高速な文字認識が可能となる。(Effect of the invention) As explained above, the character frame area is divided into (N×M) areas for the subpattern extracted from the original pattern,
Furthermore, a feature matrix is created based on the black dots within the divided area and the line width of each sub-pattern, and a feature matrix is extracted by normalizing this with the font size, so even Mincho and Gothic fonts can be used. The same feature matrix is obtained, and stable character recognition is possible despite character deformation and character line width variations. Furthermore, since the feature matrix is extracted by simple scanning of subpatterns extracted from the original pattern, high-speed character recognition is possible.

る。Ru.

[Brief explanation of drawings]

第１図は本発明の一実施例を示すブロック図、第２図は
コ゛シック体と明朝体のサブパターン例を示す図、第３
図は原パターンとサブパターンの抽出例を示す図である
。１・・・光信号入力、２・・・光電変換部、３・・・・
ゼターンレジスタ、４・・線幅計算部、５・・・文字枠
検出部、６・・・垂直サブパターン抽出部、７・・・水
平サブパターン抽出部、８・・・右斜めサブパターン抽
出部、９・・・左斜めサブパターン抽出部、１０・・・
垂直サブパターン線幅計算部、１１・・・水平サブノ’
？ターン線幅計算部、１２・・・右斜めサブ・ｐターン
線幅計算部、１４・・・特徴マトリクス抽出部、１５・
・・識別部、１６・・・文字名出力。特許出願人　沖電気工業株式会社特許出願代理人　弁理士山本恵− ゴシ・ノア）ｌ＄ヒ明＠体句サすぐ７−ン伜１凄、２　
閏FIG. 1 is a block diagram showing an embodiment of the present invention, FIG. 2 is a diagram showing examples of sub-patterns of Kosic typeface and Mincho typeface, and FIG.
The figure shows an example of extraction of original patterns and sub-patterns. 1... Optical signal input, 2... Photoelectric conversion section, 3...
Zeturn register, 4...Line width calculation unit, 5...Character frame detection unit, 6...Vertical sub-pattern extraction unit, 7...Horizontal sub-pattern extraction unit, 8...Right diagonal sub-pattern extraction Part, 9... Left diagonal sub pattern extraction part, 10...
Vertical sub-pattern line width calculation section, 11...Horizontal sub-pattern'
? Turn line width calculation unit, 12... Right diagonal sub/p turn line width calculation unit, 14... Feature matrix extraction unit, 15.
...Identification part, 16...Character name output. Patent applicant: Oki Electric Industry Co., Ltd. Patent application agent: Patent attorney Megumi Yamamoto - Goshi Noah
Leap

Claims

[Claims] (a) Create an original pattern of a digital signal represented by black bits and white bits by photoelectrically converting and quantizing character figures; (b) Calculate the line width of the original pattern; (c) scanning the original pattern in a plurality of directions to detect the number of consecutive black bits in each scanning row, and scanning the plurality of scans based on the number of consecutive black bits and the line width of the original pattern. extracting a plurality of sub-patterns corresponding to each direction; (d) dividing the region within the character frame of the original pattern into (N×M) regions (N and M are constants) for the sub-pattern;
A feature amount is calculated based on the result of counting black points in each cell in the divided area and the line width of each sub-pattern, and (e) the feature amount is normalized by the character size to create a feature matrix. (f) A character recognition method characterized by recognizing a character figure by comparing the feature matrix with a standard character mask of a character figure pattern prepared in advance.