JPH02100783A

JPH02100783A - Character recognizing method

Info

Publication number: JPH02100783A
Application number: JP63254108A
Authority: JP
Inventors: Atsushi Shimoyama; 霜山　篤
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 1988-10-07
Filing date: 1988-10-07
Publication date: 1990-04-12
Anticipated expiration: 2013-11-25
Also published as: JP2827227B2

Abstract

PURPOSE:To improve the character recognizing process efficiency by giving the positional correction to the feature extracted from a single character in response to an area which is turned into a blank space with one character against a square and collating the feature undergone the positional correction with the feature read out of a dictionary for selection of candidate characters. CONSTITUTION:A processor 1 corrects the extracted feature with a position correcting arithmetic operation in response to an area which is turned into blank space with one character against a square. Then, the processor 1 performs the collation between the feature of a dictionary 8 and the feature of one character that undergone the positional correction to select the candidate characters. Thus it is not required to give the conventional horizontal, vertical, and oblique scans to the area corresponding to the blank area for extraction of the feature in case the feature is extracted with scan of one character. Therefore a scanning area is reduced and the feature extracting time is also shortened.

Description

【発明の詳細な説明】〔目次〕概要産業上の利用分野従来の技術発明が解決しようとする課題課題を解決するための手段作用実施例発明の効果〔概要〕切出した文字の特徴を抽出して辞書と照合し、相違度の
小さい候補文字を選択する際の処理時間を短縮する文字
認識装置に関し、文字認識処理効率を高めることを目的とし、未知の文字
列から一文字を切出し、切出した一文字の領域を走査し
て特徴を抽出し、所定の大きさの正方形に内接する文字
の特徴を格納する辞書から読出した特徴と照合し、相違
度の小さい辞書内の文字を候補文字として選択して送出
する文字認識装置の処理において、切出した一文字の縦
方向及び横方向の大きさを正方形の大きさと比較し、一
文字の縦方向及び／又は横方向の大きさが正方形より大
きい場合、一文字の縦方向又は横方向の大きい方が正方
形に内接するようにした時の縮小率を求め、一文字を該
縮小率で縮小させてから、縮小した文字の領域を走査し
て特徴を抽出させ、正方形に対して縮小した文字により
空白となる領域に対応して、縮小した文字から抽出した
特徴に対し位置補正を行い、切出した一文字の縦方向及
び横方向の大きさが正方形より小さい場合、一文字の領
域を走査して特徴を抽出させ、正方形に対して一文字に
より空白となる領域に対応して、一文字から抽出した特
徴に対し位置補正を行い、位置補正した特徴と、辞書か
ら読出した特徴と照合させて候補文字を選出する構成と
する。[Detailed description of the invention] [Table of contents] Overview Industrial field of application Conventional technology Problems to be solved by the invention Means for solving the problems Effects of the invention [Summary] Extraction of features of cut out characters Regarding character recognition devices that reduce the processing time when selecting candidate characters with a small degree of difference by comparing them with a dictionary, we cut out a single character from an unknown character string and The area is scanned to extract features, and the characters are compared with the features read from a dictionary that stores the features of characters inscribed in a square of a predetermined size, and characters in the dictionary with a small degree of difference are selected as candidate characters. In the processing of the character recognition device to be sent, the vertical and horizontal sizes of the cut out character are compared with the size of the square, and if the vertical and/or horizontal size of the character is larger than the square, the length of the character is Find the reduction ratio when the larger direction or width is inscribed in the square, reduce one character by the reduction ratio, scan the area of the reduced character to extract features, and apply it to the square. Corresponding to the area that becomes blank due to the reduced character, position correction is performed on the features extracted from the reduced character, and if the vertical and horizontal dimensions of the extracted character are smaller than a square, the area of the single character is Scan the square to extract features, correct the position of the features extracted from each character corresponding to the area that becomes blank due to one character in the square, and compare the position-corrected features with the features read from the dictionary. The configuration is such that candidate characters are selected.

[Industrial application field]

本発明は切出した文字の特徴を抽出して辞書と照合し、
相違度の小さい候補文字を選択する文字認識装置に係り
、特に特徴を抽出する際の処理時間を短縮することを可
能とする文字認識装置に関する。The present invention extracts the characteristics of cut out characters and compares them with a dictionary,
The present invention relates to a character recognition device that selects candidate characters with a small degree of dissimilarity, and particularly to a character recognition device that makes it possible to shorten processing time when extracting features.

情＋ｔａ化社会の発展に伴い、文書を読取って計算機シ
ステムに処理させることが盛んとなって来た。With the development of an information-oriented society, it has become popular to read documents and have computer systems process them.

このため文字認識装置が利用されているが、この文字認
識装置は読取った文書から一文字ずつ文字を切出し、こ
の切出した文字を所定の大きさの枠内に内接するように
、拡大／縮小する正規化を行い、この正規化した文字を
走査して特徴を抽出し、辞書と照合して相違度の小さい
文字を認識結果として計算機システムに送出している。For this purpose, a character recognition device is used, and this character recognition device cuts out characters one by one from the read document and enlarges/reduces the cut out characters so that they are inscribed within a frame of a predetermined size. The normalized characters are scanned to extract features, compared with a dictionary, and characters with a small degree of difference are sent to the computer system as recognition results.

正規化した文字から特徴を抽出するための走査は、文字
の上下、左右、斜めの方向から何度も行われるため、走
査面積が大きい程時間が必要となるが、文字の大きさに
は何種類もあり、小さな文字や縦方向に長い文字又は横
方向に長い文字の特徴抽出処理時間は短縮されることが
望ましい。Scanning to extract features from normalized characters is performed many times from the top, bottom, left, right, and diagonal directions of the characters, so the larger the scanning area, the more time is required. There are different types of characters, and it is desirable to shorten the feature extraction processing time for small characters, vertically long characters, or horizontally long characters.

[Conventional technology]

第７図は従来の技術を説明するブロック図で、第８図は
第７図の動作を説明する図である。FIG. 7 is a block diagram explaining the conventional technique, and FIG. 8 is a diagram explaining the operation of FIG. 7.

プロセッサ１はプログラムメモリ２かＱプログラムを読
出して動作し、例えばスキャナ１０を制御して文書から
読取った文字列を画像メモリ５に格納させ、文字切出し
部３を制御して、画像メモリ５に格納された文字列から
、一文字を切出させて文字画像として画像メモリ５に格
納させる。The processor 1 operates by reading the program memory 2 or the Q program, for example, controls the scanner 10 to store the character string read from the document in the image memory 5, and controls the character cutting section 3 to store the character string in the image memory 5. One character is cut out from the character string and stored in the image memory 5 as a character image.

正規化部４はプロセッサ１の制御により、画像メモリ５
に格納された文字画像を所定の大きさの枠内に内接する
ように拡大／縮小する正規化処理を行って、画像メモリ
５に格納する。Under the control of the processor 1, the normalization unit 4 stores the image memory 5.
A normalization process is performed to enlarge/reduce the character image stored in the character image so that it is inscribed within a frame of a predetermined size, and the character image is stored in the image memory 5.

即ち、正規化部４は第８図（ｂｌ又は（Ｃ）に示す如き
文字を正規化して、第８図ｆａ）■に示す如き例えば、
４８Ｘ４８ドツトの大きさの正方形の枠■に内接する文
字とする。That is, the normalization unit 4 normalizes the characters as shown in FIG.
The characters are inscribed in a square frame (■) with a size of 48 x 48 dots.

特徴抽出部６はプロセッサ１の制御に基づき、画像メモ
リ５に格納された第８図（ａｌの斜線で示す文字を、図
中矢印■に示す如（左右に、矢印■に示す如く上下に、
矢印■■に示す如く、斜めの方向から走査し、公知の方
法で、例えば４３２次元の特徴を抽出して画像メモリ５
に格納する。Under the control of the processor 1, the feature extracting unit 6 extracts the characters stored in the image memory 5 in the diagonal lines shown in FIG.
As shown by the arrow ■■, the image is scanned from a diagonal direction, and 432-dimensional features, for example, are extracted using a known method and stored in the image memory 5.
Store in.

照合部７はプロセッサ１の制御に基づき、画像メモリ５
に格納された文字の特徴と、辞書８から読出した文字の
特徴とを照合し、辞書８から読出して最も相違度の小さ
い特徴を持つ文字を候補文字として、インタフェース部
９を経て例えば上位装置に送出する。一般に、候補文字
は相違度の小さいものから順に、相違度の大きいものを
複数選択して送出する。Under the control of the processor 1, the collation unit 7 stores the image memory 5.
The characteristics of the characters stored in the dictionary 8 are compared with the characteristics of the characters read from the dictionary 8, and the character read from the dictionary 8 and having the characteristics with the smallest degree of difference is selected as a candidate character and sent to, for example, a host device via the interface section 9. Send. Generally, a plurality of candidate characters are selected and sent in descending order of dissimilarity, starting with those with the greatest dissimilarity.

[Problem to be solved by the invention]

上記の如く、従来は正規化部４で文字を正規化する時、
第８図（ｂｌのように、枠■より大きな文字は縮小させ
、第８図（Ｃ）のように、枠■より小さな文字は拡大し
て、枠■に内接するようにしている。As mentioned above, conventionally, when characters are normalized by the normalization unit 4,
As shown in Figure 8 (bl), characters larger than the frame ■ are reduced, and as shown in Figure 8 (C), characters smaller than the frame ■ are enlarged so that they are inscribed in the frame ■.

即ち、正方形の枠■の四辺に接するようにしている。In other words, it touches the four sides of the square frame (■).

従って、特徴抽出部６は枠■の内部領域を総て走査して
、４３２次元の特徴を抽出しなければならない。Therefore, the feature extraction unit 6 must scan the entire area inside the frame (3) and extract 432-dimensional features.

第９図は正規化部と特徴抽出部が必要とする時間を説明
する図である。FIG. 9 is a diagram illustrating the time required by the normalization section and the feature extraction section.

縦軸に時間をとり横軸に走査面積をとると、■に示ず正
規化に必要な時間は、文字切出し部３が切出した文字の
大きさが、第８図（ａ）の枠■に内接する大きさであれ
ば、画像メモリ５０枠■内に移動させるのみで良く、図
中移動で示す如く、時間は最小であるが、枠■より大き
ければ、この文字を走査して縮小する時間は、文字が大
きい程走査時間が必要となり、図中の縮小で示す斜線の
如く文字の大きさに比例して多くなる。If we take time on the vertical axis and scanning area on the horizontal axis, the time required for normalization (not shown in If the size is inscribed, it is only necessary to move it within the 50 frame ■ of the image memory, and as shown by the movement in the figure, the time is minimal, but if it is larger than the frame ■, it will take time to scan and reduce this character. The larger the character, the more scanning time is required, and the time increases in proportion to the size of the character, as shown by the diagonal lines indicated by the reduction in the figure.

又、枠■より小さければ、この文字を走査して拡大する
時間は、文字が小さい程走査時間が少なくて良いため、
図中の拡大で示す斜線の如く文字の大きさに比例して少
なくなる。Also, if it is smaller than the frame ■, the time it takes to scan and enlarge this character is because the smaller the character is, the less time it takes to scan it.
The number decreases in proportion to the size of the character, as shown by the diagonal line shown in the enlarged figure.

■に示す特徴抽出に必要な時間は、前記の如く枠■の内
部領域を総て走査する必要があり、文字切出し部３が切
出した文字の大きさに関係無（、一定時間となる。The time required to extract the feature shown in (2) is a constant time, as it is necessary to scan the entire area inside the frame (2) as described above, and is independent of the size of the character cut out by the character cutout section 3.

又、特徴抽出を行った後、同一文字は同じ特徴が出るよ
うに、■に示す特徴の補正が行われるが、この時間も一
定である。Further, after the feature extraction is performed, the feature correction shown in (■) is performed so that the same characters have the same features, but this time is also constant.

従って、点線の■に示す如く、総合した時間は切出し部
３が切出した文字の大きさに比例して多くなる。Therefore, as shown by the dotted line ■, the total time increases in proportion to the size of the character cut out by the cutout section 3.

第１０図は文字の大きさの分布を説明する図である。FIG. 10 is a diagram illustrating the distribution of character sizes.

横軸に文字ナイズをとり、縦軸に頻度をとると、文字認
識装置が取り扱う文字の大きさの頻度は、４８Ｘ４８ド
ソ］・より大きな文字に対し、小さい文字の方が多いこ
とを示している。If we take character size on the horizontal axis and frequency on the vertical axis, the frequency of character sizes handled by the character recognition device is 48 x 48 doso] - This shows that smaller characters are more common than larger characters. There is.

即ら、枠■より小さい文字を取り扱うことが多く、枠■
より大きい文字も、横方向に大きい文字や、縦方向に大
きい文字があるが、従来は、総て枠■の四辺に内接する
ように拡大／縮小が行われるため、特徴抽出時間は一定
であり、拡大する文字の正規化時間は文字の大きさに比
例して多くなるという問題がある。In other words, characters that are smaller than the frame■ are often handled, and the frame■
Larger characters also include characters that are larger in the horizontal direction and characters that are larger in the vertical direction, but conventionally, all of them are enlarged/reduced so that they are inscribed in the four sides of the frame ■, so the feature extraction time is constant. , there is a problem that the normalization time for enlarged characters increases in proportion to the size of the characters.

本発明はこのような問題点に鑑み、枠■より小さい文字
は拡大せずに枠■の二辺に接するように移動させること
で、枠■内で空白となる部分の走査を行わないようにし
て、正規化する際の拡大に必要な走査時間を不要とする
と共に、特徴抽出に必要な走査時間も短縮する。又、枠
■より大きい文字は横又は縦方向で大きい方が枠■ψ横
又は縦の二辺に接するようにして、空白となる部分の走
査を不要とし、特徴抽出時間を短縮して、文字認識処理
効率を高めることを目的としている。In view of this problem, the present invention moves characters smaller than the frame ■ so that they touch two sides of the frame ■ without enlarging them, thereby avoiding scanning of blank areas within the frame ■. This eliminates the scanning time required for enlargement during normalization, and also reduces the scanning time required for feature extraction. In addition, for characters larger than the frame ■, the larger one in the horizontal or vertical direction touches the two horizontal or vertical sides of the frame ■ψ, eliminating the need to scan blank areas and shortening the feature extraction time. The purpose is to increase recognition processing efficiency.

[Means to solve the problem]

第１図は本発明の構成を示す処理の流れ図である。 FIG. 1 is a process flowchart showing the configuration of the present invention.

第１図は文字認識装置を制御するプロセッサの処理の流
れを示し、■１〜１９は処理ステップである。FIG. 1 shows the flow of processing by a processor that controls a character recognition device, and 1 to 19 are processing steps.

[Effect]

未知の文字列から一文字を切出し、該切出した一文字の
領域を走査して特徴を抽出し、所定の大きさの正方形に
内接する文字から得られる特徴を格納する辞書から読出
した特徴と照合し、相違度の小さい咳、辞書内の文字を
候補文字として選択し、該選択した候補文字を認識結果
として送出する文字認識装置のプロセッサは、処理ステ
ップ１１で切出した一文字の縦方向と横方向の大きさが
、該正方形より大きいか調べる。Cut out one character from an unknown character string, scan the area of the cut out character to extract features, and compare them with features read from a dictionary that stores features obtained from characters inscribed in a square of a predetermined size, The processor of the character recognition device that selects a character in the dictionary with a small degree of difference as a candidate character and sends out the selected candidate character as a recognition result calculates the vertical and horizontal size of the character cut out in processing step 11. Check whether the size is larger than the square.

そして、一文字の縦方向及び／又は横方向の大きさが、
該正方形より大きい場合、ステップ１２において、この
一文字を該正方形の二辺に横又は縦の大きい方が内接す
るように縮小する縮小率を算出する。The vertical and/or horizontal size of one character is
If it is larger than the square, in step 12, a reduction rate is calculated to reduce this one character so that the larger horizontal or vertical side is inscribed in the two sides of the square.

そして、ステップ１３で算出した縮小率を用いて、一文
字を縮小した後正規化させ、ステップ１４で縮小した文
字領域を走査させて、特徴の抽出を行わせる。Then, using the reduction ratio calculated in step 13, one character is reduced and normalized, and in step 14, the reduced character area is scanned to extract features.

第２図は特徴抽出の一例を説明する図である。FIG. 2 is a diagram illustrating an example of feature extraction.

例えば、第８図（ｂｌ又は（Ｃ１に示す如き文字を縮小
した後正規化して、第２図（、ｌ）■に示す如き例えば
、４８Ｘ４８ドツトの大きさの正方形の枠■に、例えば
横が内接する文字とする。For example, after reducing and normalizing the characters as shown in Figure 8 (bl or (C1), for example, the width is It is an inscribed character.

ブロセッ４ノ・は特徴抽出部を制御して第２図ｔａ＋の
斜線で示す文字を、前記同様に走査させるが、この時文
字を枠■の内部で移動させて、例えば文字の上部を枠■
に内接させ、空白領域［相］を作成し、この空白領域［
相］の走査は行わせない。そして、公知の方法で特徴を
抽出させる。Brosset 4no. controls the feature extraction unit to scan the characters indicated by diagonal lines in Figure 2 ta+ in the same manner as above, but at this time, the characters are moved within the frame ■, and for example, the upper part of the character is moved into the frame ■.
Create a blank area [phase] by inscribing this blank area [
Phase] scanning is not performed. Then, features are extracted using a known method.

プロセッサはステップ１５で、正方形に対して縮小した
文字により空白となる領域、即ち、空白領域［相］に対
応して、抽出された特徴を位置補正する演算を行って補
正する。In step 15, the processor performs an operation to correct the position of the extracted feature corresponding to the area that becomes blank due to the reduced character with respect to the square, that is, the blank area [phase].

即ち、特徴抽出部が抽出した特徴は、一文字を正規化す
る際に、枠■の四辺に内接するように縮小していないた
め、枠■の空白領域［相］に対応する部分が縮小された
と同様な文字画像に基づく特徴であるが、辞書には枠■
の四辺に内接する文字画像から抽出された特徴が格納さ
れているため、特徴が対応するように位置補正する。In other words, when normalizing a single character, the features extracted by the feature extraction unit are not reduced so that they are inscribed in the four sides of the frame ■, so it is assumed that the part corresponding to the blank area [phase] of the frame ■ has been reduced. Although it is a feature based on similar character images, the dictionary has a frame
Since the features extracted from the character images inscribed on the four sides of are stored, the positions are corrected so that the features correspond.

次に、プロセッサはステップ１８で辞書の特徴と位置補
正した一文字の特徴とを照合させ、ステップ１９で候補
文字の選出を行わせる。Next, in step 18, the processor compares the features of the dictionary with the features of the position-corrected character, and in step 19, selects candidate characters.

又、プロセッサはステップ１１で、一文字の縦方向及び
横方向の大きさが正方形より小さい場合、ステップ１６
で特徴抽出部を制御して、一文字の領域を走査させて特
徴の抽出を行わせる。Further, in step 11, if the length and width of one character are smaller than a square, the processor executes step 16.
The feature extraction unit is controlled to scan an area of one character and extract features.

特徴抽出部は第２図（ｂｌの斜線で示す文字を、前記同
様に走査するが、この時文字を枠■の内部で移動させて
、例えば文字の上部と左側を枠■に内接させ、空白領域
■を作成し、この空白領域■の走査は行わない。そして
、公知の方法で特徴を抽出する。The feature extraction unit scans the characters indicated by diagonal lines in Fig. 2 (bl) in the same manner as described above, but at this time, the characters are moved within the frame ■, for example, the upper and left side of the characters are inscribed in the frame ■, A blank area (■) is created, this blank area (■) is not scanned, and features are extracted using a known method.

ここで、プロセッサはステップ１７において、正方形に
対して一文字により空白となる領域、即ち、空白領域０
に対応して、抽出された特徴を位置補正する演算を行っ
て補正する。Here, in step 17, the processor selects an area that becomes blank due to one character for the square, that is, a blank area 0.
Corresponding to this, a calculation is performed to correct the position of the extracted feature.

即ち、特徴抽出部が抽出した特徴は、一文字を枠■の四
辺に内接するように拡大せず、枠■の空白領域■に対応
する部分が縮小されたと同様な文字画像に基づく特徴で
あるが、辞書には枠■の四辺に内接する文字画像から抽
出された特徴が格納されているため、特徴が対応するよ
うに位置補正する。In other words, the features extracted by the feature extraction unit are based on character images that are similar to those in which one character is not expanded to be inscribed in the four sides of the frame ■, but the part corresponding to the blank area ■ of the frame ■ is reduced. , Since the dictionary stores the features extracted from the character images inscribed on the four sides of the frame ■, the positions are corrected so that the features correspond.

以上により、一文字を走査して特徴を抽出する際に、空
白領域［相］又は■に対応する領域を、従来のように第
８図（ａｌに示す如く、左右、」１下及び斜めに走査し
て４３２次元もの特徴を抽出する必要が無く、走査領域
が縮小されるため、特徴抽出処理時間を短縮することが
出来る。As described above, when scanning one character and extracting features, the blank area [phase] or the area corresponding to Since there is no need to extract 432-dimensional features and the scanning area is reduced, the feature extraction processing time can be shortened.

そして、プロセッサは切出した文字の大きさと、枠■の
大きさとの比に基づき、特徴抽出部が抽出した特徴の位
置補正を行うが、演算時間は短いため、特徴抽出部の特
徴抽出処理時間の短縮の効果が大きい。Then, the processor corrects the position of the feature extracted by the feature extractor based on the ratio between the size of the cut out character and the size of the frame ■, but since the calculation time is short, the feature extraction processing time of the feature extractor is The shortening effect is significant.

又、正規化する際に、切出された文字が枠■より小さい
場合、拡大する処理を行わないため、正規化時間を短縮
することが可能となる。Further, during normalization, if the cut out character is smaller than the box (■), the enlargement process is not performed, which makes it possible to shorten the normalization time.

従って、文字認識装置の処理効率を高めることが出来る
。Therefore, the processing efficiency of the character recognition device can be improved.

〔Example〕

第３図は本発明の一実施例を示す回路のブロック図で、
第４図は第３図の動作を説明するフローチャー１−で、
第５図は第３図の動作を説明する図である。FIG. 3 is a block diagram of a circuit showing one embodiment of the present invention.
FIG. 4 is a flowchart 1- explaining the operation of FIG.
FIG. 5 is a diagram illustrating the operation of FIG. 3.

第７図と同一符号は同一機能のものを示す。プロセッサ
′１はプログラムメモリ１２に格納されているプログラ
ムを読出して動作する。即ち、第４図Φφに示す如く、
インタフェース部９を経て上位装置から文字認識を指示
されると、スキャナ１０を制御して原稿から１ペ一ジ分
の文字列の読取りを行わせ、画像メモリ５のイメージ領
域に格納さ・Ｕると、第５図（ａｌに示す如く、画像メ
モリ５のイメージ領域上における文字位置を検出する。The same reference numerals as in FIG. 7 indicate the same functions. Processor '1 reads a program stored in program memory 12 and operates. That is, as shown in FIG. 4 Φφ,
When character recognition is instructed from the host device via the interface unit 9, the scanner 10 is controlled to read one page of character strings from the document, and store them in the image area of the image memory 5. Then, as shown in FIG. 5 (al), the character position on the image area of the image memory 5 is detected.

即ち、第５図ｔａｌの座標値ｘ、ｙと、文字の高さｈ及
び幅Ｗを求め、文字切出し部３を；ｆｉｌ制御して、一
文字を切出させて文字画像として画像ノモリ５に格納さ
せる。That is, the coordinate values x and y of tal in FIG. let

そして、切出した文字を４８ドツトと比較する。Then, the cut out characters are compared with 48 dots.

即ら、第４図■■に示す如く、Ｗ〉４８か否かをユ１ム
１べ、次にｈ〉４８か否かを調べる。That is, as shown in FIG. 4, it is checked whether W>48 or not, and then it is checked whether h>48.

Ｗ及び１１が４８トソトより小ざければ、第４図・唇に
示［如く、切出した一文字を画像メモリ５の正規化イメ
ージ領域に移動させるが、Ｗ及び／又はｈが４８ドツト
より大きければ、正規化縮小率を求めるため、第４図０
に示ず如く、Ｗ≧ｈか否かを調べ、Ｗ≧ｈであれば、ｋ
＝ｗとし、Ｗ≧ｈでなければ、ｋ＝ｈとして、ｐ　＝　
４８　／　ｋより縮小率ｐを求める。If W and 11 are smaller than 48 dots, the cut out character is moved to the normalized image area of the image memory 5, as shown in the lip in FIG. 4, but if W and/or h are larger than 48 dots, , to find the normalized reduction ratio, use Fig. 4 0
As shown in , check whether W≧h or not, and if W≧h, k
= w, and if W≧h, then k = h, p =
48/k to find the reduction rate p.

プロセッサ１は画像メモリ５に格納された一文字領域を
、正規化部４を制御して第５図ｔｂｌに示す如く左右に
走査させ、第５図（Ｃ１に示す如くテーブルを作成させ
る。The processor 1 controls the normalization unit 4 to scan a single character area stored in the image memory 5 from side to side as shown in FIG. 5 tbl, and creates a table as shown in FIG. 5 (C1).

即ち、例えば、第５図（ｂｌの＠で示す走査線上の黒画
素の始点と終点の位置ａ及びｂ、０で示す走査線上の黒
画素の始点と終点の位置ｃ、ｄとｅ。That is, for example, the positions a and b of the starting and ending points of a black pixel on the scanning line indicated by @ in FIG.

ｒ及びｇ、ｊを求め、第５図（Ｃ１に示す如くテーブル
とする。Determine r, g, and j and create a table as shown in Figure 5 (C1).

次にプロセッサ１は第５図（Ｃ１に示すテーブルの各画
素位置に対し、第４図０に示す如くｐを乗算して変換テ
ーブルを作成する。即ち、ａ＋＝ａＸｐ、ｂ、＝ｂＸｐ
の如き演算を行い、第５図ｆｄｌに示す如き変換テーブ
ルを作成する。Next, the processor 1 creates a conversion table by multiplying each pixel position in the table shown in FIG. 5 (C1) by p as shown in FIG.
The following calculations are performed to create a conversion table as shown in FIG. 5 fdl.

次にプロセッサｌは第・１図■に示す如く、同一データ
の変換処理を行う。これは、線分の多い文字を縮小する
場合、隣接する線分が重畳するようになるが、この重畳
を避けるため、重畳する線分を削除するか、１１ソＩ・
分ずらす等の処理を行って、第５図（ｄｌの変換テーブ
ルの変換処理を行う。Next, the processor 1 performs conversion processing on the same data, as shown in FIG. This means that when reducing a character with many line segments, adjacent line segments will overlap, but to avoid this overlap, either delete the overlapping line segments or
5 (dl conversion table).

次にプロセッサ１はこの変換テーブルを使用して、変換
テーブルに基づく文字作成を行う。即ち、第５図（ｅ）
に示す如く、例えばＷ≧ｈてあれば、横幅が４８ドツト
て高さがｈの文字を作成し、画像メモリ５に格納する。Next, processor 1 uses this conversion table to create characters based on the conversion table. That is, FIG. 5(e)
As shown in FIG. 3, for example, if W≧h, a character with a width of 48 dots and a height of h is created and stored in the image memory 5.

そして、この文字を上下に走査して黒画素の始点と終点
の位置を示すチーフルを作成し、このテーブルの各画素
位置に対し、ｐを乗算して、第５図（ｆｌに示す如き、
横幅が４８ドツトで、高さがｌ−１＝　ｈＸ　ｐとなる
文字が作成される変換テーブルを作成する。Then, scan this character up and down to create a chiffle indicating the positions of the start and end points of black pixels, and multiply each pixel position in this table by p to create a square as shown in Figure 5 (fl).
A conversion table is created that creates characters with a width of 48 dots and a height of l-1=hXp.

そして、ｉ７記同様に同一データの変換処理を行い、こ
の変換テーブルから正規化部４を制御して第５図（ｆ）
に示す如き正規化イメージを作成させ、画像メモリ５の
正規化イメージ領域に格納させる。Then, as in i7, the same data is converted, and the normalization unit 4 is controlled from this conversion table as shown in Fig. 5(f).
A normalized image as shown in FIG. 1 is created and stored in the normalized image area of the image memory 5.

次にプロセッサ１は第４図［相］に示す如（、文字の高
さ及び幅の小さい方をＨ＝ｐＸｈ又はＷ＝ｐ×Ｗにより
求める。そして、特徴抽出部６を制御して、第４図０に
示す如く、４８×Ｈ又は４８×Ｗの範囲内で特徴抽出を
行わせる。即ち、例えば、横幅が４８ドツトで高さがＨ
の文字範囲を走査させ、特徴を抽出させるか、又は、第
４図０で正規化イメージ領域に移動した横幅がＷで、高
さがトＩの文字範囲を走査させて特徴を抽出さ−Ｕる。Next, the processor 1 calculates the smaller height and width of the character by H=pXh or W=p×W as shown in FIG. 4 As shown in Figure 0, feature extraction is performed within the range of 48 x H or 48 x W. That is, for example, if the width is 48 dots and the height is H
Scan the character range and extract the features, or scan the character range of width W and height I that was moved to the normalized image area in Figure 4 0 and extract the features. Ru.

従って、この場合、第２図（ａｌ及び（ｂｌに示す如く
、空白部［相］又は■の領域は走査されない。Therefore, in this case, as shown in FIG. 2 (al and (bl), the blank area [phase] or the area of ■ is not scanned.

プロセッサ１は特徴抽出部６が抽出した特徴を画像メモ
リ５に格納させ、４８Ｘ４８ドソｌ”の正方形の枠に内
接する文字から特徴を抽出して格納している辞書８の特
徴と照合するため、第４図［相］に示す如く、４８／Ｈ
又は４８／Ｗを乗算する等の位置補正を行い、更に同一
文字は同じ特徴が出るように特立景の補正も行う。The processor 1 stores the features extracted by the feature extractor 6 in the image memory 5, and extracts the features from the characters inscribed in the 48 x 48 square frame and compares them with the features in the dictionary 8 stored therein. , as shown in Figure 4 [Phase], 48/H
Alternatively, the positions are corrected by multiplying by 48/W, and the special scenery is also corrected so that the same characters have the same characteristics.

そして、照合部７を制御して辞書８の特徴と照合させ、
候補文字を選択させて画像メモリ５に格納させる。そし
て、１ペ一ジ分の全文字の選択が終了したか調べ、終了
していなければ、次の一文字切出しを行うルーチンに戻
り、終了していれば、答えをインタフェース部９を経て
上位装置に送出する。Then, the matching unit 7 is controlled to match the features of the dictionary 8,
Candidate characters are selected and stored in the image memory 5. Then, it is checked whether all the characters for one page have been selected, and if not, the routine returns to cutting out the next character, and if it is, the answer is sent to the host device via the interface section 9. Send.

そして、次頁があるか調べ、あればスキャナ１０を制御
して原措から１ペ一ジ分の文字列を読取らせるルーチン
に戻り、次頁が無ければ動作を終了する。Then, it is checked whether there is a next page, and if there is, the routine returns to the routine of controlling the scanner 10 to read one page's worth of character strings from the original image, and if there is no next page, the operation is ended.

第６図は発明の詳細な説明する図である。FIG. 6 is a diagram explaining the invention in detail.

第９図に比し、［相］で示す正規化に必要な時間では、
拡大に対応する斜線部分が移動のみで良く、時間が短縮
される。そして、■で示す特徴抽出に必要な時間は、特
徴抽出に必要な走査範囲が小さくなるため、文字が小さ
い程特徴抽出時間が小さくなる。Compared to Fig. 9, the time required for normalization indicated by [phase] is
Only the shaded area corresponding to enlargement needs to be moved, which saves time. As for the time required for feature extraction indicated by ■, since the scanning range required for feature extraction becomes smaller, the smaller the character, the shorter the feature extraction time becomes.

■で示す特徴の補正と位置補正では、第９図に比し、位
置補正する時間が多くなる。従って、［相］に示す如（
、総合した時間は斜線で示す範囲が効果となる。In the feature correction and position correction shown in (2), the time required for position correction is longer than in FIG. 9. Therefore, as shown in [phase] (
, the range shown by diagonal lines is effective for the total time.

尚、第６図は切出した文字を縮小した場合に、第２図に
示す空白部分［相］の走査が不要となるために、特徴抽
出に必要な時間が節減されるが、この節減状態は正確に
把握されないため、図示することは省略しである。Note that when the cut out characters are reduced in size in Figure 6, scanning of the blank areas shown in Figure 2 is no longer necessary, so the time required for feature extraction is saved. Since it cannot be understood accurately, illustration is omitted.

〔Effect of the invention〕

以上説明した如く、本発明は特徴抽出処理時間を短縮す
ることが可能なため、文字認識装置の文字認識処理効率
を高めることが出来る。As described above, since the present invention can shorten the feature extraction processing time, it is possible to improve the character recognition processing efficiency of the character recognition device.

[Brief explanation of the drawing]

第１図は本発明の構成を示す処理の流れ図、第２図は特
徴抽出の一例を説明する図、第３図は本発明の一実施例
を示す回路のブロック図、第４図は第３図の動作を説明するフＥｌ−チャーＩ・、
第５図は第３図の動作を説明する図、第６図は発明の詳細な説明する図、第７図は従来の技術を説明するプロ・ツタ図、第８図は
第７図の動作を説明する図、第９図は正規化部と特徴抽出部が必要とする時間を説明
する図、第１０図は文字の大きさの分布を説明する図である。図において、 ■はプロセッサ、　　　２，１２はプログラムメモリ、
３は文字切出し部、　　　４は正規化部、５は画像メモ
リ、　　　　６は特徴抽出部、７は照合部、　　　　　
　８は辞害、９はインタフムース部、１０はスキャナ、１１−１９は
処理ステップを示す。本えａ、＠　（Ｎ情へ；を１１ハ理ｔ・、たれ唱毛　１
２ぐコで雪鳥蚊を白書ａ＞　　−１々゛１　Σ３免日月マろ口纂
　２　！一′紮−９月の一更つ→ｈイクつとｉ卜　丁？路のフロ
ン２図筈　　３　でキ３スの）つ１；λ宮尤ｅ月ミジ１０−ナヤ稟　二　畳
　ζイの３〉（４’）ル３（９）力車丁’Ｑ詫吠す己２杢５図（その１）（ｄ〕（、ｚ）褐−３宜　の中っ１１χま也絹１己・多曇蔓　５　船　
ζやの２）ｊた釆の才文ｉ′桁と北畔５フ゛ロデを手　ｑ　２！ｌ吾朗金ト！！８パ刀凋灸Στ乞Ｂ月する９菓　　る　　図（θ）（Ｃ）〕４７ダいの１ヒυイ下と１元９ＵＴド乙ＧＯヰ　６　
コ時開正税化舌廼ヒ才）イ多父」畑ム告や力＼邸てモと可る時
開８兇Ｂ月する口　　＊９　区丈Ｊの天さ（ｒ心師を３光輯可る区事　１０　　■FIG. 1 is a process flowchart showing the configuration of the present invention, FIG. 2 is a diagram explaining an example of feature extraction, FIG. 3 is a block diagram of a circuit showing an embodiment of the present invention, and FIG. Feature I-Character I, which explains the operation of the figure.
Figure 5 is a diagram explaining the operation of Figure 3, Figure 6 is a diagram explaining the invention in detail, Figure 7 is a professional diagram explaining the conventional technology, and Figure 8 is the operation of Figure 7. FIG. 9 is a diagram illustrating the time required by the normalization section and the feature extraction section. FIG. 10 is a diagram illustrating the distribution of character sizes. In the figure, ■ is a processor, 2 and 12 are program memories,
3 is a character extraction section, 4 is a normalization section, 5 is an image memory, 6 is a feature extraction section, 7 is a matching section,
Reference numeral 8 indicates a retractor, 9 indicates an interface unit, 10 indicates a scanner, and 11-19 indicates a processing step. Book a, @ (to Njo; 11 hari t・, Tare Shouge 1
2 White Paper on Snowbird Mosquitoes with Guko A> -1゛1 Σ3 Sunset Moon Maro Composition 2! 1' 箮 - September 1st Sarasatsu → h Ikutsu and i 卜 Ding? The front of the road should be 2 (3) and 3 (kiss 3)) 1; Self 2 Heather 5 (Part 1) (d) (, z) Brown-3 Gi Naka 11χ Maya Kinu 1 Self/Tagumi vine 5 Ship
ζ Yan's 2) Hands up the talented writer i' digit and the north bank 5 rod q 2! l Goro Kinto! ! 8 Pa sword moxibustion Στ beg B month 9 Ka Ru Figure (θ) (C)] 47 days 1 Hi υ lower and 1 yuan 9UT do Otsu GO ヰ 6
When you open your mouth, you can say that you have a father's field, power, and residence. Inquiry ward matters 10 ■

Claims

[Claims] Cut out one character from an unknown character string, scan the area of the cut out character to extract features, and read out from a dictionary that stores features obtained from characters inscribed in a square of a predetermined size. In the process of a character recognition device that selects a character in the dictionary with a small degree of difference as a candidate character, and sends out the selected candidate character as a recognition result, Compare the size of the character with the size of the square (11), and if the vertical and/or horizontal size of the character is larger than the square, the larger vertical or horizontal size of the character is placed in the square. Find the reduction ratio when inscribed (12), reduce the character by the reduction ratio (13), scan the area of the reduced character to extract features (14), Corresponding to the area that becomes blank due to the reduced character, position correction is performed on the features extracted from the reduced character (15), and the vertical and horizontal sizes of the extracted character are determined to be the same. If it is smaller than a square, the area of the one character is scanned to extract the feature (16), and the position of the feature extracted from the one character is corrected in accordance with the area that becomes blank due to the one character with respect to the square. (
17) A character recognition method, characterized in that the position-corrected feature is compared with the feature read from the dictionary (18), and candidate characters are selected (19).