JPH07282194A

JPH07282194A - Character recognizing device

Info

Publication number: JPH07282194A
Application number: JP6077429A
Authority: JP
Inventors: Makoto Kushima; 真久島; Koichi Higuchi; 浩一樋口
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 1994-04-15
Filing date: 1994-04-15
Publication date: 1995-10-27

Abstract

PURPOSE:To provide a character recognizing device which can execute a highly precise discrimination in a short processing time by shaping locally only a part showing a feature effective for discriminating from another character by a method suitable to a character kind even if a character pattern is blurred or broken. CONSTITUTION:As for the character pattern whose pattern is judged to need to be shaped among candidate characters outputted from a first discriminating part 32, the shaping of the pattern is executed by a local shaping part 332 on the basis of corresponding shaping information in a shaping information table 333, and afterwards, feature extraction is executed again by a second feature extracting part 335, and detailed discrimination is executed by a second discriminating part 34, and therefore, the discrimination can be executed highly precisely even for the blurred or broken character pattern. Besides, since an input part 334 to enable the shaping information in the shaping information table to be designated is provided the flexible and more highly precise discrimination becomes possible.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は被読取り物に記載された
文書等の画像データに基づいて文字を認識する文字認識
装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character recognition device for recognizing a character based on image data such as a document written on an object to be read.

【０００２】[0002]

【従来の技術】従来、ＯＣＲ等の文字認識装置は文献
ａ：「特公昭６０−３８７５６」、ｂ：「特公昭６２−
６２３９３」に記載されているように、被読取り物に記
載された文書等を光学的に読み取って文字部分を２値化
したものをパタンレジスタに格納し、例えば各方向のス
トローク成分等から文字線長をあらわす特徴量を文字の
大きさで正規化して特徴マトリクスを作成し、あらかじ
め作成された辞書と照合して第１の識別を行い、更にこ
の辞書との照合の際２種以上の類似文字が候補として得
られた場合には、これらの類似文字間の違いをよくあら
わしている特徴量により第１の識別とは異なる第２の識
別を行い候補文字の中から適切なものを選択して出力す
るという方法が採られていた。2. Description of the Related Art Conventionally, character recognition devices such as OCR have been described in literature a: "Japanese Patent Publication No. 60-38756", b: "Japanese Patent Publication No.
62393 ”, a document or the like written on an object to be read is optically read and the character portion is binarized and stored in a pattern register. For example, a character line is extracted from a stroke component in each direction. The feature quantity representing the length is normalized by the size of the character to create a feature matrix, which is compared with a dictionary created in advance to perform the first identification, and when matching with this dictionary, two or more similar characters are compared. Is obtained as a candidate, a second identification different from the first identification is performed based on the feature amount that often shows the difference between these similar characters, and an appropriate one is selected from the candidate characters. The method of outputting was adopted.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、このよ
うな文字認識装置では何らかの理由で文字パタンにかす
れやつぶれが存在しているような場合に、第２の識別に
おいて文字パタンの正確な詳しい特徴を得ることができ
なくなり、認識率の低下の原因になっている。例えば文
字パタンのある部位の中心線または輪郭線等を追跡し曲
率等を求めようとしても中心線または輪郭線等が途中で
欠落している場合はその部位に関する値を求めることは
困難である。また、文字パタン中のループの個数等を求
めようとしてもパタンがかすれてループが発生している
場合や、パタンがつぶれてループが消滅している場合は
本来のループ数を求めることはできない。However, in such a character recognition device, when the character pattern has a blur or a collapse for some reason, the accurate detailed characteristic of the character pattern is determined in the second identification. It is not possible to obtain it, which causes the recognition rate to decrease. For example, even if the center line or the contour line of a portion having a character pattern is traced to obtain the curvature or the like, if the center line, the contour line, or the like is missing in the middle, it is difficult to obtain the value for the portion. Further, when trying to obtain the number of loops in a character pattern or the like, if the pattern is faint and a loop is generated, or if the pattern is destroyed and the loop disappears, the original number of loops cannot be obtained.

【０００４】従来からこのような場合の対策として、読
取り部で得た画像データに対する一様な雑音除去や特徴
抽出前の文字パタンに対する一様な整形等の処理の他、
かすれやつぶれの影響をあらかじめ想定して詳細特徴を
求めるアルゴリズムを作成して対応する等の様々な対策
が採られて来た。しかし、画像データに対する一様な雑
音除去や整形処理は多くの処理時間を必要とし、また、
かすれやつぶれを想定した識別アルゴリズムは非常に冗
長であるだけでなく、光電変換の特性、紙質、印刷また
は筆記の状態に応じて不規則に変化するパタンのかすれ
やつぶれに柔軟に対応することはできないという問題点
があった。Conventionally, as measures against such a case, in addition to processing such as uniform noise removal for image data obtained by the reading unit and uniform shaping for a character pattern before feature extraction,
Various measures have been taken, such as creating an algorithm for obtaining detailed features by assuming the effect of faintness or crushing in advance and responding. However, uniform noise removal and shaping processing for image data requires a lot of processing time, and
Not only is an identification algorithm that assumes blurring or blurring very redundant, but it is also not flexible to deal with the blurring or blurring of patterns that change irregularly depending on the characteristics of photoelectric conversion, paper quality, and the state of printing or writing. There was a problem that it could not be done.

【０００５】よって本発明の目的は、文字パタンがかす
れたりつぶれたりしていても、他の文字との識別のため
に有効な特徴をあらわしている部位のみを文字種に適し
た方法で局所的に整形して、少ない処理時間で高精度な
識別を行うことができる文字認識装置を提供することに
ある。Therefore, an object of the present invention is to locally localize only a portion showing a characteristic effective for distinguishing from other characters even if the character pattern is faint or crushed by a method suitable for the character type. An object of the present invention is to provide a character recognition device that can be shaped and can perform highly accurate identification in a short processing time.

【０００６】[0006]

【課題を解決するための手段】この発明は前記課題を解
決するために、被読取り物の画像データを得るための読
取り部と、該画像データから文字パタンの特徴を抽出し
て識別する識別部から成る文字認識装置において、前記
識別部は、文字パタンから識別に有効な特徴を抽出する
第１特徴抽出部と、前記特徴に基づいて１つまたは複数
の候補文字を出力する第１識別部と、文字パタン中の特
定な部位を整形する局所整形部と、文字コード別に整形
の必要性の有無を示すフラグ情報と整形部位に関する情
報と整形方法に関する情報とから成る整形情報を保持す
る整形情報テーブルと、該整形情報を指定可能とする入
力部と、整形後の文字パタンから特徴を抽出する第２特
徴抽出部を備えた整形部と、前記候補文字或はパタン整
形後の候補文字の中から最適な文字を選択して出力する
第２識別部とを備えたことを特徴とする。SUMMARY OF THE INVENTION In order to solve the above problems, the present invention provides a reading unit for obtaining image data of an object to be read, and an identification unit for extracting and identifying a characteristic of a character pattern from the image data. In the character recognition device, the identification unit includes a first feature extraction unit that extracts a feature effective for identification from a character pattern, and a first identification unit that outputs one or more candidate characters based on the feature. A local shaping unit that shapes a specific part in a character pattern, a shaping information table that holds shaping information that includes flag information indicating whether or not there is a need for shaping for each character code, information about the shaping part, and information about the shaping method. An input unit that can specify the shaping information; a shaping unit that includes a second feature extraction unit that extracts a feature from the shaped character pattern; and a candidate character or a candidate character after pattern shaping. Characterized in that a second identification unit for selecting and outputting an optimal character from.

【０００７】[0007]

【作用】この発明によれば上記のように構成したことに
より、第１識別部から出力される候補文字の中でパタン
の整形の必要性有りと判定された文字パタンについて
は、整形情報テーブル内の当該整形情報に基づいて局所
整形部によりパタンの整形を行なった後、第２特徴抽出
部で再度特徴抽出が行なわれ、第２識別部で詳細識別が
行なわれるので、かすれたりつぶれたりした文字パタン
についても高精度に識別を行なうことが出来る。また、
整形情報テーブル内の整形情報を指定可能とする入力部
を整形部に備えたことにより、光電変換の特性、紙質、
印刷状態或は筆記状態に応じて文字別に整形の必要性の
有無を示すフラグ情報や整形部位及び整形方法を自由に
指定可能となり、柔軟でより高精度な識別が可能とな
る。よって前記課題を解決できるのである。According to the present invention, with the above-described configuration, the character pattern that is determined to be required to be shaped in the candidate characters output from the first identification unit is stored in the shaping information table. After the local shaping unit shapes the pattern based on the shaping information of, the second feature extraction unit performs feature extraction again, and the second identification unit performs detailed identification. Patterns can also be identified with high accuracy. Also,
By providing the shaping section with an input section that allows the shaping information in the shaping information table to be specified, the characteristics of photoelectric conversion, paper quality,
It is possible to freely specify the flag information indicating the necessity of shaping for each character, the shaping portion, and the shaping method according to the printing state or the writing state, and thus it is possible to perform flexible and more accurate identification. Therefore, the said subject can be solved.

【０００８】[0008]

【実施例】以下、本発明の文字認識装置の実施例図面に
基づいて説明する。尚図面はこの発明が理解できる程度
に概略的に示されているにすぎず、従って各構成成分の
形状、配置および接続関係を図示例に限定するものでは
ない。図１は、本発明の文字認識装置の構成を示すブロ
ック図である。この文字認識装置１は例えばＯＣＲから
成るものであり、被読取り物に記載された文字を読み取
って認識し、コンピュータ等への入力作業を迅速に行う
ためのものである。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS An embodiment of a character recognition device of the present invention will be described below with reference to the drawings. It should be noted that the drawings are only schematically shown to the extent that the present invention can be understood, and therefore the shapes, arrangements, and connection relationships of the respective constituent components are not limited to the illustrated examples. FIG. 1 is a block diagram showing the configuration of the character recognition device of the present invention. The character recognition device 1 is composed of, for example, an OCR, and is for reading and recognizing characters written on an object to be read and for promptly performing an input operation to a computer or the like.

【０００９】この文字認識装置１は、主として被読取り
物から画像データを得るための読取り部２と、得られた
画像データから文字の特徴を抽出して識別する識別部３
とから構成されている。読取り部２は、文字の反射光を
取り込んでそれを電気信号に変換し例えば２値化された
画像データを得る光電変換部２１と、その２値化された
画像データを格納しておく画像メモリ２２とから成る。
また、識別部３は、画像データから識別に必要な特徴を
抽出する第１特徴抽出部３１と、得られた特徴から例え
ば類似度を求めて候補文字を出力する第１識別部３２
と、その候補文字の種類に応じて特定の部位のパタンを
整形する整形部３３と、例えばパタンの中心線や輪郭線
の局所的形状や位相的な特徴を調べて候補文字の中から
適当なものを選択し結果を出力する第２識別部３４とか
ら成る。また、整形部３３は、識別のために有効な部位
のパタンを候補文字の種類に応じて例えば膨張、収縮さ
せる局所整形部３３２と、整形処理を施す部位および方
法に関する情報を文字種別に保持している整形情報テー
ブル３３３と、その整形部位および方法を指定する入力
部３３４と、整形後のパタンから特徴を抽出する第２特
徴抽出部３３５と、パタンを整形する必要があるか否か
を候補文字および整形情報テーブル３３３から判断する
制御部３３１とから成る。The character recognition device 1 mainly includes a reading unit 2 for obtaining image data from an object to be read, and an identifying unit 3 for extracting and identifying a characteristic of a character from the obtained image data.
It consists of and. The reading unit 2 captures the reflected light of a character and converts it into an electric signal to obtain, for example, binarized image data, and an image memory for storing the binarized image data. 22 and 22.
In addition, the identification unit 3 extracts a feature necessary for identification from image data, and a first identification unit 32 that outputs a candidate character by obtaining a similarity, for example, from the obtained features.
And a shaping unit 33 that shapes a pattern of a specific part according to the type of the candidate character, and a local shape or topological feature of the center line or contour line of the pattern, for example, is examined to select an appropriate character from the candidate characters. And a second identification unit 34 that selects the one and outputs the result. Further, the shaping unit 33 holds, for each character type, a local shaping unit 332 that expands or contracts a pattern of a region effective for identification according to the type of the candidate character, and information regarding a region and a method for performing the shaping process. Shaping information table 333, an input unit 334 that specifies the shaping site and method, a second feature extraction unit 335 that extracts features from the shaped pattern, and a candidate as to whether or not the pattern needs to be shaped. And a control unit 331 for judging from the character and shaping information table 333.

【００１０】この文字認識装置１を用いて被読取り物に
記載された文字の認識を行うには、光電変換部２１によ
り被読取り物に記載された文字の反射光を電気信号に変
換し、２値化された画像データを画像メモリ２２内に格
納する。そしてこの画像データから特徴抽出部３１によ
って文字パタンの特徴を抽出し、次いで第１識別部３２
は例えばパタンマッチングやストローク解析等により類
似度等を求めて１つまたは複数個の候補文字を決定す
る。In order to recognize a character written on an object to be read by using the character recognition device 1, the photoelectric conversion unit 21 converts the reflected light of the character written on the object to be read into an electric signal. The binarized image data is stored in the image memory 22. Then, the feature extraction unit 31 extracts the feature of the character pattern from this image data, and then the first identification unit 32.
Determines one or a plurality of candidate characters by obtaining the degree of similarity by pattern matching, stroke analysis, or the like.

【００１１】図２は第１識別部３２で得られた候補文字
に応じてパタンを整形する様子を説明する図であり、候
補文字と整形情報テーブル３３３のフォーマットを図示
したものである。整形情報テーブル３３３のフォーマッ
トは文字コード別に記述された整形情報レコード５１か
らなり、各レコード内には文字コード５２と、整形する
か否かを示す整形フラグ５３と、整形する部位を示す部
位情報５４と、例えば膨張、収縮等の整形方法を示す方
法情報５５から成り、整形フラグ５３には例えば整形す
る必要があれば１を、その必要が無ければ０を設定して
おく。候補文字４は例えば類似度の大きなものから順番
に並べられて第１識別部３２から出力される。制御部３
３１は例えば候補文字４の文字コードをキー項目として
整形情報テーブル３３３内の整形情報レコード５１を検
索し、該当するレコード内の整形フラグ５３の値を調べ
る。ここで、このように検索した全ての文字に対応する
整形情報テーブル３３３内の整形フラグ５３が０である
場合、つまりパタンを整形する必要が無い場合は、第２
識別部の処理へ移る。FIG. 2 is a diagram for explaining how patterns are shaped according to the candidate characters obtained by the first identifying section 32, and illustrates the format of the candidate characters and the shaping information table 333. The format of the shaping information table 333 is composed of shaping information records 51 described for each character code. In each record, a character code 52, a shaping flag 53 indicating whether or not to be shaped, and part information 54 indicating a part to be shaped. And, for example, method information 55 indicating a shaping method such as expansion or contraction. For example, 1 is set in the shaping flag 53 if it is necessary to perform shaping, and 0 is set if it is not necessary. The candidate characters 4 are arranged in order from the one having the highest degree of similarity, for example, and are output from the first identification unit 32. Control unit 3
For example, 31 uses the character code of the candidate character 4 as a key item to search the shaping information record 51 in the shaping information table 333, and checks the value of the shaping flag 53 in the corresponding record. Here, when the shaping flag 53 in the shaping information table 333 corresponding to all the characters thus searched is 0, that is, when it is not necessary to shape the pattern, the second
Move to the processing of the identification unit.

【００１２】一方、整形フラグ５３が１のものが存在す
る場合、つまりパタンを整形する必要が有る場合は、局
所整形部３３２が部位情報５４の示す特定部位を方法情
報５５が示す膨張、収縮等の方法により整形する。この
局所整形部３３２は、パターン整形用の画像メモリを保
持していてもよい。図３（ａ）は整形処理を必要とする
文字パタン［ｅ］の例を示しており、整形前の文字パタ
ン６１と整形部位６２と、水平および垂直方向の座標を
ｘ，ｙとしたときの整形部位の左上の点（ｘ１，ｙ１）
６３と、右下の点（ｘ２，ｙ２）６４と、整形後のパタ
ン６５とから成る。図３（ｂ）は例えば図３（ａ）のよ
うなパタン整形を行う場合の整形情報レコードの例であ
り、文字コード７１と、整形フラグ７２と、整形部位の
左上の点の座標７３と、右下の点の座標７４と、膨張処
理を示すコード７５と、膨張処理の回数７６と、収縮処
理を示すコード７７と、収縮処理を示す回数７８とから
成る。On the other hand, if there is a shaping flag 53 of 1, that is, if it is necessary to shape the pattern, the local shaping unit 332 expands, contracts, etc. the specific portion indicated by the portion information 54 to the specific portion indicated by the method information 55. Shape by the method of. The local shaping unit 332 may hold an image memory for pattern shaping. FIG. 3A shows an example of a character pattern [e] that requires shaping processing. When the character pattern 61 before shaping and the shaping portion 62, and the horizontal and vertical coordinates are x and y. Upper left point (x1, y1) of the shaped part
63, a lower right point (x2, y2) 64, and a shaped pattern 65. FIG. 3B is an example of the shaping information record in the case of performing the pattern shaping as shown in FIG. 3A, for example, the character code 71, the shaping flag 72, the coordinates 73 of the upper left point of the shaping site, The coordinates 74 of the lower right point, the code 75 indicating the expansion process, the number 76 of expansion processes, the code 77 indicating the contraction process, and the number 78 indicating the contraction process are included.

【００１３】例えば何らかの原因で図３（ａ）のパタン
６１のようなかすれが存在する場合は、整形情報テーブ
ル内の［ｅ」に関する情報を格納する整形情報レコード
に整形したい部位の例えば左上の点の座標７３および右
下の店の座標７４と、例えば膨張処理を示すコード７５
およびその回数７６と、収縮処理を示すコード７７およ
びその回数７８を入力部３３４より指定する。局所整形
部３３２はこの整形情報レコードの情報により処理を行
いパタン６５を生成する。ここで膨張処理とは、例えば
文字部分を１、背景部分を０の二値で表現し、処理対象
領域を左上から順に走査して注目画素が１である場合は
例えばその８近傍点を全て１とするような処理であり、
収縮処理とは注目画素が０の場合にその８近傍点を全て
０にするような処理である。また膨張、収縮処理の回数
とは、対象領域をぞれぞれの処理を施しながら走査する
回数のことである。For example, if there is a blur such as the pattern 61 in FIG. 3A for some reason, the point at the upper left of the portion to be shaped in the shaping information record storing the information about [e] in the shaping information table. Coordinate 73 and the lower right store coordinate 74 and, for example, a code 75 indicating expansion processing.
And the number of times 76, the code 77 indicating the contraction process, and the number of times 78 are specified from the input unit 334. The local shaping unit 332 performs processing based on the information of this shaping information record and generates a pattern 65. Here, the expansion processing is represented by a binary value of 1 for the character portion and 0 for the background portion, and when the target pixel is 1 when the region to be processed is sequentially scanned from the upper left, for example, all 8 neighboring points are set to 1 Is a process like
The contraction process is a process for setting all 8 neighboring points to 0 when the pixel of interest is 0. The number of times of expansion and contraction processing is the number of times the target area is scanned while performing each processing.

【００１４】ここで、上述したような整形情報テーブル
３３３の検索処理は全候補文字に対して実行されてもよ
いし、また候補文字の１部についてのみ実行されてもよ
い。例えば、類似度が所定の閾値より大きな文字を１つ
または複数個選んで整形情報テーブルを検索してもよい
し、類似度の上位のものから所定の個数の文字について
検索してもよい。また、候補文字の中に「ｃ］と「ｅ」
のように誤って識別してしまう可能性のある文字のペア
が存在する場合に限り、これらの文字について整形情報
テーブルを検索してもよい。また、整形部位の指定は、
左上点および右下点の座標による指定以外の方法であっ
てもよい。また、局所整形部３３２はパタン内のある特
定の方向のみに膨張、収縮等の整形を行ってもよい。こ
の場合は方法情報５５に処理方向に関する情報を持たせ
るものとする。また、局所整形部３３２は上述した方法
以外のパタン整形を行ってもよい。Here, the above-described retrieval processing of the shaping information table 333 may be executed for all candidate characters, or may be executed for only a part of the candidate characters. For example, one or a plurality of characters having a similarity higher than a predetermined threshold may be selected to search the shaping information table, or a predetermined number of characters may be searched from the one having the highest similarity. In addition, "c" and "e" are included in the candidate characters.
The shaping information table may be searched for these characters only when there is a pair of characters that may be erroneously identified, such as. Also, the designation of the shaping part is
A method other than designation by the coordinates of the upper left point and the lower right point may be used. Further, the local shaping section 332 may perform shaping such as expansion and contraction only in a specific direction within the pattern. In this case, the method information 55 has information on the processing direction. In addition, the local shaping unit 332 may perform pattern shaping other than the method described above.

【００１５】１つまたは複数の候補文字のパタンの整形
が終わると第２特徴抽出部３３５は、例えばパタンの中
心線または輪郭線等の形状、線幅、或は位相等の特徴を
抽出する。第２識別部３４は例えば標準の字形につい
て、第１特徴抽出部３１または第２特徴抽出部３３５で
得られる特徴と同形式の予め作成される辞書を用いて、
第１識別部３２から出力される候補文字について前記特
徴と辞書との照合を行なう。When the shaping of the pattern of one or more candidate characters is completed, the second feature extraction unit 335 extracts features such as the shape of the center line or the contour line of the pattern, the line width, or the phase. The second identification unit 34 uses, for example, a pre-created dictionary having the same format as the features obtained by the first feature extraction unit 31 or the second feature extraction unit 335 for a standard character shape,
For the candidate characters output from the first identification unit 32, the features are collated with the dictionary.

【００１６】尚入力部３３４は光電変換部２１の特性、
紙質、印刷または筆記の状態等に応じて、整形情報テー
ブル３３３内の整形フラグ５３、部位情報５４、および
方法情報５５の内容を指定することができる。例えば文
字パタンのかすれやつぶれの程度によって膨張処理回数
７６や、収縮処理回数７８を増減したり、整形する部位
を変更してもよい。The input unit 334 has characteristics of the photoelectric conversion unit 21,
The contents of the shaping flag 53, the part information 54, and the method information 55 in the shaping information table 333 can be specified according to the paper quality, the state of printing or writing, and the like. For example, the number of expansion processes 76 and the number of contraction processes 78 may be increased or decreased, or the part to be shaped may be changed depending on the degree of blurring or collapse of the character pattern.

【００１７】[0017]

【発明の効果】以上説明したように本発明の文字認識装
置によれば、被読取り物の画像データを得るための読取
り部と、該画像データから文字パタンの特徴を抽出して
識別する識別部から成る文字認識装置において、前記識
別部は、文字パタンから識別に有効な特徴を抽出する第
１特徴抽出部と、前記特徴に基づいて１つまたは複数の
候補文字を出力する第１識別部と、文字パタン中の特定
な部位を整形する局所整形部と、文字コード別に整形の
必要性の有無を示すフラグ情報と整形部位に関する情報
と整形方法に関する情報とから成る整形情報を保持する
整形情報テーブルと、該整形情報を指定可能とする入力
部と、整形後の文字パタンから特徴を抽出する第２特徴
抽出部を備えた整形部と、前記候補文字或はパタン整形
後の候補文字の中から最適な文字を選択して出力する第
２識別部とを備えたことにより次のような効果を生じ
る。すなわち、識別しようとする文字パタンにかすれや
つぶれが存在していても、候補文字に応じて必要により
パタンを局所的に整形した後、中心線や輪郭線等の形状
や位置、あるいは位相的な特徴等を詳しく調べて識別す
ることができる。また、光電変換の特性、紙質、印刷ま
たは筆記の状態等に応じて、文字種毎に整形部位や方法
等を自由に指定できる。従って、大幅に処理時間を増や
すこと無く、文字パタンの品質に柔軟に対応して認識精
度を高めることができる。As described above, according to the character recognition apparatus of the present invention, a reading section for obtaining image data of an object to be read, and an identification section for extracting and identifying a characteristic of a character pattern from the image data. In the character recognition device, the identification unit includes a first feature extraction unit that extracts a feature effective for identification from a character pattern, and a first identification unit that outputs one or more candidate characters based on the feature. A local shaping unit that shapes a specific part in a character pattern, a shaping information table that holds shaping information that includes flag information indicating whether or not there is a need for shaping for each character code, information about the shaping part, and information about the shaping method. An input unit that can specify the shaping information, a shaping unit that includes a second feature extraction unit that extracts a feature from the shaped character pattern, and the candidate character or the candidate character after the pattern shaping. Causing the following effects by providing a second identification unit for selecting and outputting et optimum character. That is, even if the character pattern to be identified has faintness or crushing, after the pattern is locally shaped according to the candidate character as necessary, the shape or position of the center line or contour line, or topological Features and the like can be examined in detail and identified. Further, the shaping site, method, etc. can be freely designated for each character type in accordance with the characteristics of photoelectric conversion, the quality of paper, the state of printing or writing, and the like. Therefore, it is possible to flexibly deal with the quality of the character pattern and improve the recognition accuracy without significantly increasing the processing time.

[Brief description of drawings]

【図１】本発明の文字認識装置の構成を示すブロック図
である。FIG. 1 is a block diagram showing a configuration of a character recognition device of the present invention.

【図２】整形情報テーブルの説明図である。FIG. 2 is an explanatory diagram of a shaping information table.

【図３】文字パタンの局所整形の説明図である。FIG. 3 is an explanatory diagram of local shaping of a character pattern.

[Explanation of symbols]

１文字認識装置２読取り部３識別部３２１光電変換部２２画像メモリ３１第１特徴抽出部３２第１識別部３３整形部３４第２識別部３３１制御部３３２局所整形部３３３整形情報テーブル３３４入力部３３５第２特徴抽出部 DESCRIPTION OF SYMBOLS 1 character recognition device 2 reading unit 3 identification unit 3 21 photoelectric conversion unit 22 image memory 31 first feature extraction unit 32 first identification unit 33 shaping unit 34 second identification unit 331 control unit 332 local shaping unit 333 shaping information table 334 input Part 335 Second feature extraction unit

Claims

[Claims]

1. A character recognition device comprising: a reading unit for obtaining image data of an object to be read; and an identification unit for extracting and identifying a characteristic of a character pattern from the image data. A first feature extraction unit that extracts a feature effective for identification, a first identification unit that outputs one or more candidate characters based on the feature, and a local shaping unit that shapes a specific portion in a character pattern. A shaping information table that holds shaping information that includes flag information indicating whether or not shaping is required for each character code, information regarding a shaping site, and information regarding a shaping method, an input unit that can specify the shaping information, and shaping A shaping unit including a second feature extraction unit that extracts a feature from the subsequent character pattern, and a second identification unit that selects and outputs an optimum character from the candidate character or the candidate character after pattern shaping. Character recognition device which is characterized in that there was example.