JPH0830717A

JPH0830717A - Character recognition method and device therefor

Info

Publication number: JPH0830717A
Application number: JP6167754A
Authority: JP
Inventors: Kazuhiro Matsubayashi; 一弘松林; Shinichi Sunakawa; 伸一砂川
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1994-07-20
Filing date: 1994-07-20
Publication date: 1996-02-02

Abstract

PURPOSE:To provide an inexpensive character recognition method and a device therefor with a high character recognition rate. CONSTITUTION:A picture input part 1 reads characters and stores them in a picture storage part 4. A character recognition part 8 extracts a feature from input picture data, compares it with a prescribed feature stored in a character recognition dictionary part 7, outputs a matching character code and stores it in a character code storage part 9. By reading a prescribed character font stored in a character font storage part 6 corresponding to the character code, the character is displayed at a display part 3. When an operator corrects the displayed character, display character coordinates on a screen are successively inputted by a coordinate input part 2 and stored in a coordinate storage part 5. The character recognition part 8 extracts the feature from the input picture data and coordinate data, compares it with the prescribed feature stored in the character recognition dictionary part 7 and outputs the matching character code. The prescribed character font stored in the character font storage part 6 is displayed at the display part 3 corresponding to the character code.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、画像の中の文字を認識
してコード化する方法とその装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a method and apparatus for recognizing and coding characters in an image.

【０００２】[0002]

【従来の技術】通常、文字認識装置は、画像を入力する
画像入力部と、画像情報に基づき文字コードを出力する
ＯＣＲ文字認識部と、認識された文字をフォントとして
出力する文字出力部と、誤って認識された文字を選択し
正しい文字を入力する操作を行なう文字修正部によって
構成される。2. Description of the Related Art Generally, a character recognition device includes an image input section for inputting an image, an OCR character recognition section for outputting a character code based on image information, and a character output section for outputting the recognized character as a font. It is composed of a character correction unit for selecting an erroneously recognized character and inputting a correct character.

【０００３】ここで、文字修正部において、座標入力面
上にスタイラスペンによって文字を書き、これをオンラ
イン手書き文字認識処理によって文字コードを得るもの
がある。There is a character correction unit that writes a character on a coordinate input surface with a stylus pen and obtains a character code by performing an online handwritten character recognition process.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、文字修
正のためのオンライン文字認識において、操作者の文字
の癖が著しい場合などに、修正した文字も誤認識となっ
てしまう可能性が高いという問題点があった。However, in the online character recognition for character correction, the corrected character is likely to be erroneously recognized when the habit of the character of the operator is remarkable. was there.

【０００５】また、この問題に対処するため、異なる文
字認識部を複数持たせると、辞書の容量が多大になりコ
ストが増大するという問題があった。Further, if a plurality of different character recognition units are provided in order to deal with this problem, there is a problem that the dictionary capacity becomes large and the cost increases.

【０００６】本発明は上記従来例に鑑みてなされたもの
で、文字認識率が高く安価な文字認識方法とその装置を
提供することを目的とする。The present invention has been made in view of the above conventional example, and an object of the present invention is to provide a character recognition method having a high character recognition rate and a low cost, and an apparatus therefor.

【課題を解決するための手段】上記目的を達成するた
め、本発明の画像認識方法とその装置は以下の構成を備
える。即ち、画像情報を入力する第１画像入力手段と、
前記画像情報に含まれる文字の特徴データに基づいて、
文字コードを生成する第１文字認識手段と、前記第１文
字認識手段で生成された文字コードから、修正を行う文
字コードを選択する選択手段と、前記選択手段で選択さ
れた文字コードに対応する修正文字ストロークを入力す
る第２画像入力手段と、前記選択された文字コードに対
応する特徴データと前記第２画像入力手段から入力した
修正文字ストロークに基づいて、文字コードを生成する
第２文字認識手段とを備える。To achieve the above object, an image recognition method and apparatus of the present invention have the following configurations. That is, a first image input means for inputting image information,
Based on the character feature data included in the image information,
Corresponding to the first character recognition means for generating the character code, the selection means for selecting the character code to be corrected from the character codes generated by the first character recognition means, and the character code selected by the selection means. Second image input means for inputting a corrected character stroke, second character recognition for generating a character code based on the characteristic data corresponding to the selected character code and the corrected character stroke input from the second image input means. And means.

【０００７】また、別の発明は、画像情報を第１画像入
力手段から入力する第１画像入力工程と、前記画像情報
に含まれる文字の特徴データに基づいて、文字コードを
生成する第１文字認識工程と、前記第１文字認識工程で
生成された文字コードから、修正を行う文字コードを選
択する選択工程と、前記選択工程で選択された文字コー
ドに対応する修正文字ストロークを第２画像入力手段か
ら入力する第２画像入力工程と、前記選択された文字コ
ードに対応する特徴データと前記第２画像入力工程で入
力した修正文字ストロークに基づいて、文字コードを生
成する第２文字認識工程とを備える。Another aspect of the present invention is a first character inputting step of inputting image information from the first image inputting means, and a first character generating a character code based on characteristic data of characters included in the image information. The second step of inputting a recognition step, a selection step of selecting a character code to be corrected from the character codes generated in the first character recognition step, and a corrected character stroke corresponding to the character code selected in the selection step. A second image input step of inputting from the means, a second character recognition step of generating a character code based on the characteristic data corresponding to the selected character code and the corrected character stroke input in the second image input step. Equipped with.

【０００８】[0008]

【作用】以上の構成において、第１画像入力手段が画像
情報を入力し、第１文字認識手段が、前記画像情報に含
まれる文字の特徴データに基づいて、文字コードを生成
し、選択手段が、前記第１文字認識手段で生成された文
字コードから、修正を行う文字コードを選択し、第２画
像入力手段が、前記選択手段で選択された文字コードに
対応する修正文字ストロークを入力し、第２文字認識手
段が、前記選択された文字コードに対応する特徴データ
と前記第２画像入力手段から入力した修正文字ストロー
クに基づいて、文字コードを生成する。In the above structure, the first image inputting means inputs the image information, the first character recognizing means generates the character code based on the character feature data included in the image information, and the selecting means operates. A character code to be corrected is selected from the character codes generated by the first character recognition means, and the second image input means inputs a corrected character stroke corresponding to the character code selected by the selection means, The second character recognizing means generates a character code based on the characteristic data corresponding to the selected character code and the corrected character stroke input from the second image input means.

【０００９】また、別の発明は、画像情報を第１画像入
力手段から入力し、前記画像情報に含まれる文字の特徴
データに基づいて、文字コードを生成し、前記生成され
た文字コードから、修正を行う文字コードを選択し、前
記選択された文字コードに対応する修正文字ストローク
を第２画像入力手段から入力し、前記選択された文字コ
ードに対応する特徴データと前記入力した修正文字スト
ロークに基づいて、文字コードを生成する。Another aspect of the present invention is to input image information from the first image input means, generate a character code based on characteristic data of a character included in the image information, and generate a character code from the generated character code. A character code to be corrected is selected, a corrected character stroke corresponding to the selected character code is input from the second image input means, and characteristic data corresponding to the selected character code and the input corrected character stroke are input. Based on this, a character code is generated.

【実施例】初めに、本発明に係る本実施例の文字認識装
置の基本的な構成を、以下に要約する。DESCRIPTION OF THE PREFERRED EMBODIMENTS First, the basic structure of the character recognition apparatus of this embodiment according to the present invention will be summarized below.

【００１０】本発明に係る本実施例の文字認識装置は、
画像情報を入力する画像入力部と、入力した画像情報に
基づいて文字コードを出力する第１の文字認識部と、そ
の文字をフォントとして出力する文字出力部と、誤って
認識された文字を選択して正しい文字を入力する処理を
行なう文字修正部を備え、ポインティングデバイスによ
って、座標情報を入力する座標入力部と、その画像情報
と座標情報に基づき文字コードを出力する第２の文字認
識部を備える。ここで、前記第２の文字認識部は、前記
画像情報に基づき、文字コードを出力するＯＣＲ文字認
識部と、ＯＣＲ文字認識部で用いる辞書の一部を前記座
標情報に基づいて選択する辞書選択部を備える。さら
に、ＯＣＲ文字認識部は、画数別に分類された辞書を備
え、辞書選択部は、座標情報に基づき文字の画数を求め
る画数算出部を備える。さらに、第２の文字認識部は、
画像情報に基づき文字を構成する第１の線情報を抽出す
るＯＣＲ特徴抽出部と、座標情報に基づき文字を構成す
る向きを持った第２の線情報を抽出するオンライン特徴
抽出部と、前記第１および第２の線情報を対応させるこ
とにより第１の線情報の向きの情報を与えるストローク
マッチング部とを備える。さらに、第２の文字認識部
は、画像情報から得られる文字の形状情報を用いて認識
を行ない、１つ以上の文字コードおよび各文字コード対
応する評価情報を出力するＯＣＲ文字認識部と、座標情
報から得られる文字の形状情報を用いて認識を行ない、
１つ以上の文字コードおよび各文字コードに対応する評
価情報を出力するオンライン文字認識部と、ＯＣＲ文字
認識部および前記オンライン文字認識部の出力情報に基
づいて、文字コードを出力する判定部を備える。文字修
正部は、画像入力部から入力された画像情報とペンによ
って入力された座標情報に基づいて、文字認識を行な
う。［実施例１］図２に、本発明に係る実施例の文字認識装
置の外観構成図を示す。ここで、イメージスキャナ５１
は、原稿台５２に置かれた原稿を光学的に読み取り、ケ
ーブル５３を介して、本体５４に画像データを送信す
る。本体５４は、前記画像データから第１の文字認識を
行ない、文字フォントを画面５５に表示する。画面５５
に表示された文字が原稿の文字と異なっていることを操
作者が発見した場合、ペン５６によって画面５５に文字
を書き込むことによって文字を修正できる。画面５５
は、表示装置に透明な座標入力装置を重ねた構造になっ
ており、ペン５６で書かれた文字の座標データに基づ
き、第２の文字認識が行なわれ、修正された文字が文字
フォントとして画面５５に表示される。The character recognition device of this embodiment according to the present invention is
An image input unit that inputs image information, a first character recognition unit that outputs a character code based on the input image information, a character output unit that outputs the character as a font, and a character that is erroneously recognized is selected. A pointing device for inputting coordinate information, and a second character recognition unit for outputting a character code based on the image information and the coordinate information. Prepare Here, the second character recognition unit selects an OCR character recognition unit that outputs a character code based on the image information, and a dictionary selection that selects a part of the dictionary used by the OCR character recognition unit based on the coordinate information. Section. Further, the OCR character recognition unit includes a dictionary classified according to the number of strokes, and the dictionary selection unit includes a stroke number calculation unit that obtains the number of strokes of the character based on the coordinate information. Furthermore, the second character recognition unit
An OCR feature extraction unit that extracts first line information that forms a character based on image information; an online feature extraction unit that extracts second line information that has a direction that forms a character based on coordinate information; And a stroke matching unit that gives information on the direction of the first line information by associating the first and second line information. Furthermore, the second character recognition unit performs recognition using the shape information of the character obtained from the image information, outputs one or more character codes and evaluation information corresponding to each character code, and an OCR character recognition unit, and coordinates. Recognition using the shape information of the character obtained from the information,
An online character recognition unit that outputs one or more character codes and evaluation information corresponding to each character code, and an OCR character recognition unit and a determination unit that outputs the character code based on the output information of the online character recognition unit . The character correction unit performs character recognition based on the image information input from the image input unit and the coordinate information input by the pen. [Embodiment 1] FIG. 2 is an external view of a character recognition apparatus according to an embodiment of the present invention. Here, the image scanner 51
Optically reads a document placed on the document table 52 and transmits image data to the main body 54 via the cable 53. The main body 54 performs the first character recognition from the image data and displays the character font on the screen 55. Screen 55
If the operator finds that the characters displayed on the screen are different from the characters on the document, the characters can be corrected by writing them on the screen 55 with the pen 56. Screen 55
Has a structure in which a transparent coordinate input device is overlaid on the display device, the second character recognition is performed based on the coordinate data of the character written by the pen 56, and the corrected character is displayed on the screen as a character font. 55 is displayed.

【００１１】本発明に係る本実施例は、第２の文字認識
に関するものである。This embodiment according to the present invention relates to the second character recognition.

【００１２】図１は、本実施例の文字認識装置のブロッ
ク構成図である。FIG. 1 is a block diagram of the character recognition apparatus of this embodiment.

【００１３】画像入力部１は、文字を含む画像を光学的
に読み取り、画像記憶部４に記憶する。文字認識部８は
入力した画像データから特徴情報を抽出し、文字認識辞
書部７にあらかじめ記憶された特徴情報と比較すること
により、最もよくマッチする文字コード情報を出力し、
文字コード記憶部９に記憶する。前記文字コードに対応
して文字フォント記憶部６にあからじめ記憶された文字
フォントを読み出すことにより、文字を表示部３に表示
する。The image input section 1 optically reads an image containing characters and stores it in the image storage section 4. The character recognition unit 8 extracts the characteristic information from the input image data and compares it with the characteristic information stored in advance in the character recognition dictionary unit 7 to output the character code information that best matches,
It is stored in the character code storage unit 9. The characters are displayed on the display unit 3 by reading out the character fonts stored in the character font storage unit 6 corresponding to the character code.

【００１４】表示された文字が操作者によって修正する
場合、画面上に書かれた文字の座標を順に座標入力部２
によって入力し、座標記憶部５に記憶する。文字認識部
８は、入力した画像データと座標データの両方の情報か
ら特徴を抽出し、文字認識辞書部７にあらかじめ記憶さ
れた特徴情報と比較することにより、最もよく特徴がマ
ッチする文字をコード情報として出力する。前記文字コ
ードに対応して文字フォント記憶部６にあらかじめ記憶
された文字フォントを読み出すことにより、文字を表示
部３に表示する。When the displayed characters are to be corrected by the operator, the coordinates of the characters written on the screen are sequentially changed to the coordinate input section 2.
Is input and stored in the coordinate storage unit 5. The character recognition unit 8 extracts a feature from the information of both the input image data and coordinate data, and compares it with the feature information stored in advance in the character recognition dictionary unit 7 to code the character with the best feature match. Output as information. A character font stored in advance in the character font storage unit 6 corresponding to the character code is read out to display the character on the display unit 3.

【００１５】本発明の第１の実施例における文字認識部
８の具体的処理について、図３のブロック図を用いてさ
らに詳しく説明する。Specific processing of the character recognition unit 8 in the first embodiment of the present invention will be described in more detail with reference to the block diagram of FIG.

【００１６】まず画像記憶部４に記憶された画像データ
について前処理部１１によって、ノイズ除去，２値化，
細線化，文字の切り出しなどの処理を行なう。次にＯＣ
Ｒ特徴抽出部１２において、文字の形状情報として、端
点や分岐点などの特徴点情報や、前記特徴点を結ぶ線情
報を抽出する。First, the image data stored in the image storage unit 4 is subjected to noise removal, binarization,
Perform processing such as thinning and cutting out characters. Next OC
The R feature extraction unit 12 extracts, as the character shape information, feature point information such as end points and branch points, and line information connecting the feature points.

【００１７】前記第１の文字認識においては、これらの
形状情報と文字認識辞書部７に記憶された特徴情報を比
較し、最もよく特徴がマッチする文字をコード情報とし
て出力する。In the first character recognition, the shape information is compared with the characteristic information stored in the character recognition dictionary unit 7, and the character having the best characteristic match is output as code information.

【００１８】一方、第２の文字認識においては、入力し
た画像データとともに、座標記憶部５に記憶された座標
データを用いる。座標データとして、まずペンが画面に
接したときの座標値が記憶され、ペンが移動するたびに
新しい座標値が続いて記憶される。ペンが画面から離れ
るとペンアップコードという制御コードが記憶される。On the other hand, in the second character recognition, the coordinate data stored in the coordinate storage unit 5 is used together with the input image data. As the coordinate data, first, the coordinate value when the pen comes into contact with the screen is stored, and each time the pen moves, a new coordinate value is continuously stored. When the pen leaves the screen, a control code called a pen-up code is stored.

【００１９】尚、ユーザは、画面５５に表示された文字
列のうち、修正したい文字の上にペン５６を用いて、修
正後の文字のストロークを入力する。The user inputs the stroke of the corrected character by using the pen 56 on the character to be corrected in the character string displayed on the screen 55.

【００２０】このストロークの座標データに基づき、ま
ず文字選択部１３によって、修正対象の文字を確定す
る。Based on the coordinate data of this stroke, the character selection section 13 first determines the character to be corrected.

【００２１】尚、文字を表示する画面５５の座標系と、
その上に配置された座標入力装置の座標系は、必ずしも
一致するわけではなく、修正させる文字位置と、修正文
字のストロークの位置が異なっていてもよく、修正され
る文字を選択する処理は、別なエディタなどのアプリケ
ーションによってもよい。The coordinate system of the screen 55 for displaying characters,
The coordinate system of the coordinate input device arranged thereon does not necessarily match, and the position of the character to be corrected and the position of the stroke of the corrected character may be different, and the process of selecting the character to be corrected is It may be an application such as another editor.

【００２２】文字が選択されると、前処理部１１は選択
された文字に対応する画像データを画像記憶部４から読
み出す。さらに、ＯＣＲ特徴抽出部１２は前記画像デー
タの形状情報を文字認識辞書部７の特徴情報と比較する
が、前記特徴情報は文字の画数ごとに分類されており、
指定された画数の文字に対する特徴情報のみ比較処理を
行なう。画数計算部１４は、前記座標データから、１文
字あたりのペンアップコードの数を数えることによって
文字の画数を求め、文字認識辞書部７に対して画数情報
を指定する。When a character is selected, the preprocessing unit 11 reads out the image data corresponding to the selected character from the image storage unit 4. Furthermore, the OCR feature extraction unit 12 compares the shape information of the image data with the feature information of the character recognition dictionary unit 7, and the feature information is classified by the number of strokes of the character,
Only the characteristic information for the specified number of strokes of characters is compared. The stroke count calculation unit 14 obtains the stroke count of a character from the coordinate data by counting the number of pen-up codes per character, and specifies stroke count information for the character recognition dictionary unit 7.

【００２３】本実施例によって、形状が似ていても画数
が異なる文字を候補から排除できるので正しい文字が認
識される可能性が高くなる。たとえば、「官（８画）」
と「宮（１０画）」のような文字の識別に対して効果が
大きい。According to this embodiment, characters having a similar shape but different stroke numbers can be excluded from the candidates, so that there is a high possibility that a correct character will be recognized. For example, “Government (8 strokes)”
The effect is great for identifying characters such as "Miya (10 strokes)".

【００２４】また、画数によって、比較のデータ量を限
定できるので高速に実行できる。Further, since the amount of data for comparison can be limited depending on the number of strokes, it can be executed at high speed.

【００２５】また、本実施例では、オンライン文字認識
部は特に設けずに、ＯＣＲ文字認識部のみで認識を行っ
ているので、辞書の容量を節約することができる。［実施例２］本発明の第２の実施例における文字認識部
８の具体的処理について、図４のブロック図を用いて詳
しく説明する。Further, in this embodiment, since the online character recognition unit is not provided and only the OCR character recognition unit performs recognition, the capacity of the dictionary can be saved. [Embodiment 2] Specific processing of the character recognition unit 8 in the second embodiment of the present invention will be described in detail with reference to the block diagram of FIG.

【００２６】前処理部１１、ＯＣＲ特徴抽出部１２、文
字選択部１３の処理は、前記第１の実施例と同様であ
る。The processes of the preprocessing unit 11, the OCR feature extraction unit 12, and the character selection unit 13 are the same as those in the first embodiment.

【００２７】オンライン特徴抽出部２１は、座標記憶部
５に記憶されている座標データに基づき、端点や分岐点
などの特徴点情報や、前記特徴点を結ぶ線情報を抽出す
る。The online feature extraction unit 21 extracts feature point information such as end points and branch points, and line information connecting the feature points based on the coordinate data stored in the coordinate storage unit 5.

【００２８】ストロークマッチング部２２は、前記ＯＣ
Ｒ特徴抽出部１２から得られた特徴点情報や線情報と、
前記オンライン特徴抽出部２１から得られた特徴点情報
や線情報とをマッチングさせて、対応する特徴点情報や
線情報を見つける。ＯＣＲ特徴部１２から得られた線情
報には線の向きを指定する情報がないが、対応するオン
ライン特徴抽出部２１から得られた線情報から線の向き
の情報を得ることができる。The stroke matching section 22 is provided with the OC.
Feature point information and line information obtained from the R feature extraction unit 12,
The corresponding feature point information and line information are found by matching with the feature point information and line information obtained from the online feature extraction unit 21. Although the line information obtained from the OCR feature unit 12 does not have information that specifies the line direction, the line direction information can be obtained from the corresponding line information obtained from the online feature extraction unit 21.

【００２９】このように、ＯＣＲ特徴抽出部１２から得
られた線情報の形状情報と、オンライン特徴抽出部から
得られた線情報の向き情報を、文字認識辞書部７に記憶
された特徴情報を比較し、最もよく特徴がマッチする文
字をコード情報として出力する。As described above, the shape information of the line information obtained from the OCR feature extraction unit 12 and the direction information of the line information obtained from the online feature extraction unit are used as the feature information stored in the character recognition dictionary unit 7. It compares and outputs the character with the best feature match as code information.

【００３０】従来のＯＣＲの文字認識辞書部は線の向き
の情報は含んでいないが、本実施例では文字認識辞書部
７に線の情報を記憶する際に、始点の座標値の記憶アド
レスと終点の座標値の記憶アドレスとの順序を始点のア
ドレスのほうが小さい番地になるように決めることで向
きの情報を記憶しておく。Although the character recognition dictionary unit of the conventional OCR does not include the information of the direction of the line, in the present embodiment, when the line information is stored in the character recognition dictionary unit 7, the storage address of the coordinate value of the starting point is used. The direction information is stored by determining the order of the end point coordinate value and the storage address so that the start point address is smaller.

【００３１】図６は、文字認識辞書部７の辞書データの
字形の一例であり、（ａ）は「千」という文字、（ｂ）
は「干」という文字のストロークをそれぞれ表わす。こ
こでストロークとは端点や分岐点等の特徴点どうしを結
ぶ線のことである。FIG. 6 shows an example of the character shape of the dictionary data of the character recognition dictionary unit 7, where (a) is the character "thousand" and (b) is the character.
Represents each stroke of the character "Dai". Here, a stroke is a line connecting characteristic points such as end points and branch points.

【００３２】図９は、文字認識辞書部７のデータ構造の
一例であり、（ａ）は「千」という文字、（ｂ）は
「干」という文字のデータ構造をそれぞれ表わす。FIG. 9 shows an example of the data structure of the character recognition dictionary unit 7, where (a) shows the data structure of the character "thousand" and (b) shows the data structure of the character "dry."

【００３３】ここで、図７に示す字形の文字がイメージ
スキャナから入力されたとする。さらに、ペンによる修
正において、図８の矢印で示す向きで文字が書かれたと
する。字形イメージスキャナから入力されたデータを参
照し、ストロークの向きはペンにより入力されたデータ
を参照することによって、図１０に示すデータが得られ
る。Here, it is assumed that the character having the shape shown in FIG. 7 is input from the image scanner. Further, it is assumed that the character is written in the direction shown by the arrow in FIG. 8 in the correction with the pen. The data shown in FIG. 10 is obtained by referring to the data input from the character image scanner and by referring to the data input by the pen for the stroke direction.

【００３４】図１０のデータと図９の（ａ）および
（ｂ）とを、例えば最小二乗法などによってマッチング
させると、ストロークの向きを考慮することによって
（ａ）の方がよくマッチする。したがって、この文字は
「千」であると判定することができる。When the data of FIG. 10 and (a) and (b) of FIG. 9 are matched by, for example, the method of least squares, (a) is better matched by considering the stroke direction. Therefore, it can be determined that this character is “thousand”.

【００３５】この例の場合は、従来のようにイメージス
キャナの画像データのみで認識を行なうと「干」と誤認
識される可能性が高く、ペンの座標データのみで認識を
行なうとカタカナの「チ」と誤認識される可能性が高い
が、本実施例を用いることによって正しく認識できる可
能性が高くなる。In the case of this example, if the recognition is performed only by the image data of the image scanner as in the conventional case, it is likely to be erroneously recognized as "dry", and if the recognition is performed only by the coordinate data of the pen, the katakana " Although there is a high possibility that it will be erroneously recognized as “h”, the possibility of correct recognition is increased by using this embodiment.

【００３６】以上述べたように、本実施例では、入力デ
ータと辞書データの比較のときに、線の向きが一致した
ときにマッチしたとみなすことで、従来のＯＣＲよりも
正しい文字が認識される可能性が高くなる。たとえば、
「水」と「木」のように文字の認識に効果が大きい。［実施例３］本発明の第３の実施例における文字認識部
８の具体的処理について、図５のブロック図を用いて詳
しく説明する。As described above, in the present embodiment, when comparing the input data and the dictionary data, it is considered that a match occurs when the line directions match, so that a correct character is recognized as compared with the conventional OCR. Is more likely to occur. For example,
It has a great effect on recognition of characters such as “water” and “tree”. [Third Embodiment] The specific processing of the character recognition unit 8 in the third embodiment of the present invention will be described in detail with reference to the block diagram of FIG.

【００３７】前処理部１１、ＯＣＲ特徴抽出部１２、文
字選択部１３、オンライン特徴抽出部２１の処理は、前
記第１の実施例または第２の実施例と同様である。The processes of the preprocessing unit 11, the OCR feature extraction unit 12, the character selection unit 13, and the online feature extraction unit 21 are the same as those in the first or second embodiment.

【００３８】ＯＣＲ辞書部３１は、第１の実施例または
第２の実施例と同様の辞書であるが、特徴がマッチした
度合いに応じて、複数の候補文字と対応する点数（確信
度）を出力する。The OCR dictionary unit 31 is a dictionary similar to that of the first or second embodiment, but the scores (confidence) corresponding to a plurality of candidate characters are determined according to the degree of feature matching. Output.

【００３９】一方、本実施例ではオンライン辞書部３２
を設け、座標記憶部５に記憶されている座標データに基
づき、オンライン文字認識として従来から知られている
方法（例えば、特公告昭６２ー３９４６０参照）で文字
認識を行ない、特徴がマッチした度合いに応じて、複数
の候補文字と点数を出力する。On the other hand, in this embodiment, the online dictionary unit 32 is used.
Based on the coordinate data stored in the coordinate storage unit 5, character recognition is performed by a method conventionally known as online character recognition (see, for example, Japanese Patent Publication No. 62-39460), and the degree of feature matching Depending on, a plurality of candidate characters and scores are output.

【００４０】判定部３３においては、ＯＣＲ辞書部３１
およびオンライン辞書部３２で候補として出力された各
文字について、点数の合計点の最も高い文字コード情報
として出力する。In the judgment unit 33, the OCR dictionary unit 31
And, for each character output as a candidate in the online dictionary unit 32, it is output as the character code information with the highest total score.

【００４１】本実施例では、ＯＣＲ文字認識とオンライ
ン文字認識の統合協調により、認識結果を出力する。そ
のため、文字の認識率を上げることができる。In this embodiment, the recognition result is output by the integrated cooperation of OCR character recognition and online character recognition. Therefore, the character recognition rate can be increased.

【００４２】尚、本発明は、複数の機器から構成される
システムに適用しても、１つの機器から成る装置に適用
しても良い。また、本発明はシステム或は装置にプログ
ラムを供給することによって達成される場合にも適用で
きることは言うまでもない。The present invention may be applied to a system composed of a plurality of devices or an apparatus composed of one device. Further, it goes without saying that the present invention can be applied to the case where it is achieved by supplying a program to a system or an apparatus.

【００４３】以上述べたように、画像を入力する画像入
力部と、前記画像情報に基づき文字コードを出力する第
１の文字認識部と、前記文字をフォントとして出力する
文字出力部と、誤って認識された文字を選択し正しい文
字を入力する操作を行なう文字修正部によって構成され
る文字認識装置において、ポインティングデバイスによ
って座標を入力する座標入力部と、前記画像情報と前記
座標情報に基づき文字コードを出力する第２の文字認識
部とを設けたことにより、ＯＣＲ文字認識とオンライン
文字認識の欠点を補い合ってさらに高い認識率を得るこ
とができるという効果がある。As described above, the image input unit for inputting an image, the first character recognition unit for outputting a character code based on the image information, and the character output unit for outputting the character as a font are erroneously set. In a character recognition device configured by a character correction unit that performs operation of selecting a recognized character and inputting a correct character, a coordinate input unit that inputs coordinates by a pointing device, and a character code based on the image information and the coordinate information. By providing the second character recognition unit for outputting, there is an effect that the defects of the OCR character recognition and the online character recognition can be compensated for and a higher recognition rate can be obtained.

【００４４】[0044]

【発明の効果】以上説明したように本発明によれば、安
価な構成で高い文字認識率を獲得することができる。As described above, according to the present invention, a high character recognition rate can be obtained with an inexpensive structure.

【００４５】[0045]

[Brief description of drawings]

【図１】本実施例の文字認識装置のブロック図である。FIG. 1 is a block diagram of a character recognition device of this embodiment.

【図２】本実施例の文字認識装置の外観図である。FIG. 2 is an external view of a character recognition device of this embodiment.

【図３】第１の実施例の文字認識部の詳細なブロック図
である。FIG. 3 is a detailed block diagram of a character recognition unit according to the first embodiment.

【図４】第２の実施例の文字認識部の詳細なブロック図
である。FIG. 4 is a detailed block diagram of a character recognition unit according to a second embodiment.

【図５】第３の実施例の文字認識部の詳細なブロック図
である。FIG. 5 is a detailed block diagram of a character recognition unit according to a third embodiment.

【図６】第２の実施例での辞書データの字形の一例を表
した図である。FIG. 6 is a diagram showing an example of a character shape of dictionary data in the second embodiment.

【図７】第２の実施例でイメージスキャナで入力された
字形の一例を表わした図である。FIG. 7 is a diagram showing an example of a character shape input by an image scanner in the second embodiment.

【図８】第２の実施例でペン入力された文字ストローク
の一例を表した図である。FIG. 8 is a diagram showing an example of character strokes input by a pen in the second embodiment.

【図９】第２の実施例で辞書のデータ構造の一例を表し
た図である。FIG. 9 is a diagram showing an example of a data structure of a dictionary in the second embodiment.

【図１０】第２の実施例で入力データのデータ構造を表
した図である。FIG. 10 is a diagram showing a data structure of input data in the second embodiment.

[Explanation of symbols]

１画像入力部２座標入力部３表示部４画像記憶部５座標記憶部６文字フォント記憶部７文字認識辞書部８文字認識部９文字コード記憶部 1 Image Input Section 2 Coordinate Input Section 3 Display Section 4 Image Storage Section 5 Coordinate Storage Section 6 Character Font Storage Section 7 Character Recognition Dictionary Section 8 Character Recognition Section 9 Character Code Storage Section

Claims

[Claims]

1. A first image input means for inputting image information, and based on character feature data included in the image information,
A first character recognition means for generating a character code; a selection means for selecting a character code to be corrected from the character codes generated by the first character recognition means; and a character code selected by the selection means. Second image input means for inputting a corrected character stroke; second character recognition for generating a character code based on the characteristic data corresponding to the selected character code and the corrected character stroke input from the second image input means. And a character recognition device.

2. The character recognition device according to claim 1, wherein the character feature data is position data of character end points and branch points, and line information connecting the end points and the branch points. .

3. The character recognition device according to claim 1, wherein the second image input means generates start position and end position data of each stroke from the input corrected character stroke.

4. The second character used by the second character recognition means.
4. The character recognition device according to claim 3, wherein the corrected character stroke input from the image input means is start position and end point position data of each stroke generated by the second image input means.

5. The character recognition device according to claim 1, wherein the first image input means is an optical image scanner.

6. The character recognition device according to claim 1, wherein the second image input means includes a stylus pen and a coordinate input surface.

7. The feature corresponding to the character code selected from the predetermined feature dictionary data corresponding to the stroke number of the corrected character stroke input from the second image input device, the second character recognizing device. The character recognition device according to claim 1, wherein the character dictionary data that is close to the data is searched to generate a corresponding character code.

8. A first image inputting step of inputting image information to a first image inputting means, and based on characteristic data of characters included in the image information,
A first character recognition step of generating a character code; a selection step of selecting a character code to be modified from the character codes generated in the first character recognition step; and a character code selected in the selection step. A second image input step of inputting the corrected character stroke to the second image input means; a character code is generated based on the characteristic data corresponding to the selected character code and the corrected character stroke input in the second image input step. And a second character recognition step of generating the character recognition method.

9. The character recognition method according to claim 8, wherein the character feature data is position data of character end points and branch points, and line information connecting the end points and the branch points. .

10. The character recognition method according to claim 8, wherein the second image input step generates start position and end position data of each stroke from the input corrected character stroke.

11. The corrected character stroke input from the second image input means used in the second character recognition step is start position and end point position data of each stroke generated in the second image input step. 11. The method according to claim 10,
Character recognition method described in.

12. The character recognition method according to claim 8, wherein the first image input means is an optical image scanner.

13. The character recognition method according to claim 8, wherein the second image input means includes a stylus pen and a coordinate input surface.

14. The feature corresponding to the selected character code from the predetermined feature dictionary data corresponding to the stroke number of the corrected character stroke input from the second image input means in the second character recognition step. The character recognition method according to claim 8, wherein the characteristic dictionary data close to the data is searched to generate a corresponding character code.