JPH0528302A

JPH0528302A - Character reader

Info

Publication number: JPH0528302A
Application number: JP3178625A
Authority: JP
Inventors: Toshifumi Yamauchi; 俊史山内; Mitsuo Tanaka; 満雄田中; Kunikazu Shigeta; 邦和重田
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1991-07-19
Filing date: 1991-07-19
Publication date: 1993-02-05

Abstract

PURPOSE:To read slip characters with a various kinds of printers having different fonts. CONSTITUTION:A scanning part 10 reads the characters of a slip printed in a prescribed font and stores the image data in an image memory 11. The data stored in the image memory 11 are cut out for each character by a character cut-out part 12, supplied to a decision part 13 and checked with a character dictionary of a dictionary memory part 14. The decision part 13 outputs the data which could be identified with a prescribed evaluation measure as read data. The identification is decided based on whether or not the reading of the characters printed on a learning field column provided on the slip ensures a prescribed identification precision and when the characters were not readable, a new character dictionary is prepared by utilizing the characters which were not readable and their image pattern in a dictionary preparation part 15 and added to the dictionary memory part 14.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は文字読取装置に関し、特
に読取対象外のフォントによる文字を読み取ることを可
能とした文字読取装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character reading device, and more particularly to a character reading device capable of reading a character in a font which is not a reading target.

【０００２】[0002]

【従来の技術】活字文字を読み取る方法の一例として、
入力文字とあらかじめ登録した文字辞書とのマッチング
処理を行ない、一致度の最も良いものを抽出して識別す
る方法がある。2. Description of the Related Art As an example of a method for reading print characters,
There is a method of performing matching processing between an input character and a character dictionary registered in advance and extracting and identifying the one having the best degree of matching.

【０００３】しかしながら、読取対象外のフォントの文
字を読み取る場合には、そのフォントに対応した辞書を
あらかじめ作成し、装置に登録しておく必要があった。However, when reading characters in a font that is not the object of reading, it is necessary to create a dictionary corresponding to the font in advance and register it in the device.

【０００４】[0004]

【発明が解決しようとする課題】近年、パーソナルコン
ピュータの普及により、それらに接続されてハードコピ
ーを出力するプリンタも多種開発されている。With the widespread use of personal computers in recent years, various types of printers connected to them and outputting a hard copy have been developed.

【０００５】また、プリンタの多種開発に伴なって、プ
リンタの印字方式も、ドット等のインパクトプリンタ
や、レーザープリンタのようなノンインパクトプリンタ
が多種開発され、それらのプリンタに使用される文字の
フォントも多種多様に出回ってきている。With the development of various printers, various types of printers have been developed, such as impact printers for dots and non-impact printers such as laser printers, and fonts for characters used in these printers. Is also in wide variety.

【０００６】従来の文字読取装置では、読み取るべきフ
ォントはあらかじめ決まっており、従って他のフォント
を読み取る場合には、そのための辞書を作成し登録する
必要があり、多種のプリンタの文字を読み取ることが困
難であるという欠点があった。In the conventional character reading device, the font to be read is determined in advance. Therefore, when reading another font, it is necessary to create and register a dictionary for that, and it is possible to read characters of various printers. It had the drawback of being difficult.

【０００７】本発明の目的は上述した欠点を除去し、多
種のプリンタによる印字文字を読み取ることができる文
字読取装置を提供することにある。An object of the present invention is to eliminate the above-mentioned drawbacks and to provide a character reading device capable of reading printed characters by various printers.

【０００８】[0008]

【課題を解決するための手段】本発明の文字読取装置
は、読み取るべきデータを印字するデータフィールドの
印字文字と同一フォントかつあらかじめ設定した文字順
で所定の文字を印字する学習フィールドを帳票に備える
手段と、前記学習フィールドを備えた帳票を照合により
読み取るための第一の文字辞書を備えて前記帳票を読み
取るとともに、前記学習フィールドに対する読取り不可
能なすべての文字の文字パタンと、文字位置にもとづい
て得られる文字コードデータとを入力として前記読取り
不可能な文字を読取り可能とする第二の文字辞書を作成
して前記第一の文字辞書と合体させつつ前記帳票を読み
取る手段とを備えて構成される。The character reading apparatus of the present invention includes a learning field for printing a predetermined character in the same font as a printing character of a data field for printing data to be read and in a preset character order. Means and a first character dictionary for reading the form with the learning field by collation, and reading the form, based on the character patterns of all unreadable characters for the learning field and the character position. And a means for reading the form while incorporating a second character dictionary into which the unreadable characters can be read by inputting the character code data obtained by To be done.

【０００９】[0009]

【実施例】次に、本発明について図面を参照して説明す
る。DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described below with reference to the drawings.

【００１０】図１は、本発明の一実施例の構成を示すブ
ロック図である。FIG. 1 is a block diagram showing the configuration of an embodiment of the present invention.

【００１１】図１において、走査部１０は、読取りの対
象とする帳票などを走査して、そのイメージパターンを
出力する。In FIG. 1, a scanning unit 10 scans a form or the like to be read and outputs its image pattern.

【００１２】イメージメモリ１１は、走査部１０の出力
するイメージパターンを順次ストアするメモリである。The image memory 11 is a memory for sequentially storing the image patterns output from the scanning section 10.

【００１３】文字切出し部１２は、イメージメモリ１１
からイメージパターンを帳票のフォーマット情報に基づ
いて１文字づつ切り出す。The character cutout unit 12 includes an image memory 11
The image pattern is cut out character by character based on the format information of the form.

【００１４】判定部１３は、切り出された１文字単位の
文字を識別する。The determination unit 13 identifies the cut-out character by character.

【００１５】辞書メモリ部１４は、切り出された文字と
のマッチング（照合）を行なう文字辞書（標準パター
ン）を記憶しておくものであるが、これらの辞書はあら
かじめ作成されて記憶している部分としての第一の文字
辞書と、辞書作成部１５で新しく作成された第二の文字
辞書を記憶する部分とから構成されている。The dictionary memory unit 14 stores a character dictionary (standard pattern) for matching with the cut out characters (standard pattern). These dictionaries are created and stored in advance. And a portion for storing the second character dictionary newly created by the dictionary creating unit 15.

【００１６】辞書作成部１５は、文字切出し部１２によ
り出力された文字イメージから第二の文字辞書を作成す
る。The dictionary creating unit 15 creates a second character dictionary from the character images output by the character cutting unit 12.

【００１７】次に、本実施例の動作について説明する。Next, the operation of this embodiment will be described.

【００１８】図２は、図１の実施例に利用する帳票の一
例を示す図である。FIG. 2 is a diagram showing an example of a form used in the embodiment of FIG.

【００１９】図２において、帳票識別コード（以下、Ｉ
Ｃコードと呼ぶ）２０は、帳票のフォーマット、すなわ
ち、読み取る行数とそれらの位置，文字種等を指定する
ためのものである。In FIG. 2, a form identification code (hereinafter, I
A C code) 20 is used to specify the format of a form, that is, the number of lines to be read, their positions, character types, and the like.

【００２０】学習フィールド２１は、帳票に使用されて
いる文字，例えば数字の０から９の文字を順番に印字す
るためのフィールドである。The learning field 21 is a field for printing the characters used in the form, for example, the numbers 0 to 9 in order.

【００２１】例えば、帳票に使用されているフォントが
新ＪＩＳ−Ｂフォントの数字であれば、学習フィールド
２１には、新ＪＩＳ−Ｂの数字０から９までが順番に印
字される。For example, if the font used in the form is the numbers of the new JIS-B font, the learning field 21 is printed with the numbers 0 to 9 of the new JIS-B in order.

【００２２】図１において、走査部１０は、図２に示す
ＩＤコード２０を読み取ることによって、学習フィール
ド２１と複数のデータフィールド２２の位置，文字種等
の帳票フォーマットを知る。In FIG. 1, the scanning unit 10 knows the form format such as the positions of the learning field 21 and the plurality of data fields 22 and the character type by reading the ID code 20 shown in FIG.

【００２３】次に、学習フィールド２１の読取りに移
る。Next, the reading of the learning field 21 is started.

【００２４】例えば、図２に示すように、学習フィール
ド２１には０から９までの数字が印字されている。For example, as shown in FIG. 2, numbers 0 to 9 are printed in the learning field 21.

【００２５】走査部１０によって読み取られた文字は、
イメージメモリ１１に記憶されたのち、文字切出し部１
２によって切り出されて判定部１３に送出される。The characters read by the scanning unit 10 are
After being stored in the image memory 11, the character cutout unit 1
It is cut out by 2 and sent to the determination unit 13.

【００２６】判定部１３は、切り出された文字と辞書メ
モリ１４に格納した標準文字パターンとの照合を行な
う。The determination unit 13 collates the cut out characters with the standard character pattern stored in the dictionary memory 14.

【００２７】学習フィールド２１に印字した文字は、文
字読取りに利用するとともに、読取り対象外の文字が入
力したときに、この読取り対象外の文字読取に必要な新
たな文字辞書作成用に利用する。このことは、学習フィ
ールドに印字しておく文字が、あらかじめ設定した文字
順で配列しているので、ある文字が読取り不可能な場
合、たとえば前述した例で“２”の存在する位置情報か
ら“２”を特定する文字コードがわかり、これと読み取
れなかった文字パターンとを教師信号として新たな文字
辞書を作成することによって可能となる。The characters printed in the learning field 21 are used for character reading, and when a character not to be read is input, it is used to create a new character dictionary necessary for reading the character not to be read. This means that the characters to be printed in the learning field are arranged in a preset character order, so if a certain character is unreadable, for example, from the position information where "2" exists in the above example, " This is possible by knowing the character code specifying 2 "and creating a new character dictionary using this and the character pattern that could not be read as a teacher signal.

【００２８】帳票の読取りを開始する時、まず帳票の学
習フィールド２１に印字されている文字を識別する。When reading the form, the characters printed in the learning field 21 of the form are first identified.

【００２９】この時、確度の高い正解が得られれば、そ
の文字の辞書はすでに登録されていると判断し、以降の
データフィールド２２の読取りを実行する。At this time, if a highly accurate correct answer is obtained, it is determined that the dictionary of the character is already registered, and the subsequent reading of the data field 22 is executed.

【００３０】一方、読取り不能や誤読が発生したり、識
別されたとしても確度の低い結果であれば、その文字は
登録されていないと判断する。登録されていないと判断
した時は、学習フィールド２１の印字されている文字パ
ターンから自動的に辞書を作成し、登録した後、以降の
データフィールド２２の読取りを実行するように動作す
る。On the other hand, if unreadable or erroneous reading occurs, or if the result is low in accuracy even if it is identified, it is determined that the character is not registered. If it is determined that the dictionary is not registered, the dictionary is automatically created from the printed character pattern of the learning field 21, and after the registration, the subsequent reading of the data field 22 is executed.

【００３１】学習フィールド２１に印字された文字は、
決められた文字の順序で印字されているので正解が容易
にわかり、識別結果が所定の判定しきい値を超えて確度
の高いものであれば、ただちに以後のデータフィールド
２２の読取りに移る。The characters printed in the learning field 21 are
Since the characters are printed in the determined character order, the correct answer can be easily understood, and if the identification result exceeds the predetermined judgment threshold value and is highly accurate, the subsequent reading of the data field 22 is immediately started.

【００３２】しかし、これらの文字の内識別できなかっ
たり、誤読したり、あるいは正読しても確度が低い文字
がある場合には、辞書メモリ部１４にその文字辞書が登
録されていないと判断する。However, if one of these characters cannot be identified, is misread, or has a low accuracy even if correctly read, it is determined that the character dictionary is not registered in the dictionary memory unit 14. To do.

【００３３】この時、確度が高い正解以外の文字に対し
ては、それらの文字イメージを辞書作成部１５に送り、
それらの文字辞書を作成し、辞書メモリ部１４に登録、
以後のデータフィールド部２２の識別に移るように動作
する。At this time, for characters other than the correct answer with high accuracy, those character images are sent to the dictionary creating section 15,
Create those character dictionaries and register them in the dictionary memory unit 14,
It operates so as to proceed to the identification of the data field portion 22 thereafter.

【００３４】尚、辞書作成のアルゴリズムは多種多様の
ものが発表されているが、本発明とは直接関係がないの
で説明は省略する。A variety of algorithms for creating a dictionary have been published, but since they are not directly related to the present invention, their explanations are omitted.

【００３５】辞書メモリ部１４には、通常使用頻度の高
いフォントの辞書を常時登録させておく部分と、辞書作
成部１５によって新たに作成される辞書を登録する部分
に分けてある。The dictionary memory section 14 is divided into a section for constantly registering a dictionary of a font that is frequently used and a section for registering a dictionary newly created by the dictionary creating section 15.

【００３６】辞書メモリ部１４の新しく作成される辞書
のエリアが一杯になった場合は、使用頻度の低い辞書と
置きかえる機能を有する。また、その帳票に使用されて
いる文字が辞書メモリ部１４に常時登録させているフォ
ントであることがあらかじめ判明している場合は、学習
フィールドは印字なし（ブランク）にしておくことによ
って、すなわちブランクを検知することによって、辞書
作成の処理を省略するように制御することが可能であ
る。When the area of the newly created dictionary in the dictionary memory unit 14 becomes full, it has a function of replacing the dictionary with a low frequency of use. If it is known in advance that the characters used in the form are fonts that are constantly registered in the dictionary memory unit 14, the learning field is left blank (blank), that is, blank. It is possible to control so as to omit the dictionary creation processing by detecting the.

【００３７】さらに、図３に示すように、プリンタの印
字にクセ，たとえば図３では、数字０の下部が欠けてい
る場合がある場合にも、ダミーフィールドに印字してお
くことによってこれらの文字の辞書も自動的に作成され
るので読取りが可能となる。Further, as shown in FIG. 3, even if there is a case that the lower part of the numeral 0 in FIG. The dictionary is automatically created and can be read.

【００３８】[0038]

【発明の効果】以上説明したように本発明によれば、従
来の文字読取装置ではあらかじめ決められたフォントの
文字しか読取ることができなかったが、読取対象外のフ
ォントの文字でもそれに対応する辞書を自動作成するこ
とができ、任意のフォントの読取りを可能にする効果が
ある。As described above, according to the present invention, the conventional character reading device can read only the characters of the predetermined font, but the dictionary corresponding to the characters of the font not to be read can be read. Can be created automatically, and it has the effect of enabling the reading of arbitrary fonts.

[Brief description of drawings]

【図１】本発明の一実施例の構成を示すブロック図であ
る。FIG. 1 is a block diagram showing a configuration of an exemplary embodiment of the present invention.

【図２】図１の実施例における読取り帳票の一例を示す
図である。FIG. 2 is a diagram showing an example of a read form in the embodiment of FIG.

【図３】プリンタの印字文字のクセの一例を示す図であ
る。FIG. 3 is a diagram showing an example of a habit of print characters of a printer.

[Explanation of symbols]

１０走査部１１イメージメモリ１２文字切出し部１３判定部１４辞書メモリ部１５辞書作成部２０帳票識別コード２１学習フィールド２２データフィールド 10 Scanning Section 11 Image Memory 12 Character Extraction Section 13 Judgment Section 14 Dictionary Memory Section 15 Dictionary Creation Section 20 Form Identification Code 21 Learning Field 22 Data Field

Claims

Claims: 1. A means for providing a form with a learning field for printing predetermined characters in the same font as a printing character of a data field for printing data to be read and in a preset character order, and the learning field. While reading the form with a first character dictionary for reading the form by collation, the character patterns of all unreadable characters for the learning field, and the character code data obtained based on the character position. Characters comprising means for reading the form while creating a second character dictionary that makes it possible to read the unreadable characters by inputting and, and incorporating the second character dictionary with the first character dictionary. Reader.