JPH06162255A - Character reading system - Google Patents

Character reading system

Info

Publication number
JPH06162255A
JPH06162255A JP4318110A JP31811092A JPH06162255A JP H06162255 A JPH06162255 A JP H06162255A JP 4318110 A JP4318110 A JP 4318110A JP 31811092 A JP31811092 A JP 31811092A JP H06162255 A JPH06162255 A JP H06162255A
Authority
JP
Japan
Prior art keywords
character
field
reading
conversion
recognition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP4318110A
Other languages
Japanese (ja)
Other versions
JP3224616B2 (en
Inventor
Yoshifumi Abe
佳史 阿部
Masao Michino
正雄 道野
Masatoshi Kurata
正敏 倉田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitachi Ltd
Original Assignee
Hitachi Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitachi Ltd filed Critical Hitachi Ltd
Priority to JP31811092A priority Critical patent/JP3224616B2/en
Publication of JPH06162255A publication Critical patent/JPH06162255A/en
Application granted granted Critical
Publication of JP3224616B2 publication Critical patent/JP3224616B2/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Abstract

PURPOSE:To provide a character read technique which can realize a reduction of an error correction work by an operator. CONSTITUTION:Under the control of a main control part 7, a business form 3 is converted to binarized data by a scanner part 4 and fetched, and with regard to each field, character recognition is executed in a recognition control part 6. As for a field in which an error is generated at the time of character recognition, whether the field concerned is a specific read field which dispenses with read accuracy of a high level or not is checked by referring to a recognition accuracy level setting part 5a, and in the case of the specific read field, a result of character recognition is delivered to a conversion control part 5. With regard to the specific read field, the conversion control part 5 executes automatically a correction of an abnormal digit, and forced substitution by a candidate character of an unreadable character and a specific conversion character without allowing an operator to intervene, and thereafter, reports a result of conversion to a host device 2 through a transmission control part 8.

Description

【発明の詳細な説明】Detailed Description of the Invention

【0001】[0001]

【産業上の利用分野】本発明は、文字読み取り技術に関
し、特に、必要とする読み取り精度が異なる複数種の読
取フィールドが混在する帳票における文字認識処理など
に適用して有効な技術に関する。
BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a character reading technique, and more particularly to a technique effective when applied to a character recognition process in a form in which a plurality of types of reading fields having different required reading accuracy are mixed.

【0002】[0002]

【従来の技術】たとえば、情報処理システムなどにおい
ては、データ入力の一手段として、周知の光学文字認識
(OCR)技術を用いて帳票などの媒体から手書き文字
を自動的に読み取ることが行われている。このような文
字読み取り技術においては、通常、不読文字の訂正処理
が必要となる。
2. Description of the Related Art For example, in information processing systems and the like, as a means of data input, a known optical character recognition (OCR) technique is used to automatically read handwritten characters from a medium such as a form. There is. In such a character reading technique, it is usually necessary to correct unreadable characters.

【0003】従来、このような文字読み取り技術として
は、たとえば、特開昭61−163472号公報に開示
されているように、認識不能文字が属するフィールドの
全文字パターンをディスプレイに表示して操作者に提示
し、操作者はフィールド内の他の可読文字を参照するこ
とにより、認識不能文字をキーボードから正確に訂正す
ることを可能にしようとする技術が知られている。
Conventionally, as such a character reading technique, as disclosed in, for example, Japanese Patent Application Laid-Open No. 61-163472, the operator displays all the character patterns of the field to which the unrecognizable character belongs on the display. In order to enable an operator to accurately correct an unrecognizable character from a keyboard by referring to another readable character in a field, a technique is known.

【0004】[0004]

【発明が解決しようとする課題】ところが、上記の従来
技術の場合には、フィールドの種別について配慮されて
おらず、たとえば、それほど高精度の読み取りを必要と
しない備考欄などのフィールドについても、他の、たと
えば金額などの重要な情報が記載されたフィールドと同
様に、一律に操作者の介入による不読文字の訂正操作が
必要となり、訂正操作が煩雑になるという問題がある。
However, in the case of the above-mentioned prior art, no consideration is given to the type of field and, for example, other fields such as a remarks column that does not require highly accurate reading are also considered. In the same manner as the field in which important information such as the amount of money is written, there is a problem that the correction operation of the unreadable character is required uniformly by the intervention of the operator, and the correction operation becomes complicated.

【0005】本発明の目的は、操作者によるエラー訂正
作業の軽減を実現することが可能な文字読み取り技術を
提供することにある。
An object of the present invention is to provide a character reading technique capable of reducing the error correction work by the operator.

【0006】本発明の前記ならびにその他の目的と新規
な特徴は、本明細書の記述および添付図面から明らかに
なるであろう。
The above and other objects and novel features of the present invention will be apparent from the description of this specification and the accompanying drawings.

【0007】[0007]

【課題を解決するための手段】本願において開示される
考案のうち、代表的なものの概要を簡単に説明すれば、
以下の通りである。
Of the devices disclosed in the present application, representative ones will be briefly described as follows.
It is as follows.

【0008】本発明は、帳票フォーマットを指定して所
望の帳票パターンをスキャナで読み取り、文字認識を行
う文字読み取り方式において、高度な読み取り精度を必
要としない特定読取フィールドを指定する読み取り精度
弁別手段と、特定読取フィールドにおいて、不読文字お
よび指定文字数を超過した異常桁の少なくとも一方から
なる文字認識エラーが発生した時、不読文字を候補文字
または特定置換文字で強制的に置換する第1の操作およ
び指定文字数を超過した文字を切り捨てることによって
異常桁の修正を行う第2の操作の少なくとも一方を遂行
する不読文字変換手段とを設けたものである。
According to the present invention, there is provided a reading accuracy discriminating means for designating a specific reading field which does not require a high reading accuracy in a character reading system in which a desired format pattern is specified by a scanner by reading a desired form pattern and character recognition is performed. , A first operation for forcibly replacing an unread character with a candidate character or a specific replacement character when a character recognition error occurs in the specific reading field, which consists of at least one of an unread character and an abnormal digit exceeding the specified number of characters And an unreadable character conversion means for performing at least one of the second operations for correcting the abnormal digit by truncating the characters exceeding the designated number of characters.

【0009】[0009]

【作用】上記した本発明の文字読み取り技術によれば、
読み取り精度弁別手段によって特定される、高度な読み
取り精度を必要としない特定読取フィールドについて
は、不読文字変換手段により、不読文字の候補文字や特
定置換文字による強制置換、および異常桁の場合の過剰
文字の切捨て処理などのエラー訂正を自動的に行うの
で、たとえば備考欄などのように、高度な読み取り精度
を必要としない特定読取フィールドの修正作業において
操作者の介入が不要となり、修正作業における操作者の
負担を確実に軽減することができる。また、修正作業の
所要時間を短縮することができる。
According to the character reading technique of the present invention described above,
For a specific reading field that does not require a high degree of reading accuracy, which is specified by the reading accuracy discrimination means, the unread character conversion means forces replacement with unread character candidate characters or specific replacement characters, and abnormal digits. Since error corrections such as truncation of excess characters are automatically performed, operator intervention is not required when modifying specific reading fields that do not require a high degree of reading accuracy, such as in the remarks column. The burden on the operator can be reliably reduced. Further, the time required for the correction work can be shortened.

【0010】[0010]

【実施例】以下、本発明の一実施例である文字読み取り
方式を図面を参照しながら詳細に説明する。
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS A character reading method according to an embodiment of the present invention will be described in detail below with reference to the drawings.

【0011】図1は本実施例の文字読み取り方式が適用
される光学文字読取装置1の構成の一例を示すブロック
図である。
FIG. 1 is a block diagram showing an example of the configuration of an optical character reading device 1 to which the character reading method of this embodiment is applied.

【0012】光学文字読取装置1は、上位装置2から伝
送制御部8を介して到来する読み取り指令を受けて、主
制御部7の制御のもとに、帳票3の画像をスキャナ部4
にて、二値化パターンに変換して帳票イメージデータを
取り込む。取り込まれた帳票イメージデータは、認識制
御部6において、文字認識処理が行われる。
The optical character reader 1 receives a read command from the host device 2 via the transmission controller 8 and, under the control of the main controller 7, scans the image of the form 3 into the scanner unit 4.
At, the form image data is captured after being converted into a binary pattern. The recognition control unit 6 performs character recognition processing on the captured form image data.

【0013】本実施例の場合、認識制御部6における文
字認識処理において読み取りエラーが発生した場合、必
要に応じて変換制御部5が起動される。この変換制御部
5は、帳票フォーマットに指定された変換対象フィール
ドであるか否かを、認識精度レベル設定部5aに設定さ
れた情報を参照して判定する。
In the case of the present embodiment, when a reading error occurs in the character recognition processing in the recognition control section 6, the conversion control section 5 is activated as necessary. The conversion control unit 5 determines whether or not the field is a conversion target field specified in the form format by referring to the information set in the recognition accuracy level setting unit 5a.

【0014】すなわち、本実施例の場合、認識精度レベ
ル設定部5aには、帳票フォーマット情報として、各フ
ィールドについて、高度な読み取り精度を必要とするか
否か、当該フィールドに許容された最大文字数、後述の
ような変換処理に用いられる変換文字10としてどのよ
うな文字を設定するか、などの情報が指定される。
That is, in the case of the present embodiment, the recognition accuracy level setting unit 5a determines whether or not a high reading accuracy is required for each field as the form format information, the maximum number of characters allowed in the field, Information such as what character is set as the conversion character 10 used in the conversion processing described later is specified.

【0015】そして、当該フィールドが、高度な読み取
り精度を必要とせず、エラー訂正を自動的に実行しても
よい変換対象フィールドであり、かつ、文字認識結果に
不読文字、当該フィールドに指定された文字数を超過し
た異常桁、などがある場合には、余剰の文字を切り捨て
て異常桁を解消したのち、不読文字を候補文字や、候補
文字がない場合には特定置換文字に変換する処理を行
い、変換結果を主制御部7に報告する、という動作を行
う。主制御部7は、伝送制御部8を介して変換結果を上
位装置2へ送出する。
Then, the field is a conversion target field that does not require a high degree of reading accuracy and may be automatically subjected to error correction, and an unreadable character is designated in the field as a character recognition result. If there is an abnormal digit exceeding the number of characters, etc., the excess characters are truncated to eliminate the abnormal digit, and then the unread character is converted to a candidate character or a specific replacement character if there is no candidate character. And the conversion result is reported to the main control unit 7. The main control unit 7 sends the conversion result to the higher-level device 2 via the transmission control unit 8.

【0016】以下、本実施例の文字読み取り方式の作用
の一例を説明する。
An example of the operation of the character reading method of this embodiment will be described below.

【0017】図2は、変換制御部5における不読文字の
変換処理の一例を示す概念図であり、図3および図4
は、本実施例の文字読み取り方式の作用の一例を示すフ
ローチャートである。
FIG. 2 is a conceptual diagram showing an example of conversion processing of unread characters in the conversion control unit 5, and FIGS.
3 is a flow chart showing an example of the operation of the character reading method of the present embodiment.

【0018】まず、スキャナ部4によって帳票イメージ
データ9を取り込み(ステップ21)、指定の帳票フォ
ーマットに基づいてフィールドを切り出し(ステップ2
2)、当該フィールドから文字認識を行う(ステップ2
3)。
First, the scanner unit 4 takes in the form image data 9 (step 21) and cuts out a field based on a specified form format (step 2).
2) Character recognition is performed from the field (step 2)
3).

【0019】そして、認識エラーの有無を判定し(ステ
ップ24)、エラー有りの場合には、さらに、当該フィ
ールドについて高精度認識不要の指定がされているか否
かを判定する(ステップ25)。
Then, it is determined whether or not there is a recognition error (step 24), and if there is an error, it is further determined whether or not high-precision recognition is not required for the field (step 25).

【0020】そして、たとえば金額欄などのように、高
精度の読み取りが必要な場合、すなわち、高精度認識不
要の指定がない場合には、エラー発生情報を設定し(ス
テップ27)、当該エラー発生情報とともに文字認識結
果11をそのまま上位装置2に出力し(ステップ2
8)、最後のフィールドか否かを調べて(ステップ2
9)、未処理のフィールドがある場合には、ステップ2
2以降を繰り返す。
If high-accuracy reading is required, such as in the amount column, that is, if there is no designation that high-accuracy recognition is not required, error occurrence information is set (step 27), and the error occurrence occurs. The character recognition result 11 is directly output to the upper level device 2 together with the information (step 2
8) Check to see if it is the last field (step 2
9) Step 2 if there are unprocessed fields
Repeat from 2 onwards.

【0021】一方、ステップ25において、エラーが発
生した当該フィールドが、たとえば備考欄で、高精度認
識不要の指定がある場合には、変換制御部5において修
正処理を行う(ステップ26)。
On the other hand, in step 25, if the field in which the error has occurred is, for example, in the remarks column and there is a designation that high-precision recognition is not required, the conversion control unit 5 performs a correction process (step 26).

【0022】すなわち、このステップ26では、図4に
例示されるように、まず、文字認識結果11の文字数が
当該フィールドに指定されている制限文字数以内か否か
を調べ(ステップ26a)、文字数が過剰な異常桁の場
合には、文字認識結果11から溢れた分の文字を切り捨
てる(ステップ26b)。
That is, in this step 26, as illustrated in FIG. 4, it is first checked whether or not the number of characters of the character recognition result 11 is within the limited number of characters specified in the field (step 26a), and the number of characters is determined. In the case of an excessive abnormal digit, the characters overflowing from the character recognition result 11 are truncated (step 26b).

【0023】その後、残りの文字認識結果11について
不読文字の検索を行い(ステップ26c)、見い出され
た不読文字に対する候補文字が存在するか否か調べ(ス
テップ26d)、存在する場合には、候補文字による不
読文字の置換を実行する(ステップ26e)。また、候
補文字が存在しない場合には、特定の変換文字10(本
実施例の場合には、一例として“@”が設定されてい
る)によって不読文字を強制的に置換する(ステップ2
6f)。
Thereafter, an unread character is searched for in the remaining character recognition result 11 (step 26c), and it is checked whether or not there is a candidate character for the found unread character (step 26d). , Replace unread characters with candidate characters (step 26e). If no candidate character exists, the unread character is forcibly replaced by the specific conversion character 10 (in this embodiment, "@" is set as an example) (step 2).
6f).

【0024】このようなステップ26c〜26fの処理
を不読文字がなくなるまで繰り返し(ステップ26
g)、変換結果12として主制御部7に報告する(ステ
ップ26h)。
The above steps 26c to 26f are repeated until there are no unread characters (step 26
g), the conversion result 12 is reported to the main controller 7 (step 26h).

【0025】なお、特定の変換文字10については、帳
票フォーマット情報の一部に当該変換文字10に関する
情報を付加することにより、各フィールド毎に、任意の
文字を割り当ててもよい。
As for the specific conversion character 10, an arbitrary character may be assigned to each field by adding information about the conversion character 10 to a part of the form format information.

【0026】以上本発明者によってなされた発明を実施
例に基づき具体的に説明したが、本発明は前記実施例に
限定されるものではなく、その要旨を逸脱しない範囲で
種々変更可能であることはいうまでもない。
Although the invention made by the present inventor has been specifically described based on the embodiments, the invention is not limited to the embodiments and various modifications can be made without departing from the scope of the invention. Needless to say.

【0027】たとえば、上述の実施例の説明では、一例
として、本発明の文字読み取り方式を、光学文字読取装
置に採用する場合について説明したが、これに限らず、
帳票フォーマットに指定された帳票を文字認識し、当該
文字認識結果と不読文字および異常桁に対応する候補文
字を出力する文字読み取り方式であれば、本発明は採用
可能である。例えば、光学文字読取装置が、上記文字読
み取り方式を有する装置であれば、上位装置にて不読文
字の自動変換を行うことができ、本発明の目的を達成す
ることが可能である。
For example, in the above description of the embodiments, the case where the character reading system of the present invention is adopted in the optical character reading device has been described as an example, but the present invention is not limited to this.
The present invention can be adopted as long as it is a character reading system that character-recognizes a form specified in the form format and outputs the character recognition result and a candidate character corresponding to an unreadable character and an abnormal digit. For example, if the optical character reading device is a device having the above-mentioned character reading method, an unread character can be automatically converted by a higher-level device, and the object of the present invention can be achieved.

【0028】[0028]

【発明の効果】本願において開示される発明のうち、代
表的なものによって得られる効果を簡単に説明すれば、
以下のとおりである。
The effects obtained by the typical ones of the inventions disclosed in the present application will be briefly described as follows.
It is as follows.

【0029】すなわち、本発明の文字読み取り方式によ
れば、高度な読み取り精度を必要としない備考欄などの
特定読取フィールドについては、過剰文字の切捨てによ
る異常桁の訂正、さらには不読文字の候補文字や特定の
変換文字への置換を自動的に行うので、たとえば、要求
される文字読み取り精度の異なる複数種の読取フィール
ドが混在する帳票の文字読み取り処理などにおいて、操
作者によるエラー訂正作業の軽減を実現することができ
る、という効果が得られる。
That is, according to the character reading method of the present invention, regarding a specific reading field such as a remarks column that does not require a high reading accuracy, correction of abnormal digits by truncation of excess characters, and further unread character candidates. Characters and specific conversion characters are automatically replaced, reducing error correction work by the operator, for example, in character reading processing of a form that contains multiple types of reading fields with different required character reading accuracy. The effect that can be realized is obtained.

【図面の簡単な説明】[Brief description of drawings]

【図1】本発明の一実施例である文字読み取り方式が適
用される光学文字読取装置の構成の一例を示すブロック
図である。
FIG. 1 is a block diagram showing an example of a configuration of an optical character reading device to which a character reading method according to an embodiment of the present invention is applied.

【図2】変換制御部における不読文字の変換処理の一例
を示す概念図である。
FIG. 2 is a conceptual diagram showing an example of conversion processing of an unread character in a conversion control unit.

【図3】本発明の一実施例である文字読み取り方式の作
用の一例を示すフローチャートである。
FIG. 3 is a flowchart showing an example of the operation of the character reading method according to the embodiment of the present invention.

【図4】本発明の一実施例である文字読み取り方式の作
用の一例を示すフローチャートである。
FIG. 4 is a flowchart showing an example of the operation of the character reading method according to the embodiment of the present invention.

【符号の説明】[Explanation of symbols]

1 光学文字読取装置 2 上位装置 3 帳票 4 スキャナ部 5 変換制御部(不読文字変換手段) 5a 認識精度レベル設定部(読み取り精度弁別手段) 6 認識制御部 7 主制御部 8 伝送制御部 9 帳票イメージデータ 10 変換文字 11 文字認識結果 12 変換結果 1 Optical Character Reading Device 2 Upper Device 3 Form 4 Scanner Unit 5 Conversion Control Unit (Unreadable Character Conversion Unit) 5a Recognition Accuracy Level Setting Unit (Reading Accuracy Discriminating Means) 6 Recognition Control Unit 7 Main Control Unit 8 Transmission Control Unit 9 Form Image data 10 Converted character 11 Character recognition result 12 Converted result

Claims (1)

【特許請求の範囲】[Claims] 【請求項1】 帳票フォーマットを指定して所望の帳票
パターンをスキャナで読み取り、文字認識を行う文字読
み取り方式であって、高度な読み取り精度を必要としな
い特定読取フィールドを指定する読み取り精度弁別手段
と、前記特定読取フィールドにおいて、不読文字および
指定文字数を超過した異常桁の少なくとも一方からなる
文字認識エラーが発生した時、前記不読文字を候補文字
または特定置換文字で強制的に置換する第1の操作およ
び前記指定文字数を超過した文字を切り捨てることによ
って前記異常桁の修正を行う第2の操作の少なくとも一
方を遂行する不読文字変換手段とを含むことを特徴とす
る文字読み取り方式。
1. A reading precision discriminating means for designating a form format and reading a desired form pattern with a scanner to perform character recognition, and a reading precision discriminating means for designating a specific reading field which does not require a high reading precision. A forcibly replacing the unread character with a candidate character or a specific replacement character when a character recognition error is generated in the specific reading field, the character recognition error including at least one of an unread character and an abnormal digit exceeding a specified number of characters And a non-reading character conversion means for performing at least one of the second operation for correcting the abnormal digit by truncating the characters exceeding the specified number of characters.
JP31811092A 1992-11-27 1992-11-27 Character reading method Expired - Lifetime JP3224616B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP31811092A JP3224616B2 (en) 1992-11-27 1992-11-27 Character reading method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP31811092A JP3224616B2 (en) 1992-11-27 1992-11-27 Character reading method

Publications (2)

Publication Number Publication Date
JPH06162255A true JPH06162255A (en) 1994-06-10
JP3224616B2 JP3224616B2 (en) 2001-11-05

Family

ID=18095608

Family Applications (1)

Application Number Title Priority Date Filing Date
JP31811092A Expired - Lifetime JP3224616B2 (en) 1992-11-27 1992-11-27 Character reading method

Country Status (1)

Country Link
JP (1) JP3224616B2 (en)

Also Published As

Publication number Publication date
JP3224616B2 (en) 2001-11-05

Similar Documents

Publication Publication Date Title
US5280544A (en) Optical character reading apparatus and method
CA1160347A (en) Method for recognizing a machine encoded character
USRE36581E (en) Character reader and recognizer with a specialized editing function
US5394484A (en) Image recognition apparatus
US4523331A (en) Automated image input, storage and output system
US5233672A (en) Character reader and recognizer with a specialized editing function
JPH06162255A (en) Character reading system
JP3319203B2 (en) Document filing method and apparatus
JPH0430074B2 (en)
JP2529421B2 (en) Character recognition device
JPH04293185A (en) Filing device
JPH06333083A (en) Optical character reader
JP2878772B2 (en) Optical character reader
JPH0327488A (en) Character recognizing device
JPH04500422A (en) Method and apparatus for identifying unrecognizable characters in an optical character recognition device
JPH0820669B2 (en) Image information recording / reading method
JPH04109379A (en) Ocr system
JPH03171274A (en) Optical character reader
JPS62248081A (en) Character recognition method through memory medium
JPS63208180A (en) Character recognizing device
JPH0476674A (en) Drawing data processor
JPS61279989A (en) System for correcting recognized result
JPH0927014A (en) Dictionary data learning system of handy optical character recognition device
JPS6215678A (en) Correcting system for read character
JPH0437969A (en) Optical character reader

Legal Events

Date Code Title Description
FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20070824

Year of fee payment: 6

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20080824

Year of fee payment: 7

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20080824

Year of fee payment: 7

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20090824

Year of fee payment: 8

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20100824

Year of fee payment: 9

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20100824

Year of fee payment: 9

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20110824

Year of fee payment: 10

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20120824

Year of fee payment: 11

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130824

Year of fee payment: 12

EXPY Cancellation because of completion of term
FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130824

Year of fee payment: 12