JP6462930B1

JP6462930B1 - Character recognition apparatus, method and program

Info

Publication number: JP6462930B1
Application number: JP2018061297A
Authority: JP
Inventors: 択渡久地
Original assignee: Ai Inside; AI Inside Inc
Current assignee: Ai Inside; AI Inside Inc
Priority date: 2018-03-28
Filing date: 2018-03-28
Publication date: 2019-01-30
Anticipated expiration: 2038-03-28
Also published as: JP2019175037A

Abstract

【課題】項目ごとに尤度を算出し、算出した尤度に基づいて読取項目の表示態様を変更することが可能な文字認識装置、方法およびプログラムを提供する。
【解決手段】書類をスキャンして変換された画像データに対して、読取を行う読取項目の設定を行う読取項目設定部２３１と、ＯＣＲ機能を備え、画像データの読取項目に表示されている文字情報をＯＣＲで読み取ってテキストデータを生成する文字認識部２３２と、読取項目の画像データとテキストデータとを並列に表示させるテキストデータ表示部２３３と、テキストデータの尤度を算出する尤度算出部２３４と、表示されたテキストデータの表示の態様を変更する表示変更部２３５と、を有する制御部２３０を備える文字認識装置２００を提供する。
【選択図】図２
A character recognition apparatus, method, and program capable of calculating likelihood for each item and changing a display mode of a read item based on the calculated likelihood.
A scan item setting unit 231 for setting a scan item for scanning image data converted by scanning a document and an OCR function, and a character displayed in the scan item of the image data A character recognition unit 232 that reads information by OCR to generate text data, a text data display unit 233 that displays image data and text data of a reading item in parallel, and a likelihood calculation unit that calculates the likelihood of text data A character recognition device 200 is provided that includes a control unit 230 that includes a display change unit 235 that changes the display mode of the displayed text data.
[Selection] Figure 2

Description

本開示は、画像データから文字情報を読み取る文字認識装置、方法およびプログラムに関する。 The present disclosure relates to a character recognition apparatus, method, and program for reading character information from image data.

手書きで文字が記入された書類をイメージスキャナなどで読み取り、ＯＣＲ（Optical Character Recognition）処理を行うことにより、入力情報を所定の文字コードに変換したデジタルデータを生成する技術が普及している。 2. Description of the Related Art A technique for generating digital data in which input information is converted into a predetermined character code by reading a document with handwritten characters by an image scanner or the like and performing OCR (Optical Character Recognition) processing has become widespread.

手書きの書類等を画像としてスキャンした画像データから、機械学習により文字認識を行う文字識別システムが開示されている（例えば、特許文献１参照。）。特許文献１に開示されている文字認識システムでは、見本文字画像の入力を受け付ける文字画像入力受付部と、見本文字画像に基づいて文字部品を抽出する文字部品抽出と、文字部品に基づいて擬似文字モデルを生成する擬似文字モデル生成部と、擬似文字モデルに基づいて文字識別パターンを生成して識別辞書を生成する識別辞書生成と、により構成されるシステムが提案されている。 A character identification system that performs character recognition by machine learning from image data obtained by scanning a handwritten document or the like as an image is disclosed (for example, see Patent Document 1). In the character recognition system disclosed in Patent Document 1, a character image input receiving unit that receives an input of a sample character image, a character component extraction that extracts a character component based on the sample character image, and a pseudo character based on the character component There has been proposed a system including a pseudo character model generation unit that generates a model, and an identification dictionary generation that generates a character identification pattern based on the pseudo character model and generates an identification dictionary.

特開２０１５−０６９２５６号公報Japanese Patent Application Laid-Open No. 2015-069256

ところで、ＯＣＲによる文字認識の正解率は必ずしも完全ではないため、文字認識処理を行った後に、人間の目視による確認が一般的に行われる。その際に、文字認識の精度（文字認識処理を行った文字の尤度）は文字ごとに異なるが、その尤度によらず全ての項目・文字を目視で一律に確認するため、非効率であった。 By the way, since the correct answer rate of the character recognition by OCR is not necessarily perfect, after performing a character recognition process, the human visual confirmation is generally performed. At that time, the accuracy of character recognition (likelihood of characters that have undergone character recognition processing) varies from character to character, but it is inefficient because all items and characters are checked uniformly regardless of the likelihood. there were.

そこで、本開示では、手書きの書類等を画像としてスキャンした画像データからＯＣＲ処理により文字認識を行う際に、項目ごとに尤度を算出し、算出した尤度に基づいて読取項目の表示態様を変更することにより、読取項目ごとの読取処理の信頼性を把握することで無駄な目視確認処理を行わず、精度の高いテキストデータ生成を行うことができる文字認識装置、方法およびプログラムについて説明する。 Therefore, in the present disclosure, when character recognition is performed by OCR processing from image data obtained by scanning a handwritten document or the like as an image, the likelihood is calculated for each item, and the display mode of the read item is set based on the calculated likelihood. A character recognition apparatus, method, and program capable of generating text data with high accuracy without performing unnecessary visual confirmation processing by grasping the reliability of the reading processing for each reading item by changing the data will be described.

本開示の一態様における文字認識装置は、書類を画像としてスキャンした画像データから、文字情報を読み取る文字認識装置であって、書類の画像データに設定された読取項目に表示されている文字情報の文字認識を行い、テキストデータを生成する文字認識部と、読取項目ごとに読取項目の画像データとテキストデータとを比較可能に表示し、テキストデータに対して文字入力を行うことでテキストデータを編集可能な状態にするテキストデータ表示部と、読取項目ごとに文字認識の尤度を算出する尤度算出部と、算出された尤度に基づき、テキストデータを表示する箇所の態様を変化させる表示変更部と、を備え、表示変更部は、読取項目ごとの尤度が所定の第１の閾値以下であるテキストデータの読取項目について、テキストデータの表示欄から文字列を表示しない設定に変更し、読取項目ごとの尤度が所定の第２の閾値以上の読取項目の画像データとテキストデータとを表示せず、読取項目ごとの尤度が第１の閾値より大きく第２の閾値未満である場合、読取項目ごとの尤度に応じて読取項目のテキストデータを表示する箇所の態様を変化させる。 A character recognition device according to an aspect of the present disclosure is a character recognition device that reads character information from image data obtained by scanning a document as an image, and includes character information displayed in a reading item set in the image data of the document. Character recognition unit that performs character recognition and generates text data, and displays the image data and text data of the read item for each read item in a comparable manner, and edits the text data by inputting characters into the text data A text data display unit for enabling a state, a likelihood calculation unit for calculating the likelihood of character recognition for each reading item, and a display change for changing the mode of the place where the text data is displayed based on the calculated likelihood e Bei and parts, and the display changing unit for reading items of text data likelihood for each read item is equal to or less than a predetermined first threshold value, the text data The display column is changed to a setting not to display a character string, and the image data and text data of the read item whose likelihood for each read item is equal to or greater than a predetermined second threshold value is not displayed, and the likelihood for each read item is the first. If greater than the first threshold value is less than the second threshold value, Ru changing the manner of a portion displaying the text data of the read item in accordance with the likelihood of each read entry.

本開示の一態様における文字認識方法は、書類を画像としてスキャンした画像データから、文字情報を読み取る文字認識方法であって、文字認識部が行う、書類の画像データに設定された読取項目に表示されている文字情報の文字認識を行い、テキストデータを生成する文字認識ステップと、テキストデータ表示部が行う、読取項目ごとに読取項目の画像データとテキストデータとを比較可能に表示し、テキストデータに対して文字入力を行うことでテキストデータを編集可能な状態にするテキストデータ表示ステップと、尤度算出部が行う、読取項目ごとに文字認識の尤度を算出する尤度算出ステップと、算出された尤度に基づき、表示変更部が行う、テキストデータを表示する箇所の態様を変化させる表示変更ステップと、を備え、表示変更ステップでは、読取項目ごとの尤度が所定の第１の閾値以下であるテキストデータの読取項目について、テキストデータの表示欄から文字列を表示しない設定に変更し、読取項目ごとの尤度が所定の第２の閾値以上の読取項目の画像データとテキストデータとを表示せず、読取項目ごとの尤度が第１の閾値より大きく第２の閾値未満である場合、読取項目ごとの尤度に応じて読取項目のテキストデータを表示する箇所の態様を変化させる。 A character recognition method according to an aspect of the present disclosure is a character recognition method for reading character information from image data obtained by scanning a document as an image, and is displayed on a reading item set in the image data of the document, which is performed by a character recognition unit. The character recognition step for generating the text data by performing the character recognition of the character information that has been read, and the text data display unit, for each reading item, the image data of the reading item and the text data are displayed so as to be comparable, and the text data A text data display step for making the text data editable by inputting characters to the likelihood, a likelihood calculating step for calculating the likelihood of character recognition for each reading item, and a calculation performed by the likelihood calculating unit based on the likelihoods, performed by the display change section, e Bei a display change step of changing the mode of a portion displaying text data, the display In the further step, the text data reading item whose likelihood for each reading item is equal to or less than a predetermined first threshold value is changed to a setting for not displaying a character string from the text data display column, and the likelihood for each reading item is changed. When the image data and text data of the reading item equal to or greater than the predetermined second threshold are not displayed, and the likelihood for each reading item is greater than the first threshold and less than the second threshold, the likelihood for each reading item depending on the Ru changing the manner of a portion displaying the text data of the read item.

また、本開示の一態様における文字認識プログラムは、書類を画像としてスキャンした画像データから、文字情報を読み取る文字認識プログラムであって、書類の画像データに設定された読取項目に表示されている文字情報の文字認識を行い、テキストデータを生成する文字認識ステップと、読取項目ごとに読取項目の画像データとテキストデータとを比較可能に表示し、テキストデータに対して文字入力を行うことでテキストデータを編集可能な状態にするテキストデータ表示ステップと、読取項目ごとに文字認識の尤度を算出する尤度算出ステップと、算出された尤度に基づき、テキストデータを表示する箇所の態様を変化させる表示変更ステップと、を電子計算機に実行させ、表示変更ステップでは、読取項目ごとの尤度が所定の第１の閾値以下であるテキストデータの読取項目について、テキストデータの表示欄から文字列を表示しない設定に変更し、読取項目ごとの尤度が所定の第２の閾値以上の読取項目の画像データとテキストデータとを表示せず、読取項目ごとの尤度が第１の閾値より大きく第２の閾値未満である場合、読取項目ごとの尤度に応じて読取項目のテキストデータを表示する箇所の態様を変化させる。 A character recognition program according to an aspect of the present disclosure is a character recognition program that reads character information from image data obtained by scanning a document as an image, and the characters displayed in the reading items set in the image data of the document performs character recognition of the information, text data by performing a character recognition step of generating text data, read comparably displaying the read image data and text data entry for each item, the character input to the text data The text data display step for making the text editable, the likelihood calculation step for calculating the likelihood of character recognition for each reading item, and the mode of the place where the text data is displayed is changed based on the calculated likelihood is executed and a display changing step, the electronic computer, the display change step, the likelihood of each read entry is given 1 For text data reading items that are less than or equal to the threshold value, the text data display column is changed to a setting that does not display a character string, and the image data and text data of the reading item that has a likelihood for each reading item equal to or greater than a predetermined second threshold value. Is not displayed and the likelihood of each reading item is larger than the first threshold value and less than the second threshold value, the mode of the portion where the text data of the reading item is displayed is changed according to the likelihood of each reading item. to.

本開示によれば、手書きの書類等を画像としてスキャンした画像データからＯＣＲ処理により文字認識を行う際に、項目ごとに尤度を算出し、算出した尤度に基づいて読取項目の表示態様を変更することができるため、読取項目ごとの読取処理の信頼性を把握することが可能である。これにより、無駄な目視確認処理を行わず、精度の高いテキストデータ生成を行うことが可能である。 According to the present disclosure, when character recognition is performed by OCR processing from image data obtained by scanning a handwritten document or the like as an image, the likelihood is calculated for each item, and the display mode of the read item is set based on the calculated likelihood. Since it can be changed, it is possible to grasp the reliability of the reading process for each reading item. Thereby, it is possible to generate highly accurate text data without performing useless visual confirmation processing.

本開示の一実施形態に係る文字認識システムを示す機能ブロック図である。It is a functional block diagram showing a character recognition system concerning one embodiment of this indication. 図１に示す文字認識装置２００の構成の一例を示す機能ブロック図である。It is a functional block diagram which shows an example of a structure of the character recognition apparatus 200 shown in FIG. 図１の画像データＤＢ２２１に記憶されている画像データの例である口座振替依頼書を示す模式図である。It is a schematic diagram which shows the account transfer request document which is an example of the image data memorize | stored in image data DB221 of FIG. 図３の口座振替依頼書の画像データが表示部に表示されている状態の例を示す模式図である。It is a schematic diagram which shows the example of the state by which the image data of the account transfer request document of FIG. 3 is displayed on the display part. 図４の画像データから抽出された読取項目の画像データとテキストデータとが並列に表示されている状態の例を示す模式図である。FIG. 5 is a schematic diagram illustrating an example of a state in which image data and text data of a reading item extracted from the image data of FIG. 4 are displayed in parallel. 図２の文字認識部２３２による特徴抽出とベクトル変換の様子を示す模式図である。It is a schematic diagram which shows the mode of the feature extraction and vector conversion by the character recognition part 232 of FIG. 図２の文字認識部２３２による字種の判定の様子を示す模式図である。It is a schematic diagram which shows the mode of determination of the character type by the character recognition part 232 of FIG. 図５の読取項目の画像データとテキストデータとが並列に表示されている状態からテキストデータの右端に色彩が施された状態の例を示す模式図である。FIG. 6 is a schematic diagram illustrating an example of a state in which color is applied to the right end of text data from a state in which image data and text data of a reading item in FIG. 図１に示す文字認識システム１による文字認識方法を示すフローチャートである。It is a flowchart which shows the character recognition method by the character recognition system 1 shown in FIG. 図１に示す記憶部２４０に記憶される金融機関テーブルの例を示す図である。It is a figure which shows the example of the financial institution table memorize | stored in the memory | storage part 240 shown in FIG. 本開示の一実施形態に係る文字認識システムにおける画像データから抽出された読取項目の画像データと、テキストデータとが並列に表示されている状態の例を示す模式図である。It is a mimetic diagram showing an example of a state where image data of a reading item extracted from image data and text data are displayed in parallel in a character recognition system concerning one embodiment of this indication. 本開示の一実施形態に係る文字認識システムにおける画像データから抽出された読取項目の画像データと、テキストデータとが並列に表示されている状態の例を示す模式図である。It is a mimetic diagram showing an example of a state where image data of a reading item extracted from image data and text data are displayed in parallel in a character recognition system concerning one embodiment of this indication. 本開示の一実施形態に係る文字認識システムにおける画像データから抽出された読取項目の画像データと、テキストデータとが並列に表示されている状態の例を示す模式図である。It is a mimetic diagram showing an example of a state where image data of a reading item extracted from image data and text data are displayed in parallel in a character recognition system concerning one embodiment of this indication.

本開示の実施形態について図面を参照して説明する。なお、以下に説明する実施形態は、特許請求の範囲に記載された本開示の内容を不当に限定するものではない。また、実施形態に示される構成要素のすべてが、本開示の必須の構成要素であるとは限らない。 An embodiment of the present disclosure will be described with reference to the drawings. Note that the embodiments described below do not unduly limit the content of the present disclosure described in the claims. In addition, all the components shown in the embodiments are not necessarily essential components of the present disclosure.

（実施形態１）
＜構成＞
図１は、本開示の実施形態１に係る文字認識システム１のブロック図である。この文字認識システム１は、例えば、手書きの申込書や口座振替依頼書等の書類を画像としてスキャンした画像データから、文字情報を読み取るシステムであり、顧客から申込書や口座振替依頼書を受領するユーザ企業が、申込書や口座振替依頼書に記載された手書きの文字情報を読み取るために使用される。 (Embodiment 1)
<Configuration>
FIG. 1 is a block diagram of a character recognition system 1 according to the first embodiment of the present disclosure. The character recognition system 1 is a system that reads character information from image data obtained by scanning, for example, a handwritten application form or an account transfer request form as an image, and receives an application form or an account transfer request form from a customer. It is used by a user company to read handwritten character information described in an application form or fund transfer request form.

文字認識システム１は、ユーザシステム１００と、文字認識装置２００と、ネットワークＮＷと、を有している。ユーザシステム１００と、文字認識装置２００とは、ネットワークＮＷを介して接続される。ネットワークＮＷは、インターネット、ＬＡＮ（Local Area Network）やＷＡＮ（Wide Area Network）等により構成される。また、このネットワークＮＷは有線通信でも無線通信でも良く、ＬＴＥ（Long Term Evolution）等の４Ｇと呼ばれる通信方式や、５Ｇによる通信方式も含まれる。 The character recognition system 1 includes a user system 100, a character recognition device 200, and a network NW. The user system 100 and the character recognition device 200 are connected via a network NW. The network NW is configured by the Internet, a LAN (Local Area Network), a WAN (Wide Area Network), or the like. The network NW may be wired communication or wireless communication, and includes a communication method called 4G such as LTE (Long Term Evolution) and a communication method based on 5G.

ユーザシステム１００は、複数の項目から構成される申込書や口座振替依頼書などの書類をスキャンし、画像データに変換する。このユーザシステム１００は、スキャナ装置１１０と、ユーザ端末１２０とを備え、例えば、ＵＳＢ（登録商標）やＬＡＮにより相互に通信可能に接続されている。 The user system 100 scans a document such as an application form or an account transfer request form composed of a plurality of items, and converts it into image data. The user system 100 includes a scanner device 110 and a user terminal 120, and is connected to be communicable with each other via, for example, USB (registered trademark) or a LAN.

スキャナ装置１１０は、申込書や口座振替依頼書などの書類をスキャンして画像データに変換する装置である。なお、この実施形態１ではスキャナ装置としたが、紙媒体による書類を電子データ化できる装置であれば良く、例えば、カメラ等でも良い。スキャナ装置１１０でスキャンした画像データは、後述する画像データＤＢ２２１に記憶される。 The scanner device 110 is a device that scans a document such as an application form or an account transfer request form and converts it into image data. In the first embodiment, the scanner device is used. However, any device that can convert a paper document into electronic data may be used. For example, a camera may be used. Image data scanned by the scanner device 110 is stored in an image data DB 221 described later.

ユーザ端末１２０は、ユーザ企業に設置される端末であり、ユーザの操作により、画像データに対して読取項目を設定し、読取項目の文字認識が行われたテキストデータに対して目視確認を行う端末である。このユーザ端末１２０は、画像データを表示する表示部を備え、操作部を操作することでネットワークＮＷを介して文字認識装置２００にアクセスされ、各種プログラムが起動されて提供されるようになっており、表示部はディスプレイ等から構成され、操作部はキーボードやマウス等から構成される。 The user terminal 120 is a terminal installed in a user company, and sets a reading item for image data by a user operation, and performs a visual check on text data in which character recognition of the reading item is performed. It is. The user terminal 120 includes a display unit that displays image data, and the character recognition device 200 is accessed via the network NW by operating the operation unit, and various programs are started and provided. The display unit includes a display and the operation unit includes a keyboard and a mouse.

図２は、図１に示す文字認識装置２００の構成の一例を示す機能ブロック図である。この文字認識装置２００は、設定された読取項目に基づき、画像データ上の読取項目に表示されている文字情報をテキストデータとして生成する。文字認識装置２００は、通信部２１０と、記憶部２２０と、制御部２３０とを備える。 FIG. 2 is a functional block diagram showing an example of the configuration of the character recognition device 200 shown in FIG. The character recognition device 200 generates, as text data, character information displayed in the reading item on the image data based on the set reading item. The character recognition device 200 includes a communication unit 210, a storage unit 220, and a control unit 230.

通信部２１０は、ネットワークＮＷを介してユーザシステム１００と通信を行うための通信インターフェースであり、ＴＣＰ／ＩＰ（Transmission Control Protocol/Internet Protocol）等の通信規約により通信が行われる。 The communication unit 210 is a communication interface for communicating with the user system 100 via the network NW, and performs communication according to a communication protocol such as TCP / IP (Transmission Control Protocol / Internet Protocol).

記憶部２２０は、各種制御処理や制御部２５０内の各機能を実行するためのプログラム、入力データ等を記憶するものであり、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）等から構成される。また、記憶部２２０は、後述するテキストデータ表示部２３３により表示された読取項目の画像データや、尤度算出部２３４により算出された尤度を一時的に記憶している。 The storage unit 220 stores programs for executing various control processes and functions in the control unit 250, input data, and the like, and includes a RAM (Random Access Memory), a ROM (Read Only Memory), and the like. The In addition, the storage unit 220 temporarily stores image data of a reading item displayed by a text data display unit 233 described later and the likelihood calculated by the likelihood calculation unit 234.

さらに、記憶部２２０は、スキャナ装置１１０で変換した画像データを記憶する画像データＤＢ２２１と、画像データに対して設定した読取項目を記憶する読取項目ＤＢ２２２と、読取項目が読み取られて生成されたテキストデータを記憶するテキストデータＤＢ２２３とを備えている。画像データＤＢ２２１、読取項目ＤＢ２２２、テキストデータＤＢ２２３は、制御部２３０の各種プログラムからアクセスされて参照、更新が可能なデータベースである。 Further, the storage unit 220 includes an image data DB 221 that stores image data converted by the scanner device 110, a reading item DB 222 that stores reading items set for the image data, and a text generated by reading the reading items. A text data DB 223 for storing data. The image data DB 221, the reading item DB 222, and the text data DB 223 are databases that can be accessed and referenced by various programs of the control unit 230.

制御部２３０は、記憶部２２０に記憶されているプログラムを実行することにより、文字認識装置２００の全体の動作を制御するものであり、ＣＰＵ（Central Processing Unit）やＧＰＵ（Graphics Processing Unit）等から構成される。制御部２３０により実行されるプログラムの機能として、読取項目設定部２３１、文字認識部２３２、テキストデータ表示部２３３、尤度算出部２３４、表示変更部２３５を備えている。この読取項目設定部２３１、文字認識部２３２、テキストデータ表示部２３３、尤度算出部２３４、表示変更部２３５は、記憶部２２０に記憶されているプログラムにより起動されて実行される。 The control unit 230 controls the overall operation of the character recognition apparatus 200 by executing a program stored in the storage unit 220. From the CPU (Central Processing Unit), the GPU (Graphics Processing Unit), and the like. Composed. As a function of a program executed by the control unit 230, a reading item setting unit 231, a character recognition unit 232, a text data display unit 233, a likelihood calculation unit 234, and a display change unit 235 are provided. The reading item setting unit 231, the character recognition unit 232, the text data display unit 233, the likelihood calculation unit 234, and the display change unit 235 are activated and executed by a program stored in the storage unit 220.

読取項目設定部２３１は、ユーザ端末１２０の操作部が操作されることにより、画像データＤＢ２２１に記憶されている画像データに対して、読取を行う読取項目の設定を行う。読取項目の設定は、画像データが表示されているユーザ端末１２０の表示部で、操作部（例えば、マウス）が操作されて、例えば、いわゆるドラッグ＆ドロップによる範囲設定により行われる。設定された読取項目は、読取項目ＤＢ２２２に記憶される。 The reading item setting unit 231 sets a reading item to be read for the image data stored in the image data DB 221 by operating the operation unit of the user terminal 120. The setting of the reading item is performed by, for example, a range setting by so-called drag and drop by operating an operation unit (for example, a mouse) on the display unit of the user terminal 120 on which image data is displayed. The set reading item is stored in the reading item DB 222.

図３は、図２の画像データＤＢ２２１に記憶されている画像データの例である口座振替依頼書を示す模式図である。この口座振替依頼書は、公共料金の引き落とし等のために、金融機関に対して口座振替を依頼するための申込書であり、記入項目として、氏名フリガナ記入欄Ａ１、氏名漢字記入欄Ａ２、金融機関名記入欄Ａ３、支店名記入欄Ａ４、金融機関コード記入欄Ａ５、支店コード記入欄Ａ６、預金種目記入欄Ａ７、口座番号記入欄Ａ８が設けられている。また、これらの記入項目には例として、氏名フリガナ記入欄Ａ１には「トッキョタロウ」と、氏名漢字記入欄Ａ２には「特許太郎」と、金融機関名記入欄Ａ３には「みずほ（銀行）」と、支店名記入欄Ａ４には「麹町（支店）」と、金融機関コード記入欄Ａ５には「０００１」と、支店コード記入欄Ａ６には「０２１」と、口座番号記入欄Ａ８には「１１１１１１１」と手書きで記入され、預金種目記入欄Ａ７には「普通」に〇がつけられている。 FIG. 3 is a schematic diagram showing an account transfer request form which is an example of image data stored in the image data DB 221 of FIG. This account transfer request form is an application form for requesting account transfer to a financial institution for the deduction of utility bills, etc., and includes the name reading field A1, name kanji entry field A2, financial An institution name entry field A3, a branch name entry field A4, a financial institution code entry field A5, a branch code entry field A6, a deposit item entry field A7, and an account number entry field A8 are provided. As an example of these entries, “Tokkyo Taro” is in the name entry field A1, “Taro Patent” is in the name Kanji entry field A2, and “Mizuho (Bank) is in the bank name entry field A3”. ”,“ Kojimachi (branch) ”in the branch name entry field A4,“ 0001 ”in the financial institution code entry field A5,“ 021 ”in the branch code entry field A6, and the account number entry field A8. “1111111” is written by hand, and “ordinary” is marked with “◯” in the deposit item entry field A7.

ユーザ企業では、口座振替依頼書をテキスト化する処理を行う際、このような口座振替依頼書を、スキャナ装置１１０を使用してスキャンを行い、画像データを画像データＤＢ２２１に記憶させる。 In the user company, when performing the process of converting the account transfer request form into a text, such an account transfer request form is scanned using the scanner device 110 and image data is stored in the image data DB 221.

図４は、図３の口座振替依頼書の画像データが表示部に表示されている状態の例を示す模式図である。ユーザ端末１２０の表示部の画面Ｐ１には、図３に示す口座振替依頼書の画像データが表示され、口座振替依頼書の記入項目と同様の氏名フリガナ記入欄Ａ１、氏名漢字記入欄Ａ２、金融機関名記入欄Ａ３、支店名記入欄Ａ４、金融機関コード記入欄Ａ５、支店コード記入欄Ａ６、預金種目記入欄Ａ７、口座番号記入欄Ａ８が表示されている。 FIG. 4 is a schematic diagram illustrating an example of a state in which the image data of the account transfer request document in FIG. 3 is displayed on the display unit. The image data of the account transfer request form shown in FIG. 3 is displayed on the screen P1 of the display unit of the user terminal 120. The name reading field A1, the name kanji entry field A2, and the financial information similar to the entry items of the account transfer request form are displayed. An institution name entry field A3, a branch name entry field A4, a financial institution code entry field A5, a branch code entry field A6, a deposit item entry field A7, and an account number entry field A8 are displayed.

この状態で、ユーザ端末１２０の操作部が操作されてドラッグ＆ドロップを行い、図４に破線で示すように読取項目の範囲設定を行う。例えば、氏名フリガナ読取項目Ｓ１、氏名漢字読取項目Ｓ２、金融機関名読取項目Ｓ３、支店名読取項目Ｓ４、金融機関コード読取項目Ｓ５、支店コード読取項目Ｓ６、口座番号読取項目Ｓ７が読取項目として範囲設定される。 In this state, the operation unit of the user terminal 120 is operated to perform drag and drop, and the range of the reading item is set as indicated by a broken line in FIG. For example, name reading item S1, name kanji reading item S2, financial institution name reading item S3, branch name reading item S4, financial institution code reading item S5, branch code reading item S6, and account number reading item S7 are ranges as reading items. Is set.

なお、ここでは、ユーザの操作により読取項目の範囲設定を行う方法について説明したが、文字認識装置２００により読取項目を自動設定しても良い。例えば、図３の口座振替依頼書において、印刷されている文字や罫線等以外の、手書きにより記入されている文字を認識し、その範囲を文字認識ごとに設定しても良い。これにより、印刷されている罫線からはみ出して記載されている場合でも文字認識が可能である。 Here, the method for setting the range of the read item by the user's operation has been described, but the read item may be automatically set by the character recognition device 200. For example, in the account transfer request form of FIG. 3, characters entered by handwriting other than printed characters and ruled lines may be recognized, and the range may be set for each character recognition. As a result, character recognition is possible even when it is written out of the printed ruled line.

文字認識部２３２は、ＯＣＲ機能を備え、画像データＤＢ２２１に記憶されている画像データを読み出し、読取項目ＤＢ２２２に記憶されている読取項目に表示されている文字情報をＯＣＲで読み取ってテキストデータを生成する。生成されたテキストデータは、テキストデータＤＢ２２３に記憶される。 The character recognition unit 232 has an OCR function, reads image data stored in the image data DB 221, and reads text information displayed in the reading item stored in the reading item DB 222 by OCR to generate text data. To do. The generated text data is stored in the text data DB 223.

図３に示す例では、氏名フリガナ読取項目Ｓ１に記載されている内容として「トッキョタロウ」がテキストデータとして生成される。同様に、氏名漢字読取項目Ｓ２の記載内容として「特許太郎」が、金融機関名読取項目Ｓ３の記載内容として「みずほ（銀行）」が、支店名読取項目Ｓ４の記載内容として「麹町（支店）」が、金融機関コード読取項目Ｓ５の記載内容として「０００１」が、支店コード読取項目Ｓ６の記載内容として「０２１」が、口座番号読取項目Ｓ７の記載内容として「１１１１１１１」がテキストデータとして生成される。 In the example shown in FIG. 3, “Tokyo Taro” is generated as text data as the contents described in the name reading item S1. Similarly, “patent Taro” is written as the name Kanji reading item S2, “Mizuho (bank)” is written as the financial institution name reading item S3, and “Kashiwamachi (branch)” is written as the branch name reading item S4. ”Is generated as text data“ 0001 ”as the description contents of the financial institution code reading item S5,“ 021 ”as the description contents of the branch code reading item S6, and“ 1111111 ”as the description contents of the account number reading item S7. The

テキストデータ表示部２３３は、画像データＤＢ２２１に記憶されている画像データから、読取項目ＤＢ２２２に記憶されている読取項目の画像データを抽出する。また、テキストデータ表示部２３３は、テキストデータＤＢ２２３に記憶されているテキストデータを読み出し、読取項目の画像データとテキストデータとを、ユーザ端末１２０の表示部上に並列に表示する。 The text data display unit 233 extracts the image data of the reading item stored in the reading item DB 222 from the image data stored in the image data DB 221. The text data display unit 233 reads the text data stored in the text data DB 223 and displays the image data and text data of the read item on the display unit of the user terminal 120 in parallel.

図５は、図４の画像データから抽出された読取項目の画像データとテキストデータとが並列に表示されている状態の例を示す模式図である。ユーザ端末１２０の表示部の画面Ｐ１の左側には、図４と同様に、氏名フリガナ記入欄Ａ１、氏名漢字記入欄Ａ２、金融機関名記入欄Ａ３、支店名記入欄Ａ４、金融機関コード記入欄Ａ５、支店コード記入欄Ａ６、預金種目記入欄Ａ７、口座番号記入欄Ａ８が表示されている。 FIG. 5 is a schematic diagram showing an example of a state in which the image data of the reading item extracted from the image data of FIG. 4 and the text data are displayed in parallel. On the left side of the screen P1 of the display unit of the user terminal 120, as in FIG. 4, a full name entry column A1, a full name Chinese character entry column A2, a financial institution name entry column A3, a branch name entry column A4, and a financial institution code entry column A5, branch code entry field A6, deposit type entry field A7, and account number entry field A8 are displayed.

また、画面Ｐ１の右側には、画像データから抽出された読取項目の画像データの例として、氏名フリガナ読取項目Ｓ１、氏名漢字読取項目Ｓ２、金融機関名読取項目Ｓ３、支店名読取項目Ｓ４、金融機関コード読取項目Ｓ５、支店コード読取項目Ｓ６、口座番号読取項目Ｓ７が表示されている。 Further, on the right side of the screen P1, as an example of the image data of the reading items extracted from the image data, the name reading item S1, the name kanji reading item S2, the financial institution name reading item S3, the branch name reading item S4, the financial An institution code reading item S5, a branch code reading item S6, and an account number reading item S7 are displayed.

さらに、これらの読取項目から読み取ったテキストデータの例として、氏名フリガナテキストＴ１、氏名漢字テキストＴ２、金融機関名テキストＴ３、支店名テキストＴ４、金融機関コードテキストＴ５、支店コードテキストＴ６、口座番号テキストＴ７が、それぞれの読取項目の画像データの下側にそれぞれ並列に表示されている。このように表示するのは、それぞれの読取項目の画像データと読取結果であるテキストデータとを対比しやすくすることで、正確にテキストデータ生成が行われていることを確認しやすくするためである。 Further, as examples of text data read from these reading items, name reading text T1, name kanji text T2, financial institution name text T3, branch name text T4, financial institution code text T5, branch code text T6, account number text T7 is displayed in parallel below the image data of each reading item. The reason for displaying in this way is to make it easier to compare the image data of each reading item and the text data that is the reading result, thereby making it easier to confirm that text data is being generated accurately. .

この読取項目の画像データ及びテキストデータは、図５に示すように画像データに並列に表示しても良く、また、画面Ｐ１の上に別画面（ウィンドウ）として表示させても良い。 The image data and text data of the read item may be displayed in parallel with the image data as shown in FIG. 5, or may be displayed as a separate screen (window) on the screen P1.

尤度算出部２３４は、画面Ｐ１に表示されている読取項目ごとのテキストデータの尤度を算出する。このとき、テキストデータを構成する文字ごとに算出された尤度に対して所定の演算、例えば、文字ごとの尤度を乗算することにより、読取項目ごとのテキストデータの尤度を算出する。なお、テキストデータの尤度の算出はこれ以外の方法で算出しても良く、例えば、文字ごとの尤度の平均値を算出することにより算出しても良い。 The likelihood calculating unit 234 calculates the likelihood of the text data for each reading item displayed on the screen P1. At this time, the likelihood of the text data for each reading item is calculated by multiplying the likelihood calculated for each character constituting the text data by a predetermined calculation, for example, the likelihood for each character. Note that the likelihood of the text data may be calculated by other methods, for example, by calculating an average value of the likelihood for each character.

ここで、文字認識部２３２で行われるＯＣＲ機能における文字識別と尤度との関係について説明する。図６は、図２の文字認識部２３２による特徴抽出とベクトル変換の様子を示す模式図である。また、図７は、図２の文字認識部２３２による字種の判定の様子を示す模式図である。 Here, the relationship between character identification and likelihood in the OCR function performed by the character recognition unit 232 will be described. FIG. 6 is a schematic diagram showing the feature extraction and vector conversion by the character recognition unit 232 of FIG. FIG. 7 is a schematic diagram showing a state of character type determination by the character recognition unit 232 of FIG.

文字認識部２３２は、図６に示すように、切り出した１個の文字パターンの画像データに対して特徴抽出を行う。そして、文字のストロークの方向成分などを抽出し、画像データを特徴空間上の１つのベクトルに変換する。図６に示す例では、画像データＸが多層のニューラルネットワークに入力され、方向や位置等の特徴を捉えて特徴抽出をされている様子を模式的に示している。また、図６に示す例では、ベクトルＸ_１と、ベクトルＸ_２と、ベクトルＸ_３とに変換された様子が模式的に示されている。 As shown in FIG. 6, the character recognition unit 232 performs feature extraction on the image data of one cut character pattern. Then, the direction component of the character stroke is extracted, and the image data is converted into one vector on the feature space. In the example shown in FIG. 6, a state in which image data X is input to a multilayer neural network and features are extracted by capturing features such as directions and positions is schematically shown. Further, in the example illustrated in FIG. 6, the state of being converted into the vector X ₁ , the vector X _2, and the vector X ₃ is schematically illustrated.

次に、文字認識部２３２は、図７に示すように、変換されたベクトルに基づいて字種が何であるかを判定する。例えば、当該判定において、事前に大量のパターンを使った分布の様子から、どの字種が特徴空間上のどの辺りに分布しているかを保持している辞書データを参照し、未知の入力パターンである画像データの候補を決定する。図７に示す例では、辞書データにおいて、字種「中」、字種「申」および字種「十」の情報が記憶されている様子を概念的に示しており、原点から離れるほどその字種の尤もらしさが高いことを示している。 Next, as shown in FIG. 7, the character recognition unit 232 determines what the character type is based on the converted vector. For example, in this determination, refer to the dictionary data that holds which character type is distributed in which area on the feature space from the state of distribution using a large number of patterns in advance, with unknown input patterns A candidate for certain image data is determined. The example shown in FIG. 7 conceptually shows that the information of the character type “medium”, the character type “Sen”, and the character type “ten” is stored in the dictionary data. This indicates that the likelihood of the species is high.

文字認識部２３２は、以上のプロセスにより、複数のテキスト候補（例えば、中、申、十）を取得する。そして、各テキスト候補の尤もらしさを示す尤度が文字ごとに算出される。尤度算出部２３４では、その文字ごとの尤度から、読取項目の尤度が算出される。なお、各テキスト候補の尤度は、特徴空間内における各候補の中心と、未知の入力パターンである画像データとの距離で算出することができる。 The character recognizing unit 232 acquires a plurality of text candidates (for example, medium, deity, and ten) by the above process. Then, a likelihood indicating the likelihood of each text candidate is calculated for each character. The likelihood calculation unit 234 calculates the likelihood of the read item from the likelihood for each character. The likelihood of each text candidate can be calculated by the distance between the center of each candidate in the feature space and image data that is an unknown input pattern.

表示変更部２３５は、尤度算出部２３４によって算出されたテキストデータの尤度に基づき、テキストデータ表示部２３３によって表示されたテキストデータの表示の態様を変更する。例えば、画面Ｐ２に表示されている各テキストデータの枠の右端に色彩を施し、その色をテキストデータの尤度に基づいて変化させる。 The display change unit 235 changes the display mode of the text data displayed by the text data display unit 233 based on the likelihood of the text data calculated by the likelihood calculation unit 234. For example, a color is applied to the right end of the frame of each text data displayed on the screen P2, and the color is changed based on the likelihood of the text data.

図８は、図５の読取項目の画像データとテキストデータとが並列に表示されている状態からテキストデータの右端に色彩が施された状態の例を示す模式図である。テキストデータ表示部２３３によって表示された、氏名フリガナテキストＴ１、氏名漢字テキストＴ２、金融機関名テキストＴ３、支店名テキストＴ４、金融機関コードテキストＴ５、支店コードテキストＴ６、口座番号テキストＴ７には、図８に示すように、それぞれ氏名フリガナ彩色部Ｔ１１、氏名漢字彩色部Ｔ２１、金融機関名彩色部Ｔ３１、支店名彩色部Ｔ４１、金融機関コード彩色部Ｔ５１、支店コード彩色部Ｔ６１、口座番号彩色部Ｔ７１が設けられている。 FIG. 8 is a schematic diagram illustrating an example of a state in which color is applied to the right end of the text data from the state in which the image data and text data of the reading item in FIG. 5 are displayed in parallel. The name reading text T1, name kanji text T2, financial institution name text T3, branch name text T4, financial institution code text T5, branch code text T6, and account number text T7 displayed by the text data display unit 233 include: As shown in FIG. 8, the name name coloring portion T11, the name Chinese character coloring portion T21, the financial institution name coloring portion T31, the branch name coloring portion T41, the financial institution code coloring portion T51, the branch code coloring portion T61, and the account number coloring portion T71. Is provided.

この氏名フリガナ彩色部Ｔ１１、氏名漢字彩色部Ｔ２１、金融機関名彩色部Ｔ３１、支店名彩色部Ｔ４１、金融機関コード彩色部Ｔ５１、支店コード彩色部Ｔ６１、口座番号彩色部Ｔ７１は、尤度算出部２３４によって算出された、氏名フリガナテキストＴ１、氏名漢字テキストＴ２、金融機関名テキストＴ３、支店名テキストＴ４、金融機関コードテキストＴ５、支店コードテキストＴ６、口座番号テキストＴ７の尤度により、例えば、尤度が０〜０．６の場合は赤色に、０．６〜０．８の場合はオレンジ色に、０．８以上の場合は黄色に彩色される。また、尤度が高くなるほど細く表示されている。このように態様を変化させるのは、テキストデータの尤度が高いほどその読取項目のテキストデータへの変換処理が正確に行われているといえるので、その読取項目の尤度がどれくらいであったかを一目で把握することを可能にするためである。 This name reading portion T11, name kanji coloring portion T21, financial institution name coloring portion T31, branch name coloring portion T41, financial institution code coloring portion T51, branch code coloring portion T61, account number coloring portion T71 is a likelihood calculating portion. Based on the likelihood of the name reading text T1, the name kanji text T2, the financial institution name text T3, the branch name text T4, the financial institution code text T5, the branch code text T6, and the account number text T7 calculated by H.234, for example, When the degree is 0 to 0.6, it is colored red, when it is 0.6 to 0.8, it is colored orange, and when it is 0.8 or more, it is colored yellow. Further, the higher the likelihood, the smaller the display. In this way, the reason for changing the mode is that the higher the likelihood of the text data, the more accurately the conversion process of the read item to the text data is performed, so it is possible to determine the likelihood of the read item. This is to make it possible to grasp at a glance.

＜処理の流れ＞
以下、図９を参照しながら、文字認識システム１が実行する文字認識方法の一例を説明する。図９は、図１に示す文字認識システム１による文字認識方法を示すフローチャートである。 <Process flow>
Hereinafter, an example of a character recognition method executed by the character recognition system 1 will be described with reference to FIG. FIG. 9 is a flowchart showing a character recognition method by the character recognition system 1 shown in FIG.

ステップＳ１０１の処理として、読取項目設定部２３１では、画像データＤＢ２２１に記憶されている画像データに対して、読み取ってテキストデータの生成を行う読取項目の設定を行う。 As the processing of step S101, the reading item setting unit 231 sets reading items for reading and generating text data for the image data stored in the image data DB 221.

このとき、例えば、図４に示す氏名フリガナ読取項目Ｓ１にはカタカナのみが記載され、金融機関コード読取項目Ｓ５、支店コード読取項目Ｓ６、及び口座番号読取項目Ｓ７には数字のみが記載されるため、生成するテキストデータの文字種類を限定するように設定することも可能である。また、金融機関名読取項目Ｓ３は、金融機関コード読取項目Ｓ５の値に対応する金融機関名が記入され、支店名読取項目Ｓ４は、支店コード読取項目Ｓ６の値に対応する支店名が記入されるので、例えば図１０に示すような金融機関テーブルから該当するテキスト及び値を選択するように設定することも可能である。設定された読取項目は、読取項目ＤＢ２２２に記憶される。 At this time, for example, only katakana is described in the name reading item S1 shown in FIG. 4, and only numbers are described in the financial institution code reading item S5, the branch code reading item S6, and the account number reading item S7. It is also possible to set so as to limit the character types of the text data to be generated. The financial institution name reading item S3 is filled with the name of the financial institution corresponding to the value of the financial institution code reading item S5, and the branch name reading item S4 is filled with the name of the branch corresponding to the value of the branch code reading item S6. Therefore, for example, it is possible to set to select the corresponding text and value from the financial institution table as shown in FIG. The set reading item is stored in the reading item DB 222.

ステップＳ１０２の処理として、文字認識部２３２では、画像データＤＢ２２１に記憶されている画像データが読み出され、読取項目ＤＢ２２２に記憶されている読取項目に表示されている文字情報がＯＣＲで読み取られ、テキストデータが生成される。具体的には、例えば、文字パターンの画像データに対して特徴抽出が行われ、文字のストロークの方向成分などが抽出されて特徴空間上の１つのベクトルに変換され、どの字種が特徴空間上のどの辺りに分布しているかを保持している辞書データを参照されて尤度の高い字種が決定される。 In step S102, the character recognition unit 232 reads the image data stored in the image data DB 221, reads the character information displayed in the reading item stored in the reading item DB 222 by OCR, Text data is generated. Specifically, for example, feature extraction is performed on image data of a character pattern, a direction component of a character stroke is extracted and converted into one vector on the feature space, and which character type is on the feature space. A character type having a high likelihood is determined by referring to dictionary data that holds the distribution of which of the characters.

ステップＳ１０３の処理として、テキストデータ表示部２３３では、画像データＤＢ２２１に記憶されている画像データから読取項目ＤＢ２２２に記憶されている読取項目の画像データが抽出され、また、テキストデータＤＢ２２３に記憶されているテキストデータが読み出され、読取項目の画像データとテキストデータとが、図５に示す画面Ｐ１のように並列に表示する。 As the processing in step S103, the text data display unit 233 extracts the image data of the reading item stored in the reading item DB 222 from the image data stored in the image data DB 221, and stores it in the text data DB 223. The read text data is read, and the image data and text data of the read item are displayed in parallel as shown in a screen P1 shown in FIG.

図５の例では、氏名フリガナ読取項目Ｓ１と氏名フリガナテキストＴ１とが、氏名漢字読取項目Ｓ２と氏名漢字テキストＴ２とが、金融機関名読取項目Ｓ３と金融機関名テキストＴ３とが、支店名読取項目Ｓ４と支店名テキストＴ４とが、金融機関コード読取項目Ｓ５と金融機関コードテキストＴ５とが、支店コード読取項目Ｓ６と支店コードテキストＴ６とが、口座番号読取項目Ｓ７と口座番号テキストＴ７とがそれぞれ並列に表示されている。 In the example of FIG. 5, the full name reading item S1 and full name reading text T1, the full name kanji reading item S2 and full name kanji text T2, the financial institution name reading item S3, and the financial institution name text T3 are read branch names. The item S4 and the branch name text T4, the financial institution code reading item S5 and the financial institution code text T5, the branch code reading item S6 and the branch code text T6, the account number reading item S7 and the account number text T7. Each is displayed in parallel.

ステップＳ１０４の処理として、尤度算出部２３４では、テキストデータＤＢ２２３に記憶されているテキストデータを構成する文字ごとに算出された、文字ごとの尤度を乗算して、その読取項目の尤度を算出する。 As the processing of step S104, the likelihood calculating unit 234 multiplies the likelihood for each character calculated for each character constituting the text data stored in the text data DB 223, and calculates the likelihood of the read item. calculate.

ステップＳ１０５の処理として、表示変更部２３５では、尤度算出部２３４によって算出されたテキストデータの尤度に基づき、図８に示すように、氏名フリガナテキストＴ１、氏名漢字テキストＴ２、金融機関名テキストＴ３、支店名テキストＴ４、金融機関コードテキストＴ５、支店コードテキストＴ６、口座番号テキストＴ７にそれぞれ設けられた、氏名フリガナ彩色部Ｔ１１、氏名漢字彩色部Ｔ２１、金融機関名彩色部Ｔ３１、支店名彩色部Ｔ４１、金融機関コード彩色部Ｔ５１、支店コード彩色部Ｔ６１、口座番号彩色部Ｔ７１の色彩及び太さが変更される。例えば、算出されたテキストデータの尤度が０〜０．６の場合は赤色に、０．６〜０．８の場合はオレンジ色に、０．８以上の場合は黄色に彩色される。 As the processing in step S105, the display changing unit 235, based on the likelihood of the text data calculated by the likelihood calculating unit 234, as shown in FIG. 8, the name reading text T1, the name kanji text T2, the financial institution name text T3, branch name text T4, financial institution code text T5, branch code text T6, and account number text T7, respectively, name full color part T11, name kanji color part T21, financial institution name color part T31, branch name color part The color and thickness of the part T41, financial institution code coloring part T51, branch code coloring part T61, and account number coloring part T71 are changed. For example, when the calculated likelihood of the text data is 0 to 0.6, it is colored red, when it is 0.6 to 0.8, it is colored orange, and when it is 0.8 or more, it is colored yellow.

以上のように、本実施形態に係る文字認識システムは、テキストデータを構成する文字ごとに算出された尤度を、テキストデータを構成する文字の分だけ乗算して、その読取項目のテキストデータの尤度を算出する。これにより、読み取ったテキストデータの尤度から、読取項目ごとの読取の信頼性を評価することが可能になる。 As described above, the character recognition system according to the present embodiment multiplies the likelihood calculated for each character constituting the text data by the amount of the characters constituting the text data, and determines the text data of the reading item. Calculate the likelihood. This makes it possible to evaluate the reading reliability for each reading item from the likelihood of the read text data.

また、読取項目ごとのテキストデータの尤度に基づき、表示されたテキストデータの枠の右端に色彩を施し、その色をテキストデータの尤度に基づいて変化させる等の手法により態様を変化させることができるので、テキストデータの尤度を一目で把握することが可能になる。これにより、読取項目ごとの信頼性を一目で把握することができるため、信頼性の低い読取項目のみ重点的に確認するなど、目視による確認の効率向上に寄与することが可能である。 In addition, based on the likelihood of the text data for each reading item, color is applied to the right edge of the displayed text data frame, and the mode is changed by a method such as changing the color based on the likelihood of the text data. Therefore, the likelihood of text data can be grasped at a glance. Thereby, since the reliability for each reading item can be grasped at a glance, it is possible to contribute to improving the efficiency of visual confirmation, for example, checking only reading items with low reliability.

（実施形態２）
＜構成＞
図１１は、本開示の実施形態２に係る文字認識システムにおける、画像データから抽出された読取項目の画像データとテキストデータとが並列に表示されている状態の例を示す模式図である。この文字認識システム１は、実施形態１と同様の構成であるが、表示変更部２３５により変更される、テキストデータ表示部２３３によって表示されたテキストデータの表示の態様が異なる。この実施形態２では、表示変更部２３５は、尤度算出部２３４によって算出された読取項目ごとのテキストデータの尤度が所定の閾値（第１の閾値）以下の場合、当該読取項目のテキストデータの表示を行わない（ブランクにして表示させる）点において、実施形態１と異なる。 (Embodiment 2)
<Configuration>
FIG. 11 is a schematic diagram illustrating an example of a state in which image data and text data of a reading item extracted from image data are displayed in parallel in the character recognition system according to the second embodiment of the present disclosure. The character recognition system 1 has the same configuration as that of the first embodiment, but the display mode of the text data displayed by the text data display unit 233 changed by the display change unit 235 is different. In the second embodiment, when the likelihood of the text data for each reading item calculated by the likelihood calculating unit 234 is equal to or less than a predetermined threshold (first threshold) , the display changing unit 235 reads the text data of the reading item. Is different from the first embodiment in that no display is performed (the display is made blank).

図１１に示すように、画面Ｐ１には、実施形態１の図５と同様に、氏名フリガナ読取項目Ｓ１、氏名漢字読取項目Ｓ２、金融機関名読取項目Ｓ３、支店名読取項目Ｓ４、金融機関コード読取項目Ｓ５、支店コード読取項目Ｓ６、口座番号読取項目Ｓ７、氏名フリガナテキストＴ１、氏名漢字テキストＴ２、金融機関名テキストＴ３、支店名テキストＴ４、金融機関コードテキストＴ５、支店コードテキストＴ６、口座番号テキストＴ７が表示されているが、氏名フリガナテキストＴ１及び氏名漢字テキストＴ２のテキストデータが表示されず、ブランク（空白）状態になっている。この例では、氏名フリガナテキストＴ１及び氏名漢字テキストＴ２のテキストデータの尤度が所定の閾値以下であるため、表示変更部２３５は表示を行っていない。 As shown in FIG. 11, on the screen P1, as in FIG. 5 of the first embodiment, the name reading item S1, the name kanji reading item S2, the financial institution name reading item S3, the branch name reading item S4, the financial institution code Reading item S5, branch code reading item S6, account number reading item S7, name reading text T1, name kanji text T2, financial institution name text T3, branch name text T4, financial institution code text T5, branch code text T6, account number Although the text T7 is displayed, the text data of the name reading text T1 and the name kanji text T2 are not displayed and are in a blank (blank) state. In this example, since the likelihood of the text data of the name reading text T1 and the name kanji text T2 is equal to or less than a predetermined threshold, the display changing unit 235 does not display.

このように、所定の閾値以下のテキストデータを表示していないのは、これらのテキストデータの尤度が低く、一定の信頼性がないと考えられるため、人間が目視確認して個別に手入力し直す必要がある。そのため、最初からテキストデータを表示しないことにより、余計な目視確認処理を省略し、効率的に文字入力を行うことを可能にするためである。その他の構成及び処理の流れについては、実施形態１と同様である。 In this way, the reason why text data below a predetermined threshold is not displayed is because the likelihood of these text data is low and it is considered that there is no certain reliability. It is necessary to redo. Therefore, by not displaying the text data from the beginning, it is possible to omit extra visual confirmation processing and to efficiently input characters. Other configurations and processing flows are the same as those in the first embodiment.

本実施形態によれば、上記実施形態１の効果に加え、尤度算出部２３４によって算出されたテキストデータの尤度が所定の閾値以下の場合、表示変更部２３５ではテキストデータの表示を行わないので、一定の信頼性がない読取項目についてテキストデータの表示を省略することができる。これにより、余計な目視確認処理を省略し、効率的に文字入力を行うことが可能になる。 According to the present embodiment, in addition to the effects of the first embodiment, when the likelihood of the text data calculated by the likelihood calculating unit 234 is equal to or less than a predetermined threshold, the display changing unit 235 does not display the text data. Therefore, it is possible to omit the display of text data for reading items that do not have a certain level of reliability. As a result, unnecessary visual confirmation processing can be omitted, and character input can be performed efficiently.

（実施形態３）
＜構成＞
図１２は、本開示の実施形態３に係る文字認識システムにおける、画像データから抽出された読取項目の画像データとテキストデータとが並列に表示されている状態の例を示す模式図である。この文字認識システム１は、実施形態１と同様の構成であるが、表示変更部２３５により変更される、テキストデータ表示部２３３によって表示されたテキストデータの表示の態様が異なる。この実施形態３では、表示変更部２３５は、読取項目ごとのテキストデータの尤度により各テキスト表示欄に設けられた彩色部に色彩を施すとともに、文字ごとの尤度に応じて文字の態様を変更する点において、実施形態１と異なる。文字の態様の変更は、例えば、文字の色彩の変更や、文字のフォント（字体や文字の太さ）の変更、文字の大きさの変更等によって行われる。 (Embodiment 3)
<Configuration>
FIG. 12 is a schematic diagram illustrating an example of a state in which image data and text data of a reading item extracted from image data are displayed in parallel in the character recognition system according to the third embodiment of the present disclosure. The character recognition system 1 has the same configuration as that of the first embodiment, but the display mode of the text data displayed by the text data display unit 233 changed by the display change unit 235 is different. In the third embodiment, the display changing unit 235 colors the coloring portion provided in each text display column according to the likelihood of the text data for each reading item, and changes the character mode according to the likelihood for each character. It differs from the first embodiment in that it is changed. The character mode is changed by, for example, changing the character color, changing the character font (character font or character thickness), changing the character size, or the like.

図１２に示すように、画面Ｐ１には、実施形態１の図５と同様に、氏名フリガナ読取項目Ｓ１、氏名漢字読取項目Ｓ２、金融機関名読取項目Ｓ３、支店名読取項目Ｓ４、金融機関コード読取項目Ｓ５、支店コード読取項目Ｓ６、口座番号読取項目Ｓ７、氏名フリガナテキストＴ１、氏名漢字テキストＴ２、金融機関名テキストＴ３、支店名テキストＴ４、金融機関コードテキストＴ５、支店コードテキストＴ６、口座番号テキストＴ７が表示されているが、氏名フリガナテキストＴ１に表示されている「トッキョタロウ」のうち、「ト」の部分と、「ッキョ」の部分と、「タロウ」の部分とが、それぞれ異なる色彩で表示されている。また、氏名漢字テキストＴ２に表示されている「特許太郎」のうち、「特許」の部分と、「太郎」の部分とが、それぞれ異なる文字の太さで表示されている。例えば、それぞれの文字の尤度により、例えば、尤度が０〜０．６の場合は太字の赤色に、０．６〜０．８の場合は通常の太さの赤色に、０．８以上の場合は通常の太さの黒色に表示される。 As shown in FIG. 12, on the screen P1, as in FIG. 5 of the first embodiment, the name reading item S1, the name kanji reading item S2, the financial institution name reading item S3, the branch name reading item S4, the financial institution code Reading item S5, branch code reading item S6, account number reading item S7, name reading text T1, name kanji text T2, financial institution name text T3, branch name text T4, financial institution code text T5, branch code text T6, account number The text T7 is displayed. Of the “Tokkyo Taro” displayed in the name reading text T1, the “G” part, the “Kyo” part, and the “Taro” part have different colors. Is displayed. In addition, among the “patent Taro” displayed in the name kanji text T2, the “patent” portion and the “taro” portion are displayed with different thicknesses. For example, according to the likelihood of each character, for example, when the likelihood is 0 to 0.6, it is bold red, and when it is 0.6 to 0.8, it is normal red, 0.8 or more In the case of, it is displayed in black of normal thickness.

このように、それぞれの文字の尤度により異なる態様で表示しているのは、テキストデータの中でも文字により尤度が異なる場合があるので、その文字の尤度がどれくらいであったかを一目で把握することを可能にするためである。その他の構成及び処理の流れについては、実施形態１と同様である。 In this way, because the likelihood is different depending on the character in the text data, it is displayed in a different manner depending on the likelihood of each character, so it is possible to grasp at a glance how much the likelihood of the character was This is to make it possible. Other configurations and processing flows are the same as those in the first embodiment.

本実施形態によれば、上記実施形態１の効果に加え、テキストデータを構成する文字ごとの尤度に応じて、文字の態様を変更するので、その文字の尤度がどれくらいであったかを一目で把握することが可能になる。これにより、文字ごとの読取の信頼性を評価することが可能になる。 According to the present embodiment, in addition to the effects of the first embodiment, the character mode is changed according to the likelihood of each character constituting the text data, so it is possible to determine at a glance how much the likelihood of the character was. It becomes possible to grasp. This makes it possible to evaluate the reading reliability for each character.

（実施形態４）
＜構成＞
図１３は、本開示の実施形態４に係る文字認識システムにおける、画像データから抽出された読取項目の画像データとテキストデータとが並列に表示されている状態の例を示す模式図である。この文字認識システム１は、実施形態１と同様の構成であるが、表示変更部２３５により変更される、テキストデータ表示部２３３によって表示されたテキストデータの表示の態様が異なる。この実施形態４では、表示変更部２３５は、尤度算出部２３４によって算出されたテキストデータの尤度が所定の閾値（第２の閾値）以上の場合、読取項目の画像データとテキストデータとの表示を行わない点において、実施形態１と異なる。
(Embodiment 4)
<Configuration>
FIG. 13 is a schematic diagram illustrating an example of a state in which image data and text data of a reading item extracted from image data are displayed in parallel in the character recognition system according to the fourth embodiment of the present disclosure. The character recognition system 1 has the same configuration as that of the first embodiment, but the display mode of the text data displayed by the text data display unit 233 changed by the display change unit 235 is different. In the fourth embodiment, when the likelihood of the text data calculated by the likelihood calculating unit 234 is greater than or equal to a predetermined threshold (second threshold) , the display changing unit 235 performs the processing between the image data of the reading item and the text data. This is different from the first embodiment in that no display is performed.

図１３に示すように、画面Ｐ１には、氏名フリガナ読取項目Ｓ１、氏名漢字読取項目Ｓ２、口座番号読取項目Ｓ７、氏名フリガナテキストＴ１、氏名漢字テキストＴ２、口座番号テキストＴ７が表示されているが、図５に示す金融機関名読取項目Ｓ３、支店名読取項目Ｓ４、金融機関コード読取項目Ｓ５、支店コード読取項目Ｓ６、金融機関名テキストＴ３、支店名テキストＴ４、金融機関コードテキストＴ５、支店コードテキストＴ６は表示されていない状態になっている。この例では、金融機関名テキストＴ３、支店名テキストＴ４、金融機関コードテキストＴ５、支店コードテキストＴ６のテキストデータの尤度が所定の閾値以上であるため、表示変更部２３５は表示を行っていない。 As shown in FIG. 13, a full name reading item S1, a full name kanji reading item S2, an account number reading item S7, a full name reading text T1, a full name kanji text T2, and an account number text T7 are displayed on the screen P1. 5, financial institution name reading item S3, branch name reading item S4, financial institution code reading item S5, branch code reading item S6, financial institution name text T3, branch name text T4, financial institution code text T5, branch code The text T6 is not displayed. In this example, since the likelihood of the text data of the financial institution name text T3, the branch name text T4, the financial institution code text T5, and the branch code text T6 is equal to or greater than a predetermined threshold value, the display changing unit 235 does not perform display. .

このように、所定の閾値以上の読取項目の画像データ及びテキストデータを表示していないのは、これらのテキストデータの尤度が高く、一定の信頼性があると考えられるため、人間による確認を省略することが可能であると考えられる。そのため、画面Ｐ２にこれらの項目を表示せず、人間が確認して個別に手入力で修正を行う可能性のあるテキストデータのみ表示することにより、効率的に文字入力を行うことを可能にするためである。その他の構成及び処理の流れについては、実施形態１と同様である。 As described above, the reason why the image data and text data of the read item that is equal to or greater than the predetermined threshold is not displayed is that the likelihood of the text data is high and it is considered that there is a certain level of reliability. It can be omitted. Therefore, these items are not displayed on the screen P2, but only text data that can be confirmed by humans and individually corrected manually is displayed, thereby enabling efficient character input. Because. Other configurations and processing flows are the same as those in the first embodiment.

本実施形態によれば、上記実施形態１の効果に加え、尤度算出部２３４によって算出されたテキストデータの尤度が所定の閾値以上の場合、表示変更部２３５では読取項目の画像データ及びテキストデータの表示を行わないので、一定の信頼性がある読取項目についてテキストデータの確認を省略することができる。これにより、効率的に文字入力を行うことが可能になる。 According to the present embodiment, in addition to the effects of the first embodiment, when the likelihood of the text data calculated by the likelihood calculation unit 234 is equal to or greater than a predetermined threshold, the display change unit 235 displays the image data and text of the read item. Since the data is not displayed, the confirmation of the text data can be omitted for the read item having a certain reliability. This makes it possible to input characters efficiently.

なお、その他の実施形態として、文字認識装置に直接スキャナ装置を接続し、ネットワークを経由せずにスタンドアローンで文字認識装置を構成することも可能である。これは、外部への情報漏洩を防止するために、文字認識装置をネットワークに接続せずに構成したい場合に有効である。 As another embodiment, it is also possible to connect the scanner device directly to the character recognition device and configure the character recognition device standalone without going through a network. This is effective when it is desired to configure the character recognition device without connecting to the network in order to prevent information leakage to the outside.

以上、開示に係る実施形態について説明したが、これらはその他の様々な形態で実施することが可能であり、種々の省略、置換および変更を行なって実施することができる。また、実施形態１〜４に記載した構成を組み合わせて実施することもできる。これらの実施形態および変形例ならびに省略、置換および変更を行なったものは、特許請求の範囲の技術的範囲とその均等の範囲に含まれるものである。 While the embodiments according to the disclosure have been described above, these can be implemented in various other forms, and can be implemented with various omissions, substitutions, and changes. Moreover, it can also implement combining the structure described in Embodiment 1-4. These embodiments and modifications, as well as those obtained by omission, substitution, and change are included in the technical scope of the claims and their equivalents.

１文字認識システム、１００ユーザシステム、１１０スキャナ装置、１２０ユーザ端末、２００文字認識装置、２１０通信部、２２０記憶部、２２１画像データＤＢ、２２２読取項目ＤＢ、２２３テキストデータＤＢ、２３０制御部、２３１読取項目設定部、２３２文字認識部、２３３テキストデータ表示部、２３４尤度算出部、２３５表示変更部、ＮＷネットワーク 1 character recognition system, 100 user system, 110 scanner device, 120 user terminal, 200 character recognition device, 210 communication unit, 220 storage unit, 221 image data DB, 222 reading item DB, 223 text data DB, 230 control unit, 231 Reading item setting unit, 232 character recognition unit, 233 text data display unit, 234 likelihood calculation unit, 235 display change unit, NW network

Claims

A character recognition device that reads character information from image data obtained by scanning a document as an image,
A character recognition unit that performs character recognition of character information displayed in a reading item set in the image data of the document and generates text data;
A text data display unit for displaying the image data of the reading item and the text data for each reading item in a comparable manner, and making the text data editable by inputting characters to the text data; ,
A likelihood calculating unit for calculating the likelihood of character recognition for each reading item;
Based on the likelihood of each of the read item, Bei example and a display changing unit that changes the manner of a portion displaying the text data,
The display change unit changes the setting of the text data display item in which the likelihood for each reading item is equal to or less than a predetermined first threshold to a setting that does not display a character string from the text data display column, The image data and the text data of the reading item whose likelihood for each reading item is equal to or greater than a predetermined second threshold are not displayed, and the likelihood for each reading item is larger than the first threshold and the second A character recognition device that changes a mode of a portion where the text data of the read item is displayed according to the likelihood of each read item when the value is less than the threshold .

It said text data display unit, together with the image data of the document, and extracts the image data of the read item is displayed on the text data and the parallel character recognition device according to claim 1.

The character recognition device according to claim 1, further comprising a reading item setting unit that sets the reading item from image data of the document.

The character recognition unit generates the text data and calculates a likelihood for each character constituting the text data,
The character recognition according to any one of claims 1 to 3 , wherein the likelihood calculating unit performs a predetermined operation on the likelihood for each character to calculate the likelihood for each reading item. apparatus.

The character recognition device according to claim 4 , wherein the likelihood calculation unit calculates the likelihood for each reading item by multiplying the likelihood for each character.

The character recognition device according to claim 4 or 5 , wherein the display change unit displays the character in a different manner for each character according to the likelihood for each character.

The character according to any one of claims 1 to 6 , wherein the display change unit displays a portion where the text data of the reading item is displayed in a different color according to the likelihood of each reading item. Recognition device.

A character recognition method for reading character information from image data obtained by scanning a document as an image,
A character recognition step for performing character recognition of the character information displayed in the reading item set in the image data of the document, and generating text data, performed by the character recognition unit;
A state in which the text data display unit displays the image data of the read item and the text data so that the read data can be compared for each read item, and the text data can be edited by inputting characters to the text data. Text data display step to
A likelihood calculating step for calculating a likelihood of character recognition for each of the reading items, performed by a likelihood calculating unit;
Based on the calculated likelihood, performed by the display change section, Bei example and a display change step of changing the mode of a portion displaying the text data,
In the display change step, the reading item of the text data whose likelihood for each reading item is equal to or less than a predetermined first threshold is changed from the text data display column to a setting not to display a character string, The image data and the text data of the reading item whose likelihood for each reading item is equal to or greater than a predetermined second threshold are not displayed, and the likelihood for each reading item is larger than the first threshold and the second The character recognition method of changing the aspect of the location which displays the said text data of the said reading item according to the likelihood for every said reading item when it is less than a threshold value .

A character recognition program that reads character information from image data obtained by scanning a document as an image,
A character recognition step of performing character recognition of the character information displayed in the reading item set in the image data of the document and generating text data;
A text data display step for displaying the image data of the reading item and the text data for each reading item in a comparable manner, and making the text data editable by inputting characters to the text data; ,
A likelihood calculating step for calculating the likelihood of character recognition for each reading item;
Based on the calculated likelihood, causing the electronic computer to execute a display change step of changing the mode of the location where the text data is displayed ,
In the display change step, the reading item of the text data whose likelihood for each reading item is equal to or less than a predetermined first threshold is changed from the text data display column to a setting not to display a character string, The image data and the text data of the reading item whose likelihood for each reading item is equal to or greater than a predetermined second threshold are not displayed, and the likelihood for each reading item is larger than the first threshold and the second The character recognition program which changes the aspect of the location which displays the said text data of the said reading item according to the likelihood for every said reading item when it is less than a threshold value .