JP2020046860A

JP2020046860A - Form reading apparatus

Info

Publication number: JP2020046860A
Application number: JP2018174013A
Authority: JP
Inventors: 央典青島; Hisanori Aoshima; 雄一黒田; Yuichi Kuroda; 真一長田; Shinichi Osada
Original assignee: MUFG Bank Ltd
Current assignee: MUFG Bank Ltd
Priority date: 2018-09-18
Filing date: 2018-09-18
Publication date: 2020-03-26

Abstract

To provide a form reading apparatus capable of extracting desired information from filled-out forms.SOLUTION: One embodiment of the present invention relates to a form reading apparatus that comprises an extraction unit for extracting predetermined information from filled-out forms using a learned model.SELECTED DRAWING: Figure 3

Description

本発明は、一般に金融技術に関し、より詳細には帳票読取システムに関する。 The present invention relates generally to financial technology, and more particularly to a form reading system.

現在、銀行における金融サービスを利用する際、顧客は、預金の預入、引出、振込、為替取引等の多くの金融取引を、フィンテックの進展によってインターネットバンキング等を操作することによって電子ベースで実行すること可能である。一方、顧客が依然として紙媒体の書類や帳票に必要事項を記入押印することによって行われる金融取引もある。例えば、顧客口座から公共料金、クレジットカードの利用代金、商品・サービスの購入代金等を定期的に引き落とすための口座振替依頼書は、紙媒体の帳票ベースの金融取引の典型例である。 Currently, when using financial services in banks, customers execute many financial transactions such as deposits, withdrawals, transfers, foreign exchange transactions, etc. on an electronic basis by operating Internet banking etc. with the development of FinTech. It is possible. On the other hand, there is also a financial transaction in which a customer still fills in and seals a paper document or form. For example, an account transfer request form for regularly debiting a utility bill, a credit card usage fee, a purchase price for a product or service from a customer account is a typical example of a paper-based form-based financial transaction.

顧客によって記入押印された口座振替依頼書は顧客口座のある銀行に送付され、記入内容に不備がないかの確認、押印された印鑑の照合、行内システムへの入力等が主として人手によって行われている。例えば、メガバンクでは、日々膨大な数の口座振替依頼書が銀行スタッフによって処理されており、依頼書の処理に係る事務作業の自動化・効率化が強く求められている。例えば、このような目的のため、文字列が多数記載されている帳票画像から高速に項目名と項目値とのペアを抽出するための帳票認識方式などが知られている。 The debit request form filled out and stamped by the customer is sent to the bank where the customer's account is located, and it is mainly performed by hand that confirmation of the entry contents is complete, verification of the stamped seal, input to the in-line system, etc. I have. For example, in megabanks, a huge number of account transfer request documents are processed by bank staff every day, and there is a strong demand for automation and efficiency of office work related to processing the request documents. For example, for such a purpose, a form recognition method for extracting a pair of an item name and an item value at high speed from a form image in which many character strings are described is known.

特開２０１２−２０８５８９号公報JP 2012-208589 A

しかしながら、従来の文字認識モデルによる帳票認識方式によると、特定タイプの帳票における情報を読み取ることはできるが、多数のタイプの帳票から所望の情報を読取可能な帳票認識方式は知られていない。実際の事務処理では、公共料金、クレジットカード利用代金、購入代金等の個々のサービス毎に書式もレイアウトも異なる独自の口座振替依頼書が用意され、例えば、メガバンクで扱われる口座振替依頼書の種類は、数千タイプ以上にのぼる。このため、異なる書式及びレイアウトによる複数種別の帳票を一括に読取可能な抽出モデルが望まれる。 However, according to a conventional form recognition method based on a character recognition model, information in a specific type of form can be read, but a form recognition method capable of reading desired information from many types of forms is not known. In the actual paperwork, a unique fund transfer request form with a different format and layout is prepared for each service such as utility bill, credit card use price, purchase price etc.For example, the types of fund transfer requests handled by megabanks Number over thousands of types. For this reason, an extraction model that can collectively read a plurality of types of forms with different formats and layouts is desired.

このように、口座振替依頼書は委託者毎などに異なる書式及びレイアウトによって用意され、委託者識別情報（委託者コードなど）、顧客の口座情報、印影情報などの口座振替を実行するのに必要とされる情報の記載場所も様々である。また、口座振替依頼書には、手書きされた文字や数字と共に、該当事項を選択するためのレ点、丸印などの様々なタイプの情報も含まれており、事務処理の自動化にはこれらの判別も必要となる。このため、従来の文字認識モデルにより対処することは困難である。 As described above, the fund transfer request form is prepared in different formats and layouts for each client, and is necessary for executing the bank transfer such as client ID information (client code, etc.), customer account information, and seal imprint information. There are various places where information is described. In addition, the debit request form contains various types of information, such as check marks and circles, for selecting relevant items, along with handwritten letters and numbers. Is also required. For this reason, it is difficult to cope with the conventional character recognition model.

上述した問題点を鑑み、本発明の課題は、記入済みの帳票から所望の情報を抽出可能な帳票読取装置を提供することである。 In view of the above-described problems, an object of the present invention is to provide a form reading apparatus capable of extracting desired information from a filled form.

上記課題を解決するため、本発明の一態様は、学習済みモデルによって、記入済みの帳票から所定の情報を抽出する抽出部を有する帳票読取装置に関する。 In order to solve the above-described problem, one embodiment of the present invention relates to a form reading device including an extraction unit that extracts predetermined information from a filled-out form using a learned model.

本発明によると、様々な書式及びレイアウトの帳票から様々なタイプの記入内容に基づく所望の情報を抽出することが可能になる。 ADVANTAGE OF THE INVENTION According to this invention, it becomes possible to extract desired information based on various types of entry contents from forms of various formats and layouts.

本発明の一実施例による帳票読取システムを示す概略図である。1 is a schematic diagram showing a form reading system according to an embodiment of the present invention. 本発明の一実施例による帳票読取装置のハードウェア構成を示すブロック図である。FIG. 2 is a block diagram showing a hardware configuration of the form reading device according to one embodiment of the present invention. 本発明の一実施例による帳票読取装置の機能構成を示すブロック図である。1 is a block diagram illustrating a functional configuration of a form reading device according to an embodiment of the present invention. 本発明の一実施例によるニューラルネットワーク構造を示す概略図である。FIG. 1 is a schematic diagram illustrating a neural network structure according to an embodiment of the present invention. 本発明の一実施例によるモデル学習処理を示すフローチャートである。5 is a flowchart illustrating a model learning process according to an embodiment of the present invention. 本発明の一実施例による学習済みモデルによる帳票読取処理を示すフローチャートである。It is a flowchart which shows the form reading process by the learned model by one Example of this invention.

以下、図面に基づいて本発明の実施の形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

以下の実施例では、記入済みの帳票から様々なタイプの情報を抽出する帳票読取装置が開示される。後述される実施例を概略すると、帳票読取装置は、複数種別の記入済みの帳票を入力とし、当該帳票の記入内容に基づく所定の情報を出力するニューラルネットワークなどの学習済みモデルを利用して、入力された帳票から各種情報を抽出する。例えば、帳票読取装置は、入力として口座振替依頼書を取得すると、学習済みモデルを利用して、当該口座振替依頼書の委託者コード、口座情報（口座番号など）、印影情報等の口座振替を実行するのに必要な所定の情報を抽出する。 In the following embodiment, a form reading apparatus that extracts various types of information from filled forms is disclosed. Briefly describing an embodiment described later, the form reading apparatus uses a learned model such as a neural network which inputs a plurality of types of filled forms and outputs predetermined information based on the contents of the forms, Various information is extracted from the input form. For example, when the form reader obtains an account transfer request as an input, the form reader uses the learned model to transfer the account transfer information such as the entrustor code, account information (such as account number), and imprint information of the account transfer request. Extract predetermined information necessary for execution.

なお、以下の実施例は口座振替依頼書に関して主として説明されるが、本発明による帳票はこれに限定されず、例えば、紙媒体に記入／押印された他の何れかの帳票に適用されてもよい。また、以下の実施例は抽出対象の情報として、口座振替依頼書などへの記入内容などの金融サービスに関連する情報に着目するが、本発明による抽出対象の情報は、これに限定されず、公共サービスなどにおける氏名、住所、生年月日等の個人情報など、帳票から読取可能な他の何れかの記入情報であってもよい。 Although the following embodiment is mainly described with respect to a fund transfer request form, the form according to the present invention is not limited to this, and may be applied to any other form filled in / stamped on a paper medium, for example. Good. Further, the following embodiment focuses on information related to financial services such as the contents of an account transfer request form as information to be extracted, but information to be extracted according to the present invention is not limited to this. Any other entry information that can be read from a form, such as personal information such as a name, address, and date of birth in a public service or the like, may be used.

まず、図１を参照して、本発明の一実施例による帳票読取システムを説明する。図１は、本発明の一実施例による帳票読取システムを示す概略図である。 First, a form reading system according to an embodiment of the present invention will be described with reference to FIG. FIG. 1 is a schematic diagram showing a form reading system according to one embodiment of the present invention.

図１に示されるように、帳票読取システム１０は、ユーザ端末５０、学習装置６０、ストレージ７０及び帳票読取装置１００を有する。 As shown in FIG. 1, the form reading system 10 includes a user terminal 50, a learning device 60, a storage 70, and a form reading device 100.

ユーザ端末５０は、典型的には、帳票読取装置１００に各種指示及び／又はデータを提供するための端末であり、例えば、パーソナルコンピュータなどの通信機能を備えた何れかの情報処理装置であってもよい。 The user terminal 50 is typically a terminal for providing various instructions and / or data to the form reading apparatus 100, and is, for example, any information processing apparatus having a communication function such as a personal computer. Is also good.

学習装置６０は、帳票と抽出対象の情報とのペアから構成される教師データを利用して学習対象のモデルを学習する。例えば、学習対象のモデルがニューラルネットワークとして構成される場合、学習装置６０は、教師用の帳票を示す画像情報を当該ニューラルネットワークに入力し、出力情報と教師用の抽出対象の情報との誤差に基づき、例えば、バックプロパゲーションに従ってニューラルネットワークのパラメータを更新する。所定の終了条件が充足されると、学習装置６０は、更新されたパラメータによるニューラルネットワークを学習済みモデルとしてストレージ７０に格納する。例えば、学習済みモデルは、抽出対象の個別の情報毎に学習されてもよいし、抽出対象の全て又は一部の情報に対して学習されてもよい。 The learning device 60 learns a learning target model using teacher data composed of a pair of a form and information to be extracted. For example, when the model to be learned is configured as a neural network, the learning device 60 inputs image information indicating a form for teachers to the neural network, and calculates an error between output information and information to be extracted for teachers. For example, the parameters of the neural network are updated according to back propagation. When the predetermined termination condition is satisfied, the learning device 60 stores the neural network based on the updated parameters in the storage 70 as a learned model. For example, the learned model may be learned for each piece of information to be extracted, or may be learned for all or a part of the information to be extracted.

例えば、記入済みの複数種別の口座振替依頼書から委託者コードを抽出する場合、委託者の名称等が口座振替依頼書に予め印字されているケース、顧客が口座振替依頼書に委託者を手書きにより指定するケース、顧客が選択対象の委託者から特定の委託者の欄を手書きでチェック（例えば、レ点を書き込む、丸印で囲むなど）することによって指定するケースなど、口座振替依頼書の書式に応じて様々なバリエーションがある。また、委託者を特定するレイアウト領域も口座振替依頼書のレイアウトに応じて様々である。このような多種多様な口座振替依頼書から委託者情報、委託者コード等を抽出するニューラルネットワークを取得するため、学習装置６０は、ユーザ端末５０などから提供される記入済みの複数種別の口座振替依頼書と委託者コードとのペアから構成される教師データに対して適切な前処理（委託者コード毎の並び替え、正規化、データ圧縮、罫線抽出など）を実行し、前処理された教師データを利用して、何れか適切な学習アルゴリズム（例えば、Ｒｅｓｎｅｔ、Ｘｃｅｐｔｉｏｎ、ＳＥＮｅｔなど）に従って学習対象のニューラルネットワークを学習する。 For example, when extracting a consignor code from multiple types of completed fund transfer request forms, the case where the name of the consignor is pre-printed on the fund transfer request form, the customer writes the consignor on the fund transfer request form The form of the debit request form, such as the case specified by the customer or the case where the customer specifies by hand writing (for example, writing a check mark, encircling with a circle, etc.) the column of the specific consignor from the selected consignor. There are various variations according to. The layout area for specifying the entrustor also varies depending on the layout of the fund transfer request form. In order to obtain a neural network for extracting entrustor information, entrustor code, etc. from such a variety of fund transfer request forms, the learning device 60 uses a plurality of filled-in account transfers provided from the user terminal 50 or the like. Appropriate pre-processing (rearrangement, normalization, data compression, ruled line extraction, etc. for each consignor code) is performed on teacher data composed of pairs of request forms and consignor codes, and the pre-processed teacher Using the data, a neural network to be learned is learned according to any appropriate learning algorithm (for example, Resnet, Xception, SENet, etc.).

また、記入済みの複数種別の口座振替依頼書から口座情報の記載場所及び口座情報を抽出する場合も同様に、口座振替依頼書の書式に応じて様々なバリエーションがあり、また、口座情報の記載場所及び口座情報が記入されるレイアウト領域も口座振替依頼書のレイアウトに応じて様々である。さらに、口座番号等は、典型的には罫線により区切られたマス目に記入してもらう形式となっており、罫線と記入内容とを適切に区別することが可能なモデルが必要とされる。このような多種多様な口座振替依頼書から口座情報の記載場所及び口座情報を抽出するニューラルネットワークを取得するため、学習装置６０は、ユーザ端末５０などから提供される記入済みの複数種別の口座振替依頼書と口座情報の場所又は口座情報とのペアから構成される教師データに対して適切な前処理（罫線抽出のための輪郭抽出、輪郭直線近似、マス目抽出など）を実行し、前処理された教師データを利用して、何れか適切な学習アルゴリズム（例えば、Ｒｅｓｎｅｔ、Ｘｃｅｐｔｉｏｎ、ＳＥＮｅｔなど）に従って学習対象のニューラルネットワークを学習する。 In addition, when extracting the location and account information of the account information from the completed multiple types of account transfer request forms, there are also various variations according to the format of the account transfer request form. The layout area in which the location and the account information are entered also varies depending on the layout of the account transfer request form. Furthermore, account numbers and the like are typically in a form to be entered in squares separated by ruled lines, and a model capable of appropriately distinguishing ruled lines from written contents is required. In order to obtain a neural network for extracting the account information entry location and account information from such various types of account transfer requests, the learning device 60 uses a plurality of entered account transfers provided from the user terminal 50 or the like. Appropriate preprocessing (contour extraction for ruled line extraction, contour straight line approximation, grid extraction, etc.) is performed on pre-processing for teacher data composed of a request form and a pair of account information location or account information. Using the obtained teacher data, the learning target neural network is learned according to any appropriate learning algorithm (for example, Resnet, Xception, SENet, etc.).

さらに、押印された印影情報を抽出する場合、様々な向きで押印されることが多く、向きに依存しない印影情報の抽出が所望される。例えば、学習装置６０は、ユーザ端末５０などから提供される記入済みの複数種別の口座振替依頼書と押印場所とのペアから構成される教師データに対して適切な前処理（罫線抽出のための輪郭抽出、輪郭直線近似、マス目抽出など）を実行し、前処理された教師データを利用して、何れか適切な学習アルゴリズム（例えば、Ｒｅｓｎｅｔ、Ｘｃｅｐｔｉｏｎ、ＳＥＮｅｔ、ＳＳＤなど）に従って学習対象のニューラルネットワークを学習する。学習済みモデルによって押印場所が特定されると、帳票読取装置１００は、特定された押印場所における画像を抽出することができ、帳票読取装置１００又は他の照合装置（図示せず）などによって、抽出した画像と当該顧客の登録されている印影情報との自動照合が可能になる。 Furthermore, when extracting stamped seal imprint information, stamping is often performed in various directions, and it is desired to extract seal imprint information independent of the direction. For example, the learning device 60 performs appropriate preprocessing (for ruled line extraction) on teacher data composed of a pair of a filled-in type of bank transfer request form provided from the user terminal 50 and a stamp location. Contour extraction, contour straight line approximation, grid extraction, etc.), and using the pre-processed teacher data, according to any appropriate learning algorithm (eg, Resnet, Xception, SENet, SSD, etc.) Learn the network. When the stamping location is specified by the learned model, the form reading device 100 can extract an image at the specified stamping position, and the extracted image is extracted by the form reading device 100 or another collation device (not shown). It is possible to automatically collate the image thus obtained with the imprint information registered by the customer.

ストレージ７０は、帳票読取装置１００により利用される学習済みモデルを格納する。例えば、学習済みモデルは、学習装置６０によって学習され、記入済みの複数種別の口座振替依頼書から所定の情報（例えば、委託者コード、口座情報の場所、口座情報、印影情報の場所、印影情報など）を抽出するニューラルネットワークなどの機械学習モデルである。 The storage 70 stores a learned model used by the form reading device 100. For example, the learned model is learned by the learning device 60, and given information (for example, client code, location of account information, location of account information, location of imprint information, Machine learning model such as a neural network that extracts

帳票読取装置１００は、以下で詳細に説明されるように、ユーザ端末５０からの帳票読取処理の起動指示に応答して、帳票読取装置１００は、スキャナ（図示せず）などにより読み取られた帳票から生成された画像情報に対して帳票読取処理を実行する。 As described in detail below, the form reading device 100 responds to a start instruction of the form reading process from the user terminal 50, and the form reading device 100 reads the form read by a scanner (not shown) or the like. A form reading process is performed on the image information generated from the.

帳票読取装置１００及び学習装置６０は、典型的には、サーバにより実現され、例えば、図２に示されるようなハードウェア構成を有してもよい。すなわち、帳票読取装置１００及び学習装置６０は、バスＢを介し相互接続されるドライブ装置１０１、補助記憶装置１０２、メモリ装置１０３、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１０４、インタフェース装置１０５及び通信装置１０６を有する。 The form reading device 100 and the learning device 60 are typically realized by a server, and may have a hardware configuration as shown in FIG. 2, for example. That is, the form reading device 100 and the learning device 60 include a drive device 101, an auxiliary storage device 102, a memory device 103, a CPU (Central Processing Unit) 104, an interface device 105, and a communication device 106, which are interconnected via the bus B. .

帳票読取装置１００及び学習装置６０における後述される各種機能及び処理を実現するプログラムを含む各種コンピュータプログラムは、ＣＤ−ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｋ−ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）などの記録媒体１０７によって提供されてもよい。プログラムを記憶した記録媒体１０７がドライブ装置１０１にセットされると、プログラムが記録媒体１０７からドライブ装置１０１を介して補助記憶装置１０２にインストールされる。但し、プログラムのインストールは必ずしも記録媒体１０７により行う必要はなく、ネットワークなどを介し何れかの外部装置からダウンロードするようにしてもよい。補助記憶装置１０２は、インストールされたプログラムを格納すると共に、必要なファイルやデータなどを格納する。メモリ装置１０３は、プログラムの起動指示があった場合に、補助記憶装置１０２からプログラムやデータを読み出して格納する。プロセッサとして機能するＣＰＵ１０４は、メモリ装置１０３に格納されたプログラムやプログラムを実行するのに必要なパラメータなどの各種データに従って、後述されるような帳票読取装置１００及び学習装置６０の各種機能及び処理を実行する。インタフェース装置１０５は、ネットワーク又は外部装置に接続するための通信インタフェースとして用いられる。通信装置１０６は、外部装置と通信するための各種通信処理を実行する。しかしながら、帳票読取装置１００及び学習装置６０は、上述したハードウェア構成に限定されるものでなく、他の何れか適切なハードウェア構成により実現されてもよい。 Various computer programs including programs for realizing various functions and processes described below in the form reading device 100 and the learning device 60 may be provided by a recording medium 107 such as a CD-ROM (Compact Disk-Read Only Memory). When the recording medium 107 storing the program is set in the drive device 101, the program is installed from the recording medium 107 to the auxiliary storage device 102 via the drive device 101. However, the program need not always be installed on the recording medium 107, and may be downloaded from any external device via a network or the like. The auxiliary storage device 102 stores the installed program and also stores necessary files and data. The memory device 103 reads out and stores a program or data from the auxiliary storage device 102 when a program start instruction is issued. The CPU 104 that functions as a processor performs various functions and processes of the form reading device 100 and the learning device 60, which will be described later, according to various data such as programs stored in the memory device 103 and parameters necessary for executing the programs. Execute. The interface device 105 is used as a communication interface for connecting to a network or an external device. The communication device 106 executes various communication processes for communicating with an external device. However, the form reading device 100 and the learning device 60 are not limited to the hardware configuration described above, and may be realized by any other appropriate hardware configuration.

次に、図３を参照して、本発明の一実施例による帳票読取装置を説明する。図３は、本発明の一実施例による帳票読取装置の機能構成を示すブロック図である。 Next, a form reader according to an embodiment of the present invention will be described with reference to FIG. FIG. 3 is a block diagram showing a functional configuration of the form reading apparatus according to one embodiment of the present invention.

図３に示されるように、帳票読取装置１００は、インタフェース部１１０及び抽出部１２０を有する。 As shown in FIG. 3, the form reading device 100 includes an interface unit 110 and an extraction unit 120.

インタフェース部１１０は、抽出対象の記入済みの帳票を取得すると、当該帳票を抽出部１２０にわたし、抽出部１２０から所定の情報を取得する。具体的には、インタフェース部１１０は、スキャナなどによって帳票をスキャンすることによって生成された複数種別の帳票の画像情報を取得し、取得した画像情報を抽出部１２０にわたす。抽出部１２０によって学習済みモデルを利用した帳票の画像情報に対する抽出処理が実行された後、インタフェース部１１０は、抽出部１２０から取得された情報を取得し、例えば、ユーザ端末５０に送信する。当該情報を受信すると、ユーザ端末５０は、帳票から抽出した所定の情報を格納及び表示することができる。 When acquiring the filled-out form to be extracted, the interface unit 110 sends the form to the extracting unit 120 and acquires predetermined information from the extracting unit 120. More specifically, the interface unit 110 acquires image information of a plurality of types of forms generated by scanning the form with a scanner or the like, and passes the acquired image information to the extraction unit 120. After the extraction unit 120 performs the extraction process on the image information of the form using the learned model, the interface unit 110 acquires the information acquired from the extraction unit 120 and transmits the information to the user terminal 50, for example. Upon receiving the information, the user terminal 50 can store and display the predetermined information extracted from the form.

抽出部１２０は、学習済みモデルによって、記入済みの帳票から所定の情報を抽出する。具体的には、抽出部１２０は、学習装置６０によって学習されたニューラルネットワークなどの機械学習モデルに、インタフェース部１１０から提供された複数種別の帳票の画像情報を入力し、学習済みモデルから出力された当該帳票に関する所定の情報を取得する。例えば、帳票が口座振替依頼書である場合、抽出部１２０は、インタフェース部１１０から提供された口座振替依頼書の画像情報を学習済みモデルに入力し、当該口座振替依頼書の委託者コード、口座情報の場所、口座情報（口座番号など）、印影の場所、印影情報等の所定の情報を出力として取得する。抽出部１２０は、このようにして取得した情報をユーザ端末５０に送信する。 The extracting unit 120 extracts predetermined information from the filled-out form using the learned model. Specifically, the extraction unit 120 inputs image information of a plurality of types of forms provided from the interface unit 110 to a machine learning model such as a neural network trained by the learning device 60, and is output from the trained model. And obtains predetermined information on the form. For example, when the form is a fund transfer request form, the extracting unit 120 inputs the image information of the fund transfer request form provided from the interface unit 110 to the trained model, and sets the entrustor code, account, Predetermined information such as information location, account information (account number, etc.), imprint location, imprint information, etc. is obtained as an output. The extracting unit 120 transmits the information thus obtained to the user terminal 50.

一実施例では、学習済みモデルは、帳票の画像から画像特徴量を抽出する画像特徴抽出モデルと、帳票に予め記載された罫線から罫線特徴量を抽出する罫線特徴抽出モデルと、画像特徴量と罫線特徴量とから所定の情報を抽出する対象情報抽出モデルとを含んでもよい。具体的には、図４に示されるように、学習済みモデルは、帳票画像と帳票に印字された罫線との２つのタイプの入力をそれぞれ処理するマルチモーダルなニューラルネットワーク構造を有し、帳票画像から画像特徴量を抽出する画像特徴量抽出モデルと、罫線画像から罫線特徴量を抽出する罫線特徴量抽出モデルとを含んでもよい。学習装置６０は、前処理として記入済みの複数種別の帳票から帳票画像と罫線画像とを生成し、帳票画像を画像特徴抽出モデルに入力して画像特徴量を取得し、罫線画像を罫線特徴抽出モデルに入力して罫線特徴量を取得する。なお、罫線画像は、必ずしも記入済みの帳票から生成される必要はなく、未記入の対応する帳票から生成されてもよい。学習装置６０は、取得した画像特徴量と罫線特徴量とを対象情報抽出モデルに入力し、所定の情報（例えば、委託者コード、口座情報の場所、口座情報、印影の場所、印影情報など）として出力情報を取得する。学習装置６０は、取得した出力情報と教師データとを比較し、その誤差に従って画像特徴抽出モデル、罫線特徴抽出モデル及び対象情報抽出モデルの各パラメータを更新することによって、これらのモデルを学習する。また、学習済みモデルは、罫線部分の抽出に限定されず、帳票に予め記載された矩形領域などの所定の図形領域を抽出する領域抽出モデルを含んでもよい。 In one embodiment, the learned model is an image feature extraction model that extracts an image feature amount from an image of a form, a ruled line feature extraction model that extracts a ruled line feature amount from a ruled line previously described in the form, and an image feature amount. And a target information extraction model for extracting predetermined information from the ruled line feature amount. Specifically, as shown in FIG. 4, the trained model has a multimodal neural network structure for processing two types of input, a form image and a ruled line printed on the form, respectively. And a ruled line feature amount extraction model for extracting a ruled line feature amount from a ruled line image. The learning device 60 generates a form image and a ruled line image from a plurality of types of filled out forms as preprocessing, inputs the form image to an image feature extraction model, acquires an image feature amount, and extracts the ruled line image from the ruled line feature extraction. Input to the model to obtain ruled line feature values. Note that the ruled line image does not necessarily need to be generated from a completed form, but may be generated from a corresponding blank form. The learning device 60 inputs the acquired image feature amount and ruled line feature amount to the target information extraction model, and performs predetermined information (for example, entrustor code, account information location, account information, imprint location, imprint information, and the like). To get output information. The learning device 60 learns these models by comparing the acquired output information with the teacher data and updating the parameters of the image feature extraction model, ruled line feature extraction model, and target information extraction model according to the error. Further, the learned model is not limited to the extraction of the ruled line portion, and may include an area extraction model for extracting a predetermined graphic area such as a rectangular area described in a form in advance.

また、一実施例では、学習済みモデルは、帳票の画像から帳票種別を判定する帳票種別判定モデルと、帳票種別毎の対象情報抽出モデルとを含んでもよい。具体的には、学習装置６０は、記入済みの複数種別の帳票と当該帳票の種別とのペアから構成される教師データによって帳票種別判定モデルを学習し、さらに帳票種別毎に対象情報抽出モデルを用意し、記入済みの特定種別の帳票と所定の情報とのペアから構成される教師データによって各対象情報抽出モデルを学習する。抽出部１２０は、記入済みの複数種別の帳票を取得すると、まず帳票の種別を判定するため、当該帳票を帳票種別判定モデルに入力して帳票種別を特定する。その後、抽出部１２０は、特定した帳票種別に対応する対象情報抽出モデルに当該記入済みの帳票を入力して所定の情報（口座情報の場所、口座情報、印影の場所、印影情報など）を抽出する。本実施例によると、帳票種別毎に対象情報抽出モデルを準備する必要はあるが、例えば、帳票の書式又はレイアウト変更があった際には、帳票種別判定モデルと、変更があった帳票の対象情報抽出モデルのみとを再学習すればよく、メンテナンス性に優れたものとなると共に、特定の書式及びレイアウトに特化した対象情報抽出モデルとなるため、抽出精度の向上が見込まれる。 Further, in one embodiment, the learned model may include a form type determination model for determining a form type from a form image, and a target information extraction model for each form type. More specifically, the learning device 60 learns a form type determination model based on teacher data composed of pairs of completed forms and a type of the form, and further generates a target information extraction model for each form type. Each target information extraction model is learned using teacher data prepared and prepared as a pair of a filled-out form of a specific type and predetermined information. When the extracting unit 120 obtains the filled-out forms, first, the form is input to the form type determination model to specify the form type in order to determine the form type. After that, the extraction unit 120 inputs the completed form into the target information extraction model corresponding to the specified form type, and extracts predetermined information (account information location, account information, imprint location, imprint information, etc.). I do. According to the present embodiment, it is necessary to prepare a target information extraction model for each form type. For example, when there is a change in the form or layout of the form, the form type determination model and the target of the changed form Only the information extraction model needs to be re-learned, which is excellent in maintainability, and is a target information extraction model specialized in a specific format and layout. Therefore, improvement in extraction accuracy is expected.

次に、図５を参照して、本発明の一実施例によるモデル学習処理を説明する。図５は、本発明の一実施例によるモデル学習処理を示すフローチャートである。当該モデル学習処理は、学習装置６０、より具体的には、学習装置６０のプロセッサによって実行される。 Next, a model learning process according to an embodiment of the present invention will be described with reference to FIG. FIG. 5 is a flowchart illustrating a model learning process according to an embodiment of the present invention. The model learning process is executed by the learning device 60, more specifically, the processor of the learning device 60.

図５に示されるように、ステップＳ１０１において、学習装置６０は、記入済みの複数種別の帳票と抽出対象の情報（例えば、委託者コード、口座情報の場所、口座情報、印影の場所、印影情報など）とのペアから構成される教師データをユーザ端末５０などから取得する。例えば、学習装置６０は、抽出対象の情報タイプ毎にモデルを学習する。 As shown in FIG. 5, in step S101, the learning device 60 sets the entered plural forms and information to be extracted (for example, entrustor code, account information location, account information, imprint location, imprint information). ) Is acquired from the user terminal 50 or the like. For example, the learning device 60 learns a model for each information type to be extracted.

ステップＳ１０２において、学習装置６０は、訓練用の記入済みの帳票を学習対象モデルに入力する。なお、学習装置６０は、訓練用の記入済みの帳票に対して適切な前処理をし、前処理された帳票を学習対象モデルに入力してもよい。 In step S102, the learning device 60 inputs the completed form for training to the learning target model. The learning device 60 may perform appropriate pre-processing on the completed form for training, and input the pre-processed form to the learning target model.

ステップＳ１０３において、学習装置６０は、学習対象モデルから出力データを取得する。 In step S103, the learning device 60 acquires output data from the learning target model.

ステップＳ１０４において、学習装置６０は、訓練用の抽出対象の情報と出力データとを比較し、それらの間の誤差に基づきバックプロパゲーションなどに従って学習対象モデルのパラメータを更新する。 In step S104, the learning device 60 compares the information of the extraction target for training with the output data, and updates the parameters of the learning target model according to back propagation or the like based on the error between them.

学習装置６０は、終了条件が充足されるまでステップＳ１０１〜Ｓ１０４の処理を各教師データに対して実行し、最終的に取得された学習対象モデルを学習済みモデルとしてストレージ７０に格納する。終了条件としては、所定数の教師データが処理されたこと、誤差が所定の閾値以下になったこと、誤差が収束したことなどであってもよい。 The learning device 60 executes the processing of steps S101 to S104 on each teacher data until the termination condition is satisfied, and stores the finally acquired learning target model in the storage 70 as a learned model. The termination condition may be that a predetermined number of teacher data has been processed, that the error has become equal to or less than a predetermined threshold, that the error has converged, or the like.

次に、図６を参照して、本発明の一実施例による帳票読取処理を説明する。図６は、本発明の一実施例による学習済みモデルによる帳票読取処理を示すフローチャートである。当該帳票読取処理は、帳票読取装置１００、より具体的には、帳票読取装置１００のプロセッサによって実行される。 Next, a form reading process according to an embodiment of the present invention will be described with reference to FIG. FIG. 6 is a flowchart showing a form reading process using a learned model according to an embodiment of the present invention. The form reading process is executed by the form reading device 100, more specifically, by the processor of the form reading device 100.

図６に示されるように、ステップＳ２０１において、帳票読取装置１００は、抽出対象の記入済み帳票を学習済みモデルに入力する。 As shown in FIG. 6, in step S201, the form reading apparatus 100 inputs a filled-out form to be extracted to a learned model.

ステップＳ２０２において、帳票読取装置１００は、学習済みモデルから所定の情報を取得する。取得される情報は、学習済みモデルに応じて、帳票への記入内容の全て又は一部の情報であってもよい。帳票読取装置１００は、抽出した情報をユーザ端末５０などに送信する。 In step S202, the form reading device 100 acquires predetermined information from the learned model. The information to be acquired may be all or a part of information entered in the form according to the learned model. The form reading device 100 transmits the extracted information to the user terminal 50 or the like.

以上、本発明の実施例について詳述したが、本発明は上述した特定の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形・変更が可能である。 As described above, the embodiments of the present invention have been described in detail. However, the present invention is not limited to the specific embodiments described above, and various modifications may be made within the scope of the present invention described in the appended claims.・ Change is possible.

１０帳票読取システム
５０ユーザ端末
６０学習装置
７０ストレージ
１００帳票読取装置 10 form reading system 50 user terminal 60 learning device 70 storage 100 form reading device

Claims

A form reader having an extraction unit for extracting predetermined information from a filled-out form using a learned model.

The form is a fund transfer request form,
The form reading device according to claim 1, wherein the predetermined information is information necessary for executing an account transfer.

The form reading apparatus according to claim 1, wherein the learned model is learned for each predetermined information to be extracted.

The trained model is
A first sub-model for extracting an image feature amount from the image of the form,
A second sub-model for extracting a ruled line feature from a ruled line previously described in the form;
A third sub-model for extracting the predetermined information from the image feature amount and the ruled line feature amount;
The form reading apparatus according to claim 1, further comprising:

The form reading device according to any one of claims 1 to 4, further comprising an interface unit that, when acquiring a filled-out form to be extracted, sends the form to the extracting unit and acquires the predetermined information from the extracting unit. apparatus.