JP2022013524A

JP2022013524A - Form information generation apparatus, method, and program

Info

Publication number: JP2022013524A
Application number: JP2020135157A
Authority: JP
Inventors: 信之坂入; Nobuyuki Sakairi; 竜也園部; Tatsuya Sonobe; 広樹上野; Hiroki Ueno; 寿紀岩城; Toshinori Iwaki; 慎一田野; Shinichi Tano; 相麟朴; Sang Rin Park
Original assignee: Arithmer Inc
Current assignee: Arithmer Inc
Priority date: 2020-06-30
Filing date: 2020-08-07
Publication date: 2022-01-18
Anticipated expiration: 2040-06-30
Also published as: JP7086361B2

Abstract

PROBLEM TO BE SOLVED: To provide a form information generation apparatus for generating new form information on the basis of past form information, a method, and a program.

SOLUTION: A form information generation apparatus 100 includes: a storage unit which stores past feature quantities extracted from past form images and past form information based on character information included in the past form images, in association with each other; an extraction unit which extracts a target feature quantity from a targeted form image; a calculation unit which calculates similarities between the target feature quantity and the past feature quantities; a selection unit which selects past form information on the basis of the similarities; a read unit which reads character information of a predetermined item from the targeted form image; and a generation unit which generates new form information associated with the targeted form image, from the past form information selected in the selection unit and the character information read in the read unit, to be displayed.

SELECTED DRAWING: Figure 3

Description

本発明は、帳票情報生成装置、方法及びプログラムに関する。 The present invention relates to a form information generator, a method and a program.

特許文献１及び２には、「管理装置１は、紙媒体の画像を示す画像情報と、当該画像に対応して過去に入力された入力情報とを関連付けて記憶部１１に記憶しており、記憶部１１に記憶されている画像情報から、取得された画像情報と類似する画像情報を特定する（図１の（３））。管理装置１は、特定した画像情報に関連付けられている入力情報を特定し、当該入力情報と、取得された画像情報から抽出された文字情報とを、ユーザ端末２に送信される入力画面に出力する（図１の（４））。」と記載されている。
［先行技術文献］
［特許文献］
［特許文献１］特開２０１７－１７４１９９
［特許文献２］特開２０１７－１７４２００ In Patent Documents 1 and 2, "The management device 1 stores the image information indicating the image of the paper medium and the input information previously input corresponding to the image in the storage unit 11 in association with each other. From the image information stored in the storage unit 11, image information similar to the acquired image information is specified ((3) in FIG. 1). The management device 1 is input information associated with the specified image information. Is specified, and the input information and the character information extracted from the acquired image information are output to the input screen transmitted to the user terminal 2 ((4) in FIG. 1). " ..
[Prior Art Document]
[Patent Document]
[Patent Document 1] Japanese Patent Application Laid-Open No. 2017-174199
[Patent Document 2] Japanese Patent Application Laid-Open No. 2017-174200

本発明の第１の態様においては、帳票情報生成装置であって、過去の帳票画像から抽出された過去特徴量と、過去の帳票画像に記載された文字の情報に基づいた過去の帳票情報とを関連付けて取得する取得部と、対象とする帳票画像から対象特徴量を抽出する抽出部と、対象特徴量と過去特徴量との類似度を算出する算出部と、類似度に基づいて、過去の帳票情報を選択する選択部と、対象とする帳票画像から予め定められた項目の文字の情報を読み取る読取部と、選択部で選択された過去の帳票情報と、読取部で読み取られた文字の情報とから、対象とする帳票画像に関連付けられた新たな帳票情報を生成して表示する生成部とを備える。 In the first aspect of the present invention, the form information generation device is a past form information based on the past feature amount extracted from the past form image and the character information described in the past form image. An acquisition unit that acquires the target feature amount in association with each other, an extraction unit that extracts the target feature amount from the target form image, a calculation unit that calculates the similarity between the target feature amount and the past feature amount, and a past based on the similarity degree. A selection unit that selects the form information of, a reading unit that reads the character information of the predetermined item from the target form image, the past form information selected by the selection unit, and the characters read by the reading unit. It is provided with a generation unit that generates and displays new form information associated with the target form image from the information of.

本発明の第２の態様においては、会計処理システムに読み込ませる請求情報を生成するための装置であって、請求情報及び当該請求情報に対応する請求書画像は、それぞれ第１情報と第２情報とを含むものであり、過去の請求情報と、過去の請求書画像とを関連付けて取得する取得部と、取得部を参照することにより新たな請求書画像と類似する過去の請求書画像を選択し、当該選択した過去の請求情報から第１情報を抽出する第１情報取得部と、新たな請求書画像から第２情報を光学的文字認識装置（ＯＣＲ）で読み取る第２情報取得部と、第１情報取得部により抽出された第１情報と、第２情報取得部により読み取られた第２情報とから、新たな請求情報を生成する生成部と、を備える。 In the second aspect of the present invention, the device for generating the billing information to be read by the accounting processing system, and the billing information and the billing image corresponding to the billing information are the first information and the second information, respectively. And, the acquisition unit that acquires the past invoice information in association with the past invoice image, and the past invoice image that is similar to the new invoice image is selected by referring to the acquisition unit. A first information acquisition unit that extracts the first information from the selected past billing information, and a second information acquisition unit that reads the second information from the new invoice image with an optical character recognition device (OCR). It includes a generation unit that generates new billing information from the first information extracted by the first information acquisition unit and the second information read by the second information acquisition unit.

本発明の第３の態様においては、方法であって、過去の帳票画像から抽出された過去特徴量と、過去の帳票画像に記載された文字の情報に基づいた過去の帳票情報とを関連付けて取得するステップと、対象とする帳票画像から対象特徴量を抽出するステップと、対象特徴量と過去特徴量との類似度を算出するステップと、類似度に基づいて、過去の帳票情報を選択するステップと、対象とする帳票画像から予め定められた項目の文字の情報を読み取るステップと、選択のステップで選択された過去の帳票情報と、読み取りのステップで読み取られた文字の情報とから、対象とする帳票画像に関連付けられた新たな帳票情報を生成して表示するステップとを備える。 In the third aspect of the present invention, there is a method of associating the past feature amount extracted from the past form image with the past form information based on the character information described in the past form image. A step to acquire, a step to extract the target feature amount from the target form image, a step to calculate the similarity between the target feature amount and the past feature amount, and a step to select the past form information based on the similarity degree. The target is from the step, the step of reading the character information of the predetermined item from the target form image, the past form information selected in the selection step, and the character information read in the reading step. It is provided with a step of generating and displaying new form information associated with the form image.

本発明の第４の態様においては、プログラムであって、コンピュータに上記方法を実現させる。 In the fourth aspect of the present invention, the program is a computer to realize the above method.

本発明の第５の態様においては、会計処理システムに読み込ませる請求情報を生成するための方法であって、請求情報及び当該請求情報に対応する請求書画像は、それぞれ第１情報と第２情報とを含むものであり、過去の請求情報と、過去の請求書画像とを関連付けて取得するステップと、取得した過去の請求書画像を参照することにより新たな請求書画像と類似する過去の請求書画像を選択し、当該選択した過去の請求情報から第１情報を抽出するステップと、新たな請求書画像から第２情報を光学的文字認識装置（ＯＣＲ）で読み取るステップと、第１情報と、第２情報とから、新たな請求情報を生成するステップと、を備える。 In the fifth aspect of the present invention, there is a method for generating billing information to be read by the accounting processing system, and the billing information and the billing image corresponding to the billing information are the first information and the second information, respectively. And, the step of associating the past invoice information with the past invoice image and acquiring the past invoice image, and the past invoice similar to the new invoice image by referring to the acquired past invoice image. A step of selecting a calligraphy image and extracting the first information from the selected past billing information, a step of reading the second information from the new billing image with an optical character recognition device (OCR), and the first information. , A step of generating new billing information from the second information.

本発明の第６の態様においては、プログラムであって、コンピュータに上記方法を実現させる。 In the sixth aspect of the present invention, the program is a computer to realize the above method.

なお、上記の発明の概要は、本発明の必要な特徴の全てを列挙したものではない。また、これらの特徴群のサブコンビネーションもまた、発明となりうる。 The outline of the above invention does not list all the necessary features of the present invention. A subcombination of these feature groups can also be an invention.

本実施形態における帳票画像の一例である請求書画像１０の具体的な一例を示す。A specific example of the invoice image 10 which is an example of the form image in this embodiment is shown. 使用者が図１の請求書画像１０から生成する帳票情報の一例である請求情報２０の具体的な一例を示す。A specific example of the billing information 20 which is an example of the form information generated by the user from the invoice image 10 of FIG. 1 is shown. 本実施形態に係る帳票情報生成装置１００のブロック図である。It is a block diagram of the form information generation apparatus 100 which concerns on this embodiment. 選択用データ１１２を模式的に示す。The selection data 112 is schematically shown. 読取条件１１３を模式的に示す。The reading condition 113 is schematically shown. 対応関係１１４を模式的に示す。Correspondence relationship 114 is schematically shown. 帳票情報生成装置１００の動作フローＳ１０の一例を示す。An example of the operation flow S10 of the form information generation apparatus 100 is shown. 新たな請求書画像１２の一例を示す。An example of a new invoice image 12 is shown. 生成部１６０が生成した新たな請求情報２２の一例を示す。An example of the new billing information 22 generated by the generation unit 160 is shown. 帳票情報生成装置１００の記憶部１１０に記憶される学習済み抽出部１１１を生成する学習装置２００のブロック図である。It is a block diagram of the learning apparatus 200 which generates the learned extraction unit 111 stored in the storage unit 110 of the form information generation apparatus 100. 深層距離学習の方法を模式的に示す。The method of deep distance learning is schematically shown. 深層距離学習を用いた場合とカテゴリ分類による学習を用いた場合とで、帳票情報生成装置１００で類似画像検索をしたときの正解率を示す。The correct answer rate when a similar image search is performed by the form information generation device 100 is shown in the case where the deep distance learning is used and the case where the learning by the category classification is used. 深層距離学習の他の方法を模式的に示す。Other methods of deep distance learning are schematically shown. 本発明の複数の態様が全体的または部分的に具現化されてよいコンピュータ１２００の例を示す。An example of a computer 1200 in which a plurality of aspects of the present invention may be embodied in whole or in part is shown.

以下、発明の実施の形態を通じて本発明を説明するが、以下の実施形態は特許請求の範囲にかかる発明を限定するものではない。また、実施形態の中で説明されている特徴の組み合わせの全てが発明の解決手段に必須であるとは限らない。 Hereinafter, the present invention will be described through embodiments of the invention, but the following embodiments do not limit the invention to which the claims are made. Also, not all combinations of features described in the embodiments are essential to the means of solving the invention.

図１は、本実施形態における帳票画像の一例である請求書画像１０の具体的な一例を示す。図１の請求書画像１０は「ＡＢＣ電力会社」から、本実施形態の使用者である「〇〇株式会社」に宛ての電気料金の請求である。請求書画像１０は、紙で送られて使用者側でスキャナ等により画像データ化したり、元々ＰＤＦなどの汎用の画像データとして送られたりする。 FIG. 1 shows a specific example of the invoice image 10 which is an example of the form image in the present embodiment. The invoice image 10 in FIG. 1 is a bill for electricity charges from the "ABC electric power company" to the user "○○ Co., Ltd." of the present embodiment. The invoice image 10 is sent on paper and converted into image data by a scanner or the like on the user side, or originally sent as general-purpose image data such as PDF.

図２は、使用者が図１の請求書画像１０から生成する帳票情報の一例である請求情報２０の具体的な一例を示す。使用者は、請求書画像１０を受け取った場合に、請求書画像１０に記載されている文字の情報の中から使用者自身に必要な又は関心のある情報を、例えば会計処理システムなどに用いる。請求情報２０は、当該会計処理システムへの入力が容易となるよう、ｃｓｖ、ｘｌｓなどのフォーマットであることが好ましい。 FIG. 2 shows a specific example of the billing information 20 which is an example of the form information generated by the user from the invoice image 10 of FIG. When the user receives the invoice image 10, the user uses the information necessary or interested by the user himself / herself from the character information described in the invoice image 10 for, for example, an accounting processing system. The billing information 20 is preferably in a format such as csv or xls so that it can be easily input to the accounting processing system.

請求情報２０は、使用者側の会計処理システムで用いる情報である。会計処理システムを用いることで、使用者は独自の規則で費用を計上し、財務諸表などを作成することが可能となる。図２の例では、請求情報２０の項目として、請求項目、費目、費用発生部署、金額、請求日、支払期日、及び、請求元を含むものである。請求情報２０は、これらの項目の他、送金処理に必要な取引先の銀行口座の情報や、備考欄なども関連付けたものでもよい。また、ここでは、各項目が部署ごとに記載されているが、各項目を部署ごとの費用として計上する必要がない場合には、例えば費用発生部署に全社又は総務部等と記載する。 The billing information 20 is information used in the accounting processing system on the user side. By using the accounting system, users can record expenses according to their own rules and prepare financial statements. In the example of FIG. 2, the items of the billing information 20 include billing items, expense items, expense generation departments, amounts, billing dates, payment due dates, and billing sources. In addition to these items, the billing information 20 may be associated with information on the bank account of the business partner required for remittance processing, a remarks column, and the like. In addition, although each item is described here for each department, if it is not necessary to record each item as an expense for each department, for example, it is described as company-wide or general affairs department in the expense generation department.

ここで、請求書画像１０と請求情報２０とを対比すると、請求書画像１０に記載された「割引」のように、請求情報２０には反映されない文字の情報がある。一方で、請求書画像１０の「請求金額」と請求情報２０の「金額」のように、請求書画像１０に記載された文字の情報をそのままではなく、独自の規則で変換して反映させる文字の情報がある。したがって、いわゆるＯＣＲの精度を高くするだけでは必ずしも請求書画像１０から使用者の望む請求情報２０は容易には生成されない。 Here, when the invoice image 10 and the billing information 20 are compared, there is character information that is not reflected in the billing information 20, such as the "discount" described in the invoice image 10. On the other hand, characters such as the "billing amount" of the invoice image 10 and the "amount" of the billing information 20 are not the same as the character information described in the invoice image 10, but are converted and reflected by their own rules. There is information on. Therefore, the billing information 20 desired by the user is not always easily generated from the invoice image 10 only by increasing the accuracy of the so-called OCR.

さらに、電気料金が年単位での定額の契約である場合などは、請求書画像１０に記載される電気料金に関する情報は前月と変わらない。したがって、このような場合には当該請求書が前月のどの請求書と対応するのかが精度よく選択されることが好ましい。 Further, when the electricity rate is a fixed amount contract on a yearly basis, the information on the electricity rate described in the invoice image 10 is the same as the previous month. Therefore, in such a case, it is preferable to accurately select which invoice of the previous month corresponds to the invoice.

そこで本実施形態では、新たな請求書画像の特徴量に対して、これと類似する過去の特徴量に対応付けられた請求情報を選択する。さらに、当該過去の請求情報と、新たな請求書画像から読み取った文字の情報とを用いることで、使用者の望む請求情報を容易に生成する装置を提供する。 Therefore, in the present embodiment, the billing information associated with the past feature amount similar to the feature amount of the new invoice image is selected. Further, by using the past billing information and the character information read from the new billing image, a device for easily generating the billing information desired by the user is provided.

図３は、本実施形態に係る帳票情報生成装置１００のブロック図である。帳票情報生成装置１００の一例は、汎用コンピュータにプログラムがインストールされたものである。また、帳票情報生成装置１００は使用者の端末装置と通信可能に構成されており、使用者の端末装置から請求情報の修正などを受け付ける。 FIG. 3 is a block diagram of the form information generation device 100 according to the present embodiment. An example of the form information generation device 100 is a general-purpose computer in which a program is installed. Further, the form information generation device 100 is configured to be able to communicate with the user's terminal device, and accepts correction of billing information from the user's terminal device.

帳票情報生成装置１００は、記憶部１１０と、請求書画像から特徴量を抽出する抽出部１２０と、抽出した特徴量と記憶部１１０に記憶された特徴量との類似度を算出する算出部１３０と、類似度に基づいて記憶部１１０に記憶された請求情報を選択する選択部１４０とを備える。帳票情報生成装置１００はさらに、請求書画像から予め定められた項目の文字の情報を読み取る読取部１５０と、選択された請求情報及び読み取られた文字の情報から、新たな請求情報を生成して表示する生成部１６０とを備える。 The form information generation device 100 includes a storage unit 110, an extraction unit 120 that extracts a feature amount from an invoice image, and a calculation unit 130 that calculates the degree of similarity between the extracted feature amount and the feature amount stored in the storage unit 110. And a selection unit 140 that selects billing information stored in the storage unit 110 based on the degree of similarity. The form information generation device 100 further generates new billing information from the reading unit 150 that reads the character information of the predetermined item from the invoice image, and the selected billing information and the read character information. It is provided with a generation unit 160 to be displayed.

記憶部１１０は、学習済み抽出部１１１と、選択用データ１１２と、読取条件１１３と、対応関係１１４とを記憶する。例えば、これらの情報はインターネットを介して他のサーバ等から又はＵＳＢメモリのような記録媒体により、使用者により予め記憶される。当該情報を外部から取得して記憶するという観点から、記憶部１１０は取得部としての機能も担っているといえる。 The storage unit 110 stores the learned extraction unit 111, the selection data 112, the reading condition 113, and the correspondence relationship 114. For example, these pieces of information are stored in advance by the user from another server or the like via the Internet or by a recording medium such as a USB memory. From the viewpoint of acquiring and storing the information from the outside, it can be said that the storage unit 110 also has a function as an acquisition unit.

学習済み抽出部１１１は、予め学習された方法で請求書画像から特徴量を抽出するものであって、詳細は後述する。学習済み抽出部１１１そのものを記憶することに代えて、予め学習することによって得られた各パラメータや重み付け値を、抽出部１２０で利用可能に記憶していてもよい。 The learned extraction unit 111 extracts the feature amount from the invoice image by a method learned in advance, and the details will be described later. Instead of storing the learned extraction unit 111 itself, each parameter or weighting value obtained by learning in advance may be stored in the extraction unit 120 so that it can be used.

図４は、選択用データ１１２を模式的に示す。選択用データ１１２において、請求書画像１０、特徴量３０及び請求情報２０が１つのデータセットとして互いに対応付けられている。また、請求書画像１０には請求元の会社毎にラベル４０が対応付けられている。ただし、これに限定されるものではなく、請求書画像１０は、同じ請求元であって異なる請求書のフォーマット毎にラベルが対応付けられているものでもよい。なお、請求書画像１０及びラベル４０は主に後述する学習に用いられるので、選択用データ１１２には含まれていなくてよい。 FIG. 4 schematically shows the selection data 112. In the selection data 112, the invoice image 10, the feature amount 30, and the billing information 20 are associated with each other as one data set. Further, the invoice image 10 is associated with a label 40 for each billing company. However, the invoice image 10 is not limited to this, and the invoice image 10 may be associated with labels for different invoice formats from the same invoice source. Since the invoice image 10 and the label 40 are mainly used for learning described later, they do not have to be included in the selection data 112.

請求書画像１０及び請求情報２０は、使用者が過去に会計処理システムに使用したものであってよい。これらは、少量多種類の情報であり、例えば１万程度の会社（すなわちラベル４０）に対し、会社毎（すなわちラベル４０毎）に１００枚程度のものが存在する。なお、図４の例では、説明を簡略化するために、ラベル４０が「ＡＢＣ電力」と「ＤＥＦ新聞販売店」の２つあり、「ＡＢＣ電力」に対して３つのデータセット、「ＤＥＦ新聞販売店」に対して１つのデータセットのものを示している。 The invoice image 10 and the invoice information 20 may be those used by the user in the accounting system in the past. These are small amounts and many kinds of information, and for example, there are about 100 sheets of information for each company (that is, for each label 40) for about 10,000 companies (that is, for each label 40). In the example of FIG. 4, in order to simplify the explanation, there are two labels 40, "ABC power" and "DEF newspaper dealer", and three data sets for "ABC power", "DEF newspaper". It shows one data set for "dealer".

特徴量Ｖ０等は学習済み抽出部１１１で抽出された、請求書画像１０の特徴をｄ次元のベクトル（ｄは例えば１０００程度）で表現したものである。当該特徴量Ｖ０等はユーザが視認可能な形で記憶されていなくてよい。以上を換言すれば、記憶部１１０には、少なくとも、過去の請求書画像から抽出された特徴量である過去特徴量と、過去の請求書画像に記載された文字の情報に基づいた過去の請求情報とが関連付けて記憶されているといえる。 The feature amount V0 and the like represent the features of the invoice image 10 extracted by the learned extraction unit 111 with a d-dimensional vector (d is, for example, about 1000). The feature amount V0 or the like does not have to be stored in a form that can be visually recognized by the user. In other words, the storage unit 110 has at least the past feature amount, which is the feature amount extracted from the past invoice image, and the past billing based on the character information described in the past invoice image. It can be said that the information is associated and stored.

図５は、読取条件１１３を模式的に示す。読取条件１１３には請求書画像１０から読取部１５０が読み取る項目と、当該項目を読み取る条件とが対応付けられている。 FIG. 5 schematically shows the reading condition 113. The reading condition 113 is associated with an item read by the reading unit 150 from the invoice image 10 and a condition for reading the item.

ここで「条件」は、請求書画像１０のどこに又はどのように当該項目に対応する文字の情報をＯＣＲで読み取ることができるかをルールベースで示したものである。図５の例において、例えば、項目「発行日」の条件「１）」は、発行日が請求書画像１０の「発行日」という文字列の右の文字列から得られることを示す。また、例えば、項目「発行日」の条件「２）」は、発行日が、請求書画像１０の「発行日」という文字列の下の文字列から得られることを示す。また、例えば、項目「発行日」の条件「３）」は、発行日が、請求書画像１０の右上領域の「年」「月」「日」のいずれかの文字を含む文字列から得られることを示す。なお、読取条件として、「発行日」という文字列に代えて、「発」「行」「日」のうちの例えば２文字を含む文字列が用いられてもよい。これにより「発行年月日」等の文字列を発行日として読み取ることが可能となる。 Here, the "condition" indicates where or how the character information corresponding to the item can be read by OCR in the invoice image 10 on a rule basis. In the example of FIG. 5, for example, the condition "1)" of the item "issue date" indicates that the issue date is obtained from the character string to the right of the character string "issue date" of the invoice image 10. Further, for example, the condition "2)" of the item "issue date" indicates that the issue date is obtained from the character string below the character string "issue date" of the invoice image 10. Further, for example, the condition "3)" of the item "issue date" is obtained from a character string in which the issue date includes any of the characters "year", "month", and "day" in the upper right area of the invoice image 10. Show that. As the reading condition, a character string including, for example, two characters of "departure", "line", and "day" may be used instead of the character string "issue date". This makes it possible to read a character string such as "issue date" as the issue date.

また、図５の例において、例えば、項目「請求金額」の条件「１）」は、請求金額が、請求書画像１０の「請求金額」という文字列の右の文字列から得られることを示す。また、例えば、項目「請求金額」の条件「２）」は、請求金額が、請求書画像１０の「請求金額」という文字列の下の文字列から得られることを示す。また、例えば、項目「請求金額」の条件「３）」は、請求金額が、請求書画像１０の「合計」という文字列の右の文字列から得られることを示す。また、例えば、項目「請求金額」の条件「４）」は、請求金額が、請求書画像１０の「合計」という文字列の下の文字列から得られることを示す。なお、読取条件として、「請求金額」という文字列に代えて、「請」「求」「金」「額」のうちの例えば３文字が含まれる項目を「請求金額」と記載される文字列としてもよい。これにより「請求額」等の文字列を請求金額として読み取ることが可能となる。 Further, in the example of FIG. 5, for example, the condition "1)" of the item "billing amount" indicates that the billing amount is obtained from the character string to the right of the character string "billing amount" of the invoice image 10. .. Further, for example, the condition "2)" of the item "billing amount" indicates that the billing amount is obtained from the character string below the character string "billing amount" of the invoice image 10. Further, for example, the condition "3)" of the item "billing amount" indicates that the billing amount is obtained from the character string to the right of the character string "total" of the invoice image 10. Further, for example, the condition "4)" of the item "billing amount" indicates that the billing amount is obtained from the character string below the character string "total" of the invoice image 10. As a reading condition, instead of the character string "billing amount", an item containing, for example, three characters of "contract", "request", "money", and "amount" is described as "billing amount". May be. This makes it possible to read a character string such as "billing amount" as the billing amount.

また、図５の例において、例えば、項目「支払期日」の条件「１）」は、支払期日が、請求書画像１０の「支払期日」という文字列の右の文字列から得られることを示す。また、例えば、項目「支払期日」の条件「２）」は、支払期日が、請求書画像１０の「支払期日」という文字列の下の文字列から得られることを示す。なお、読取条件として、「支払期日」という文字列に代えて、「支」「払」「期」「日」「限」のうちの例えば２文字が含まれる項目を「支払期日」と記載される文字列としてもよい。これにより「支払い期限」「請求期限」等の文字列を支払期日として読み取ることが可能となる。 Further, in the example of FIG. 5, for example, the condition "1)" of the item "payment date" indicates that the payment date is obtained from the character string to the right of the character string "payment date" of the invoice image 10. .. Further, for example, the condition "2)" of the item "payment date" indicates that the payment date is obtained from the character string below the character string "payment date" of the invoice image 10. As a reading condition, instead of the character string "payment date", an item containing, for example, two characters of "payment", "payment", "period", "day", and "limit" is described as "payment date". It may be a character string. This makes it possible to read character strings such as "payment deadline" and "billing deadline" as the payment deadline.

なお複数の条件のうち、番号が早い順に読み取りが試行され、見つからなければ次の条件を試行する。読取条件１１３はＯＣＲの知識を有する者により予め設定されていることが好ましいが、使用者により変更することができてもよい。 Of the multiple conditions, reading is tried in ascending order of number, and if not found, the next condition is tried. The reading condition 113 is preferably set in advance by a person having knowledge of OCR, but may be changed by the user.

図６は、対応関係１１４を模式的に示す。対応関係１１４は、読取部１５０による読取の項目と、選択用データの過去の請求情報２０の項目との対応関係、及び、当該項目に対して生成部１６０が行う処理が対応付けられている。対応関係１１４は使用者が自ら望む請求情報２０を得るように、任意に設定することができるようにしてもよい。 FIG. 6 schematically shows the correspondence 114. The correspondence relationship 114 is associated with a correspondence relationship between an item read by the reading unit 150 and an item of past billing information 20 of selection data, and a process performed by the generation unit 160 for the item. Correspondence relationship 114 may be arbitrarily set so as to obtain the billing information 20 desired by the user.

図６の例では、読取の項目「発行日」が請求情報の項目「請求日」に対応する。そして、請求情報の「請求日」を読取の項目で読み取った「発行日」で上書きするという処理を行なうための情報が記憶部１１０に格納されている。 In the example of FIG. 6, the reading item "issue date" corresponds to the billing information item "billing date". Then, information for performing a process of overwriting the "billing date" of the billing information with the "issue date" read in the reading item is stored in the storage unit 110.

また、図６の例では、読取の項目「請求金額」が請求情報の項目「金額」に対応する。そして、請求情報の「金額」の合計と読取の項目で読み取った「請求金額」とを比較するという処理を行なうための情報が記憶部１１０に格納されている。 Further, in the example of FIG. 6, the reading item "billing amount" corresponds to the billing information item "amount". Then, information for performing a process of comparing the total of the "amounts" of the billing information with the "billing amount" read in the reading item is stored in the storage unit 110.

また、図６の例では、読取の項目「支払期日」が請求情報の項目「支払期日」に対応する。そして、請求情報の「支払期日」の合計と読取の項目で読み取った「支払期日」で上書きするという処理を行なうための情報が記憶部１１０に格納されている。 Further, in the example of FIG. 6, the reading item “payment date” corresponds to the billing information item “payment date”. Then, information for performing a process of overwriting with the total of the "payment date" of the billing information and the "payment date" read in the reading item is stored in the storage unit 110.

なお、対応関係１１４において、読取条件１１３で読み取るとされた項目の全てを請求情報２０のいずれかの項目と対応付けする必要はない。さらに、対応関係１１４において、選択用データ１１２の請求情報２０に含まれる項目の全てをいずれかの読取の項目と対応付ける必要もない。 In the correspondence 114, it is not necessary to associate all the items read under the reading condition 113 with any item of the billing information 20. Further, in the correspondence relationship 114, it is not necessary to associate all the items included in the billing information 20 of the selection data 112 with any of the reading items.

むしろ、請求書画像１０に含まれる文字の情報のように毎回変わらない情報が多い場合には、変わらない情報に関する項目を対応関係１１４に含めないことが好ましい場合がある。対応関係１１４に含めないことにより、読取部１５０で誤って読み取った文字の情報に基づいて、誤った請求情報が生成されることを防ぐことができる場合がある。 Rather, when there is a lot of information that does not change each time, such as character information included in the invoice image 10, it may be preferable not to include the item related to the information that does not change in the correspondence 114. By not including it in the correspondence 114, it may be possible to prevent erroneous billing information from being generated based on the information of characters erroneously read by the reading unit 150.

図７は、帳票情報生成装置１００の動作フローＳ１０の一例を示し、図８は新たな請求書画像１２の一例を示す。動作フローＳ１０は、新たな請求書画像１２が入力された場合に開始される。なお、以下の説明において、新たな請求情報を符号２２で表し、記憶部１１０に記憶されている請求情報２０と区別して表すことにする。 FIG. 7 shows an example of the operation flow S10 of the form information generation device 100, and FIG. 8 shows an example of a new invoice image 12. The operation flow S10 is started when a new invoice image 12 is input. In the following description, the new billing information will be represented by reference numeral 22 and will be represented separately from the billing information 20 stored in the storage unit 110.

抽出部１２０は記憶部１１０から学習済み抽出部１１１を読み込んで、請求書画像１２の特徴量である対象特徴量を抽出する（Ｓ１００）。これは、抽出部１２０が、過去の請求書画像１０から特徴量３０を抽出する学習済の方法に対応した方法で、対象特徴量を抽出しているともいえる。 The extraction unit 120 reads the learned extraction unit 111 from the storage unit 110 and extracts the target feature amount, which is the feature amount of the invoice image 12 (S100). It can be said that the extraction unit 120 extracts the target feature amount by a method corresponding to the learned method of extracting the feature amount 30 from the past invoice image 10.

算出部１３０は、対象特徴量と、記憶部１１０に格納されている選択用データ１１の特徴量３０との類似度を算出する（Ｓ１０２）。類似度の算出方法はいくつか方法があり、後述する学習済み抽出部１１１を学習させるのに用いた類似度の算出方法とは必ずしも同じでなくてもよい。 The calculation unit 130 calculates the degree of similarity between the target feature amount and the feature amount 30 of the selection data 11 stored in the storage unit 110 (S102). There are several methods for calculating the similarity, and it does not necessarily have to be the same as the method for calculating the similarity used for training the learned extraction unit 111, which will be described later.

算出部１３０は、選択用データ１１２の全ての又は予め定められた一部分の特徴量３０に対して、対象特徴量との類似度を計算する（Ｓ１０４）。その結果、閾値以上の類似度となる特徴量３０が少なくとも１つあった場合に（Ｓ１０６：Ｙｅｓ）、ステップＳ１０７に進む。一方、閾値以上の類似度となる特徴量３０が１つもなかった場合に（Ｓ１０６：Ｎｏ）、ステップＳ１１６に進む。当該閾値は予め設定されて記憶部１１０に格納されているが、使用者によって変更できてよい。 The calculation unit 130 calculates the similarity with the target feature amount with respect to the feature amount 30 of all or a predetermined part of the selection data 112 (S104). As a result, when there is at least one feature amount 30 having a similarity equal to or higher than the threshold value (S106: Yes), the process proceeds to step S107. On the other hand, if there is no feature amount 30 having a similarity equal to or higher than the threshold value (S106: No), the process proceeds to step S116. The threshold value is preset and stored in the storage unit 110, but may be changed by the user.

ステップＳ１０７において、選択部１４０は上記類似度に基づいて、選択用データ１１２の請求情報２０を選択する（Ｓ１０７）。この場合に、選択部１４０は類似度の大きい順に予め定められた個数、例えば１０個、の特徴量３０に対応付けられた（すなわち同数の）請求情報２０を選択する。 In step S107, the selection unit 140 selects the billing information 20 of the selection data 112 based on the similarity (S107). In this case, the selection unit 140 selects (that is, the same number) the billing information 20 associated with the feature amount 30 of a predetermined number, for example, 10 in descending order of similarity.

以下、説明を簡略化するために、図２の請求情報２０が選択される例で説明する。 Hereinafter, in order to simplify the description, an example in which the billing information 20 of FIG. 2 is selected will be described.

上記ステップＳ１００からＳ１０７と並行して、読取部１５０は記憶部１１０に記憶された読取条件１１３に基づいて、当該条件に対応した項目について請求書画像１２から文字の情報を読み取る（Ｓ１２０）。図５の読取条件１１３の場合には、発行日及び請求金額の情報を読み取る。図８の請求書画像１２が正しく読み取られたとすれば、発行日の文字の情報として「令和２年６月１０日」、及び、請求金額の文字の情報として「：￥１１０，０００－」が読み取られる。 In parallel with steps S100 to S107, the reading unit 150 reads character information from the invoice image 12 for the item corresponding to the condition based on the reading condition 113 stored in the storage unit 110 (S120). In the case of the reading condition 113 in FIG. 5, the information on the issue date and the billing amount is read. If the invoice image 12 in FIG. 8 is correctly read, "June 10, 2nd year of Reiwa" is used as the character information of the issue date, and ": ¥ 110,000-" is used as the character information of the invoice amount. Is read.

この場合に、読取部１５０は既知の非定型ＯＣＲエンジンを用いてよい。例えば、CTPN、EAST、SegLink、TextBoxes++、PSENet、TextSnakeなどで文字領域を抽出し、当該文字領域からCRNN（畳み込みリカレントニューラルネットワーク）系のモデルを用いて文字の情報を読み取って良い。この際、CNN部分はVGG16, ResNet等、RNN部分はLSTM、GRU、seq2seq、注意機構等種々のネットワークが利用できる。また、文字領域を抽出後、文字を一文字ずつに分割してからCNN系のモデル(VGG16, ResNet等)で文字の情報を読み取ってもよい。また、文字領域抽出から文字情報の読取までをEnd-to-Endで一つのネットワークで行ってもよい。 In this case, the reading unit 150 may use a known atypical OCR engine. For example, a character area may be extracted by CTPN, EAST, SegLink, TextBoxes ++, PSENet, TextSnake, etc., and character information may be read from the character area using a CRNN (convolutional recurrent neural network) model. At this time, various networks such as VGG16, ResNet for the CNN part, LSTM, GRU, seq2seq for the RNN part, and attention mechanism can be used for the RNN part. Further, after extracting the character area, the characters may be divided into characters one by one, and then the character information may be read by a CNN model (VGG16, ResNet, etc.). Further, the process from extracting the character area to reading the character information may be performed end-to-end on one network.

ステップＳ１０８において、生成部１６０は、記憶部１１０に記憶された対応関係１１４に基づいて、選択された請求情報２０の請求日を、読み取った発行日で上書きする。この場合に生成部１６０は、読み取った発行日の文字の情報をそのまま上書きしてもよいし、予め定められた変換規則に従って変換して上書きしてもよい。当該変換規則は使用者が対応関係１１４の処理として設定してもよいし、対応関係１１４とは別個に記憶部１１０に記憶されてもよい。本実施形態において、元号を西暦に変換した上で「YYYY/MM/DD」の形に変換する例で説明する。 In step S108, the generation unit 160 overwrites the billing date of the selected billing information 20 with the read issue date based on the correspondence 114 stored in the storage unit 110. In this case, the generation unit 160 may overwrite the read character information of the issue date as it is, or may convert and overwrite according to a predetermined conversion rule. The conversion rule may be set by the user as a process of the correspondence relationship 114, or may be stored in the storage unit 110 separately from the correspondence relation 114. In this embodiment, an example of converting the era name into the Christian era and then converting it into the form of "YYYY / MM / DD" will be described.

次に、生成部１６０は、記憶部１１０に記憶された対応関係１１４に基づいて、選択された請求情報２０の金額の合計と、読み取った請求金額とを比較する（Ｓ１１０）。図８の請求書画像１２及び図２の請求情報２０の例の場合、生成部１６０は請求情報２０の金額の欄を合計し、「１１００００」を得る。生成部１６０はこれと、請求書画像１２から読み取った請求金額の文字の情報「：￥１１０，０００－」のうちの数字部分「１１００００」とを比較し、一致すると判断する。 Next, the generation unit 160 compares the total amount of the selected billing information 20 with the read billing amount based on the correspondence 114 stored in the storage unit 110 (S110). In the case of the invoice image 12 of FIG. 8 and the example of the billing information 20 of FIG. 2, the generation unit 160 totals the columns of the amount of the billing information 20 to obtain "110,000". The generation unit 160 compares this with the numerical part "110,000" in the character information ": ¥ 110,000-" of the invoice amount read from the invoice image 12, and determines that they match.

ステップＳ１１０の判断がＹｅｓの場合に、生成部１６０はステップＳ１０８で上書きされた請求情報を新たな請求情報として生成し、ディスプレイ等に表示する（Ｓ１１４）。 When the determination in step S110 is Yes, the generation unit 160 generates the billing information overwritten in step S108 as new billing information and displays it on a display or the like (S114).

図９は、生成部１６０が生成した新たな請求情報２２の一例を示す。請求情報２２において、図２の請求情報２０の文字の情報のうち、請求日及び支払期日が、読取部１５０で読み取られた発行日及び支払期日の情報で書き換えられている。一方、請求項目等、書き換えられなかった文字の情報について、選択された過去の請求情報２０の文字の情報が新たな請求情報２２に使用されている。 FIG. 9 shows an example of the new billing information 22 generated by the generation unit 160. In the billing information 22, the billing date and the payment due date among the character information of the billing information 20 of FIG. 2 are rewritten with the information of the issue date and the payment due date read by the reading unit 150. On the other hand, with respect to the non-rewritten character information such as the billing item, the character information of the selected past billing information 20 is used for the new billing information 22.

さらに、当該請求日が、請求項目等の書き換えられていない情報と区別可能に斜体で表現されている。区別可能な表現は、ボールド、赤字、点滅など、他の方法であってもよい。 Further, the billing date is expressed in italics so as to be distinguishable from unrewritten information such as billing items. The distinguishable representation may be in other ways, such as bold, deficit, blinking, etc.

さらに生成部１６０は、読取部１５０で読み取った文字の情報の読み取りの確度に応じて、請求情報２２を表示してもよい。例えば、請求情報２２のうち、読取部１５０で読み取られた情報で書き換えたもののうち、確度が閾値より高いものと低いものとを区別可能に表示してもよい。区別可能な表示は、斜体、ボールド、赤字、点滅などであってよく、確度が低い方が目に使用者の目に付きやすい表現であることが好ましい。 Further, the generation unit 160 may display the billing information 22 according to the accuracy of reading the character information read by the reading unit 150. For example, among the billing information 22, the information rewritten with the information read by the reading unit 150 may be displayed so as to be distinguishable between those having a higher accuracy than the threshold value and those having a lower accuracy than the threshold value. The distinguishable display may be italic, bold, deficit, blinking, etc., and it is preferable that the expression with low accuracy is easy for the user to see.

上記ステップＳ１１０の判断がＮｏの場合に、生成部１６０は新たに生成する請求情報２２に警告を追加する（Ｓ１１２）。警告の例は、請求情報２２の金額の欄を斜体、ボールド、赤字、点滅にするなどである。上記の書き換えられた情報に用いる表現とは異なっていることが好ましい。これに加えて又は代えて、請求情報２２の欄外に「請求金額が異なっています」という文字情報を表示してもよい。 If the determination in step S110 is No, the generation unit 160 adds a warning to the newly generated billing information 22 (S112). An example of a warning is to make the amount column of billing information 22 italic, bold, deficit, blinking, and so on. It is preferable that the expression is different from the expression used for the above rewritten information. In addition to or instead of this, the text information "The billing amount is different" may be displayed in the margin of the billing information 22.

上記ステップ１０６の判断がＮｏの場合に、選択部１４０は過去に類似した請求書画像がない旨を警告する（Ｓ１１６）。例えば、ディスプレイ等に「類似度の高い請求書が見つかりませんでした」という文字情報を表示してもよい。 If the determination in step 106 is No, the selection unit 140 warns that there is no similar invoice image in the past (S116). For example, the text information "No invoice with high similarity was found" may be displayed on a display or the like.

以上により、動作フローＳ１０が終了する。その後、帳票情報生成装置１００は、表示した請求情報２２について、使用者の修正や追加などを受け付ける。ここでは、ステップＳ１０７で類似度の大きい順に予め定められた個数（例えば１０個）の請求情報２０が選択されているので、新たな請求情報２２は上記個数（例えば１０個）と同じだけ表示されることになる。使用者は、複数の請求情報２２から最も好ましい請求情報を選択した上で、修正や追加をする。なお、ステップＳ１１６で警告していた場合には、全く新規の請求情報２２の入力を受け付ける構成としてもよい。 As a result, the operation flow S10 ends. After that, the form information generation device 100 accepts corrections and additions of the user regarding the displayed billing information 22. Here, since a predetermined number (for example, 10) of billing information 20 is selected in order of increasing similarity in step S107, the new billing information 22 is displayed in the same number as the above number (for example, 10). Will be. The user selects the most preferable billing information from the plurality of billing information 22, and then corrects or adds the billing information. If the warning is given in step S116, the input of completely new billing information 22 may be accepted.

なお、上記ステップＳ１１０及びＳ１１２、及び／又は、ステップＳ１１６は必ずしも必要な処理ではなく、適宜省略してもよいものである。 It should be noted that the steps S110 and S112 and / or step S116 are not necessarily necessary processes and may be omitted as appropriate.

帳票情報生成装置１００は、使用者から請求情報２２を確定する旨の入力があった場合に、確定した情報としてディスプレイ等に表示する。ディスプレイ等に表示することに代えて又はこれに加えて、会計処理システムにエクスポートしたり、データとして外部に出力等したりしてもよい。さらに、帳票情報生成装置１００は、請求書画像１２及び／又はその特徴量と請求情報２２とを対応付けて、新たなデータセットとして記憶部１１０の選択用データ１１２に追加する。すなわち、次回から類似画像として選択される候補となる。 When the user inputs that the billing information 22 is confirmed, the form information generation device 100 displays it on a display or the like as the confirmed information. Instead of or in addition to displaying it on a display or the like, it may be exported to an accounting processing system or output as data to the outside. Further, the form information generation device 100 associates the invoice image 12 and / or its feature amount with the billing information 22 and adds it to the selection data 112 of the storage unit 110 as a new data set. That is, it becomes a candidate to be selected as a similar image from the next time.

図１０は、帳票情報生成装置１００の記憶部１１０に記憶される学習済み抽出部１１１を生成する学習装置２００のブロック図である。学習装置２００の一例は、汎用コンピュータにプログラムがインストールされたものである。 FIG. 10 is a block diagram of the learning device 200 that generates the learned extraction unit 111 stored in the storage unit 110 of the form information generation device 100. An example of the learning device 200 is a program installed on a general-purpose computer.

学習装置２００は、記憶部２１０と、請求書画像１０から特徴量３０を抽出する抽出部２２０と、抽出した特徴量３０同士の類似度を算出する類似度算出部２３０と、類似度に基づいて損失関数の値を算出してそれを最小化する損失関数算出部２４０とを備える。本実施形態において、学習装置２００は、抽出部２２０としてＣＮＮを用いて、深層距離学習を行う。なお、抽出部２２０は、CNNに限らず、全結合、RNN、self-attentionなどのニューラルネットワークを用いて特徴量を抽出するものでもよい。さらに、抽出部２２０は、ニューラルネットワークではなく、SIFT、HOGなどの局所特徴量を抽出するものでもよい。 The learning device 200 is based on the storage unit 210, the extraction unit 220 that extracts the feature amount 30 from the invoice image 10, the similarity calculation unit 230 that calculates the similarity between the extracted feature amounts 30, and the similarity. It is provided with a loss function calculation unit 240 that calculates the value of the loss function and minimizes it. In the present embodiment, the learning device 200 uses the CNN as the extraction unit 220 to perform deep distance learning. The extraction unit 220 is not limited to the CNN, and may extract features using a neural network such as a fully coupled network, an RNN, or a self-attention. Further, the extraction unit 220 may extract local features such as SIFT and HOG instead of the neural network.

記憶部１１０は、学習用データ２１１と、学習用パラメータ２１２と、学習済み抽出部１１１とを記憶する。なお、学習前では学習済み抽出部１１１は記録されていなくてよい。 The storage unit 110 stores the learning data 211, the learning parameter 212, and the learned extraction unit 111. Before learning, the learned extraction unit 111 does not have to be recorded.

学習用データ２１１は、少なくとも、ラベル４０と、請求書画像１０と請求情報２０とが対応付けられたデータセットとを含む。ラベル４０と、請求書画像１０と請求情報２０とが対応付けられたデータセットは学習を実行する学習者によって、例えば、インターネットを介して他のサーバ等から又はＵＳＢメモリのような記録媒体により記憶部２１０に記憶される。 The learning data 211 includes at least a label 40 and a data set in which the invoice image 10 and the billing information 20 are associated with each other. The data set in which the label 40, the invoice image 10 and the billing information 20 are associated with each other is stored by the learner performing the learning, for example, from another server or the like via the Internet or by a recording medium such as a USB memory. It is stored in the unit 210.

学習用データ２１１はさらに、学習の結果として請求情報２０に対応付けられた特徴量３０も含む。学習用データ２１１のデータセットは、帳票情報生成装置１００で使われる選択用データ１１２のデータセットと同一であるか、少なくとも一部は、好ましくは大部分が重複していることが好ましい。説明の簡略化のため、本実施形態では学習用データ２１１のデータセットと選択用データ１１２のデータセットとが同一であるとする。 The learning data 211 also includes a feature amount 30 associated with the billing information 20 as a result of learning. It is preferable that the data set of the training data 211 is the same as the data set of the selection data 112 used in the form information generation device 100, or at least a part thereof is preferably largely overlapped. For the sake of simplification of the description, in this embodiment, it is assumed that the data set of the learning data 211 and the data set of the selection data 112 are the same.

学習用パラメータ２１２は、学習の方法に応じて当該学習に用いるパラメータの値を格納する。抽出部２２０としてＣＮＮを用いて、深層距離学習を行う場合には、抽出部２２０のパラメータとして、ネットワークの階層数、各層におけるノードの数、重みの初期値などが含まれる。また、深層距離学習のパラメータとして、学習条件としての学習率及び学習回数、並びに、損失関数等が含まれる。当該パラメータの値は、学習者により設定され、学習中に適宜変更されてよい。 The learning parameter 212 stores the value of the parameter used for the learning according to the learning method. When CNN is used as the extraction unit 220 for deep distance learning, the parameters of the extraction unit 220 include the number of network layers, the number of nodes in each layer, the initial value of the weight, and the like. Further, the parameters of deep distance learning include the learning rate and the number of learnings as learning conditions, a loss function, and the like. The value of the parameter is set by the learner and may be changed as appropriate during learning.

図１１は、深層距離学習の方法を模式的に示す。まず、データセットのペアを用意する。 FIG. 11 schematically shows a method of deep distance learning. First, prepare a pair of datasets.

ペアの請求書画像Ｇ１，Ｇ２のそれぞれを抽出部２２０に入力して、それぞれの特徴量Ｃ１、Ｃ２を計算する。特徴量Ｃ１，Ｃ２を類似度算出部２３０に入力してそれらの間の類似度を計算する。類似度は、コサイン類似度や特徴量空間のユークリッド距離などの距離関数の逆数等、既知の方法のいずれでもよいが、損失関数に対応したものが用いられる。 Each of the paired invoice images G1 and G2 is input to the extraction unit 220, and the feature quantities C1 and C2, respectively, are calculated. The feature quantities C1 and C2 are input to the similarity calculation unit 230, and the similarity between them is calculated. The similarity may be any known method such as the reciprocal of the distance function such as the cosine similarity and the Euclidean distance of the feature space, but the one corresponding to the loss function is used.

当該類似度、及び、ペアの請求書画像のそれぞれのラベルＬ１，Ｌ２を損失関数算出部２４０に入力して損出関数の値を計算し、当該値が小さくなるように誤差逆伝搬により抽出部２２０の重み等を更新する。損失関数は、ラベルＬ１とラベルＬ２とが同じであれば類似度が大きいほど（すなわち特徴量のベクトル間の距離が近いほど）値が小さく、かつ、ラベルＬ１とラベルＬ２とが異なっていれば類似度が小さいほど（すなわち特徴量のベクトル間の距離が遠いほど）値が小さくなる関数である。そのような損失関数として、Contrastive Loss、ArcFace、CosFace、SphereFaceなどが用いられてよい。 The similarity and the labels L1 and L2 of the invoice image of the pair are input to the loss function calculation unit 240 to calculate the value of the loss function, and the extraction unit is subjected to error back propagation so that the value becomes smaller. Update the weight etc. of 220. If the label L1 and the label L2 are the same, the loss function has a smaller value as the similarity is larger (that is, the closer the distance between the vector of the feature amount is), and the label L1 and the label L2 are different. The smaller the similarity (that is, the farther the distance between the feature vectors), the smaller the value of the function. Contrastive Loss, ArcFace, CosFace, SphereFace and the like may be used as such a loss function.

上記の更新を、予め設定した学習回数又はEarly Stoppingに基づいて繰り返し行う。上記更新は、バッチ、オンライン、ミニバッチのいずれの更新タイミングで行われてもよい。 The above update is repeated based on a preset number of learning times or early stopping ping. The above update may be performed at any of batch, online, and mini-batch update timings.

なお、請求書画像１０のように多ラベルで少数データの場合には、ラベルが同じペアと異なるペアとが予め定められた割合、例えば、等しい割合で学習に使われるようにサンプリングすることが好ましい。これにより、ランダムでサンプリングした場合に生じる、同じラベルを持つペアが使われる確率が小さくて学習が進まない、という問題を回避することができる。 In the case of a large number of labels and a small number of data such as the invoice image 10, it is preferable to sample pairs having the same label and pairs having different labels so that they are used for learning at a predetermined ratio, for example, at equal ratios. .. As a result, it is possible to avoid the problem that the probability that pairs with the same label are used is small and the learning does not proceed, which occurs when sampling at random.

学習装置２００は、上記の通り学習した結果得られた抽出部２２０を学習済み抽出部１１１として記憶部２１０に記憶する。抽出部２２０そのものを記憶することに代えて、抽出部２２０と同等の演算が再現できる、当該抽出部２２０で用いられている重み等のパラメータが学習済み抽出部１１１に記憶されてもよい。学習済み抽出部１１１そのもの又はそれに用いられるパラメータを、帳票情報生成装置１００の抽出部１２０に読み込むことにより、上記で学習した結果得られた抽出部２２０と同等の演算で特徴量３０を抽出することができる。 The learning device 200 stores the extraction unit 220 obtained as a result of learning as described above in the storage unit 210 as the learned extraction unit 111. Instead of storing the extraction unit 220 itself, parameters such as weights used in the extraction unit 220, which can reproduce the same operations as the extraction unit 220, may be stored in the learned extraction unit 111. By reading the learned extraction unit 111 itself or the parameters used therein into the extraction unit 120 of the form information generation device 100, the feature amount 30 is extracted by the same operation as the extraction unit 220 obtained as a result of learning above. Can be done.

上記の通り、過去の請求書画像１０の特徴量３０、及び、帳票情報生成装置１００の抽出部１２０は、機械学習で得られたものに対応する。これにより、精度の高い類似画像検索を行うことができる。ここで、「対応する」とは同一であってもよいし、パラメータの読み込み等により再現されたものであってもよく、確率等の他の要因の範囲内で等価な演算結果が得られるものであることを含む。 As described above, the feature amount 30 of the past invoice image 10 and the extraction unit 120 of the form information generation device 100 correspond to those obtained by machine learning. This makes it possible to perform a highly accurate similar image search. Here, the "corresponding" may be the same, or may be reproduced by reading a parameter or the like, and an equivalent calculation result can be obtained within the range of other factors such as probability. Including being.

図１２は、深層距離学習を用いた場合とカテゴリ分類による学習を用いた場合とで、帳票情報生成装置１００で類似画像検索をしたときの正解率を示す。学習用データとして、１００ラベルで、各ラベル１～１００枚の概ね１０００枚の画像について、３０～４０ラベルの３グループに分割してグループ交差検証をした結果である。図１２において、横軸が検索候補数を示し、縦軸がｔｏｐ－ｋ正解率（検索候補の中に少なくとも一つ同ラベルの請求書が含まれた割合）を示している。 FIG. 12 shows the correct answer rate when a similar image search is performed by the form information generation device 100 in the case of using the deep distance learning and the case of using the learning by categorization. This is the result of group cross-validation by dividing approximately 1000 images of 1 to 100 labels with 100 labels into 3 groups of 30 to 40 labels as learning data. In FIG. 12, the horizontal axis indicates the number of search candidates, and the vertical axis indicates the top—k correct answer rate (the ratio of at least one invoice with the same label included in the search candidates).

図１２に示すように、機械学習として深層距離学習を用いると、請求書画像１０のような、多種類かつ少数の検索対象について、カテゴリ分類による学習よりも精度を向上させることができる。 As shown in FIG. 12, when deep distance learning is used as machine learning, it is possible to improve the accuracy of a large number of types and a small number of search targets, such as the invoice image 10, as compared with learning by categorization.

図１３は、深層距離学習の他の方法を模式的に示す。図１３の方法では、データセットのペアを用意する代わりに、個々のデータセットに対する特徴量と、各ラベルに対応する代表特徴量（行列）との類似度を算出する。そして、当該類似度、及び、請求書画像のラベルＬ１を損失関数算出部２４０に入力して損出関数の値を計算し、当該値が小さくなるように誤差逆伝搬により抽出部２２０及び各ラベルの代表特徴量の値等を更新する。 FIG. 13 schematically shows another method of deep distance learning. In the method of FIG. 13, instead of preparing a pair of data sets, the similarity between the feature amount for each data set and the representative feature amount (matrix) corresponding to each label is calculated. Then, the similarity and the label L1 of the invoice image are input to the loss function calculation unit 240 to calculate the value of the loss function, and the extraction unit 220 and each label are back-propagated by error so that the value becomes smaller. Update the value of the representative feature amount of.

補足すると、「代表特徴量」は、予め設定された複数のラベルに対応する特徴量を示すものである。詳しくは、個々のデータセットに対する特徴量がｄ次元で表され、ラベルがＮ個あるとすると、代表特徴量はＮ×ｄ次元の行列で表される。そのため、図１３に示す例では、類似度算出部２３０で、特徴量Ｃ１に対して、各ラベルに対応してＮ個の類似度が算出される。そして、損失関数算出部２４０では、それらのＮ個の類似度に対し、ラベルＬ１に対応する類似度が大きいほど値が小さく、かつ、ラベルＬ１以外に対応する類似度が小さいほど値が小さくなるように出力する。なお、代表特徴量の初期値は乱数で決定される。 Supplementally, the "representative feature amount" indicates the feature amount corresponding to a plurality of preset labels. Specifically, if the features for each data set are represented in d-dimensional and there are N labels, the representative features are represented by an N × d-dimensional matrix. Therefore, in the example shown in FIG. 13, the similarity calculation unit 230 calculates N similarities for each label with respect to the feature amount C1. Then, in the loss function calculation unit 240, the larger the similarity corresponding to the label L1, the smaller the value, and the smaller the similarity corresponding to other than the label L1, the smaller the value of the N similarity. Output as follows. The initial value of the representative feature amount is determined by a random number.

換言すると、上述した抽出部２２０は、一の請求書画像Ｇ１の第１ラベル（Ｌ１）と、当該請求書画像Ｇ１から得られる特徴量Ｃ１及び予め設定された複数の第２ラベル（Ｌ１～ＬＮ）に対応する特徴量を含む代表特徴量の類似度との入力に応じて、第１ラベル（Ｌ１）と第２ラベル（Ｌ１～ＬＮ）とが同じラベル（Ｌ１）のときには前記類似度が大きいほど値が小さく、第１ラベルと第２ラベルとが異なるラベル（Ｌ２～ＬＮ）のときには類似度が小さいほど値が小さくなる損失関数を用いた学習で得られたものに対応する。 In other words, the extraction unit 220 described above includes a first label (L1) of one invoice image G1, a feature amount C1 obtained from the invoice image G1, and a plurality of preset second labels (L1 to LN). ), The similarity is large when the first label (L1) and the second label (L1 to LN) are the same label (L1) according to the input of the similarity of the representative feature amount including the feature amount corresponding to). When the value is smaller and the first label and the second label are different labels (L2 to LN), the smaller the similarity is, the smaller the value is.

以上、本実施形態によれば、対象とする請求書画像の特徴量に対して、これと類似する過去の特徴量に対応付けられた請求情報と、対象とする請求書画像から読み取った文字の情報とを用いる。これにより、新たな請求書画像から、使用者の望む請求情報を容易に生成することができる。 As described above, according to the present embodiment, with respect to the feature amount of the target invoice image, the billing information associated with the past feature amount similar to the feature amount and the characters read from the target invoice image. Use information. Thereby, the billing information desired by the user can be easily generated from the new invoice image.

詳しくは、本実施形態に係る帳票情報生成装置１００を用いれば、請求情報２２として、「項目」「費目」「費用発生部署」「金額」「請求日」「支払期日」「請求元」が少なくとも関連付けられて記憶されている場合、「項目」「費目」「費用発生部署」「請求元」を過去の請求書画像に基づいて抽出し、「請求日」「支払期限」をＯＣＲで読み取ることで、会計処理システムに必要な請求情報２２を使用者が容易に生成することができる。特に、「金額」以外の項目があまり変動しない場合に、本実施形態に係る帳票情報生成装置１００を用いることで、会計処理システムに必要な請求情報を生成するのに要する使用者の労力を大幅に低減することができる。 Specifically, if the form information generation device 100 according to the present embodiment is used, the billing information 22 includes at least "item", "expense item", "expense generation department", "amount", "billing date", "payment date", and "billing source". If it is associated and stored, "item", "expense item", "expense generation department", and "invoice source" can be extracted based on the past invoice image, and "invoice date" and "payment deadline" can be read by OCR. , The user can easily generate the billing information 22 required for the accounting system. In particular, when items other than the "amount" do not fluctuate so much, by using the form information generation device 100 according to the present embodiment, the labor required for the user to generate the billing information required for the accounting processing system is greatly increased. Can be reduced to.

本実施形態の変形例として下記のものが考えられる。まず、帳票として、請求書以外に、注文書、見積書、領収書、保険証券等が考えられる。 The following can be considered as a modification of this embodiment. First, in addition to invoices, purchase orders, quotations, receipts, insurance policies, etc. can be considered as forms.

また、読取条件１１３がルールベースであることに代えて、機械学習によって予め定められた項目の文字の情報が読み取られてもよい。また、機械学習として深層距離学習に代えて、少し精度は劣るが、カテゴリ分類による学習、自己教師あり学習、Imagenetなどを学習したモデルの転移学習などを用いてもよい。 Further, instead of the reading condition 113 being rule-based, character information of a predetermined item may be read by machine learning. Further, as machine learning, instead of deep distance learning, although the accuracy is a little inferior, learning by category classification, self-supervised learning, transfer learning of a model learned from Imagenet, etc. may be used.

また、帳票情報生成装置１００と学習装置２００とはサーバと通信可能であってよい。その場合に、当該サーバに、帳票情報生成装置１００の記憶部１１０に記憶されていると説明されていたデータの一部又は全部、及び／又は、学習装置２００の記憶部２１０に記憶されていると説明されていたデータの一部又は全部が記憶されていてもよい。この場合には、帳票情報生成装置１００が当該サーバと通信する送受信部が各種情報を取得する取得部の機能を有することになる。これに代えて、帳票情報生成装置１００と学習装置２００とが通信可能で、いずれか一方が上記サーバの機能を有していてもよい。この場合も、帳票情報生成装置１００が当該学習装置２００と通信する送受信部が各種情報を取得する取得部の機能を有することになる。 Further, the form information generation device 100 and the learning device 200 may be able to communicate with the server. In that case, a part or all of the data described as being stored in the storage unit 110 of the form information generation device 100 and / or stored in the storage unit 210 of the learning device 200 in the server. A part or all of the data described as may be stored. In this case, the transmission / reception unit in which the form information generation device 100 communicates with the server has the function of the acquisition unit for acquiring various information. Instead of this, the form information generation device 100 and the learning device 200 may be able to communicate with each other, and one of them may have the function of the server. Also in this case, the transmission / reception unit in which the form information generation device 100 communicates with the learning device 200 has the function of the acquisition unit for acquiring various information.

本発明の様々な実施形態は、フローチャートおよびブロック図を参照して記載されてよく、ここにおいてブロックは、（１）操作が実行されるプロセスの段階または（２）操作を実行する役割を持つ装置のセクションを表わしてよい。特定の段階およびセクションが、専用回路、コンピュータ可読媒体上に格納されるコンピュータ可読命令と共に供給されるプログラマブル回路、および／またはコンピュータ可読媒体上に格納されるコンピュータ可読命令と共に供給されるプロセッサによって実装されてよい。専用回路は、デジタルおよび／またはアナログハードウェア回路を含んでよく、集積回路（ＩＣ）および／またはディスクリート回路を含んでよい。プログラマブル回路は、論理ＡＮＤ、論理ＯＲ、論理ＸＯＲ、論理ＮＡＮＤ、論理ＮＯＲ、および他の論理操作、フリップフロップ、レジスタ、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、プログラマブルロジックアレイ（ＰＬＡ）等のようなメモリ要素等を含む、再構成可能なハードウェア回路を含んでよい。 Various embodiments of the present invention may be described with reference to flowcharts and block diagrams, wherein the block is (1) a stage of the process in which the operation is performed or (2) a device having a role of performing the operation. May represent a section of. Specific stages and sections are implemented by dedicated circuits, programmable circuits supplied with computer-readable instructions stored on computer-readable media, and / or processors supplied with computer-readable instructions stored on computer-readable media. It's okay. Dedicated circuits may include digital and / or analog hardware circuits, and may include integrated circuits (ICs) and / or discrete circuits. Programmable circuits are memory elements such as logical AND, logical OR, logical XOR, logical NAND, logical NOR, and other logical operations, flip flops, registers, field programmable gate arrays (FPGAs), programmable logic arrays (PLA), etc. May include reconfigurable hardware circuits, including, etc.

コンピュータ可読媒体は、適切なデバイスによって実行される命令を格納可能な任意の有形なデバイスを含んでよく、その結果、そこに格納される命令を有するコンピュータ可読媒体は、フローチャートまたはブロック図で指定された操作を実行するための手段を作成すべく実行され得る命令を含む、製品を備えることになる。コンピュータ可読媒体の例としては、電子記憶媒体、磁気記憶媒体、光記憶媒体、電磁記憶媒体、半導体記憶媒体等が含まれてよい。コンピュータ可読媒体のより具体的な例としては、フロッピー（登録商標）ディスク、ディスケット、ハードディスク、ランダムアクセスメモリ（ＲＡＭ）、リードオンリメモリ（ＲＯＭ）、消去可能プログラマブルリードオンリメモリ（ＥＰＲＯＭまたはフラッシュメモリ）、電気的消去可能プログラマブルリードオンリメモリ（ＥＥＰＲＯＭ）、静的ランダムアクセスメモリ（ＳＲＡＭ）、コンパクトディスクリードオンリメモリ（ＣＤ-ＲＯＭ）、デジタル多用途ディスク（ＤＶＤ）、ブルーレイ（ＲＴＭ）ディスク、メモリスティック、集積回路カード等が含まれてよい。 The computer readable medium may include any tangible device capable of storing instructions executed by the appropriate device, so that the computer readable medium having the instructions stored therein is specified in a flow chart or block diagram. It will be equipped with a product that contains instructions that can be executed to create means for performing the operation. Examples of computer-readable media may include electronic storage media, magnetic storage media, optical storage media, electromagnetic storage media, semiconductor storage media, and the like. More specific examples of computer readable media include floppy (registered trademark) disks, diskettes, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), Electrically erasable programmable read-only memory (EEPROM), static random access memory (SRAM), compact disk read-only memory (CD-ROM), digital versatile disk (DVD), Blu-ray (RTM) disk, memory stick, integrated A circuit card or the like may be included.

コンピュータ可読命令は、アセンブラ命令、命令セットアーキテクチャ（ＩＳＡ）命令、マシン命令、マシン依存命令、マイクロコード、ファームウェア命令、状態設定データ、またはＳｍａｌｌｔａｌｋ、ＪＡＶＡ（登録商標）、Ｃ＋＋等のようなオブジェクト指向プログラミング言語、および「Ｃ」プログラミング言語または同様のプログラミング言語のような従来の手続型プログラミング言語を含む、１または複数のプログラミング言語の任意の組み合わせで記述されたソースコードまたはオブジェクトコードのいずれかを含んでよい。 Computer-readable instructions are assembler instructions, instruction set architecture (ISA) instructions, machine instructions, machine-dependent instructions, microcode, firmware instructions, state-setting data, or object-oriented programming such as Smalltalk, JAVA®, C ++, etc. Includes either source code or object code written in any combination of one or more programming languages, including languages, and traditional procedural programming languages such as the "C" programming language or similar programming languages. good.

コンピュータ可読命令は、汎用コンピュータ、特殊目的のコンピュータ、若しくは他のプログラム可能なデータ処理装置のプロセッサまたはプログラマブル回路に対し、ローカルにまたはローカルエリアネットワーク（ＬＡＮ）、インターネット等のようなワイドエリアネットワーク（ＷＡＮ）を介して提供され、フローチャートまたはブロック図で指定された操作を実行するための手段を作成すべく、コンピュータ可読命令を実行してよい。プロセッサの例としては、コンピュータプロセッサ、処理ユニット、マイクロプロセッサ、デジタル信号プロセッサ、コントローラ、マイクロコントローラ等を含む。 Computer-readable instructions are used locally or to a local area network (LAN), wide area network (WAN) such as the Internet, to a general purpose computer, a special purpose computer, or the processor or programmable circuit of another programmable data processing device. ) May execute computer-readable instructions to create means for performing the operations specified in the flowchart or block diagram. Examples of processors include computer processors, processing units, microprocessors, digital signal processors, controllers, microcontrollers, and the like.

図１４は、本発明の複数の態様が全体的または部分的に具現化されてよいコンピュータ１２００の例を示す。コンピュータ１２００にインストールされたプログラムは、コンピュータ１２００に、本発明の実施形態に係る装置に関連付けられる操作または当該装置の１または複数のセクションとして機能させることができ、または当該操作または当該１または複数のセクションを実行させることができ、および／またはコンピュータ１２００に、本発明の実施形態に係るプロセスまたは当該プロセスの段階を実行させることができる。そのようなプログラムは、コンピュータ１２００に、本明細書に記載のフローチャートおよびブロック図のブロックのうちのいくつかまたはすべてに関連付けられた特定の操作を実行させるべく、ＣＰＵ１２１２によって実行されてよい。 FIG. 14 shows an example of a computer 1200 in which a plurality of aspects of the present invention may be embodied in whole or in part. The program installed on the computer 1200 can cause the computer 1200 to function as an operation associated with the device according to an embodiment of the invention or as one or more sections of the device, or the operation or the one or more. The section can be run and / or the computer 1200 can be run the process according to an embodiment of the invention or a stage of the process. Such a program may be run by the CPU 1212 to cause the computer 1200 to perform certain operations associated with some or all of the blocks of the flowcharts and block diagrams described herein.

本実施形態によるコンピュータ１２００は、ＣＰＵ１２１２、ＲＡＭ１２１４、グラフィックコントローラ１２１６、およびディスプレイデバイス１２１８を含み、それらはホストコントローラ１２１０によって相互に接続されている。コンピュータ１２００はまた、通信インタフェース１２２２、ハードディスクドライブ１２２４、ＤＶＤ－ＲＯＭドライブ１２２６、およびＩＣカードドライブのような入／出力ユニットを含み、それらは入／出力コントローラ１２２０を介してホストコントローラ１２１０に接続されている。コンピュータはまた、ＲＯＭ１２３０およびキーボード１２４２のようなレガシの入／出力ユニットを含み、それらは入／出力チップ１２４０を介して入／出力コントローラ１２２０に接続されている。 The computer 1200 according to this embodiment includes a CPU 1212, a RAM 1214, a graphic controller 1216, and a display device 1218, which are interconnected by a host controller 1210. The computer 1200 also includes input / output units such as a communication interface 1222, a hard disk drive 1224, a DVD-ROM drive 1226, and an IC card drive, which are connected to the host controller 1210 via the input / output controller 1220. There is. The computer also includes legacy input / output units such as the ROM 1230 and keyboard 1242, which are connected to the input / output controller 1220 via the input / output chip 1240.

ＣＰＵ１２１２は、ＲＯＭ１２３０およびＲＡＭ１２１４内に格納されたプログラムに従い動作し、それにより各ユニットを制御する。グラフィックコントローラ１２１６は、ＲＡＭ１２１４内に提供されるフレームバッファ等またはそれ自体の中にＣＰＵ１２１２によって生成されたイメージデータを取得し、イメージデータがディスプレイデバイス１２１８上に表示されるようにする。 The CPU 1212 operates according to a program stored in the ROM 1230 and the RAM 1214, thereby controlling each unit. The graphic controller 1216 acquires the image data generated by the CPU 1212 in a frame buffer or the like provided in the RAM 1214 or itself so that the image data is displayed on the display device 1218.

通信インターフェース１２２２は、ネットワークを介して他の電子デバイスと通信する。ハードディスクドライブ１２２４は、コンピュータ１２００内のＣＰＵ１２１２によって使用されるプログラムおよびデータを格納する。ＤＶＤ－ＲＯＭドライブ１２２６は、プログラムまたはデータをＤＶＤ‐ＲＯＭ１２０１から読み取り、ハードディスクドライブ１２２４にＲＡＭ１２１４を介してプログラムまたはデータを提供する。ＩＣカードドライブは、プログラムおよびデータをＩＣカードから読み取り、および／またはプログラムおよびデータをＩＣカードに書き込む。 The communication interface 1222 communicates with other electronic devices via the network. The hard disk drive 1224 stores programs and data used by the CPU 1212 in the computer 1200. The DVD-ROM drive 1226 reads the program or data from the DVD-ROM 1201 and provides the program or data to the hard disk drive 1224 via the RAM 1214. The IC card drive reads programs and data from the IC card and / or writes programs and data to the IC card.

ＲＯＭ１２３０はその中に、アクティブ化時にコンピュータ１２００によって実行されるブートプログラム等、および／またはコンピュータ１２００のハードウェアに依存するプログラムを格納する。入／出力チップ１２４０はまた、様々な入／出力ユニットをパラレルポート、シリアルポート、キーボードポート、マウスポート等を介して、入／出力コントローラ１２２０に接続してよい。 The ROM 1230 stores in it a boot program or the like executed by the computer 1200 at the time of activation and / or a program depending on the hardware of the computer 1200. The input / output chip 1240 may also connect various input / output units to the input / output controller 1220 via a parallel port, serial port, keyboard port, mouse port, and the like.

プログラムが、ＤＶＤ－ＲＯＭ１２０１またはＩＣカードのようなコンピュータ可読媒体によって提供される。プログラムは、コンピュータ可読媒体から読み取られ、コンピュータ可読媒体の例でもあるハードディスクドライブ１２２４、ＲＡＭ１２１４、またはＲＯＭ１２３０にインストールされ、ＣＰＵ１２１２によって実行される。これらのプログラム内に記述される情報処理は、コンピュータ１２００に読み取られ、プログラムと、上記様々なタイプのハードウェアリソースとの間の連携をもたらす。装置または方法が、コンピュータ１２００の使用に従い情報の操作または処理を実現することによって構成されてよい。 The program is provided by a computer readable medium such as a DVD-ROM 1201 or an IC card. The program is read from a computer readable medium, installed in a hard disk drive 1224, RAM 1214, or ROM 1230, which is also an example of a computer readable medium, and executed by the CPU 1212. The information processing described in these programs is read by the computer 1200 and provides a link between the program and the various types of hardware resources described above. The device or method may be configured to implement the manipulation or processing of information in accordance with the use of the computer 1200.

例えば、通信がコンピュータ１２００および外部デバイス間で実行される場合、ＣＰＵ１２１２は、ＲＡＭ１２１４にロードされた通信プログラムを実行し、通信プログラムに記述された処理に基づいて、通信インターフェース１２２２に対し、通信処理を命令してよい。通信インターフェース１２２２は、ＣＰＵ１２１２の制御下、ＲＡＭ１２１４、ハードディスクドライブ１２２４、ＤＶＤ‐ＲＯＭ１２０１、またはＩＣカードのような記録媒体内に提供される送信バッファ処理領域に格納された送信データを読み取り、読み取られた送信データをネットワークに送信し、またはネットワークから受信された受信データを記録媒体上に提供される受信バッファ処理領域等に書き込む。 For example, when communication is executed between the computer 1200 and an external device, the CPU 1212 executes a communication program loaded in the RAM 1214, and performs communication processing with respect to the communication interface 1222 based on the processing described in the communication program. You may order. Under the control of the CPU 1212, the communication interface 1222 reads and reads transmission data stored in a transmission buffer processing area provided in a recording medium such as a RAM 1214, a hard disk drive 1224, a DVD-ROM 1201, or an IC card. The data is transmitted to the network, or the received data received from the network is written to the reception buffer processing area provided on the recording medium.

また、ＣＰＵ１２１２は、ハードディスクドライブ１２２４、ＤＶＤ‐ＲＯＭドライブ１２２６（ＤＶＤ‐ＲＯＭ１２０１）、ＩＣカード等のような外部記録媒体に格納されたファイルまたはデータベースの全部または必要な部分がＲＡＭ１２１４に読み取られるようにし、ＲＡＭ１２１４上のデータに対し様々なタイプの処理を実行してよい。ＣＰＵ１２１２は次に、処理されたデータを外部記録媒体にライトバックする。 Further, the CPU 1212 makes the RAM 1214 read all or necessary parts of a file or a database stored in an external recording medium such as a hard disk drive 1224, a DVD-ROM drive 1226 (DVD-ROM1201), an IC card, or the like. Various types of processing may be performed on the data on the RAM 1214. The CPU 1212 then writes back the processed data to an external recording medium.

様々なタイプのプログラム、データ、テーブル、およびデータベースのような様々なタイプの情報が記録媒体に格納され、情報処理を受けてよい。ＣＰＵ１２１２は、ＲＡＭ１２１４から読み取られたデータに対し、本開示の随所に記載され、プログラムの命令シーケンスによって指定される様々なタイプの操作、情報処理、条件判断、条件分岐、無条件分岐、情報の検索／置換等を含む、様々なタイプの処理を実行してよく、結果をＲＡＭ１２１４に対しライトバックする。また、ＣＰＵ１２１２は、記録媒体内のファイル、データベース等における情報を検索してよい。例えば、各々が第２の属性の属性値に関連付けられた第１の属性の属性値を有する複数のエントリが記録媒体内に格納される場合、ＣＰＵ１２１２は、第１の属性の属性値が指定される、条件に一致するエントリを当該複数のエントリの中から検索し、当該エントリ内に格納された第２の属性の属性値を読み取り、それにより予め定められた条件を満たす第１の属性に関連付けられた第２の属性の属性値を取得してよい。 Various types of information such as various types of programs, data, tables, and databases may be stored in recording media and processed. The CPU 1212 describes various types of operations, information processing, conditional judgment, conditional branching, unconditional branching, and information retrieval described in various parts of the present disclosure with respect to the data read from the RAM 1214. Various types of processing may be performed, including / replacement, etc., and the results are written back to the RAM 1214. Further, the CPU 1212 may search for information in a file, database, or the like in the recording medium. For example, when a plurality of entries each having an attribute value of the first attribute associated with the attribute value of the second attribute are stored in the recording medium, the CPU 1212 specifies the attribute value of the first attribute. Search for an entry that matches the condition from the plurality of entries, read the attribute value of the second attribute stored in the entry, and associate it with the first attribute that satisfies the predetermined condition. The attribute value of the second attribute obtained may be acquired.

上で説明したプログラムまたはソフトウェアモジュールは、コンピュータ１２００上またはコンピュータ１２００近傍のコンピュータ可読媒体に格納されてよい。また、専用通信ネットワークまたはインターネットに接続されたサーバーシステム内に提供されるハードディスクまたはＲＡＭのような記録媒体が、コンピュータ可読媒体として使用可能であり、それによりプログラムを、ネットワークを介してコンピュータ１２００に提供する。 The program or software module described above may be stored on a computer 1200 or on a computer readable medium near the computer 1200. Also, a recording medium such as a hard disk or RAM provided in a dedicated communication network or a server system connected to the Internet can be used as a computer readable medium, thereby providing the program to the computer 1200 over the network. do.

以上、本発明を実施の形態を用いて説明したが、本発明の技術的範囲は上記実施の形態に記載の範囲には限定されない。上記実施の形態に、多様な変更または改良を加えることが可能であることが当業者に明らかである。その様な変更または改良を加えた形態も本発明の技術的範囲に含まれ得ることが、特許請求の範囲の記載から明らかである。 Although the present invention has been described above using the embodiments, the technical scope of the present invention is not limited to the scope described in the above embodiments. It will be apparent to those skilled in the art that various changes or improvements can be made to the above embodiments. It is clear from the description of the claims that the form with such changes or improvements may be included in the technical scope of the present invention.

特許請求の範囲、明細書、および図面中において示した装置、システム、プログラム、および方法における動作、手順、ステップ、および段階等の各処理の実行順序は、特段「より前に」、「先立って」等と明示しておらず、また、前の処理の出力を後の処理で用いるのでない限り、任意の順序で実現しうることに留意すべきである。特許請求の範囲、明細書、および図面中の動作フローに関して、便宜上「まず、」、「次に、」等を用いて説明したとしても、この順で実施することが必須であることを意味するものではない。 The order of execution of each process such as operation, procedure, step, and step in the apparatus, system, program, and method shown in the claims, specification, and drawings is particularly "before" and "prior to". It should be noted that it can be realized in any order unless the output of the previous process is used in the subsequent process. Even if the scope of claims, the specification, and the operation flow in the drawings are explained using "first", "next", etc. for convenience, it means that it is essential to carry out in this order. It's not a thing.

１０請求書画像、１２請求書画像、２０請求情報、２２請求情報、３０特徴量、４０ラベル、１００帳票情報生成装置、１１０記憶部、１１１学習済み抽出部、１１２選択用データ、１１３読取条件、１１４対応関係、１２０抽出部、１３０算出部、１４０選択部、１５０読取部、１６０生成部、２００学習装置、２１０記憶部、２１１学習用データ、２１２学習用パラメータ、２２０抽出部、２３０類似度算出部、２４０損失関数算出部

10 invoice image, 12 invoice image, 20 invoice information, 22 invoice information, 30 features, 40 labels, 100 form information generator, 110 storage unit, 111 learned extraction unit, 112 selection data, 113 reading conditions, 114 correspondence, 120 extraction unit, 130 calculation unit, 140 selection unit, 150 reading unit, 160 generation unit, 200 learning device, 210 storage unit, 211 learning data, 212 learning parameter, 220 extraction unit, 230 similarity calculation. Part, 240 Loss function calculation part

Claims

An acquisition unit that acquires the past feature amount extracted from the past form image in association with the past form information based on the character information described in the past form image, and the acquisition unit.
An extraction unit that extracts the target feature amount from the target form image,
A calculation unit that calculates the degree of similarity between the target feature amount and the past feature amount,
A selection unit that selects past form information based on the similarity, and
A reading unit that reads character information of predetermined items from the target form image, and
A generation unit that generates and displays new form information associated with the target form image from the past form information selected by the selection unit and the character information read by the reading unit. A form information generator equipped with and.

The form information generation device according to claim 1, wherein the extraction unit corresponds to the one obtained by deep distance learning.

The extraction unit responds to the input of labels and similarity of a set of form images. When the labels are the same, the larger the similarity, the smaller the value, and when the labels are different, the smaller the similarity, the smaller the value. The form information generation device according to claim 1 or 2, which corresponds to the one obtained by learning using.

The past feature amount and the extraction unit input the similarity between the first label of one form image, the feature amount obtained from the form image, and the feature amount corresponding to a plurality of preset second labels. When the first label and the second label are the same label, the larger the similarity is, the smaller the value is, and when the first label and the second label are different labels, the smaller the similarity is, the smaller the value is. The form information generation device according to claim 1 or 2, which corresponds to the one obtained by learning using a function.

Any one of claims 1 to 4 in which the generation unit generates the new form information by rewriting the character information of the selected past form information with the read character information. The form information generator described in the section.

The form information generation device according to claim 5, wherein the generation unit uses the character information of the selected past form information for the new form information with respect to the information of the characters that have not been rewritten.

The form information generation device according to any one of claims 1 to 6, wherein the generation unit distinguishably displays a portion of the new form information that is the same as or different from the past form information.

The form information generation device according to any one of claims 1 to 7, wherein the generation unit displays the new form information according to the accuracy of reading the character information read by the reading unit.

The form information generation device according to any one of claims 1 to 8, wherein the selection unit warns when there is no past feature amount having a similarity equal to or higher than a predetermined threshold value.

A device for generating billing information to be read by the accounting system.
The billing information and the invoice image corresponding to the billing information include the first information and the second information, respectively.
An acquisition unit that stores past invoice information in association with past invoice images,
A first information acquisition unit that selects a past invoice image similar to the new invoice image by referring to the acquisition unit and extracts the first information from the selected past invoice information.
A second information acquisition unit that reads the second information from the new invoice image with an optical character recognition device (OCR), and
A generation unit that generates new billing information from the first information extracted by the first information acquisition unit and the second information read by the second information acquisition unit.
A device equipped with.

A step of associating a past feature amount extracted from a past form image with a past form information based on the character information described in the past form image, and a step of acquiring the past form information.
Steps to extract the target feature amount from the target form image,
A step of calculating the degree of similarity between the target feature amount and the past feature amount,
The step of selecting the past form information based on the similarity, and
The step of reading the character information of the predetermined item from the target form image, and
From the past form information selected in the selection step and the character information read in the reading step, new form information associated with the target form image is generated and displayed. How to prepare with steps.

A program that realizes the method according to claim 11 on a computer.

It is a method for generating billing information to be read by the accounting system.
The billing information and the invoice image corresponding to the billing information include the first information and the second information, respectively.
Steps to obtain past invoice information in association with past invoice images,
A step of selecting a past invoice image similar to the new invoice image by referring to the acquired past invoice image and extracting the first information from the selected past invoice information.
A step of reading the second information from the new invoice image with an optical character recognition device (OCR), and
A step of generating new billing information from the first information and the second information,
How to prepare.

A program that realizes the method according to claim 13 on a computer.