JP2022139564A

JP2022139564A - Information processing device, information processing device control method, and program

Info

Publication number: JP2022139564A
Application number: JP2021040008A
Authority: JP
Inventors: 浩太郎松田; Kotaro Matsuda
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2021-03-12
Filing date: 2021-03-12
Publication date: 2022-09-26

Abstract

To simplify and facilitate a document processing flow and reduce user's work effort.SOLUTION: A business form template is learned in advance in order to determine the type of a business form. After that, it is determined whether a scanned document matches the learned business form template. When the matching with the learned business form template is confirmed, a document's recognition confidence factor is calculated based on a confidence factor of the business form and a confidence factor of character recognition results. The recognition confidence factor is classified into "very high," "high," and "medium-low." When the document's recognition confidence factor is "very high," the document is registered without user's confirmation of matching between an OCR area image and a recognition result character string.SELECTED DRAWING: Figure 3

Description

本発明は、情報処理装置、情報処理装置の制御方法およびプログラムに関する。 The present invention relates to an information processing device, a control method for an information processing device, and a program.

従来、スキャンや撮影された画像から、光学文字認識（ＯＣＲ）を用いて、文字列を抽出する技術が知られている。このような技術を用いた例としては、スキャン文書を注文書や、請求書などの帳票種別に区別して、帳票種別ごとに必要な業務データをＯＣＲ処理の認識結果文字列から入力するアプリケーション等がある。その際に、スキャン文書から抽出した認識結果文字列をメタデータとして保存、管理して、業務データとして使用するユースケースがある。
ＯＣＲエンジンでの画像認識では、スキャン文書の画質、帳票の種類が様々であるため、帳票種別判定の精度、ＯＣＲ文字認識精度ともに１００％とすることは困難である。そこで、スキャン文書ごとに認識の確からしさを示す確信度という値を代わりに算出し、この確信度を効果的に活用して、スキャンジョブの処理操作をユーザーに代わって、一部あるいは全部を自動化することができれば、業務効率の向上が望める。
特許文献１には、文書ごとの特徴量を用いて、データをフォルダーに振り分けする処理の精度を向上させる技術が開示されている。 2. Description of the Related Art Conventionally, a technique for extracting character strings from scanned or photographed images using optical character recognition (OCR) is known. As an example of using such a technology, there is an application that distinguishes scanned documents by form type such as an order form or an invoice, and inputs the necessary business data for each form type from the recognition result character string of OCR processing. be. At that time, there is a use case in which the recognition result character string extracted from the scanned document is stored and managed as metadata and used as business data.
In image recognition by an OCR engine, since the image quality of scanned documents and the types of forms vary, it is difficult to achieve 100% accuracy in both form type determination and OCR character recognition. Therefore, instead of calculating a certainty value that indicates the certainty of recognition for each scanned document, this certainty can be effectively used to automate some or all of the scan job processing operations on behalf of the user. If it is possible to do so, it is hoped that work efficiency will be improved.
Japanese Patent Application Laid-Open No. 2002-200003 discloses a technique for improving the accuracy of processing for sorting data into folders by using feature amounts for each document.

特開２０１４－１４９７４３号公報JP 2014-149743 A

しかしながら、特許文献１に開示されている技術では、ユーザーによる突合確認の手間を軽減することができず、業務効率の向上は十分に望めない。 However, with the technology disclosed in Patent Document 1, it is not possible to reduce the time and effort of matching confirmation by the user, and improvement in work efficiency cannot be sufficiently expected.

本発明は前述の問題点に鑑み、文書処理フローを簡便、容易にするとともに、ユーザーの作業労力を低減できるようにすることを目的としている。 SUMMARY OF THE INVENTION An object of the present invention is to simplify and facilitate the flow of document processing and to reduce user's labor.

本発明に係る情報処理装置は、文字列を含むスキャン文書について文字認識を実行することにより文字列を抽出する抽出手段と、前記スキャン文書が予め記憶されている学習済み帳票と一致するか否かを判定する判定手段と、前記スキャン文書の確信度を算出する算出手段と、前記判定手段によって、前記スキャン文書が学習済み帳票と一致すると判定された場合であって、前記算出手段によって算出した確信度が第１の閾値以上である場合、前記抽出手段により抽出された文字列をメタデータとして登録する制御手段と、を有することを特徴とする。 An information processing apparatus according to the present invention includes an extracting unit for extracting a character string by executing character recognition on a scanned document containing the character string, and a device for determining whether or not the scanned document matches a pre-stored learned form. a calculating means for calculating the certainty factor of the scanned document; and a case where it is judged by the judging means that the scanned document matches the learned form, and the certainty calculated by the calculating means and control means for registering the character string extracted by the extraction means as metadata when the degree is equal to or greater than a first threshold.

本発明によれば、文書処理フローを簡便、容易にするとともに、ユーザーの作業労力を低減できる。 According to the present invention, the document processing flow can be simplified and facilitated, and the work effort of the user can be reduced.

実施形態のシステム構成およびネットワーク構成例を示す図である。It is a figure which shows the system configuration|structure of embodiment, and a network configuration example. 実施形態のハードウェア構成例を示すブロック図である。It is a block diagram showing an example of hardware composition of an embodiment. 実施形態のソフトウェアおよびハードウェア構成例を示すブロック図である。It is a block diagram showing an example of software and hardware configuration of an embodiment. クライアントアプリケーションのＵＩを説明するための図である。FIG. 4 is a diagram for explaining a UI of a client application; FIG. メタデータにＯＣＲ領域を関連付けるＵＩを説明するための図である。FIG. 10 is a diagram for explaining a UI for associating an OCR area with metadata; 帳票テンプレートを保存するＵＩを説明するための図である。FIG. 10 is a diagram for explaining a UI for saving a form template; FIG. クライアントアプリケーションのスキャンジョブ一覧ＵＩを説明するための図である。FIG. 5 is a diagram for explaining a scan job list UI of a client application; 新規ジョブの到着を通知するメッセージを説明するための図である。FIG. 10 is a diagram for explaining a message notifying the arrival of a new job; FIG. 新規ジョブ通知および自動送信ジョブ通知を表示する方法を説明するための図である。FIG. 10 is a diagram for explaining a method of displaying new job notifications and automatic transmission job notifications; 通知センターＵＩを説明するための図である。FIG. 10 is a diagram for explaining a notification center UI; スキャン文書処理手順の一例を示すフローチャートである。4 is a flow chart showing an example of a scanned document processing procedure; 期限切れスキャン文書の処理手順の一例を示すフローチャートである。10 is a flow chart showing an example of a processing procedure for an expired scanned document;

以下、本発明を実施するための最良の形態について、図面を用いて説明する。
まず、図１を用いて、本実施形態に係る情報処理システム１のシステム構成およびネットワーク構成について説明する。
本実施形態に係る情報処理システム１は、スキャン文書処理サーバー１１１と、クライアント端末１２１と、業務サーバー１３１とを有している。スキャン文書処理サーバー１１１と、クライアント端末１２１と、業務サーバー１３１とは、インターネットやイントラネットなどのネットワーク１０１を介して通信可能に接続される。
スキャン文書処理サーバー１１１は、スキャン文書を処理するためのサーバーである。
クライアント端末１２１は、例えば、パーソナルコンピューター、ラップトップコンピューター、タブレットコンピューター、スマートフォンなどで構成される。
業務サーバー１３１は、スキャン文書処理サーバー１１１からデータを受け取り、各種処理を行うためのサーバーである。 BEST MODE FOR CARRYING OUT THE INVENTION The best mode for carrying out the present invention will be described below with reference to the drawings.
First, the system configuration and network configuration of an information processing system 1 according to the present embodiment will be described using FIG.
The information processing system 1 according to this embodiment has a scanned document processing server 111 , a client terminal 121 , and a business server 131 . The scanned document processing server 111, client terminal 121, and business server 131 are communicably connected via a network 101 such as the Internet or an intranet.
The scanned document processing server 111 is a server for processing scanned documents.
The client terminal 121 is composed of, for example, a personal computer, a laptop computer, a tablet computer, a smart phone, or the like.
The business server 131 is a server for receiving data from the scanned document processing server 111 and performing various processes.

次に、図２を用いて、スキャン文書処理サーバー１１１、クライアント端末１２１、および業務サーバー１３１のハードウェア構成例について説明する。
スキャン文書処理サーバー１１１、クライアント端末１２１、および業務サーバー１３１は、ユーザーＩ／Ｆ２０１と、ネットワークＩ／Ｆ２０２と、ＣＰＵ２０３と、ＲＯＭ２０４と、ＲＡＭ２０５と、二次記憶装置２０６と、内部バス２０７とを有している。各部は、内部バス２０７を介して接続されている。
ネットワークＩ／Ｆ２０２は、ネットワーク１０１を介して、他のコンピューターやネットワーク機器との通信を行う。通信の方式としては、有線・無線のいずれでもよい。 Next, a hardware configuration example of the scanned document processing server 111, the client terminal 121, and the business server 131 will be described with reference to FIG.
Scan document processing server 111 , client terminal 121 , and business server 131 have user I/F 201 , network I/F 202 , CPU 203 , ROM 204 , RAM 205 , secondary storage device 206 , and internal bus 207 . is doing. Each unit is connected via an internal bus 207 .
A network I/F 202 communicates with other computers and network devices via the network 101 . The communication method may be wired or wireless.

ＲＯＭ２０４は、組込済みプログラムおよびデータが記録されている。ＲＡＭ２０５は、一時メモリ領域である。二次記憶装置２０６は、ＨＤＤやフラッシュメモリなどの記憶装置である。
ＣＰＵ２０３は、ＲＯＭ２０４、ＲＡＭ２０５、二次記憶装置２０６などから読み込んだプログラムを実行する。ユーザーＩ／Ｆ２０１は、ディスプレイ、キーボード、マウス、ボタン、タッチパネルなどによる、情報や信号などの入出力を行う。
なお、これらのハードウェアを備えないコンピューターは、リモートデスクトップやリモートシェルなどにより、他のコンピューターから接続・操作するようにしてもよい。 The ROM 204 records embedded programs and data. RAM 205 is a temporary memory area. A secondary storage device 206 is a storage device such as an HDD or a flash memory.
The CPU 203 executes programs read from the ROM 204, the RAM 205, the secondary storage device 206, and the like. A user I/F 201 inputs and outputs information, signals, and the like using a display, keyboard, mouse, button, touch panel, and the like.
Computers without such hardware may be connected and operated from other computers by means of remote desktop, remote shell, or the like.

次に、図３を用いて、本実施形態に係る情報処理システム１のソフトウェア構成例について説明する。各ハードウェアにインストールされたソフトウェアは、それぞれＣＰＵ２０３で実行され、ネットワーク接続の矢印で図示するように、相互に通信可能な構成となっている。 Next, a software configuration example of the information processing system 1 according to this embodiment will be described with reference to FIG. The software installed in each piece of hardware is executed by the CPU 203, and is configured to be able to communicate with each other, as indicated by network connection arrows.

まず、スキャン文書処理サーバー１１１は、スキャン文書処理アプリケーション３１１と、データストア３２１と、バックエンドアプリケーション３３１と、を有している。
スキャン文書処理アプリケーション３１１は、スキャン文書処理サーバー１１１にインストールされたアプリケーションである。本実施形態では、Ｗｅｂアプリケーションサーバーとして動作するものとして説明するが、他のアプリケーション実装形態として動作してもよい。
スキャン文書処理アプリケーション３１１は、ＡＰＩ（ＡｐｐｌｉｃａｔｉｏｎＰｒｏｇｒａｍｍｉｎｇＩｎｔｅｒｆａｃｅ）３１２と、ＷｅｂＵＩ（ＵｓｅｒＩｎｔｅｒｆａｃｅ）３１３とを有している。 First, the scanned document processing server 111 has a scanned document processing application 311 , a data store 321 and a backend application 331 .
A scanned document processing application 311 is an application installed in the scanned document processing server 111 . Although this embodiment is described as operating as a web application server, it may operate as another application implementation form.
The scanned document processing application 311 has an API (Application Programming Interface) 312 and a Web UI (User Interface) 313 .

データストア３２１は、スキャン文書処理アプリケーション３１１または後述するバックエンドアプリケーション３３１が使用するデータを保存、格納するデータストアである。データストア３２１は、文書格納部３２２と、ジョブキュー３２３と、管理部３２４と、結果格納部３２５と、を有している。
文書格納部３２２は、スキャン文書自体のファイルを、ＪＰＥＧ等の画像ファイルあるいはＰＤＦ（ＰｏｒｔａｂｌｅＤｏｃｕｍｅｎｔＦｏｒｍａｔ）等の文書ファイルとして保持する。
ジョブキュー３２３は、後述するメタデータ入力処理待ちのジョブを管理するキューを保持する。 The data store 321 is a data store that saves and stores data used by the scanned document processing application 311 or a backend application 331 described later. The data store 321 has a document storage section 322 , a job queue 323 , a management section 324 and a result storage section 325 .
The document storage unit 322 holds the file of the scanned document itself as an image file such as JPEG or a document file such as PDF (Portable Document Format).
The job queue 323 holds a queue for managing jobs waiting for metadata input processing, which will be described later.

管理部３２４は、スキャン文書ごとに付加が必要なメタデータの一覧、メタデータごとの名前、値のフォーマット（文字列・数字など）、後述で説明する色情報などを保持する。
結果格納部３２５は、ＯＣＲ処理結果、および帳票判別結果を格納する。また、結果格納部３２５は、スキャン文書ごとにメタデータに関連付けられたＯＣＲ領域情報や、入力されたメタデータの値などを保持する。 The management unit 324 holds a list of metadata that needs to be added to each scanned document, a name for each metadata, a value format (character string, number, etc.), color information to be described later, and the like.
The result storage unit 325 stores OCR processing results and form determination results. The result storage unit 325 also holds OCR area information associated with metadata for each scanned document, input metadata values, and the like.

バックエンドアプリケーション３３１は、次に示すようなバックグラウンドで順次実行すればよい処理を担当する。バックエンドアプリケーション３３１は、ＯＣＲ処理部３３２と、帳票処理部３３３と、通信部３３４と、算出部３３５と、を有している。
ＯＣＲ処理部３３２は、文書格納部３２２からスキャン文書を取得し、ＯＣＲを実行する。ＯＣＲ処理部３３２は、ＯＣＲ処理において、文字列と認識された領域の始点座標・幅・高さ、および認識できたＯＣＲ処理の認識結果文字列を抽出する。
帳票処理部３３３は、ＯＣＲ処理を行ったスキャン文書、ＯＣＲ処理結果の領域パターン、ＯＣＲ処理の認識結果文字列などの情報を用いて、帳票の種別を判別する。判別処理は、パターン認識、機械学習などいずれの手法を用いてもよい。 The backend application 331 is in charge of the following processes that may be executed sequentially in the background. The backend application 331 has an OCR processing section 332 , a form processing section 333 , a communication section 334 and a calculation section 335 .
The OCR processing unit 332 acquires the scanned document from the document storage unit 322 and performs OCR. In the OCR processing, the OCR processing unit 332 extracts the starting point coordinates, width, and height of the region recognized as the character string, and the recognition result character string of the OCR processing that has been recognized.
The form processing unit 333 determines the type of the form using information such as the OCR-processed scanned document, the OCR-processed result area pattern, and the OCR-processed recognition result character string. Any technique such as pattern recognition or machine learning may be used for the determination process.

算出部３３５は、帳票処理部３３３による帳票の判別処理の確信度、およびＯＣＲ処理部３３２によるＯＣＲ処理の認識結果文字列の確信度をもとに、スキャン文書に対する認識確信度を算出する。ここで、帳票の判別処理の確信度とは、例えば、帳票種別判別処理に使用する学習済み帳票テンプレートとスキャン文書との類似度のスコア（０％～１００％）である。ＯＣＲ処理の認識結果文字列の確信度とは、例えば、ＯＣＲ領域から認識した文字列が正答文字列に一致する確度（０％～１００％）である。
認識確信度は、１００％を上限とするパーセント単位でのスコアでもよいし、高・中・低などのスコアから区分したレベルでもよいし、任意の付け方をしてよい。また、認識確信度の算出方法は、帳票判別処理および全てのＯＣＲ領域の認識結果文字列の確信度の平均を取ってもよいし、また別の方法でもよい。例えば、確信度の平均で計算すれば高いスコアになるが、一部のＯＣＲ領域が著しく確信度が低ければ、全体としてスコアを下げるような算出方法をとってもよい。いずれにしても認識確信度を用いた処理が本実施形態の趣旨であるため、その算出方法については任意の方法をとってよいものとする。
通信部３３４は、外部の業務サーバー１３１にスキャン文書およびその処理結果を送信する。なお、スキャン文書およびその処理結果を外部に送信する必要がない場合は、通信部３３４は省略してもよい。 The calculating unit 335 calculates the recognition certainty for the scanned document based on the certainty of the form discrimination processing by the form processing unit 333 and the certainty of the recognition result character string of the OCR processing by the OCR processing unit 332 . Here, the reliability of the form discrimination process is, for example, a similarity score (0% to 100%) between the learned form template used for the form type discrimination process and the scanned document. The degree of certainty of the recognition result character string of the OCR processing is, for example, the degree of certainty (0% to 100%) that the character string recognized from the OCR area matches the correct answer character string.
Recognition certainty may be a score in percentage units with an upper limit of 100%, a level divided from the score such as high, medium, or low, or may be assigned in any manner. Further, the method of calculating the recognition certainty factor may be to take the average of the certainty factors of the recognition result character strings of the form discrimination processing and all the OCR areas, or another method may be used. For example, a high score can be obtained by calculating the average confidence factor, but if some OCR regions have a significantly low confidence factor, a calculation method that lowers the score as a whole may be adopted. In any case, since the purpose of the present embodiment is to process using the recognition certainty, any calculation method may be used.
The communication unit 334 transmits the scanned document and its processing result to the external business server 131 . Note that the communication unit 334 may be omitted if there is no need to transmit the scanned document and its processing result to the outside.

次に、クライアント端末１２１は、クライアントアプリケーション３５１を有している。本実施形態では、スキャン文書処理アプリケーション３１１のＷｅｂアプリケーションを実行する。クライアントアプリケーション３５１の提供形態の１つとして、ブラウザーでＷｅｂＵＩ３１３を表示して、必要なデータをＡＰＩ３１２と送受信してＷｅｂアプリケーションを実行する方法がある。または、必要なデータをＡＰＩ３１２と送受信するよう作成されたコンピューターまたはスマートフォンのアプリケーションなどでもよい。 Next, the client terminal 121 has a client application 351 . In this embodiment, the web application of the scanned document processing application 311 is executed. As one form of providing the client application 351, there is a method of displaying the web UI 313 on the browser, sending and receiving necessary data to and from the API 312, and executing the web application. Alternatively, it may be a computer or smart phone application created to transmit and receive necessary data to and from the API 312 .

次に、業務サーバー１３１は、業務アプリケーション３６１と、ストレージ３６２と、を有している。
業務アプリケーション３６１は、業務サーバー１３１が実行するアプリケーションである。ストレージ３６２は、業務アプリケーション３６１が使用するデータを記憶する。業務アプリケーション３６１としては、ファイル管理、文書管理、受注、会計などいずれの業務アプリケーションであってもよい。スキャン文書処理サーバー１１１で処理した結果を、受信、業務処理、および保管する場合に、業務アプリケーション３６１が必要である。それ以外の場合は、省略してもよい。 Next, the business server 131 has a business application 361 and a storage 362 .
The business application 361 is an application executed by the business server 131 . The storage 362 stores data used by the business application 361 . The business application 361 may be any business application such as file management, document management, order receiving, and accounting. The business application 361 is required when receiving, business processing, and storing the results processed by the scanned document processing server 111 . Otherwise, it may be omitted.

次に、図４～図６を用いて、帳票種別判定処理に用いる帳票テンプレートを学習する処理について説明する。具体的には、後述する帳票種別判定処理において、帳票処理部３３３は、スキャン文書に対して、学習済み帳票テンプレートと一致するか否か、を判定する。以下の処理は、スキャン文書が学習済み帳票テンプレートと一致しない、つまり、未学習帳票である場合に、新たに学習済み帳票テンプレートとして登録するための処理である。
まず、図４を用いて、クライアントアプリケーション３５１のＵＩについて説明する。
プレビューペイン４０１は、プレビューページ画像４０２を有している。プレビューペイン４０１では、プレビューページ画像４０２をスクロールまたはズームして、任意の位置の表示をすることができる。メタデータペイン４１１は、スキャン文書が判別された帳票種別ごとに付与すべきメタデータ一覧を表示、編集するためのペインである。図４の例では、スキャン文書は未学習帳票であるとして判定され、入力が必要なメタデータが３つある、という状態を示している。 Next, the process of learning a form template used in the form type determination process will be described with reference to FIGS. 4 to 6. FIG. Specifically, in the form type determination process to be described later, the form processing unit 333 determines whether or not the scanned document matches the learned form template. The following processing is for registering a new learned form template when the scanned document does not match the learned form template, that is, when the document is an unlearned form.
First, the UI of the client application 351 will be described with reference to FIG.
A preview pane 401 has a preview page image 402 . In the preview pane 401, the preview page image 402 can be scrolled or zoomed to display any position. A metadata pane 411 is a pane for displaying and editing a list of metadata to be assigned to each form type for which a scanned document has been determined. The example of FIG. 4 shows a state in which the scanned document is determined to be an unlearned form and there are three pieces of metadata that need to be input.

領域４２１は、プレビューページ画像４０２と、ＯＣＲ領域との関係を示す直交座標系を説明するための領域である。図４に示す例では、ＯＣＲ処理による認識結果であるＯＣＲ領域が、領域４２２、４２３、４２４の３つ存在する。プレビューページ画像４０２には、３つ以上のＯＣＲ領域があるが、ここでは省略して説明する。ＯＣＲ領域は、それぞれの網掛けの矩形のように、始点座標、幅および高さでそれぞれ識別される。例えば、領域４２４であれば、単位をピクセルとして始点座標（１２００，７００）、幅７２０、高さ１２０などと表現される。 An area 421 is an area for explaining an orthogonal coordinate system indicating the relationship between the preview page image 402 and the OCR area. In the example shown in FIG. 4, there are three OCR areas 422, 423, and 424, which are recognition results of OCR processing. The preview page image 402 has three or more OCR areas, which are omitted here. The OCR regions are each identified by their starting point coordinates, width and height, like their respective shaded rectangles. For example, the area 424 is expressed as starting point coordinates (1200, 700), width 720, height 120, etc. with the unit being pixels.

図５は、メタデータにプレビューページ画像のＯＣＲ領域を関連付けるＵＩを説明するための図である。領域５０２は、メタデータに関連付けられたＯＣＲ領域である。領域５０２では、ユーザーがＯＣＲ領域であることを識別しやすいように、色付けをした矩形が透過表示されている。なお、色付けの手段は、色付き枠線表示など、他の任意の手段でもよい。
エリア５１２は、メタデータ名５１１である「ＣｕｓｔｏｍｅｒＮａｍｅ」に割り当てられた色を表示するエリアである。エリア５１３は、領域５０２の切り取り画像を表示するエリアである。なお、ＯＣＲ領域との関連付けが未指定の場合、図４のようにエリア５１３には「＋」ボタンが表示される。コントロール５１４は、領域５０２の抽出文字列を表示、編集するためのコントロールである。
必須のメタデータが全て入力されると、登録ボタン５１５が有効化される。ユーザーが登録ボタン５１５を選択すると、次の図６に示す最終確認ダイアログに遷移する。 FIG. 5 is a diagram for explaining a UI for associating an OCR area of a preview page image with metadata. Area 502 is the OCR area associated with the metadata. In the area 502, a colored rectangle is transparently displayed so that the user can easily identify the area as the OCR area. It should be noted that the coloring means may be any other means such as colored border display.
An area 512 is an area for displaying the color assigned to the metadata name 511 “Customer Name”. An area 513 is an area for displaying a clipped image of the area 502 . If the association with the OCR area is not specified, a "+" button is displayed in area 513 as shown in FIG. A control 514 is a control for displaying and editing the extracted character string in the area 502 .
Once all the required metadata has been entered, the register button 515 is activated. When the user selects the registration button 515, the screen transitions to the final confirmation dialog shown in FIG.

図６は、帳票種別判定に用いる帳票テンプレートに名前をつけて保存するＵＩを説明するための図である。ダイアログ６０１は、最終確認ダイアログである。コントロール６０２は、帳票テンプレート名を入力するためのテキストコントロールである。コントロール６０３は、スキャン文書の送信宛先を指定するための選択コントロールである。保存・送信ボタン６０４は、帳票テンプレート名を保存し、スキャンジョブを宛先に送信するためのボタンである。 FIG. 6 is a diagram for explaining a UI for naming and saving a form template used for form type determination. Dialog 601 is the final confirmation dialog. A control 602 is a text control for entering a form template name. A control 603 is a selection control for designating the destination of the scanned document. A save/send button 604 is a button for saving the form template name and sending the scan job to the destination.

上記説明したように帳票テンプレートの学習がされると、次回以降同じ帳票種別のスキャン文書が来たときに、帳票処理部３３３による帳票種別判定処理でスキャン文書は、学習済み帳票テンプレートと一致すると判定される。その後、各メタデータに、学習済み帳票テンプレートにて定義済みのＯＣＲ領域から文字列が入力され、同じ送信宛先が選択されることで、簡便、容易にスキャンジョブの送信を行うことができる。 After the form template is learned as described above, next time a scanned document of the same form type arrives, the form processing unit 333 determines that the scanned document matches the learned form template in the form type determination process. be done. After that, a character string is input to each metadata from the OCR area already defined in the learned form template, and the same transmission destination is selected, so that the scan job can be transmitted simply and easily.

次に、図１１のフローチャートを用いて、スキャン文書処理サーバー１１１において、文書の認識確信度を用いたスキャン文書処理の処理フローについて説明する。
本処理は、ユーザーがクライアント端末１２１等からスキャン文書のファイルを文書格納部３２２にアップロードすることによって開始する。なお、アップロード方法は、クライアントアプリケーション３５１からＡＰＩ３１２を経由してアップロードしてもよいし、スキャナーなどの機器から直接アップロードしてもよい。
ここで、１つのファイルが１つのページ画像しか含んでいない単純な場合と、複数ページ画像を含む場合とがある。後者の場合、１ページ毎に分割し、複数のスキャン文書に分割することとしてもよい。また、バーコードや仕切り紙などを検出して、任意のページ数ごとの複数のスキャン文書に分割することとしてもよい。これら単数、複数いずれのケースでも、Ｎ個のスキャン文書処理を実施する前提で以下処理フローを説明する。 Next, a processing flow of scanned document processing using document recognition certainty in the scanned document processing server 111 will be described with reference to the flowchart of FIG.
This processing starts when the user uploads a scanned document file from the client terminal 121 or the like to the document storage unit 322 . As for the upload method, the upload may be performed from the client application 351 via the API 312, or may be directly uploaded from a device such as a scanner.
Here, there are simple cases where a file contains only one page image, and cases where it contains multiple page images. In the latter case, each page may be divided into a plurality of scanned documents. It is also possible to detect a bar code, partition paper, or the like, and divide the document into a plurality of scanned documents each having an arbitrary number of pages. In both the single and plural cases, the processing flow will be described below on the premise that N scanned documents are processed.

まず、前述したように、ステップＳ１１０１において、バックエンドアプリケーション３３１は、文書格納部３２２にアップロードされた１つのファイルを、Ｎ個のスキャン文書に分割する。以降、ステップＳ１１０２～ステップＳ１１１４は、ｎ＝１からＮまでのループ処理を実行する。
次に、ステップＳ１１０２において、バックエンドアプリケーション３３１は、文書［ｎ］のスキャンジョブをジョブキュー３２３に格納し、ＯＣＲ処理部３３２、帳票処理部３３３および算出部３３５で処理する。
具体的には、ＯＣＲ処理部３３２は、まず、文書［ｎ］についてＯＣＲ処理を実行し、ＯＣＲ処理の認識結果文字列を結果格納部３２５に格納する。次に、帳票処理部３３３は、ＯＣＲ処理の結果に基づいて、文書［ｎ］の帳票種別を判断し、文書［ｎ］の帳票種別と一致する学習済み帳票テンプレートを探索する。帳票処理部３３３は、同一帳票である学習済み帳票テンプレートが存在する場合は、その学習済み帳票テンプレートを結果格納部３２５に格納する。一方、同一帳票である学習済み帳票テンプレートが存在しない場合には、一致しなかったという帳票認識結果を結果格納部３２５に格納する。
次に、学習済み帳票テンプレートと一致した場合、算出部３３５は、文書［ｎ］の認識確信度を算出し、算出部３３５は、算出した認識確信度も同様に結果格納部３２５に格納する。
ステップＳ１１０３において、バックエンドアプリケーション３３１は、ステップＳ１１０２の帳票認識結果として、学習済み帳票テンプレートと一致したか否かを判定する。バックエンドアプリケーション３３１は、学習済み帳票テンプレートと一致したと判定した場合は、処理をステップＳ１１０４に進める。一方、バックエンドアプリケーション３３１は、学習済み帳票テンプレートと一致しなかったと判定した場合は、処理をステップＳ１１１１に進める。 First, as described above, in step S1101, the backend application 331 divides one file uploaded to the document storage unit 322 into N scanned documents. Thereafter, in steps S1102 to S1114, loop processing from n=1 to N is executed.
Next, in step S1102, the backend application 331 stores the scan job of document [n] in the job queue 323, and the OCR processing unit 332, form processing unit 333, and calculation unit 335 process it.
Specifically, the OCR processing unit 332 first performs OCR processing on the document [n], and stores the recognition result character string of the OCR processing in the result storage unit 325 . Next, the form processing unit 333 determines the form type of document [n] based on the result of the OCR processing, and searches for a learned form template that matches the form type of document [n]. If a learned form template of the same form exists, the form processing unit 333 stores the learned form template in the result storage unit 325 . On the other hand, if a learned form template that is the same form does not exist, the result storage unit 325 stores a form recognition result that does not match.
Next, when it matches the learned form template, the calculation unit 335 calculates the recognition certainty of document [n], and the calculation unit 335 stores the calculated recognition certainty in the result storage unit 325 as well.
In step S1103, the back-end application 331 determines whether or not the form recognition result in step S1102 matches the learned form template. If the back-end application 331 determines that it matches the learned form template, the process advances to step S1104. On the other hand, if the back-end application 331 determines that it does not match the learned form template, the process advances to step S1111.

次に、ステップＳ１１０４において、バックエンドアプリケーション３３１は、結果格納部３２５から文書［ｎ］の認識確信度を取得する。
次に、ステップＳ１１０５において、バックエンドアプリケーション３３１は、文書［ｎ］の認識確信度を判定する。本実施形態では、例えば、認識確信度を３つのレベル「かなり高」、「高」、「中低」に分類することとする。すなわち、認識確信度が第１の閾値以上であれば「かなり高」、認識確信度が第１の閾値より小さく、第２の閾値以上であれば「高」、認識確信度が第２の閾値より小さければ「中低」と分類する。
バックエンドアプリケーション３３１は、認識確信度が「かなり高」と判定した場合、ユーザーによる、帳票認識結果およびＯＣＲ処理の認識結果文字列のレビューを不要とし、処理をステップＳ１１０６に進める。
また、バックエンドアプリケーション３３１は、認識確信度が「高」または「中低」と判定した場合、ユーザーによる、帳票認識結果およびＯＣＲ処理の認識結果文字列のレビューを必要とし、処理をステップＳ１１１０に進める。なお、認識確信度が「高」と判定された場合は、認識確信度が「中低」と判定された場合と比べて、ユーザーの操作が極力簡便、容易となるような処理を行う。 Next, in step S1104 , the backend application 331 acquires the recognition confidence of document [n] from the result storage unit 325 .
Next, in step S1105, the backend application 331 determines the recognition confidence of document[n]. In this embodiment, for example, the recognition certainty is classified into three levels of "very high", "high", and "moderately low". That is, if the recognition certainty is greater than or equal to the first threshold, it is "very high"; if the recognition certainty is smaller than the first threshold and greater than or equal to the second threshold, it is "high"; If it is smaller, it is classified as "medium-low".
When the back-end application 331 determines that the recognition certainty is "pretty high", it does not require the user to review the form recognition result and the recognition result character string of the OCR process, and the process proceeds to step S1106.
Further, when the back-end application 331 determines that the recognition confidence is "high" or "medium-low", it requires the user to review the form recognition result and the recognition result character string of the OCR processing, and the process proceeds to step S1110. proceed. When the recognition confidence is determined to be "high", processing is performed so that the user's operation is as simple and easy as possible compared to when the recognition confidence is determined to be "middle-low".

次に、ステップＳ１１０６において、バックエンドアプリケーション３３１は、文書［ｎ］と一致した学習済み帳票テンプレートにおける定義済みのＯＣＲ領域座標およびサイズを用いて、ＯＣＲ処理の認識結果文字列をメタデータとして抽出し、登録する。
ステップＳ１１０７において、バックエンドアプリケーション３３１は、学習済み帳票テンプレートで定義済みの宛先に、文書［ｎ］とメタデータとを、通信部３３４を介して送信する。 Next, in step S1106, the backend application 331 extracts the recognition result character string of the OCR process as metadata using the defined OCR area coordinates and size in the learned form template that matches the document [n]. ,sign up.
In step S1107, the back-end application 331 transmits the document [n] and metadata via the communication unit 334 to the destination defined in the learned form template.

次に、ステップＳ１１０８において、バックエンドアプリケーション３３１は、文書［ｎ］をジョブキュー３２３からデキューする。そして、ステップＳ１１０９において、バックエンドアプリケーション３３１は、文書［ｎ］が送信完了した旨を通知するためのメール通知コンテンツを生成する。 Next, in step S 1108 , the backend application 331 dequeues document[n] from the job queue 323 . Then, in step S1109, the backend application 331 generates mail notification content for notifying that the transmission of document [n] has been completed.

一方で、ステップＳ１１１０において、バックエンドアプリケーション３３１は、ステップＳ１１０６と同様に、ＯＣＲ処理の認識結果文字列をメタデータとして抽出する。但し、ＯＣＲ処理の認識結果文字列は正しい認識結果とは限らないため、仮のメタデータとして抽出する。次に、ステップＳ１１１１において、バックエンドアプリケーション３３１は、ジョブキュー３２３のジョブのステータスを認識完了に更新する。ここで、認識確信度が「高」と判定された場合は、処理をステップＳ１１１２に進め、認識確信度が「中低」と判定された場合、またはステップＳ１１０３で学習済み帳票テンプレートと一致しないと判定された場合は、処理をステップＳ１１１３に進める。 On the other hand, in step S1110, the backend application 331 extracts the recognition result character string of the OCR processing as metadata, as in step S1106. However, since the recognition result character string of the OCR process is not necessarily the correct recognition result, it is extracted as temporary metadata. Next, in step S1111, the backend application 331 updates the status of the job in the job queue 323 to recognition completed. Here, if the recognition confidence is determined to be "high", the process proceeds to step S1112. If so, the process proceeds to step S1113.

次に、ステップＳ１１１２において、バックエンドアプリケーション３３１は、ジョブがレビュー可能であることを通知するメール通知コンテンツを生成する。ここで、図８を用いて、ステップＳ１１１２で生成させるメール通知コンテンツについて説明する。 Next, in step S1112, the backend application 331 generates mail notification content notifying that the job is available for review. Here, the mail notification content generated in step S1112 will be described with reference to FIG.

図８は、新規スキャンジョブが到着したことを通知する電子メールのメッセージコンテンツ８０１の一例である。１つのスキャンジョブに対して１つの通知メッセージが作成され、後述のステップＳ１１１４でのメール通知がなされると、ユーザーは図８に示すコンテンツの内容を確認することができる。サムネイル８１０は、スキャン画像のサムネイルである。テンプレート名８１１は、スキャン文書と一致した学習済み帳票テンプレート名である。
確信度８１２は、スキャン文書の認識確信度である。宛先８１３は、スキャン文書の送信宛先である。ファイル名８１４は、スキャン文書のファイル名である。項目名８１５は、メタデータの項目名である。画像８１６は、ＯＣＲ領域の画像である。文字列８１７は、ＯＣＲ処理の認識結果文字列である。すなわち、クライアント端末１２１において、メッセージコンテンツ８０１内にサムネイル８１０～文字列８１７のコンテンツを表示させることにより、ユーザーはその場でＯＣＲ認識結果の突合確認をすることができる。 FIG. 8 is an example of message content 801 of an e-mail notifying that a new scan job has arrived. When one notification message is created for one scan job and an email notification is sent in step S1114, which will be described later, the user can confirm the details of the contents shown in FIG. Thumbnail 810 is a thumbnail of the scanned image. A template name 811 is a learned form template name that matches the scanned document.
Confidence 812 is the recognition confidence of the scanned document. A destination 813 is a transmission destination of the scanned document. File name 814 is the file name of the scanned document. The item name 815 is the item name of metadata. Image 816 is an image of the OCR area. A character string 817 is a recognition result character string of OCR processing. That is, by displaying the content of the thumbnail 810 to the character string 817 in the message content 801 on the client terminal 121, the user can check the OCR recognition result on the spot.

登録ボタン８２１は、前述の保存・送信ボタン６０４の送信処理と同様の処理を行うＵＲＬリンクである。ユーザーは、突合確認の結果、結果が正しいと判断した場合は、登録ボタン８２１を選択する。登録ボタン８２１が選択されると、ブラウザーが起動されてＵＲＬで示されたスキャン文書処理アプリケーション３１１へリクエストが送信される。スキャン文書処理アプリケーション３１１は、ジョブキュー３２３内のスキャンジョブを処理し、通信部３３４を介して、業務アプリケーション３６１にスキャン文書およびメタデータを登録する。ここで登録されるメタデータは、確認した文字列８１７である。
校正ボタン８２２は、校正画面へのリンクボタンである。ユーザーが画像８１６と文字列８１７の突合確認で誤字を発見した場合は、校正ボタン８２２を選択すると、クライアントアプリケーション３５１が起動してブラウザーでスキャンジョブ一覧ＵＩが表示される。ユーザーはスキャンジョブ一覧ＵＩ上で、必要に応じてメタデータの文字列を修正する。 A registration button 821 is a URL link that performs the same processing as the transmission processing of the save/send button 604 described above. The user selects the registration button 821 when judging that the result is correct as a result of matching confirmation. When the registration button 821 is selected, the browser is activated and a request is sent to the scanned document processing application 311 indicated by the URL. The scanned document processing application 311 processes scan jobs in the job queue 323 and registers scanned documents and metadata in the business application 361 via the communication unit 334 . The metadata registered here is the confirmed character string 817 .
A calibration button 822 is a link button to a calibration screen. If the user finds a typo in matching between the image 816 and the character string 817, and selects the calibration button 822, the client application 351 is activated and the scan job list UI is displayed on the browser. The user corrects the character string of the metadata as necessary on the scan job list UI.

ここで、図７を用いて、クライアントアプリケーション３５１のスキャンジョブ一覧ＵＩを説明する。スキャンジョブ一覧ＵＩ７０１は、リスト形式で、ユーザーに割り当てられているスキャンジョブを１つ以上の列として表示可能である。
カラム７１１は、スキャンジョブのサムネイルを表示するカラムである。カラム７１２は、スキャンジョブのファイル名を表示するカラムである。カラム７１３は、スキャンジョブの作成日時を表示するカラムである。カラム７１４は、スキャン文書が帳票種別判定された結果、一致した学習済み帳票テンプレート名を表示するカラムである。カラム７１５は、スキャンジョブの送信宛先を表示するカラムである。更新ボタン７１６は、スキャンジョブ一覧を更新するためのボタンである。 Here, the scan job list UI of the client application 351 will be described using FIG. The scan job list UI 701 can display scan jobs assigned to the user in one or more columns in a list format.
A column 711 is a column for displaying thumbnails of scan jobs. A column 712 is a column for displaying the file name of the scan job. A column 713 is a column for displaying the creation date and time of the scan job. A column 714 is a column for displaying a learned form template name that matches as a result of the form type determination of the scanned document. A column 715 is a column for displaying the transmission destination of the scan job. An update button 716 is a button for updating the scan job list.

メタデータペイン７２１は、帳票種別ごとに必要なメタデータ一覧が表示される。フィールド７２２は、メタデータの項目名を表示するテキストフィールドである。画像７２３は、メタデータに関連付けられたＯＣＲ領域の画像である。
コントロール７２４は、メタデータの値を入力、修正するためのテキストエディットコントロールである。エリア７２５は、図５のエリア５１２と同様に、メタデータごとにユーザーが識別しやすいように色付け矩形を表示するエリアである。 A metadata pane 721 displays a list of metadata required for each form type. A field 722 is a text field that displays the item name of metadata. Image 723 is an image of the OCR area associated with the metadata.
Control 724 is a text edit control for entering and modifying metadata values. Area 725 is an area that displays colored rectangles so that the user can easily identify each piece of metadata, similar to area 512 in FIG.

登録ボタン７２６は、スキャン文書を送信宛先に登録するためのボタンである。
図８の画面から校正ボタン８２２を選択すると、スキャンジョブ一覧ＵＩ７０１において、まずスキャンジョブ一覧から処理対象のスキャンジョブの行が選択される。そして、右側のメタデータペイン７２１に、当該スキャン文書と一致した学習済み帳票テンプレートに基づき、メタデータごとに定義済みＯＣＲ領域の画像７２３および、ＯＣＲ処理の認識結果文字列が記載されたコントロール７２４が表示される。ユーザーは、コントロール７２４内のメタデータの値を修正することができる。修正が完了し、ユーザーが登録ボタン７２６を選択すると、スキャン文書がカラム７１５に表示されている宛先に登録される。 A registration button 726 is a button for registering a scanned document as a transmission destination.
When the calibration button 822 is selected from the screen of FIG. 8, the line of the scan job to be processed is first selected from the scan job list on the scan job list UI 701 . In the metadata pane 721 on the right side, based on the learned form template that matches the scanned document, an image 723 of the OCR area defined for each metadata and a control 724 in which the recognition result character string of the OCR processing is described. Is displayed. The user can modify the metadata values in control 724 . When the correction is completed and the user selects the registration button 726 , the scanned document is registered at the destination displayed in column 715 .

一方、ステップＳ１１１３において、バックエンドアプリケーション３３１は、認識が完了し、ユーザーが文書［ｎ］の処理を開始可能であることを通知するメール通知コンテンツを生成する。このコンテンツには、図８の校正ボタン８２２のみが含まれ、後述のステップＳ１１１４でのメール通知がなされると、ユーザーはコンテンツの内容を確認することができる。ここで、ステップＳ１１０５で認識確信度が「中低」と判定された場合は、校正ボタン８２２の選択により、図７に示すクライアントアプリケーション３５１のスキャンジョブ一覧ＵＩに遷移するようにする。一方、ステップＳ１１０３で学習済み帳票テンプレートと一致しないと判定された場合は、図８の校正ボタン８２２の選択により、図４の画面に遷移するようにする。
そして、ステップＳ１１１４において、バックエンドアプリケーション３３１は、ステップＳ１１１２およびステップＳ１１１３で生成したメール通知を送信する。 Meanwhile, in step S1113, back-end application 331 generates mail notification content notifying that recognition is complete and the user can start processing document [n]. This content includes only the proofreading button 822 in FIG. 8, and the user can confirm the details of the content when an email notification is sent in step S1114, which will be described later. Here, if the recognition confidence level is determined to be "medium-low" in step S1105, the calibration button 822 is selected to transition to the scan job list UI of the client application 351 shown in FIG. On the other hand, if it is determined in step S1103 that the document template does not match the learned form template, the user selects the calibration button 822 in FIG. 8 to transition to the screen in FIG.
Then, in step S1114, the backend application 331 transmits the email notification generated in steps S1112 and S1113.

ステップＳ１１０２～ステップＳ１１１４の処理をＮ個のスキャン文書に対して行った後、以降のステップでＮ個のスキャン文書に対するクライアント端末１２１への通知を送信する。
まず、ステップＳ１１１５において、バックエンドアプリケーション３３１は、ステップＳ１１０５で認識確信度が「高」、「中低」またはステップＳ１１０３でＮｏに判定されたスキャン文書を、スキャンジョブ一覧ＵＩに新規ジョブとして通知する。
また、ステップＳ１１１６において、バックエンドアプリケーション３３１は、ステップＳ１１０５で認識確信度「かなり高」に判定されたスキャン文書を、スキャンジョブ一覧ＵＩ９０１に自動送信ジョブとして通知する。 After the processing of steps S1102 to S1114 has been performed on the N scanned documents, a notification of the N scanned documents is sent to the client terminal 121 in the subsequent steps.
First, in step S1115, the back-end application 331 notifies the scan job list UI of the scanned document for which the recognition confidence level is "high" or "medium-low" in step S1105 or "No" in step S1103 as a new job. .
Also, in step S1116, the back-end application 331 notifies the scan job list UI 901 of the scanned document for which the degree of recognition certainty was determined to be "very high" in step S1105 as an automatic transmission job.

ここで、図９を用いて、スキャンジョブ一覧ＵＩに、新規ジョブ通知および自動送信ジョブ通知を表示する方法を説明する。なお、図９に示すスキャンジョブ一覧ＵＩ９０１の画面は、クライアント端末１２１において、ユーザーの操作によってクライアントアプリケーション３５１が起動することによって表示される画面である。
まず、ステップＳ１１１５において通知された新規ジョブの件数は更新ボタン９１１内に表示される。ユーザーが更新ボタン９１１を選択すると、クライアントアプリケーション３５１からスキャン文書処理アプリケーション３１１に最新のスキャンジョブ一覧を取得するリクエストが送信される。その応答を受けて、クライアントアプリケーション３５１は、スキャンジョブ一覧ＵＩ９０１を最新の状態に更新する。図９の例では、ジョブキュー３２３に新規ジョブ２件が到着しており、更新ボタン９１１内に追加のメッセージ（新規ジョブ２件）があることを示している。クライアントアプリケーション３５１は、スキャン文書処理アプリケーション３１１にバックグラウンドで新規ジョブの到着数を確認する。新規ジョブの到着数に変化があった場合、更新ボタン９１１のメッセージを更新する。 Here, a method of displaying a new job notification and an automatic transmission job notification on the scan job list UI will be described with reference to FIG. Note that the screen of the scan job list UI 901 shown in FIG. 9 is displayed when the client application 351 is activated in the client terminal 121 by the user's operation.
First, the number of new jobs notified in step S1115 is displayed in the update button 911. FIG. When the user selects the update button 911 , the client application 351 transmits a request to acquire the latest scan job list to the scanned document processing application 311 . Upon receiving the response, the client application 351 updates the scan job list UI 901 to the latest state. In the example of FIG. 9, two new jobs have arrived in the job queue 323, indicating that there are additional messages (two new jobs) in the update button 911. FIG. The client application 351 checks the number of arrivals of new jobs in the background with the scanned document processing application 311 . If there is a change in the number of arrivals of new jobs, the message of the update button 911 is updated.

メッセージ９１２は、ステップＳ１１１６で通知された自動送信ジョブの件数を表示するためのメッセージである。クライアントアプリケーション３５１は、スキャン文書処理アプリケーション３１１にバックグラウンドで自動送信ジョブの件数を確認する。自動送信ジョブの件数に変化があった場合、メッセージ９１２の内容を更新する。
リンク９１３は、自動送信ジョブの履歴へのリンクである。スキャン文書の認識確信度が十分に高い場合、ユーザーの突合確認を経なくても送信宛先に送信する。その場合、ユーザーは自身が投入したスキャンジョブがスキャンジョブ一覧ＵＩ９０１の一覧に表示されないので、代わりにメッセージ９１２で確認することができる。そして必要に応じて、履歴へのリンク９１３から、スキャンジョブの処理完了状況を確認することができる。
ボタン９１４は、メッセージを消去するためのボタンである。ユーザーは、メッセージ９１２、あるいはリンク９１３から、自動送信ジョブの数や完了状態について確認した後、自動送信ジョブ通知メッセージを消去するときにボタン９１４を選択する。 A message 912 is a message for displaying the number of automatic transmission jobs notified in step S1116. The client application 351 confirms the number of automatic transmission jobs in the background with the scanned document processing application 311 . If there is a change in the number of automatic transmission jobs, the contents of the message 912 are updated.
A link 913 is a link to a history of automatic transmission jobs. If the recognition confidence of the scanned document is sufficiently high, it will be sent to the destination without user matching confirmation. In this case, since the scan job submitted by the user is not displayed in the list of the scan job list UI 901, the user can confirm it with a message 912 instead. If necessary, the processing completion status of the scan job can be checked from the history link 913 .
A button 914 is a button for deleting a message. After confirming the number and completion status of automatic transmission jobs from message 912 or link 913, the user selects button 914 to delete the automatic transmission job notification message.

次に、ステップＳ１１１７において、バックエンドアプリケーション３３１は、ステップＳ１１０５で認識確信度が「高」、「中低」またはステップＳ１１０３でＮｏに判定されたスキャン文書は、クライアント端末１２１のＯＳ（オペレーティングシステム）などが備える通知センターに新規ジョブとして通知する。
次に、ステップＳ１１１８において、バックエンドアプリケーション３３１は、ステップＳ１１０５で認識確信度が「かなり高」に判定されたスキャン文書は、通知センターに自動送信ジョブとして通知する。 Next, in step S1117, the back-end application 331 uses the OS (operating system) of the client terminal 121 for the scanned document whose recognition certainty is "high" or "medium-low" in step S1105 or "No" in step S1103. The notification center provided with such as is notified as a new job.
Next, in step S1118, the back-end application 331 notifies the notification center of the scanned document for which the degree of recognition certainty was determined to be "very high" in step S1105 as an automatic transmission job.

ここで、図１０を用いて、クライアント端末１２１のＯＳなどが備える通知センターに新規ジョブ通知および自動送信ジョブ通知を表示する方法を説明する。
図１０において、通知センターＵＩ１０００は、新規ジョブ通知１００１と、自動送信ジョブ通知１００２とを有している。ステップＳ１１１７およびステップＳ１１１８において、バックエンドアプリケーション３３１が通知センターに新規ジョブおよび自動送信ジョブを通知すると、新規ジョブ通知１００１および自動送信ジョブ通知１００２が表示される。
図９で前述した通知メッセージは、クライアントアプリケーション３５１内でしか確認できないが、ＯＳなどが備える通知センターに通知メッセージを表示することによって、いわゆるトースト通知などをユーザーに表示し、注意喚起することができる。具体的には例えば、ユーザーが５つのスキャンジョブを投入後、新規ジョブが３件到着したこと、および自動送信されたジョブが２件あることを明確に閲覧することができ、自動送信ジョブの可視性が向上する。
新規ジョブ通知１００１および自動送信ジョブ通知１００２は、ユーザーが確認後に閉じるボタンを選択することにより、閉じることができる。また、ユーザーが自動送信ジョブ通知１００２を選択すると、図９のリンク９１３と同様に自動送信ジョブの履歴へのリンクとして動作する。 Here, a method for displaying a new job notification and an automatic transmission job notification on the notification center provided by the OS of the client terminal 121 will be described with reference to FIG.
In FIG. 10, the notification center UI 1000 has a new job notification 1001 and an automatic transmission job notification 1002 . In steps S1117 and S1118, new job notification 1001 and automatic transmission job notification 1002 are displayed when backend application 331 notifies notification center of new jobs and automatic transmission jobs.
The notification message described above with reference to FIG. 9 can only be confirmed within the client application 351, but by displaying the notification message in the notification center provided in the OS, etc., a so-called toast notification can be displayed to the user to alert the user. . Specifically, for example, after the user submits 5 scan jobs, the user can clearly see that 3 new jobs have arrived and 2 jobs have been automatically sent, and the visibility of the automatically sent jobs improve sexuality.
The new job notification 1001 and the automatic transmission job notification 1002 can be closed by the user selecting the close button after confirmation. Also, when the user selects the automatic transmission job notification 1002, it operates as a link to the history of automatic transmission jobs, similar to the link 913 in FIG.

次に、図１２のフローチャートを用いて、スキャンジョブの保持期間が過ぎた場合の処理について説明する。なお、この処理はバックエンドアプリケーション３３１によって所定の周期で行われる。
まず、ステップＳ１２０１において、バックエンドアプリケーション３３１は、ジョブキュー３２３に格納されている未処理の文書［ｎ］のスキャンジョブの生成日付を取得する。
次に、ステップＳ１２０２において、バックエンドアプリケーション３３１は、ステップＳ１２０１で取得した生成日付と現在日時とを比較し、ジョブの保持期間（例：７日）を経過したか否かを判定する。バックエンドアプリケーション３３１は、ジョブの保持期間を経過したと判定した場合は、処理をステップＳ１２０３に進める。一方、バックエンドアプリケーション３３１は、ジョブの保持期間を経過していないと判定した場合は、本処理を終了する。 Next, the processing when the retention period of the scan job has passed will be described with reference to the flowchart of FIG. 12 . Note that this process is performed by the backend application 331 at a predetermined cycle.
First, in step S1201 , the backend application 331 acquires the creation date of the scan job of unprocessed document [n] stored in the job queue 323 .
Next, in step S1202, the back-end application 331 compares the creation date acquired in step S1201 with the current date and time, and determines whether the job retention period (eg, 7 days) has passed. If the back-end application 331 determines that the job retention period has elapsed, the process advances to step S1203. On the other hand, if the back-end application 331 determines that the job holding period has not elapsed, the processing ends.

次に、ステップＳ１２０３において、バックエンドアプリケーション３３１は、結果格納部３２５から文書［ｎ］の認識確信度を取得する。
次に、ステップＳ１２０４において、バックエンドアプリケーション３３１は、認識確信度を判定する。バックエンドアプリケーション３３１は、認識確信度が「高」と判定した場合、処理をステップＳ１２０５に進める。一方、バックエンドアプリケーション３３１は、認識確信度が「中低」と判定した場合は、処理をステップＳ１２０６に進める。
ステップＳ１２０５において、バックエンドアプリケーション３３１は、通信部３３４を介して、学習済み帳票テンプレートで定義済みの宛先に文書［ｎ］とメタデータとを送信する。この処理では、すでにステップＳ１１１０で仮のメタデータは抽出済みなので、仮のメタデータを正規のメタデータとして登録し、宛先に文書［ｎ］とメタデータとを送信する。
ステップＳ１２０６において、バックエンドアプリケーション３３１は、文書［ｎ］を文書格納部３２２から削除する。
ステップＳ１２０７において、バックエンドアプリケーション３３１は、文書［ｎ］をジョブキュー３２３からデキューし、本処理を終了する。 Next, in step S1203 , the backend application 331 acquires the recognition confidence of document [n] from the result storage unit 325 .
Next, in step S1204, the backend application 331 determines the recognition confidence. If the back-end application 331 determines that the degree of recognition certainty is "high", the process proceeds to step S1205. On the other hand, when the back-end application 331 determines that the recognition certainty is "medium-low", the process proceeds to step S1206.
In step S1205 , the backend application 331 transmits document [n] and metadata to the destination defined in the learned form template via the communication unit 334 . In this process, since the temporary metadata has already been extracted in step S1110, the temporary metadata is registered as regular metadata, and the document [n] and the metadata are sent to the destination.
In step S1206 , the backend application 331 deletes document [n] from the document storage unit 322 .
In step S1207, the backend application 331 dequeues the document [n] from the job queue 323, and ends this process.

なお、本実施形態では、メッセージの通知手段の一例として、電子メールによるメッセージ通知を説明したが、ビジネスチャットや社内ＳＮＳなど他のアプリケーションに通知を送信することとしてもよい。ビジネスチャットや社内ＳＮＳなどがサポートする、ＨＴＭＬやＭａｒｋｄｏｗｎなどのマークアップ言語で、ＯＣＲ結果のレビューや突合確認ＵＩを提供することが可能である。そのため、メッセージの通知手段としてのアプリケーション、プロトコル、およびマークアップ言語等はいずれでもよい。 In this embodiment, as an example of message notification means, message notification by e-mail has been described, but notification may be sent to other applications such as business chat and in-house SNS. With markup languages such as HTML and Markdown, which are supported by business chats and in-house SNS, it is possible to provide an OCR result review and match confirmation UI. Therefore, any application, protocol, markup language, or the like may be used as a message notification means.

以上のように、本実施形態の情報処理システムによれば、ＯＣＲおよび帳票認識における認識確信度を用いて、文書処理フローを分岐することにより、ユーザーによる操作を簡便、容易とすることができ、業務効率を向上することができる。 As described above, according to the information processing system of the present embodiment, by branching the document processing flow using the recognition confidence in OCR and form recognition, user operations can be simplified and facilitated. Work efficiency can be improved.

（その他の実施形態）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピューターにおける１つ以上のプロセッサがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or device via a network or a storage medium, and one or more processors in a computer of the system or device reads and executes the program. It can also be realized by processing to It can also be implemented by a circuit (for example, ASIC) that implements one or more functions.

３３１バックエンドアプリケーション、３３２ＯＣＲ処理部、３３３帳票処理部、３３５算出部 331 backend application, 332 OCR processing unit, 333 form processing unit, 335 calculation unit

Claims

an extracting means for extracting a character string by performing character recognition on a scanned document containing the character string;
determining means for determining whether or not the scanned document matches a pre-stored learned form;
a calculating means for calculating a certainty factor of the scanned document;
a character string extracted by the extraction means when the determination means determines that the scanned document matches the learned form and the certainty calculated by the calculation means is equal to or greater than a first threshold; a control means for registering as metadata;
An information processing device comprising:

When the determination means determines that the scanned document matches the learned form, the control means determines that the degree of certainty calculated by the calculation means is smaller than the first threshold and a second threshold. If the above, notify the client terminal of information about the result of character recognition;
The information processing apparatus according to claim 1, characterized by:

When the determination means determines that the scanned document matches the learned form, and the certainty factor calculated by the calculation means is smaller than the first threshold value and equal to or greater than the second threshold value. the information on the result of character recognition notified by the control means is notified by e-mail;
3. The information processing apparatus according to claim 2, characterized by:

When the determination means determines that the scanned document matches the learned form, and the certainty factor calculated by the calculation means is smaller than the first threshold value and equal to or greater than the second threshold value. the information on the result of character recognition notified by the control means is notified by a business chat or an in-house SNS;
3. The information processing apparatus according to claim 2, characterized by:

The notified information about the result of character recognition includes information on the character string extracted by the extracting means, a link for registering the character string extracted by the extracting means as metadata, and the extracting means. contains a link for proofreading the string extracted by
The information processing apparatus according to any one of claims 2 to 4, characterized by:

notifying the client terminal of a link for proofreading the character string extracted by the extraction means when the confidence calculated by the calculation means is smaller than the second threshold;
6. The information processing apparatus according to any one of claims 2 to 5, characterized by:

When the certainty calculated by the calculating means is smaller than the first threshold and equal to or larger than the second threshold, and when a predetermined period of time has passed while the scanned document is held, the control is performed. means for registering the character string extracted by the extraction means as metadata;
7. The information processing apparatus according to any one of claims 2 to 6, characterized by:

The calculating means calculates the certainty factor of the scanned document based on the certainty factor of the result of the character recognition by the extracting means;
The information processing apparatus according to any one of claims 1 to 7, characterized by:

The calculating means calculates the certainty of the scanned document based on the certainty of the result of the character recognition by the extracting means and the similarity between the learned form determined by the determining means;
The information processing apparatus according to any one of claims 1 to 7, characterized by:

notifying a client terminal of information for learning the scanned document as a form when the determination means determines that the scanned document does not match a pre-stored learned form;
The information processing apparatus according to any one of claims 1 to 9, characterized by:

the control means notifying the client terminal of information about the scanned document whose certainty is equal to or greater than the first threshold;
The information processing apparatus according to any one of claims 1 to 10, characterized by:

the control means notifying the client terminal of information for displaying a link to information on the scanned document whose certainty is equal to or greater than the first threshold;
The information processing apparatus according to any one of claims 1 to 10, characterized by:

the control means notifying a notification center of the client terminal of information for displaying, on the client terminal, information relating to the scanned document whose certainty is equal to or greater than the first threshold;
The information processing apparatus according to any one of claims 1 to 10, characterized by:

the information about the scanned document includes a link to the information about the scanned document;
14. The information processing apparatus according to claim 13, characterized by:

When it is determined that the scanned document matches the learned form and the certainty calculated by the calculation means is equal to or greater than the first threshold, the control means controls the Registering a character string as metadata and transmitting the scanned document and the metadata to a destination registered in the learned form;
The information processing apparatus according to any one of claims 1 to 14, characterized by:

an extracting means for extracting a character string by performing character recognition on a scanned document containing the character string;
Information on the character string extracted by the extraction means, a link for registering the character string extracted by the extraction means as metadata, and a link for proofreading the character string extracted by the extraction means. Notification means for performing notification including by at least one of e-mail, business chat, and in-house SNS;
2. The information processing apparatus according to claim 1, further comprising:

an extraction step of extracting the character string by performing character recognition on the scanned document containing the character string;
a determination step of determining whether or not the scanned document matches a pre-stored learned form;
a calculation step of calculating a certainty factor of the scanned document;
When the determination step determines that the scanned document matches the learned form and the certainty calculated in the calculation step is equal to or greater than a first threshold, the character string extracted in the extraction step a control step of registering as metadata;
A control method for an information processing device, comprising:

A program for causing a computer to function as each means of the information processing apparatus according to any one of claims 1 to 16.