JP2021086188A

JP2021086188A - Image processing system, image processing method, and program

Info

Publication number: JP2021086188A
Application number: JP2019212299A
Authority: JP
Inventors: 秀和世渡; Hidekazu Seto
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2019-11-25
Filing date: 2019-11-25
Publication date: 2021-06-03

Abstract

To reduce the possibility of false determination for a character string alteration in image data.SOLUTION: An image processing system includes: generation means for generating a learning model by performing machine learning processing based on an altered character image, a character image before an alteration, and an image representing the difference between the altered character image and the character image before the alteration; input means for inputting image data; and estimation means for estimating whether the image data input by the input means includes an altered character, by using the learning model generated by the generation means.SELECTED DRAWING: Figure 1

Description

本開示は、画像処理システム、画像処理方法、及びプログラムに関する。 The present disclosure relates to image processing systems, image processing methods, and programs.

従来、画像データ内の改竄を検出する技術が知られている。 Conventionally, a technique for detecting falsification in image data has been known.

特許文献１には、画像データを輝度値の近い画素を持つ複数のグループに分け、グループごとに文字認識処理を実行した結果と、グループに分けずに文字認識処理を実行した結果が異なっていたら改竄されていると判定する画像処理装置が開示されている。 According to Patent Document 1, if the image data is divided into a plurality of groups having pixels having similar brightness values and the result of executing the character recognition process for each group is different from the result of executing the character recognition process without dividing the image data into groups. An image processing device that determines that the image has been tampered with is disclosed.

特開２００９−２００７９４号公報Japanese Unexamined Patent Publication No. 2009-200794

特許文献１では、改ざんの有無を輝度でしか判定しないので、インクのかすれや筆圧等で文字の輝度が変化することで、改竄されていなくても改竄されていると誤判定される可能性がある。 In Patent Document 1, since the presence or absence of tampering is determined only by the brightness, there is a possibility that it is erroneously determined that the character has been tampered with even if it has not been tampered with because the brightness of the character changes due to ink fading or writing pressure. There is.

本発明は上記の課題に鑑みてなされたものであり、画像データ内の文字列の改竄の誤判定が発生するおそれを低下させることを目的とする。 The present invention has been made in view of the above problems, and an object of the present invention is to reduce the possibility of erroneous determination of falsification of a character string in image data.

本発明の画像処理システムは、改竄された画像と、改竄される前の画像と、改竄された画像と改竄される前の画像の差分を示す画像とに基づいて、機械学習処理を実行することにより学習モデルを生成する生成手段と、
画像データを入力する入力手段と、
前記生成手段により生成された前記学習モデルを用いて、前記入力手段によって入力された画像データに改竄された画像が含まれるか否かを推定する推定手段とを有することを特徴とする。 The image processing system of the present invention executes machine learning processing based on a falsified image, an image before falsification, and an image showing a difference between the falsified image and the image before falsification. And the generation means to generate the learning model by
Input means for inputting image data and
It is characterized by having an estimation means for estimating whether or not the image data input by the input means includes a falsified image by using the learning model generated by the generation means.

本発明によれば、画像データ内の文字列の改竄を検知する際の誤判定が発生する確率を低下させることができる。 According to the present invention, it is possible to reduce the probability that an erroneous determination will occur when detecting falsification of a character string in image data.

一実施形態に係る画像処理システムの構成の一例を示す概略図。The schematic diagram which shows an example of the structure of the image processing system which concerns on one Embodiment. （Ａ）画像処理装置の構成の一例を示すブロック図。（Ｂ）学習装置の構成の一例を示すブロック図。（Ｃ）改竄検出サーバの構成の一例を示すブロック図。(A) A block diagram showing an example of the configuration of an image processing device. (B) A block diagram showing an example of the configuration of the learning device. (C) A block diagram showing an example of a configuration of a tampering detection server. （Ａ）一実施形態に係る画像処理システムにおける学習段階の処理の概略的な流れの一例を示すシーケンス図；（Ｂ）一実施形態に係る画像処理システムにおける改竄検出段階の処理の概略的な流れの一例を示すシーケンス図。(A) Sequence diagram showing an example of a schematic flow of processing in the learning stage in the image processing system according to one embodiment; (B) Schematic flow of processing in the tampering detection stage in the image processing system according to one embodiment. A sequence diagram showing an example. 空欄の学習用原稿の一例を示す模式図。The schematic diagram which shows an example of the blank learning manuscript. 原稿読取りの指示を受付けるためのＧＵＩの一例を示す模式図。The schematic diagram which shows an example of GUI for accepting the instruction of manuscript reading. （Ａ）学習用原本画像の一例を示す模式図；（Ｂ）学習用改竄画像の一例を示す模式図；（Ｃ）学習用原本画像及び学習用改竄画像から生成される学習用データの一例を示す模式図。(A) Schematic diagram showing an example of a learning original image; (B) Schematic diagram showing an example of a learning falsified image; (C) An example of learning data generated from a learning original image and a learning falsified image Schematic diagram shown. （Ａ）処理対象画像の一例を示す模式図；（Ｂ）改竄検出結果のビットマップの一例を示す模式図；（Ｃ）改竄部分を強調した強調画像の一例を示す模式図。(A) Schematic diagram showing an example of the image to be processed; (B) Schematic diagram showing an example of a bitmap of the falsification detection result; (C) Schematic diagram showing an example of an emphasized image emphasizing the falsified portion. 改竄検出段階において画像処理装置により実行される処理の具体的な流れの一例を示すフローチャート。A flowchart showing an example of a specific flow of processing executed by an image processing apparatus in a tampering detection stage. 改竄検出段階において改竄検出サーバにより実行される処理の具体的な流れの一例を示すフローチャート。A flowchart showing an example of a specific flow of processing executed by the tampering detection server in the tampering detection stage. 改竄検出の設定及び指示のためのＧＵＩの一例を示す模式図。The schematic diagram which shows an example of GUI for setting and instruction of tampering detection. （Ａ）改竄検出結果の一覧を表示するためのＧＵＩの一例を示す模式図；（Ｂ）改竄検出結果の詳細を表示するためのＧＵＩの一例を示す模式図。(A) Schematic diagram showing an example of GUI for displaying a list of tampering detection results; (B) Schematic diagram showing an example of GUI for displaying details of tampering detection results. 改竄検出結果をユーザにより修正可能とするＧＵＩの一例を示す模式図。The schematic diagram which shows an example of GUI which makes it possible to modify a tampering detection result by a user. （Ａ）強調画像と比較画像とを対比的に配置したＧＵＩの第１の例を示す模式図；（Ｂ）強調画像と比較画像とを対比的に配置したＧＵＩの第２の例を示す模式図。(A) Schematic diagram showing a first example of a GUI in which an emphasized image and a comparison image are arranged in contrast; (B) A schematic diagram showing a second example of a GUI in which an emphasized image and a comparison image are arranged in contrast. Figure. （Ａ）第１の比較モードにおける比較画像の例を示す模式図；（Ｂ）第２の比較モードにおける比較画像の例を示す模式図；（Ｃ）第３の比較モードにおける比較画像の例を示す模式図；（Ｄ）第４の比較モードにおける比較画像の例を示す模式図。(A) Schematic diagram showing an example of a comparison image in the first comparison mode; (B) Schematic diagram showing an example of a comparison image in the second comparison mode; (C) Example of a comparison image in the third comparison mode Schematic diagram shown; (D) Schematic diagram showing an example of a comparative image in the fourth comparison mode. 改竄検出結果をユーザにより修正可能とするＧＵＩの他の例を示す模式図。The schematic diagram which shows the other example of GUI which makes it possible to modify a tampering detection result by a user. 対象文書の改竄が検出された場合に画像処理装置により実行される表示制御処理の具体的な流れの一例を示すフローチャート。A flowchart showing an example of a specific flow of display control processing executed by an image processing device when falsification of a target document is detected. 一変形例において強調画像と比較画像とを対比的に配置したＧＵＩの一例を示す模式図。The schematic diagram which shows an example of GUI which arranged the emphasized image and the comparative image in contrast in one modification.

以下、添付図面を参照して実施形態を詳しく説明する。なお、以下の実施形態は特許請求の範囲に係る発明を限定するものではない。実施形態には複数の特徴が記載されているが、これらの複数の特徴の全てが発明に必須のものとは限らず、また、複数の特徴は任意に組み合わせられてもよい。さらに、添付図面においては、同一若しくは同様の構成に同一の参照番号を付し、重複した説明は省略する。 Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. The following embodiments do not limit the invention according to the claims. Although a plurality of features are described in the embodiment, not all of the plurality of features are essential to the invention, and the plurality of features may be arbitrarily combined. Further, in the attached drawings, the same or similar configurations are designated by the same reference numbers, and duplicate explanations are omitted.

＜＜１．システムの概要＞＞
図１は、一実施形態に係る画像処理システム１００の構成の一例を示す概略図である。画像処理システム１００は、画像処理装置１０１、学習装置１０２、改竄検出サーバ１０３及びＯＣＲサーバ１０４を含む。画像処理装置１０１、学習装置１０２、改竄検出サーバ１０３及びＯＣＲサーバ１０４は、ネットワーク１０５を介して相互に接続される。 << 1. System overview >>
FIG. 1 is a schematic view showing an example of the configuration of the image processing system 100 according to the embodiment. The image processing system 100 includes an image processing device 101, a learning device 102, a tampering detection server 103, and an OCR server 104. The image processing device 101, the learning device 102, the tampering detection server 103, and the OCR server 104 are connected to each other via the network 105.

画像処理装置１０１は、例えば、印刷及び画像読取りの双方の機能を有する複合機（Ｍｕｌｔｉ−ＦｕｎｃｔｉｏｎＰｅｒｉｐｈｅｒａｌ（ＭＦＰ））であってもよく、又は専ら画像読取りを行うデジタルスキャナであってもよい。画像処理装置１０１は、読取部１１１及び表示制御部１１２を備える。読取部１１１は、原稿１１を読取って読取画像を生成する。即ち、読取部１１１は、原稿１１の読取画像を取得する。典型的には、原稿１１は文字列を含み、よって読取画像は文字画像を含む。 The image processing device 101 may be, for example, a multifunction device (Multi-Function Peripheral (MFP)) having both printing and image reading functions, or may be a digital scanner that exclusively performs image reading. The image processing device 101 includes a reading unit 111 and a display control unit 112. The scanning unit 111 reads the document 11 and generates a scanned image. That is, the scanning unit 111 acquires the scanned image of the document 11. Typically, the manuscript 11 contains a character string, and thus the scanned image contains a character image.

一例として、画像処理装置１０１は、学習段階において学習用データの生成を支援し得る。具体的には、まず、オペレータは、予め用意される空欄の学習用原稿に手書きで文字を記入し、記入済みの学習用原稿を画像処理装置１０１にセットする。学習用原稿は、例えば、１つ以上の既知の位置に記入欄を有する帳票形式の文書であってよい。学習用原稿は、個々の学習用原稿を一意に識別するための視覚的な識別情報（例えば、印字された番号、バーコード又は二次元コード）を有していてもよい。画像処理装置１０１は、空欄の学習用原稿を印刷可能であってもよい。読取部１１１は、セットされた学習用原稿を読取って読取画像１２を生成する。読取画像１２は、学習用原稿の原本の画像として扱われ、本明細書ではこれを学習用原本画像とも称する。また、オペレータ又は他の人物は、記入済みの学習用原稿（即ち、原本）に（例えば、ペンでストロークを書き足すなどして）改竄を加え、改竄された学習用原稿を画像処理装置１０１にセットする。読取部１１１は、改竄された学習用原稿を読取って読取画像１３を生成する。本明細書では、読取画像１３を学習用改竄画像とも称する。学習用原稿への文字の記入、画像処理装置１０１による原本の読取り、意図的な学習用原稿の改竄、及び画像処理装置１０１による改竄バージョンの読取りの反復を通じて、このような学習用原本画像１２及び学習用改竄画像１３の複数のペアが生成される。画像処理装置１０１は、学習用原本画像１２及び学習用改竄画像１３のこれらペアを、ネットワーク１０５を介して学習装置１０２へ送信する。学習装置１０２は、後に説明するように、これらペアから生成される学習用データを用いて機械学習を実行する。なお、上の説明に関わらず、学習用原本画像１２及び学習用改竄画像１３の生成は、画像処理装置１０１とは別の装置において行われてもよい。 As an example, the image processing device 101 can support the generation of learning data at the learning stage. Specifically, first, the operator writes characters by hand on a blank learning manuscript prepared in advance, and sets the completed learning manuscript in the image processing device 101. The learning manuscript may be, for example, a form-type document having entry fields at one or more known positions. The learning manuscript may have visual identification information (for example, a printed number, a bar code or a two-dimensional code) for uniquely identifying each learning manuscript. The image processing device 101 may be able to print a blank learning document. The scanning unit 111 reads the set learning document and generates the scanned image 12. The scanned image 12 is treated as an original image of the learning manuscript, and is also referred to as a learning original image in the present specification. In addition, the operator or another person falsifies the completed learning manuscript (that is, the original) (for example, by adding strokes with a pen), and the falsified learning manuscript is sent to the image processing device 101. set. The reading unit 111 reads the falsified learning document and generates the scanned image 13. In the present specification, the read image 13 is also referred to as a falsified image for learning. Through repeated writing of characters on the learning manuscript, reading of the original by the image processing device 101, intentional falsification of the learning manuscript, and reading of the falsified version by the image processing device 101, such a learning original image 12 and A plurality of pairs of falsified images 13 for learning are generated. The image processing device 101 transmits these pairs of the learning original image 12 and the learning falsified image 13 to the learning device 102 via the network 105. The learning device 102 executes machine learning using the learning data generated from these pairs, as will be described later. Notwithstanding the above description, the learning original image 12 and the learning falsified image 13 may be generated by a device other than the image processing device 101.

また、画像処理装置１０１は、改竄検出段階において、手書き文字が含まれ得る対象原稿を読取って読取画像２１を生成する。本明細書では、読取画像２１を処理対象画像とも称する。画像処理装置１０１は、生成した処理対象画像２１をネットワーク１０５を介して改竄検出サーバ１０３へ送信する。画像処理装置１０１の表示制御部１１２は、改竄検出サーバ１０３から、処理対象画像２１を用いて行われた改竄検出の結果を示す検出結果データ３２を受信する。そして、表示制御部１１２は、検出結果データ３２に基づいて、改竄検出の結果の画面上での表示を制御する。ここでの表示制御の様々な例について、後に具体的に説明する。 In addition, the image processing device 101 reads a target document that may include handwritten characters and generates a scanned image 21 at the tampering detection stage. In the present specification, the scanned image 21 is also referred to as a processing target image. The image processing device 101 transmits the generated image 21 to be processed to the tampering detection server 103 via the network 105. The display control unit 112 of the image processing device 101 receives the detection result data 32 indicating the result of the tampering detection performed using the processing target image 21 from the tampering detection server 103. Then, the display control unit 112 controls the display of the tampering detection result on the screen based on the detection result data 32. Various examples of display control here will be specifically described later.

学習装置１０２は、教師有り学習処理を実行する、コンピュータ又はワークステーションといった情報処理装置であってよい。学習装置１０２は、データ処理部１２１、学習部１２２及び記憶部１２３を備える。データ処理部１２１は、例えば、画像処理装置１０１（又は他の装置）において生成される上述した学習用原本画像１２及び学習用改竄画像１３のペアを記憶部１２３に蓄積し、蓄積したペアから学習用データを生成する。学習部１２２は、学習用原稿の読取画像である学習用画像（即ち、学習用原本画像１２及び学習用改竄画像１３のペア）から生成される学習用データを用いた機械学習処理を通じて、改竄検出のための学習済みモデル（学習モデル）４１を生成／更新する。学習部１２２は、生成／更新した学習済みモデル４１を記憶部１２３に記憶させる。例えば、機械学習のモデルとしてニューラルネットワークモデルが利用される場合には、学習済みモデル４１は、ニューラルネットワークのノードごとの重み及びバイアスといったパラメータを含むデータセットである。ニューラルネットワークモデルを生成／更新するための機械学習の手法として、例えば、多層ニューラルネットワークを用いるディープラーニングが利用されてもよい。学習装置１０２における学習用データの生成及び学習済みモデルの生成／更新のいくつかの例について、後に具体的に説明する。学習部１２２は、後述する改竄検出サーバ１０３からの要求に応じて、学習済みモデル４１を改竄検出サーバ１０３へ提供する。 The learning device 102 may be an information processing device such as a computer or workstation that executes supervised learning processing. The learning device 102 includes a data processing unit 121, a learning unit 122, and a storage unit 123. The data processing unit 121 stores, for example, a pair of the above-mentioned original learning image 12 and the falsified learning image 13 generated in the image processing device 101 (or another device) in the storage unit 123, and learns from the accumulated pair. Generate data for. The learning unit 122 detects tampering through machine learning processing using learning data generated from a learning image (that is, a pair of a learning original image 12 and a learning tampered image 13) which is a scanned image of a learning manuscript. Generates / updates the trained model (learning model) 41 for. The learning unit 122 stores the generated / updated learned model 41 in the storage unit 123. For example, when a neural network model is used as a machine learning model, the trained model 41 is a data set including parameters such as weights and biases for each node of the neural network. As a machine learning method for generating / updating a neural network model, for example, deep learning using a multi-layer neural network may be used. Some examples of generating training data and generating / updating a trained model in the learning device 102 will be specifically described later. The learning unit 122 provides the learned model 41 to the tampering detection server 103 in response to a request from the tampering detection server 103, which will be described later.

改竄検出サーバ１０３は、画像処理装置１０１から受信される処理対象画像２１を用いて対象原稿に含まれる改竄部分を検出する、コンピュータ又はワークステーションといった情報処理装置であってよい。改竄検出サーバ１０３は、画像取得部１３１及び検出部１３２を備える。画像取得部１３１は、対象原稿の読取画像である処理対象画像２１を取得する。検出部１３２は、処理対象画像２１を用いて、対象原稿に含まれる改竄部分を検出する。本実施形態において、検出部１３２は、学習装置１０２から提供される上述した学習済みモデル４１を改竄検出のために利用する。とりわけ、本実施形態において、検出部１３２は、学習済みモデル４１を用いて処理対象画像２１内の複数の画素の各々が改竄部分に属するか否かを推定する（即ち、画素ごとの改竄検出）。但し、処理対象画像２１内の１つ以上の文字の各々が改竄部分を含むか否かを検出部１３２が判定する変形例（即ち、文字ごとの改竄検出）についても後に説明する。検出部１３２は、改竄検出の結果として、処理対象画像２１のどの部分が改竄されていると判定されたかを示す検出結果データ３２を生成し、生成した検出結果データ３２を画像処理装置１０１へ提供する。例えば、検出結果データ３２は、処理対象画像２１の各画素が改竄部分に属するか否かを示すビットマップを含み得る。検出結果データ３２により示される改竄検出結果は、画像処理装置１０１によりユーザへ提示され、ユーザによりその妥当性が検証される。ユーザによる妥当性の検証を支援するために、画像処理装置１０１又は改竄検出サーバ１０３は、処理対象画像２１において改竄部分に属すると判定された画素を強調して示す強調画像を生成する。改竄検出サーバ１０３が強調画像を生成するケースでは、改竄検出サーバ１０３は、生成した強調画像を上記ビットマップと共に画像処理装置１０１へ送信する。 The falsification detection server 103 may be an information processing device such as a computer or a workstation that detects a falsified portion included in the target manuscript using the processing target image 21 received from the image processing device 101. The tampering detection server 103 includes an image acquisition unit 131 and a detection unit 132. The image acquisition unit 131 acquires a processing target image 21 which is a scanned image of the target original. The detection unit 132 detects the falsified portion included in the target manuscript by using the processing target image 21. In the present embodiment, the detection unit 132 uses the above-mentioned learned model 41 provided by the learning device 102 for tampering detection. In particular, in the present embodiment, the detection unit 132 estimates whether or not each of the plurality of pixels in the processed image 21 belongs to the falsification portion by using the trained model 41 (that is, falsification detection for each pixel). .. However, a modified example (that is, falsification detection for each character) in which the detection unit 132 determines whether or not each of the one or more characters in the image 21 to be processed includes a falsified portion will be described later. The detection unit 132 generates detection result data 32 indicating which part of the image 21 to be processed is determined to be tampered with as a result of tampering detection, and provides the generated detection result data 32 to the image processing device 101. To do. For example, the detection result data 32 may include a bitmap indicating whether or not each pixel of the image to be processed 21 belongs to the falsified portion. The falsification detection result indicated by the detection result data 32 is presented to the user by the image processing device 101, and the validity is verified by the user. In order to assist the user in verifying the validity, the image processing device 101 or the falsification detection server 103 generates an enhanced image that emphasizes the pixels determined to belong to the falsified portion in the image 21 to be processed. In the case where the tampering detection server 103 generates the enhanced image, the tampering detection server 103 transmits the generated enhanced image together with the bitmap to the image processing device 101.

本実施形態において、学習済みモデル４１への処理対象画像２１の適用は、文字単位で行われ得る。そこで、改竄検出サーバ１０３は、処理対象画像２１をＯＣＲサーバ１０４へ送信し、処理対象画像２１に含まれる文字の認識をＯＣＲサーバ１０４に要求する。ＯＣＲサーバ１０４は、改竄検出サーバ１０３からの要求に応じて光学文字認識（ＯＣＲ）を実行する、コンピュータ又はワークステーションといった情報処理装置であってよい。ＯＣＲサーバ１０４は、文字認識部１４１を備える。文字認識部１４１は、処理対象画像２１について公知の手法でＯＣＲを実行して、処理対象画像２１内の文字と文字領域の位置とを認識する。そして、文字認識部１４１は、認識結果を示す認識結果データ３１を改竄検出サーバ１０３へ送信する。 In the present embodiment, the application of the processed image 21 to the trained model 41 can be performed on a character-by-character basis. Therefore, the tampering detection server 103 transmits the processing target image 21 to the OCR server 104, and requests the OCR server 104 to recognize the characters included in the processing target image 21. The OCR server 104 may be an information processing device such as a computer or workstation that executes optical character recognition (OCR) in response to a request from the manipulation detection server 103. The OCR server 104 includes a character recognition unit 141. The character recognition unit 141 executes OCR on the processing target image 21 by a known method to recognize the character and the position of the character region in the processing target image 21. Then, the character recognition unit 141 transmits the recognition result data 31 indicating the recognition result to the tampering detection server 103.

＜＜２．装置の構成＞＞
図２（Ａ）は、画像処理装置１０１の構成の一例を示すブロック図である。図２（Ｂ）は、学習装置１０２の構成の一例を示すブロック図である。図２（Ｃ）は、改竄検出サーバ１０３の構成の一例を示すブロック図である。 << 2. Device configuration >>
FIG. 2A is a block diagram showing an example of the configuration of the image processing device 101. FIG. 2B is a block diagram showing an example of the configuration of the learning device 102. FIG. 2C is a block diagram showing an example of the configuration of the tampering detection server 103.

（１）画像処理装置
画像処理装置１０１は、ＣＰＵ２０１、ＲＯＭ２０２、ＲＡＭ２０４、プリンタデバイス２０５、スキャナデバイス２０６、搬送デバイス２０７、ストレージ２０８、入力デバイス２０９、表示デバイス２１０及び外部Ｉ／Ｆ２１１を備える。データバス２０３は、画像処理装置１０１内のこれらデバイスを相互に接続する通信線である。 (1) Image processing device The image processing device 101 includes a CPU 201, a ROM 202, a RAM 204, a printer device 205, a scanner device 206, a transport device 207, a storage 208, an input device 209, a display device 210, and an external I / F 211. The data bus 203 is a communication line that connects these devices in the image processing device 101 to each other.

ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）２０１は、画像処理装置１０１の全体を制御するためのコントローラである。ＣＰＵ２０１は、不揮発メモリであるＲＯＭ２０２に予め記憶されるブートプログラムを実行して、画像処理装置１０１のＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）を起動する。ＣＰＵ２０１は、このＯＳ上で、ストレージ２０８に記憶されているコントローラプログラムを実行する。コントローラプログラムは、画像処理装置１０１のデバイスの各々を制御するためのプログラムである。ＲＡＭ２０４は、ＣＰＵ２０１の主記憶デバイスとして使用され、ＣＰＵ２０１に一時的な記憶領域（即ち、ワークエリア）を提供する。 The CPU (Central Processing Unit) 201 is a controller for controlling the entire image processing device 101. The CPU 201 executes a boot program stored in advance in the ROM 202, which is a non-volatile memory, to start the OS (Operating System) of the image processing device 101. The CPU 201 executes the controller program stored in the storage 208 on this OS. The controller program is a program for controlling each of the devices of the image processing device 101. The RAM 204 is used as the main storage device of the CPU 201 and provides the CPU 201 with a temporary storage area (ie, a work area).

プリンタデバイス２０５は、画像を用紙（記録材又はシートともいう）上に印刷するデバイスである。プリンタデバイス２０５は、印刷方式として、感光体ドラム若しくは感光体ベルトを用いる電子写真方式、又は微小ノズルアレイからインクを吐出して用紙上に直接画像を印字するインクジェット方式など、いかなる方式を採用してもよい。スキャナデバイス２０６は、原稿を光学的に走査するＣＣＤ（ＣｈａｒｇｅＣｏｕｐｌｅｄＤｅｖｉｃｅ）などの光学読取デバイスを含み、光学読取デバイスからの電気信号を読取画像の画像データへ変換する。搬送デバイス２０７は、いわゆるＡＤＦ（ＡｕｔｏＤｏｃｕｍｅｎｔＦｅｅｄｅｒ）を含んでもよく、ＡＤＦにセットされる原稿を１枚ずつスキャナデバイス２０６へ搬送する。スキャナデバイス２０６は、搬送デバイス２０７から搬送されて来る原稿を読取ることに加え、画像処理装置１０１の原稿台（図示せず）上に載置される原稿を読取ることが可能であってもよい。 The printer device 205 is a device that prints an image on paper (also referred to as a recording material or a sheet). The printer device 205 employs any method as a printing method, such as an electrophotographic method using a photoconductor drum or a photoconductor belt, or an inkjet method in which ink is ejected from a minute nozzle array to print an image directly on paper. May be good. The scanner device 206 includes an optical reading device such as a CCD (Charge Coupled Device) that optically scans a document, and converts an electric signal from the optical reading device into image data of a scanned image. The transport device 207 may include a so-called ADF (Auto Document Feeder), and transports the documents set in the ADF one by one to the scanner device 206. The scanner device 206 may be capable of reading the document placed on the platen (not shown) of the image processing device 101 in addition to reading the document conveyed from the transfer device 207.

ストレージ２０８は、例えばＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）又はＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）といった、不揮発性メモリを含む書込み及び読出しが可能な補助記憶デバイスであってよい。ストレージ２０８は、例えば、上述したコントローラプログラム、設定データ及び画像データなど、様々なデータを記憶する。入力デバイス２０９は、例えばタッチパネル又はハードキーなどを含み、ユーザからの操作指示又は情報入力といったユーザ入力を受付ける。入力デバイス２０９は、受付けたユーザ入力の内容を表す入力信号をＣＰＵ２０１へ送信する。表示デバイス２１０は、例えばＬＣＤ（ＬｉｑｕｉｄＣｒｙｓｔａｌＤｉｓｐｌａｙ）又はＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）などを含み、ＣＰＵ２０１により生成される画像（例えば、ユーザインタフェース画像）を画面上に表示する。例えば、ＣＰＵ２０１は、入力デバイス２０９から受信される入力信号により示されるポインティング位置と、表示デバイス２１０により表示されているユーザインタフェースの配置とに基づいて、ユーザによりどういった操作が行われたかを判定してもよい。ＣＰＵ２０１は、この判定結果に応じて、対応するデバイスの動作を制御し、又は表示デバイス２１０により表示される内容を変化させる。 The storage 208 may be a write / read auxiliary storage device including a non-volatile memory such as an HDD (Hard Disk Drive) or an SSD (Solid State Drive). The storage 208 stores various data such as the controller program, setting data, and image data described above. The input device 209 includes, for example, a touch panel or a hard key, and accepts user input such as an operation instruction or information input from the user. The input device 209 transmits an input signal representing the contents of the received user input to the CPU 201. The display device 210 includes, for example, an LCD (Liquid Crystal Display) or a CRT (Cathode Ray Tube), and displays an image (for example, a user interface image) generated by the CPU 201 on the screen. For example, the CPU 201 determines what kind of operation has been performed by the user based on the pointing position indicated by the input signal received from the input device 209 and the arrangement of the user interface displayed by the display device 210. You may. The CPU 201 controls the operation of the corresponding device or changes the content displayed by the display device 210 according to the determination result.

外部インタフェース（Ｉ／Ｆ）２１１は、ネットワーク１０５を介して、外部機器との間で画像データを含む多様なデータを送受信する。ネットワーク１０５は、例えばＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）、電話回線又は近接無線（例えば、赤外線）ネットワークといった、いかなる種類のネットワークであってもよい。外部Ｉ／Ｆ２１１は、例えば、学習装置１０２又はＰＣ（図示せず）といった外部機器から、印刷用の描画内容を記述したＰＤＬ（ＰａｇｅＤｅｓｃｒｉｐｔｉｏｎＬａｎｇｕａｇｅ）データを受信し得る。ＣＰＵ２０１は、外部Ｉ／Ｆ２１１により受信される当該ＰＤＬデータを解釈して、画像データを生成する。画像データは、印刷のためにプリンタデバイス２０５へ送信され、又は記憶のためにストレージ２０８へ送信され得る。また、外部Ｉ／Ｆ２１１は、スキャナデバイス２０６により取得される読取画像の画像データを改竄検出のために改竄検出サーバ１０３へ送信し、検出結果データを改竄検出サーバ１０３から受信し得る。 The external interface (I / F) 211 transmits and receives various data including image data to and from the external device via the network 105. The network 105 may be any type of network, for example a LAN (Local Area Network), a telephone line or a proximity radio (eg, infrared) network. The external I / F211 can receive PDL (Page Description Language) data describing drawing contents for printing from, for example, an external device such as a learning device 102 or a PC (not shown). The CPU 201 interprets the PDL data received by the external I / F 211 to generate image data. The image data may be transmitted to the printer device 205 for printing or to storage 208 for storage. Further, the external I / F 211 can transmit the image data of the scanned image acquired by the scanner device 206 to the tampering detection server 103 for tampering detection, and receive the detection result data from the tampering detection server 103.

（２）学習装置
学習装置１０２は、ＣＰＵ２３１、ＲＯＭ２３２、ＲＡＭ２３４、ストレージ２３５、入力デバイス２３６、表示デバイス２３７、外部Ｉ／Ｆ２３８及びＧＰＵ２３９を備える。データバス２３３は、学習装置１０２内のこれらデバイスを相互に接続する通信線である。 (2) Learning device The learning device 102 includes a CPU 231, a ROM 232, a RAM 234, a storage 235, an input device 236, a display device 237, an external I / F 238, and a GPU 239. The data bus 233 is a communication line that connects these devices in the learning device 102 to each other.

ＣＰＵ２３１は、学習装置１０２の全体を制御するためのコントローラである。ＣＰＵ２３１は、不揮発メモリであるＲＯＭ２３２に予め記憶されるブートプログラムを実行して、学習装置１０２のＯＳを起動する。ＣＰＵ２３１は、このＯＳの上で、ストレージ２３５に記憶されている学習用データ生成プログラム及び学習プログラムを実行する。学習用データ生成プログラムは、学習用原本画像１２及び学習用改竄画像１３のペアから学習用データを生成するためのプログラムである。学習プログラムは、改竄検出のための学習済みモデル（例えば、ニューラルネットワークモデル）を機械学習を通じて生成／更新するためのプログラムである。ＲＡＭ２３４は、ＣＰＵ２３１の主記憶デバイスとして使用され、ＣＰＵ２３１に一時的な記憶領域（即ち、ワークエリア）を提供する。 The CPU 231 is a controller for controlling the entire learning device 102. The CPU 231 executes a boot program stored in advance in the ROM 232 which is a non-volatile memory to start the OS of the learning device 102. The CPU 231 executes a learning data generation program and a learning program stored in the storage 235 on this OS. The learning data generation program is a program for generating learning data from a pair of a learning original image 12 and a learning falsified image 13. The learning program is a program for generating / updating a trained model (for example, a neural network model) for tampering detection through machine learning. The RAM 234 is used as the main storage device of the CPU 231 and provides the CPU 231 with a temporary storage area (ie, a work area).

ストレージ２３５は、例えばＨＤＤ又はＳＳＤといった、不揮発性メモリを含む書込み及び読出しが可能な補助記憶デバイスであってよい。ストレージ２３５は、例えば、上述したプログラム、学習用画像、学習用データ及びモデルデータなど、様々なデータを記憶する。入力デバイス２３６は、例えばマウス及びキーボードなどを含み、ユーザからの操作指示又は情報入力といったユーザ入力を受付ける。表示デバイス２３７は、例えばＬＣＤ又はＣＲＴなどを含み、ＣＰＵ２３１により生成される画像を画面上に表示する。外部Ｉ／Ｆ２３８は、ネットワーク１０５を介して、外部機器との間で学習処理に関連するデータを送受信する。外部Ｉ／Ｆ２３８は、例えば、学習用原本画像１２及び学習用改竄画像１３のペアを画像処理装置１０１から受信し得る。また、外部Ｉ／Ｆ２３８は、機会学習を通じて生成／更新された学習済みモデル４１を、改竄検出サーバ１０３へ送信し得る。ＧＰＵ２３９は、高度な並列処理の能力を有するプロセッサであり、ＣＰＵ２３１と協調して学習済みモデルを生成／更新するための学習処理を促進する。 The storage 235 may be a writable and readable auxiliary storage device, including a non-volatile memory, such as an HDD or SSD. The storage 235 stores various data such as the above-mentioned program, learning image, learning data, and model data. The input device 236 includes, for example, a mouse and a keyboard, and accepts user input such as an operation instruction or information input from the user. The display device 237 includes, for example, an LCD or a CRT, and displays an image generated by the CPU 231 on the screen. The external I / F 238 transmits / receives data related to the learning process to / from the external device via the network 105. The external I / F 238 may receive, for example, a pair of the learning original image 12 and the learning falsified image 13 from the image processing device 101. Further, the external I / F 238 may transmit the learned model 41 generated / updated through opportunity learning to the alteration detection server 103. The GPU 239 is a processor having a high degree of parallel processing capability, and promotes learning processing for generating / updating a trained model in cooperation with CPU 231.

（３）改竄検出サーバ
改竄検出サーバ１０３は、ＣＰＵ２６１、ＲＯＭ２６２、ＲＡＭ２６４、ストレージ２６５、入力デバイス２６６、表示デバイス２６７及び外部Ｉ／Ｆ２６８を備える。データバス２６３は、改竄検出サーバ１０３内のこれらデバイスを相互に接続する通信線である。 (3) Manipulation detection server The manipulation detection server 103 includes a CPU 261 and a ROM 262, a RAM 264, a storage 265, an input device 266, a display device 267, and an external I / F 268. The data bus 263 is a communication line that connects these devices in the tamper detection server 103 to each other.

ＣＰＵ２６１は、改竄検出サーバ１０３の全体を制御するためのコントローラである。ＣＰＵ２６１は、不揮発メモリであるＲＯＭ２６２に予め記憶されるブートプログラムを実行することにより、改竄検出サーバ１０３のＯＳを起動する。ＣＰＵ２６１は、このＯＳの上で、ストレージ２６５に記憶されている改竄検出プログラムを実行する。改竄検出プログラムは、クライアント装置（例えば、画像処理装置１０１）から取得される対象原稿の読取画像（即ち、処理対象画像）を用いて対象原稿に含まれる改竄部分を検出するためのプログラムである。ＲＡＭ２６４は、ＣＰＵ２６１の主記憶デバイスとして使用され、ＣＰＵ２６１に一時的な記憶領域（即ち、ワークエリア）を提供する。 The CPU 261 is a controller for controlling the entire tampering detection server 103. The CPU 261 boots the OS of the tampering detection server 103 by executing a boot program stored in advance in the ROM 262, which is a non-volatile memory. The CPU 261 executes a tampering detection program stored in the storage 265 on this OS. The falsification detection program is a program for detecting a falsified portion included in a target manuscript by using a scanned image (that is, a processing target image) of the target manuscript acquired from a client device (for example, an image processing device 101). The RAM 264 is used as the main storage device of the CPU 261 and provides the CPU 261 with a temporary storage area (ie, work area).

ストレージ２６５は、例えばＨＤＤ又はＳＳＤといった、不揮発性メモリを含む書込み及び読出しが可能な補助記憶デバイスであってよい。ストレージ２６５は、例えば、上述したプログラム、画像データ及び検出結果データなど、様々なデータを記憶する。入力デバイス２６６は、例えばマウス及びキーボードなどを含み、ユーザからの操作指示又は情報入力といったユーザ入力を受付ける。表示デバイス２６７は、例えばＬＣＤ又はＣＲＴなどを含み、ＣＰＵ２６１により生成される画像を画面上に表示する。外部Ｉ／Ｆ２６８は、ネットワーク１０５を介して、外部機器との間で改竄検出に関連するデータを送受信する。外部Ｉ／Ｆ２６８は、例えば、処理対象画像２１を画像処理装置１０１から受信し、検出結果データ３２を画像処理装置１０１へ送信し得る。また、外部Ｉ／Ｆ２６８は、学習済みモデル４１の提供を求める要求を学習装置１０２へ送信し、学習済みモデル４１を学習装置１０２から受信し得る。また、外部Ｉ／Ｆ２６８は、ＯＣＲの実行を求める要求をＯＣＲサーバ１０４へ送信し、ＯＣＲの結果を示す認識結果データ３１をＯＣＲサーバ１０４から受信し得る。 The storage 265 may be a writable and readable auxiliary storage device, including a non-volatile memory, such as an HDD or SSD. The storage 265 stores various data such as the above-mentioned program, image data, and detection result data. The input device 266 includes, for example, a mouse and a keyboard, and accepts user input such as an operation instruction or information input from the user. The display device 267 includes, for example, an LCD or a CRT, and displays an image generated by the CPU 261 on the screen. The external I / F 268 transmits / receives data related to tampering detection to / from an external device via the network 105. The external I / F 268 may, for example, receive the image 21 to be processed from the image processing device 101 and transmit the detection result data 32 to the image processing device 101. Further, the external I / F 268 can transmit a request for providing the trained model 41 to the learning device 102 and receive the trained model 41 from the learning device 102. Further, the external I / F 268 can transmit a request for execution of the OCR to the OCR server 104, and can receive the recognition result data 31 indicating the result of the OCR from the OCR server 104.

なお、図２には示していないものの、ＯＣＲサーバ１０４の構成もまた改竄検出サーバ１０３と同様であってよい。 Although not shown in FIG. 2, the configuration of the OCR server 104 may be the same as that of the tampering detection server 103.

＜＜３．処理の流れ＞＞
図３（Ａ）は、画像処理システム１００における学習段階の処理の概略的な流れの一例を示すシーケンス図である。図３（Ｂ）は、画像処理システム１００における改竄検出段階の処理の概略的な流れの一例を示すシーケンス図である。なお、以下の説明では、処理ステップをＳ（ステップ）と略記する。 << 3. Process flow >>
FIG. 3A is a sequence diagram showing an example of a schematic flow of processing in the learning stage in the image processing system 100. FIG. 3B is a sequence diagram showing an example of a schematic flow of processing in the falsification detection stage in the image processing system 100. In the following description, the processing step is abbreviated as S (step).

＜３−１．学習段階＞
学習段階において、まず、オペレータは、Ｓ３０１で、手書きで文字を記入した学習用原稿を画像処理装置１０１にセットし、原稿の読取りを画像処理装置１０１に指示する。このとき、オペレータは、セットした学習用原稿が（改竄されていない）原本であることを入力デバイス２０９を介して画像処理装置１０１に入力する。画像処理装置１０１の読取部１１１は、オペレータからの指示に応じて、Ｓ３０２で、セットされた学習用原稿を読取って、読取画像１２を生成する。また、読取部１１１は、読取画像１２が学習用原本画像であることを示すフラグを読取画像１２に付与する。また、オペレータは、Ｓ３０３で、改竄を加えられた学習用原稿を画像処理装置１０１にセットし、原稿の読取りを画像処理装置１０１に指示する。このとき、オペレータは、セットした学習用原稿が改竄部分を含むことを入力デバイス２０９を介して画像処理装置１０１に入力する。画像処理装置１０１の読取部１１１は、オペレータからの指示に応じて、Ｓ３０４で、セットされた学習用原稿を読取って、読取画像１３を生成する。また、読取部１１１は、読取画像１３が学習用改竄画像であることを示すフラグを読取画像１３に付与する。読取部１１１は、Ｓ３０２及びＳ３０４で学習用原稿が有する識別情報を読取ることにより、学習用原本画像１２及び学習用改竄画像１３が同じ学習用原稿の原本及び改竄されたバージョンのペアであることを認識する。そして、読取部１１１は、認識した学習用原稿を識別する文書ＩＤを学習用原本画像１２及び学習用改竄画像１３に関連付ける。さらに、読取部１１１は、学習済みモデルの生成／更新の単位を識別するデータセットＩＤを学習用原本画像１２及び学習用改竄画像１３に関連付ける。一例として、画像処理装置ごとに１つの学習済みモデルを生成／更新する場合には、データセットＩＤは、画像処理装置１０１を一意に識別する識別子であってよい。他の例として、ユーザごとに１つの学習済みモデルを生成／更新する場合には、データセットＩＤは、各ユーザを一意に識別する識別子であってもよい。また別の例として、ユーザグループごとに１つの学習済みモデルを生成／更新する場合には、データセットＩＤは、各ユーザグループを一意に識別する識別子であってもよい。Ｓ３０５で、読取部１１１は、このように生成される学習用画像及び関連するデータ（即ち、学習用原本画像１２、学習用改竄画像１３、フラグ、文書ＩＤ及びデータセットＩＤ）を学習装置１０２へ送信する。 <3-1. Learning stage ＞
In the learning stage, first, in S301, the operator sets the learning manuscript in which characters are written by hand in the image processing device 101, and instructs the image processing device 101 to read the manuscript. At this time, the operator inputs to the image processing device 101 via the input device 209 that the set learning manuscript is the original (not tampered with). The reading unit 111 of the image processing device 101 reads the set learning document in S302 in response to an instruction from the operator, and generates a scanned image 12. Further, the reading unit 111 adds a flag to the reading image 12 indicating that the reading image 12 is a learning original image. Further, in S303, the operator sets the falsified learning document in the image processing device 101, and instructs the image processing device 101 to read the document. At this time, the operator inputs to the image processing device 101 via the input device 209 that the set learning document includes a falsified portion. The reading unit 111 of the image processing device 101 reads the set learning document in S304 in response to an instruction from the operator, and generates a scanned image 13. Further, the reading unit 111 gives the read image 13 a flag indicating that the read image 13 is a falsified image for learning. By reading the identification information of the learning manuscript in S302 and S304, the reading unit 111 determines that the learning original image 12 and the learning falsified image 13 are a pair of the original learning manuscript and the falsified version. recognize. Then, the reading unit 111 associates the recognized document ID that identifies the learning manuscript with the learning original image 12 and the learning falsified image 13. Further, the reading unit 111 associates the data set ID that identifies the unit of generation / update of the learned model with the learning original image 12 and the learning falsified image 13. As an example, when one trained model is generated / updated for each image processing device, the data set ID may be an identifier that uniquely identifies the image processing device 101. As another example, when generating / updating one trained model for each user, the dataset ID may be an identifier that uniquely identifies each user. As another example, when one trained model is generated / updated for each user group, the data set ID may be an identifier that uniquely identifies each user group. In S305, the reading unit 111 sends the learning image and related data (that is, the learning original image 12, the learning falsified image 13, the flag, the document ID, and the data set ID) generated in this way to the learning device 102. Send.

図４は、空欄の学習用原稿の一例を示す模式図である。図４の例において、学習用原稿４０１は、８個の文字記入欄４０２を有する帳票形式の文書である。オペレータは、各文字記入欄４０２に任意の文字を記入して、学習用原稿の原本を作成し得る。学習用原稿４０１は、右上に識別情報４０３を有する。識別情報（埋込情報ともいう）４０３は、学習用原稿４０１を一意に識別する文書ＩＤを二次元コードで視覚的に表現している。各文書ＩＤにより識別される学習用原稿が原稿中のどの位置にどういったサイズの文字記入欄を有するかは、予め定義され、学習装置１０２と共有されている。文書ＩＤは、例えばＵＵＩＤ（ＵｎｉｖｅｒｓａｌｌｙＵｎｉｑｕｅＩｄｅｎｔｉｆｉｅｒ）であってよい。 FIG. 4 is a schematic diagram showing an example of a blank learning manuscript. In the example of FIG. 4, the learning manuscript 401 is a form-type document having eight character entry fields 402. The operator can create an original learning manuscript by entering arbitrary characters in each character entry field 402. The learning manuscript 401 has identification information 403 in the upper right. The identification information (also referred to as embedded information) 403 visually represents the document ID that uniquely identifies the learning manuscript 401 with a two-dimensional code. The position and size of the character entry field in the manuscript for the learning manuscript identified by each document ID are defined in advance and shared with the learning device 102. The document ID may be, for example, a UUID (Universally Unique Identifier).

図５は、画像処理装置１０１が原稿読取りの指示を受付けるためのＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）の一例を示す模式図である。図５に示した操作ウィンドウ５００は、例えば表示デバイス２１０の画面上に表示され得る。操作ウィンドウ５００は、プレビュー領域５０１、種別指定ボタン５０２、５０３、スキャンボタン５０５及び送信開始ボタン５０６を有する。種別指定ボタン５０２、５０３は、これから読取ろうとする原稿又は読取られた原稿の種別（属性ともいう）として、原本であるか又は改竄バージョンであるかをオペレータに指定させるためのボタンである。種別指定ボタン５０２が操作（例えば、タップ）された場合、読取部１１１は、読取画像が学習用改竄画像であることを示すフラグを読取画像に付与する。一方、種別指定ボタン５０３が操作された場合、読取部１１１は、読取画像が学習用原本画像であることを示すフラグを読取画像に付与する。操作ウィンドウ５００において、指定済みの種別（原本又は改竄）に対応するボタンは、強調的に表示されてよい。スキャンボタン５０５は、画像処理装置１０１にセットされた原稿の読取りをトリガするためのボタンである。スキャンボタン５０５が操作され、スキャンが完了すると、プレビュー領域５０１に読取画像のプレビューイメージが表示される。オペレータは、データの送信が開始される前に、画像処理装置１０１に別の原稿をセットしてスキャンボタン５０５を再度操作することで、複数の読取画像及び関連するデータをまとめて画像処理装置１０１に保持させてもよい。少なくとも１つの学習用原稿のスキャンが完了し、原稿の種別が指定されると、送信開始ボタン５０６が操作可能となる。送信開始ボタン５０６は、読取画像及び関連するデータの送信をトリガするためのボタンである。送信開始ボタン５０６が操作されると、読取部１１１は、学習用画像及び関連するデータを学習装置１０２へ送信する（図３のＳ３０５参照）。 FIG. 5 is a schematic view showing an example of a GUI (Graphical User Interface) for the image processing device 101 to receive an instruction for reading a document. The operation window 500 shown in FIG. 5 may be displayed on the screen of the display device 210, for example. The operation window 500 has a preview area 501, type designation buttons 502 and 503, a scan button 505, and a transmission start button 506. The type designation buttons 502 and 503 are buttons for causing the operator to specify whether the original or the falsified version is the type (also referred to as an attribute) of the original to be read or the scanned original. When the type designation button 502 is operated (for example, tapped), the reading unit 111 gives the read image a flag indicating that the read image is a falsified image for learning. On the other hand, when the type designation button 503 is operated, the reading unit 111 gives the read image a flag indicating that the read image is the original image for learning. In the operation window 500, the buttons corresponding to the specified type (original or tampered with) may be highlighted. The scan button 505 is a button for triggering the reading of the document set in the image processing device 101. When the scan button 505 is operated and the scan is completed, the preview image of the scanned image is displayed in the preview area 501. Before the data transmission is started, the operator sets another document in the image processing device 101 and operates the scan button 505 again to collect a plurality of scanned images and related data into the image processing device 101. May be retained by. When the scanning of at least one learning document is completed and the type of the document is specified, the transmission start button 506 can be operated. The transmission start button 506 is a button for triggering the transmission of the scanned image and related data. When the transmission start button 506 is operated, the reading unit 111 transmits the learning image and related data to the learning device 102 (see S305 in FIG. 3).

図３（Ａ）のＳ３０５で、学習装置１０２のデータ処理部１２１は、画像処理装置１０１から学習用画像及び関連するデータを受信する。Ｓ３０６で、データ処理部１２１は、受信した学習用画像及び関連するデータを記憶部１２３に記憶させる。データ処理部１２１及び学習部１２２は、学習のために十分な量のデータが蓄積されると、機械学習のための処理を開始する。まず、Ｓ３０７で、データ処理部１２１は、学習用原本画像１２及び学習用改竄画像１３のペアを記憶部１２３から読出し、読出したこれら画像から学習用データを生成する。 In S305 of FIG. 3A, the data processing unit 121 of the learning device 102 receives the learning image and related data from the image processing device 101. In S306, the data processing unit 121 stores the received learning image and related data in the storage unit 123. The data processing unit 121 and the learning unit 122 start processing for machine learning when a sufficient amount of data for learning is accumulated. First, in S307, the data processing unit 121 reads a pair of the learning original image 12 and the learning falsified image 13 from the storage unit 123, and generates learning data from these read images.

図６は、学習用画像から学習用データがいかにして生成されるかを説明するための説明図である。図６（Ａ）及び図６（Ｂ）は、学習用原本画像１２及び学習用改竄画像１３の例をそれぞれ示している。上で説明したように、１つのペアを構成する学習用原本画像１２及び学習用改竄画像１３は、共通的な文字記入欄を有する。図６の例では、学習用原本画像１２の文字記入欄４０２ａに、数字の「１」が記入されている。一方、学習用改竄画像１３の文字記入欄４０２ｂの内容は改竄されており、ストロークが書き足された結果、文字記入欄４０２ｂは数字の「４」を示しているように見える。データ処理部１２１は、学習用原本画像１２及び学習用改竄画像１３から各文字記入欄の領域の部分画像を切出し、共通する文字記入欄から切出した部分画像同士の差分画像を算出する。データ処理部１２１は、切出すべき各文字記入欄の位置を、識別情報４０３を読取ることにより取得される文書ＩＤに基づいて知得し得る。この差分画像においてゼロ又は予め決定される閾値以下の絶対値を示す画素は、改竄部分に属さないと想定され、当該閾値を上回る絶対値を示す画素は、改竄部分に属すると想定され得る。図６（Ｃ）に示した文字領域画像６１１は、学習用改竄画像１３の文字記入欄４０２ｂから切出された部分画像である。文字領域画像６１１は、学習部１２２により機械学習のモデルへ入力される学習用の入力画像となる。図６（Ｃ）に示した二値画像６１２は、文字領域画像６１１から、学習用原本画像１２の文字記入欄４０２ａから切出された部分画像を（必要に応じて位置合わせを行った後）減算し、さらに上記閾値に基づく二値化を行うことにより生成／更新された画像である。二値画像６１２において真を示す画素（図中で白色の画素）は改竄部分に属し、偽を示す画素（図中で黒色の画素）は改竄部分に属さない。二値画像６１２は、学習部１２２により機械学習の教師データとして扱われる教師画像となる。 FIG. 6 is an explanatory diagram for explaining how the learning data is generated from the learning image. 6 (A) and 6 (B) show examples of the original learning image 12 and the falsified learning image 13, respectively. As described above, the learning original image 12 and the learning falsified image 13 constituting one pair have a common character entry field. In the example of FIG. 6, the number "1" is entered in the character entry field 402a of the learning original image 12. On the other hand, the content of the character entry field 402b of the falsified image 13 for learning has been falsified, and as a result of adding strokes, the character entry field 402b seems to indicate the number "4". The data processing unit 121 cuts out a partial image of the area of each character entry field from the learning original image 12 and the learning tampered image 13, and calculates a difference image between the partial images cut out from the common character entry field. The data processing unit 121 can know the position of each character entry field to be cut out based on the document ID acquired by reading the identification information 403. In this difference image, a pixel showing an absolute value of zero or a threshold value or less determined in advance is assumed not to belong to the falsified portion, and a pixel showing an absolute value exceeding the threshold value can be assumed to belong to the falsified portion. The character area image 611 shown in FIG. 6C is a partial image cut out from the character entry field 402b of the learning falsified image 13. The character area image 611 is an input image for learning that is input to the machine learning model by the learning unit 122. The binary image 612 shown in FIG. 6C is a partial image cut out from the character entry field 402a of the original image 12 for learning from the character area image 611 (after alignment as necessary). It is an image generated / updated by subtraction and further binarization based on the above threshold value. In the binary image 612, the pixel indicating true (white pixel in the figure) belongs to the falsified portion, and the pixel indicating false (black pixel in the figure) does not belong to the falsified portion. The binary image 612 is a teacher image that is treated as teacher data for machine learning by the learning unit 122.

なお、学習用画像は、対応する学習用改竄画像１３の無い学習用原本画像１２を含んでもよい。この場合、データ処理部１２１は、学習用原本画像１２の文字記入欄４０２ａから切出される部分画像を学習用の入力画像、同じサイズで全画素が偽を示す（即ち、全体として改竄部分が無いことを示す）二値画像を教師画像として生成し得る。 The learning image may include a learning original image 12 without the corresponding learning falsified image 13. In this case, the data processing unit 121 indicates that the partial image cut out from the character entry field 402a of the original learning image 12 is the input image for learning, and all the pixels have the same size and are false (that is, there is no tampered portion as a whole). A binary image can be generated as a teacher image.

データ処理部１２１は、同一のデータセットＩＤに関連付けられている学習用原本画像１２及び学習用改竄画像１３の複数のペアに基づいて、上述した入力画像及び対応する教師画像を複数生成する。次いで、Ｓ３０８で、学習部１２２は、同一のデータセットＩＤの範囲内で、これら入力画像及び教師画像を用いた学習処理を反復的に実行し、改竄検出のための学習済みモデル４１を生成／更新する。学習済みモデル４１は、限定ではないものの、例えば完全畳込みネットワーク（ＦＣＮ）型のモデルであってもよい。学習処理の１回の反復は、例えば、モデルへの入力画像の入力、（暫定的なパラメータ値を有する）モデルに従って算出される出力データの教師データに対する誤差の算出、及び、誤差を減少させるためのパラメータ値の調整を含み得る。誤差の指標として、例えば交差エントロピーが使用され得る。また、パラメータ値の調整の手法として、例えばバックプロパゲーション法が使用され得る。学習部１２２は、学習が収束したと判定されるまで、又は反復回数が上限値に達するまで、学習処理を反復し得る。そして、学習部１２２は、生成／更新した学習済みモデル４１（学習済みモデル４１を構成するモデルパラメータのセット）を、対応するデータセットＩＤに関連付けて記憶部１２３に記憶させる。学習部１２２は、相異なる２つ以上のデータセットＩＤについて、それぞれ別個の学習済みモデル４１を生成／更新してもよい。また、学習部１２２は、過去に生成／更新した学習済みモデル４１を、新たに取得された学習用画像を用いた追加的な学習処理を通じて更新してもよい。また、学習部１２２は、学習処理に入力すべき学習用データを、オンライン学習、バッチ学習又はミニバッチ学習のいずれの手法で選択してもよい。 The data processing unit 121 generates a plurality of the above-mentioned input images and corresponding teacher images based on a plurality of pairs of the learning original image 12 and the learning tampered image 13 associated with the same data set ID. Next, in S308, the learning unit 122 repeatedly executes a learning process using these input images and teacher images within the same data set ID range, and generates a trained model 41 for tampering detection / Update. The trained model 41 may be, for example, a fully convoluted network (FCN) type model, although it is not limited. One iteration of the training process is, for example, inputting an input image into the model, calculating an error in the teacher data of the output data calculated according to the model (having provisional parameter values), and reducing the error. It may include adjustment of the parameter value of. For example, cross entropy can be used as an index of error. Further, as a method for adjusting the parameter value, for example, a back propagation method can be used. The learning unit 122 may repeat the learning process until it is determined that the learning has converged or the number of iterations reaches the upper limit. Then, the learning unit 122 stores the generated / updated learned model 41 (a set of model parameters constituting the learned model 41) in the storage unit 123 in association with the corresponding data set ID. The learning unit 122 may generate / update a separate trained model 41 for each of two or more different data set IDs. Further, the learning unit 122 may update the learned model 41 generated / updated in the past through additional learning processing using the newly acquired learning image. Further, the learning unit 122 may select learning data to be input to the learning process by any method of online learning, batch learning, or mini-batch learning.

＜３−２．改竄検知段階＞
（１）概略的な流れ
改竄検出段階において、まず、ユーザは、Ｓ３５１で、対象原稿を画像処理装置１０１にセットし、原稿の読取りを画像処理装置１０１に指示する。なお、ここでのユーザは、学習段階に関与したオペレータと同一の人物であってもよく、又は異なる人物であってもよい。画像処理装置１０１の読取部１１１は、ユーザからの指示に応じて、Ｓ３５２で、セットされた対象原稿を読取って、読取画像２１を生成する。また、ユーザは、Ｓ３５３で、対象原稿について改竄を検出することを画像処理装置１０１に指示する。読取部１１１は、改竄検出の指示に応じて、Ｓ３５４で、読取画像２１が処理対象画像であることを示すフラグを読取画像２１に付与し、改竄検出に関連する設定データを（例えば、メモリから）取得する。ここで取得される設定データは、例えば、改竄検出のために利用すべき学習済みモデルを特定するデータセットＩＤ（例えば、画像処理装置１０１、ユーザ又はユーザグループを識別する識別子）を含み得る。そして、読取部１１１は、Ｓ３５５で、改竄検出要求と共に、処理対象画像２１及び関連するデータを改竄検出サーバ１０３へ送信する。 <3-2. Falsification detection stage>
(1) Outline Flow In the tampering detection stage, first, in S351, the user sets the target document in the image processing device 101 and instructs the image processing device 101 to read the document. The user here may be the same person as the operator involved in the learning stage, or may be a different person. The scanning unit 111 of the image processing device 101 reads the set target document in S352 in response to an instruction from the user, and generates a scanned image 21. In addition, the user instructs the image processing device 101 to detect falsification of the target document in S353. In response to the instruction for tampering detection, the reading unit 111 gives the read image 21 a flag indicating that the read image 21 is the image to be processed in S354, and sets setting data related to tampering detection (for example, from the memory). )get. The setting data acquired here may include, for example, a data set ID (for example, an image processing device 101, an identifier that identifies a user or a user group) that identifies a trained model to be used for tampering detection. Then, in S355, the reading unit 111 transmits the processing target image 21 and the related data to the tampering detection server 103 together with the tampering detection request.

改竄検出サーバ１０３の画像取得部１３１は、Ｓ３５５で、改竄検出要求と共に、対象原稿の読取画像である処理対象画像２１及び関連するデータを画像処理装置１０１から受信する。画像取得部１３１は、受信した画像及びデータを検出部１３２へ出力する。検出部１３２は、Ｓ３５６で、最新の学習済みモデル４１の提供を学習装置１０２へ要求する。ここで学習装置１０２へ送信されるモデル要求は、例えば、データセットＩＤを含み得る。学習装置１０２の学習部１２２は、モデル要求の受信に応じて、Ｓ３５７で、（例えば、データセットＩＤにより特定される）最新の学習済みモデル４１を記憶部１２３から読出し、読出した学習済みモデル４１を検出部１３２へ送信する。次いで、Ｓ３５８で、検出部１３２は、処理対象画像２１をＯＣＲサーバ１０４へ送信して、処理対象画像２１に含まれる文字の認識をＯＣＲサーバ１０４に要求する。ＯＣＲサーバ１０４の文字認識部１４１は、ＯＣＲ要求の受信に応じて、Ｓ３５９で、処理対象画像２１についてＯＣＲを実行して、処理対象画像２１内の文字と文字領域の位置とを認識する。次いで、文字認識部１４１は、Ｓ３６０で、認識結果を示す認識結果データ３１を検出部１３２へ送信する。検出部１３２は、Ｓ３６１で、学習装置１０２から提供された学習済みモデル４１に処理対象画像２１を適用することにより、対象原稿に含まれる改竄部分を検出する。学習済みモデル４１は、上述したように学習用原稿の読取画像である学習用画像を用いた機械学習を通じて生成／更新されたモデルである。ここでの改竄検出処理において、検出部１３２は、例えば、ＯＣＲの結果として認識された文字領域ごとに、処理対象画像２１の部分画像を処理対象画像２１に適用する。その結果、処理対象画像２１内で認識された文字の文字領域の各々について、各画素が改竄部分に属するか否かを示すビットマップである改竄検出結果が生成される。次いで、検出部１３２は、Ｓ３６２で、検出結果データ３２を画像処理装置１０１へ送信する。検出結果データ３２は、それぞれの文字領域について得られた上記ビットマップを統合することにより生成される、（各画素が改竄部分に属するか否かを示す）処理対象画像２１と同じサイズの統合されたビットマップを含む。この統合されたビットマップを、以下の説明において検出結果画像という。検出部１３２は、処理対象画像２１において改竄検出の結果として改竄部分に属すると判定された画素（以下、改竄画素という）を強調して示す強調画像を追加的に生成し、生成した強調画像を検出結果データ３２に含めてもよい。 In S355, the image acquisition unit 131 of the tampering detection server 103 receives the tampering detection request and the processing target image 21 which is a scanned image of the target document and related data from the image processing device 101. The image acquisition unit 131 outputs the received image and data to the detection unit 132. In S356, the detection unit 132 requests the learning device 102 to provide the latest trained model 41. The model request transmitted here to the learning device 102 may include, for example, a dataset ID. The learning unit 122 of the learning device 102 reads the latest learned model 41 (specified by, for example, the data set ID) from the storage unit 123 in S357 in response to the reception of the model request, and the learned model 41 is read out. Is transmitted to the detection unit 132. Next, in S358, the detection unit 132 transmits the processing target image 21 to the OCR server 104, and requests the OCR server 104 to recognize the characters included in the processing target image 21. In response to the reception of the OCR request, the character recognition unit 141 of the OCR server 104 executes OCR on the processing target image 21 in S359 to recognize the character and the position of the character area in the processing target image 21. Next, the character recognition unit 141 transmits the recognition result data 31 indicating the recognition result to the detection unit 132 in S360. The detection unit 132 detects the falsified portion included in the target manuscript by applying the processing target image 21 to the learned model 41 provided by the learning device 102 in S361. The trained model 41 is a model generated / updated through machine learning using a learning image which is a scanned image of a learning manuscript as described above. In the tampering detection process here, the detection unit 132 applies a partial image of the process target image 21 to the process target image 21 for each character region recognized as a result of OCR, for example. As a result, a falsification detection result, which is a bitmap indicating whether or not each pixel belongs to the falsification portion, is generated for each of the character regions of the characters recognized in the processing target image 21. Next, the detection unit 132 transmits the detection result data 32 to the image processing device 101 in S362. The detection result data 32 is integrated with the same size as the processing target image 21 (indicating whether or not each pixel belongs to the falsified portion) generated by integrating the above-mentioned bitmaps obtained for each character area. Includes bitmaps. This integrated bitmap is referred to as a detection result image in the following description. The detection unit 132 additionally generates an enhanced image showing the pixels (hereinafter referred to as tampered pixels) determined to belong to the tampered portion as a result of the tampering detection in the processed image 21, and generates the generated enhanced image. It may be included in the detection result data 32.

図７（Ａ）は、一例としての処理対象画像２１ａを示している。図７（Ｂ）は、処理対象画像２１ａについての改竄検出結果を示す検出結果画像３２ａを示している。図７（Ｃ）は、処理対象画像２１ａの改竄部分を強調した強調画像３２ｂを示している。処理対象画像２１ａは、対象原稿として契約書を読取ることにより生成された画像である。処理対象画像２１ａは、複数の文字領域を有し、そのうちの１つの文字領域７０１ａは数字の「４」を示すように見える。これら文字領域は、ＯＣＲサーバ１０４の文字認識部１４１により実行されるＯＣＲの結果として認識され得る。改竄検出サーバ１０３の検出部１３２は、処理対象画像２１からそれぞれの文字領域の文字領域画像を切出し、各文字領域画像に学習済みモデル４１を適用する。その結果、各文字領域の各画素が改竄部分に属するか否かが判定される。検出結果画像３２ａは、改竄検出結果を画像全体として統合した二値画像である。検出結果画像３２ａにおいて、改竄部分に属すると判定された画素は真（図中で黒色）を示し、改竄部分に属さないと判定された画素は偽（図中で白色）を示す。例えば、文字領域７０１ａ内で数字の「４」の一部のストロークを構成する画素は真を示しており、これは当該ストロークが改竄のために事後的に追加された可能性があることを意味する。強調画像３２ｂは、処理対象画像２１ａ内で、検出結果画像３２ａが真を示す画素の色を特定の色に変更することにより生成される画像である。ここでの特定の色は、限定ではないものの、例えば赤色（ＲＧＢ＝（２５５，０，０））であってよい。図７（Ｃ）の例では、強調画像３２ｂ内の文字領域７０１ａにおいて、数字の「４」の一部のストロークの色が変更されている。なお、改竄部分に属すると判定された画素を強調する手法は、上述したような色の変更に限定されず、太線化又は点滅などといった任意の手法であってよい。 FIG. 7A shows an image to be processed 21a as an example. FIG. 7B shows a detection result image 32a showing a falsification detection result for the processing target image 21a. FIG. 7C shows an emphasized image 32b in which the falsified portion of the image to be processed 21a is emphasized. The processing target image 21a is an image generated by reading the contract as the target manuscript. The image to be processed 21a has a plurality of character areas, and one of the character areas 701a appears to indicate the number "4". These character areas can be recognized as a result of OCR executed by the character recognition unit 141 of the OCR server 104. The detection unit 132 of the tampering detection server 103 cuts out a character area image of each character area from the processing target image 21, and applies the trained model 41 to each character area image. As a result, it is determined whether or not each pixel in each character area belongs to the falsified portion. The detection result image 32a is a binary image in which the falsification detection result is integrated as an entire image. In the detection result image 32a, the pixel determined to belong to the falsified portion indicates true (black in the figure), and the pixel determined not to belong to the falsified portion indicates false (white in the figure). For example, the pixels that make up some strokes of the number "4" in the character area 701a indicate true, which means that the strokes may have been added ex post facto for tampering. To do. The emphasized image 32b is an image generated by changing the color of the pixel in which the detection result image 32a indicates true in the processing target image 21a to a specific color. The specific color here may be, for example, red (RGB = (255,0,0)), although it is not limited. In the example of FIG. 7C, the color of a part of the stroke of the number “4” is changed in the character area 701a in the emphasized image 32b. The method of emphasizing the pixels determined to belong to the falsified portion is not limited to the color change as described above, and may be any method such as thickening or blinking.

図３のＳ３６２で、画像処理装置１０１の表示制御部１１２は、改竄検出サーバ１０３から上述した検出結果データ３２を受信する。表示制御部１１２は、Ｓ３６３で、検出結果データ３２に基づいて、改竄検出の結果を画面上に表示させる。とりわけ、本実施形態において、表示制御部１１２は、対象原稿において改竄部分が検出されたことを検出結果データ３２が示す場合に、改竄部分を含む領域の強調画像及び比較画像を、画面上に対比的に表示させる。ここでの強調画像は、処理対象画像２１における改竄部分を強調して示す画像であり、比較画像は、当該改竄部分を処理対象画像２１の通りに示す画像である。それにより、ユーザは、強調画像から画像内のどの部分について改竄が疑われるかを把握した上で、比較画像における当該部分の色味又は濃淡を見て、実際に改竄があったかを判断することができる。そのような対比的な表示の例、及び比較画像のいくつかのバリエーションについて、後にさらに説明する。 In S362 of FIG. 3, the display control unit 112 of the image processing device 101 receives the above-mentioned detection result data 32 from the tampering detection server 103. In S363, the display control unit 112 displays the result of tampering detection on the screen based on the detection result data 32. In particular, in the present embodiment, when the detection result data 32 indicates that a falsified portion has been detected in the target document, the display control unit 112 compares the emphasized image and the comparative image of the region including the falsified portion on the screen. To display. The emphasized image here is an image in which the falsified portion of the processed image 21 is emphasized, and the comparative image is an image showing the falsified portion as in the processed image 21. As a result, the user can grasp which part of the image is suspected of being tampered with from the emphasized image, and then look at the color or shade of the part in the comparison image to determine whether or not the tampering has actually occurred. it can. An example of such a contrasting display, and some variations of the comparative image, will be further described later.

（２）改竄検出段階の具体的な流れ（画像処理装置）
図８は、改竄検出段階において画像処理装置１０１により実行される処理の具体的な流れの一例を示すフローチャートである。図８に示した処理は、ストレージ２０８からＲＡＭ２０４にロードされるコントローラプログラムを実行するＣＰＵ２０１による制御の下で、画像処理装置１０１において実行される。当該処理は、画像処理装置１０１の入力デバイス２０９を介してユーザによる所定の操作が検出された際に開始され得る。 (2) Specific flow of tampering detection stage (image processing device)
FIG. 8 is a flowchart showing an example of a specific flow of processing executed by the image processing device 101 in the tampering detection stage. The process shown in FIG. 8 is executed in the image processing device 101 under the control of the CPU 201 that executes the controller program loaded from the storage 208 to the RAM 204. The process may be started when a predetermined operation by the user is detected via the input device 209 of the image processing device 101.

まず、Ｓ８０１で、読取部１１１は、搬送デバイス２０７にセットされた対象原稿をスキャナデバイス２０６で読取って処理対象画像を生成する。処理対象画像は、例えばフルカラー（ＲＧＢ３チャネル）画像であってよい。また、読取部１１１は、Ｓ８０２で、入力デバイス２０９を介してユーザにより入力される改竄検出指示を受付ける。次いで、読取部１１１は、Ｓ８０３で、改竄検出要求と共に、処理対象画像及び関連するデータ（例えば、データセットＩＤ）を外部Ｉ／Ｆ２１１を介して改竄検出サーバ１０３へ送信する。その後、表示制御部１１２は、Ｓ８０４で、改竄検出サーバ１０３からの検出結果データの受信を待ち受ける。改竄検出サーバ１０３から外部Ｉ／Ｆ２１１を介して検出結果データが受信されると、処理はＳ８０５へ進む。表示制御部１１２は、Ｓ８０５で、対象原稿が改竄部分を含むことを検出結果データが示しているかを判定する。対象原稿が改竄部分を含むと判定された場合、処理はＳ８０６へ進む。一方、対象原稿が改竄部分を含まないと判定された場合、処理はＳ８１０へ進む。Ｓ８０６で、表示制御部１１２は、ユーザ入力に基づいて、改竄検出結果として強調画像及び比較画像を対比的に表示すべきか又は強調画像のみを表示すべきかを判定する。前者の場合、処理はＳ８０７へ進み、後者の場合、処理はＳ８０８へ進む。Ｓ８０７で、表示制御部１１２は、処理対象画像における改竄部分を強調して示す強調画像、及び当該改竄部分を処理対象画像の通りに示す比較画像を、表示デバイス２１０の画面上に対比的に表示させる。ここでの対比的な表示の例について、後に図１３（Ａ）及び（Ｂ）並びに図１５を用いてさらに説明する。Ｓ８０８で、表示制御部１１２は、処理対象画像における改竄部分を強調して示す強調画像を、表示デバイス２１０の画面上に表示させる。ここでの表示の例について、後に図１１（Ｂ）及び図１２を用いてさらに説明する。Ｓ８１０で、表示制御部１１２は、処理対象画像について改竄が検出されなかったことを表示デバイス２１０の画面上に表示させる。このような改竄検出結果の表示に加えて、画像処理装置１０１は、処理対象画像、強調画像及び比較画像を、ストレージ２０８に画像データとして記憶してもよく、又は外部Ｉ／Ｆ２１１を介して他の装置へ送信してもよい。また、これら画像のうちの１つ以上がプリンタデバイス２０５により印刷されてもよい。 First, in S801, the scanning unit 111 reads the target document set in the transport device 207 with the scanner device 206 to generate a processing target image. The image to be processed may be, for example, a full-color (RGB3 channel) image. Further, the reading unit 111 receives a tampering detection instruction input by the user via the input device 209 in S802. Next, in S803, the reading unit 111 transmits the image to be processed and the related data (for example, the data set ID) to the tampering detection server 103 via the external I / F 211 together with the tampering detection request. After that, the display control unit 112 waits for the reception of the detection result data from the tampering detection server 103 in S804. When the detection result data is received from the tampering detection server 103 via the external I / F 211, the process proceeds to S805. In S805, the display control unit 112 determines whether the detection result data indicates that the target document includes a falsified portion. If it is determined that the target manuscript includes a falsified portion, the process proceeds to S806. On the other hand, if it is determined that the target document does not include the falsified portion, the process proceeds to S810. In S806, the display control unit 112 determines whether the emphasized image and the comparative image should be displayed in contrast with each other or only the emphasized image should be displayed as the tampering detection result based on the user input. In the former case, the process proceeds to S807, and in the latter case, the process proceeds to S808. In S807, the display control unit 112 displays a emphasized image in which the falsified portion of the processed image is emphasized and a comparative image in which the falsified portion is shown as the processed image in comparison with each other on the screen of the display device 210. Let me. An example of the contrasting display here will be further described later with reference to FIGS. 13 (A) and 13 (B) and FIG. In S808, the display control unit 112 causes the display device 210 to display a highlighted image that emphasizes the falsified portion of the image to be processed. An example of the display here will be further described later with reference to FIGS. 11 (B) and 12. In S810, the display control unit 112 displays on the screen of the display device 210 that no tampering has been detected with respect to the image to be processed. In addition to displaying such a tampering detection result, the image processing device 101 may store the image to be processed, the emphasized image, and the comparison image as image data in the storage 208, or the image processing device 101 may store the image to be processed, or the image processing device 101 may be stored as image data. It may be transmitted to the device of. Further, one or more of these images may be printed by the printer device 205.

（３）改竄検出段階の具体的な流れ（改竄検出サーバ）
図９は、改竄検出段階において改竄検出サーバ１０３により実行される処理の具体的な流れの一例を示すフローチャートである。図９に示した処理は、ストレージ２６５からＲＡＭ２６４にロードされるコントローラプログラムを実行するＣＰＵ２６１による制御の下で、改竄検出サーバ１０３において実行される。当該処理は、画像処理装置１０１から外部Ｉ／Ｆ２６８を介して改竄検出要求が受信された際に開始され得る（改竄検出要求の待受けは、改竄検出サーバ１０３の電源投入に応じて開始され得る）。 (3) Specific flow of tampering detection stage (tampering detection server)
FIG. 9 is a flowchart showing an example of a specific flow of processing executed by the tampering detection server 103 in the tampering detection stage. The process shown in FIG. 9 is executed by the tamper detection server 103 under the control of the CPU 261 that executes the controller program loaded from the storage 265 to the RAM 264. The process can be started when the tampering detection request is received from the image processing device 101 via the external I / F 268 (the standby of the tampering detection request can be started in response to the power-on of the tampering detection server 103). ..

まず、Ｓ９０１で、画像取得部１３１は、改竄検出要求と共に、処理対象画像及び関連するデータ（例えば、データセットＩＤ）を外部Ｉ／Ｆ２６８を介して画像処理装置１０１から受信する。次いで、検出部１３２は、Ｓ９０２で、学習済みモデルの提供を求める要求を外部Ｉ／Ｆ２６８を介して学習装置１０２へ送信し、学習済みモデルを学習装置１０２から取得する。ここでは、検出部１３２は、例えば改竄検出要求と共に受信したデータセットＩＤにより特定される学習済みモデルを取得する。検出部１３２は、例えば、ＲＡＭ２６４上でニューラルネットワークモデルを構築し、学習装置１０２から受信されるモデルパラメータの値を構築したモデルに反映させる。また、検出部１３２は、Ｓ９０３で、処理対象画像についてＯＣＲの実行を求める要求を、処理対象画像と共に外部Ｉ／Ｆ２６８を介してＯＣＲサーバ１０４へ送信し、ＯＣＲの結果を示す認識結果データをＯＣＲサーバ１０４から受信する。次いで、検出部１３２は、Ｓ９０４で、処理対象画像内で認識された文字のうちの１つの文字領域画像を処理対象画像から切出し、切出した文字領域画像をＳ９０２で取得した学習済みモデルに適用する。それにより、検出部１３２は、文字領域画像内の複数の画素の各々が改竄部分に属するか否かを判定する。文字領域画像は、学習済みモデルへの適用の前にグレースケール化されてもよい。ここでの判定の結果は、図６（Ｃ）に示した二値画像６１２に類似するビットマップとなる。次いで、検出部１３２は、処理対象画像内に残りの文字領域があるか否かを判定する。残りの文字領域がある場合には、検出部１３２は、次の文字領域についてＳ９０４の判定を繰り返す。処理対象画像内に未処理の文字領域が残されていない場合には、処理はＳ９０６へ進む。Ｓ９０６で、検出部１３２は、Ｓ９０４の反復の結果として文字領域ごとに得られた検出結果を１つのビットマップへ統合して、検出結果画像を生成する。また、検出部１３２は、Ｓ９０７で、処理対象画像内の改竄部分を強調した強調画像を生成する。そして、検出部１３２は、Ｓ９０８で、検出結果画像及び強調画像を含む検出結果データを外部Ｉ／Ｆ２６８を介して画像処理装置１０１へ送信する。 First, in S901, the image acquisition unit 131 receives the image to be processed and related data (for example, data set ID) from the image processing device 101 via the external I / F 268 together with the falsification detection request. Next, in S902, the detection unit 132 transmits a request for providing the trained model to the learning device 102 via the external I / F 268, and acquires the trained model from the learning device 102. Here, the detection unit 132 acquires, for example, a trained model specified by the data set ID received together with the tampering detection request. For example, the detection unit 132 constructs a neural network model on the RAM 264 and reflects the value of the model parameter received from the learning device 102 in the constructed model. Further, in S903, the detection unit 132 transmits a request for execution of OCR for the image to be processed to the OCR server 104 via the external I / F 268 together with the image to be processed, and OCR transmits the recognition result data indicating the result of OCR. Receive from server 104. Next, in S904, the detection unit 132 cuts out a character area image of one of the characters recognized in the processing target image from the processing target image, and applies the cut out character area image to the trained model acquired in S902. .. As a result, the detection unit 132 determines whether or not each of the plurality of pixels in the character area image belongs to the falsified portion. The text area image may be grayscaled prior to application to the trained model. The result of the determination here is a bitmap similar to the binary image 612 shown in FIG. 6 (C). Next, the detection unit 132 determines whether or not there is a remaining character area in the image to be processed. If there is a remaining character area, the detection unit 132 repeats the determination of S904 for the next character area. If no unprocessed character area is left in the image to be processed, the process proceeds to S906. In S906, the detection unit 132 integrates the detection results obtained for each character area as a result of the repetition of S904 into one bitmap to generate a detection result image. Further, in S907, the detection unit 132 generates an emphasized image in which the falsified portion in the image to be processed is emphasized. Then, in S908, the detection unit 132 transmits the detection result data including the detection result image and the emphasized image to the image processing device 101 via the external I / F 268.

なお、ここでは検出部１３２がＯＣＲの結果に基づいて処理対象画像から文字領域画像を切出す例を説明したが、ＯＣＲは必ずしも利用されなくてもよい。例えば、対象原稿が既知のフォーマットを有する帳票である場合には、検出部１３２は、既知のフォーマットに従って、予め定義された位置にある（例えば、矩形の）部分領域の画像を文字領域画像として処理対象画像から切出すことができる。 Although an example in which the detection unit 132 cuts out a character region image from the image to be processed based on the result of OCR has been described here, OCR may not always be used. For example, when the target manuscript is a form having a known format, the detection unit 132 processes an image of a partial region (for example, a rectangle) at a predetermined position as a character region image according to the known format. It can be cut out from the target image.

＜＜４．表示制御の詳細＞＞
図１０は、画像処理装置１０１の表示制御部１１２による制御の下で表示デバイス２１０の画面上に表示され得る、改竄検出の設定及び指示のためのＧＵＩの一例を示す模式図である。図１０に示した設定ウィンドウ１０００は、プレビュー領域１００１、強調設定ボタン１００２、ファイル名変更設定ボタン１００３、フォルダ振分け設定ボタン１００４、スキャンボタン１００５及び送信開始ボタン１００６を有する。強調設定ボタン１００２は、改竄検出結果を表示する際の改竄部分の強調を有効化するか否かを設定するためのボタンである。ファイル名変更設定ボタン１００３は、改竄検出結果に基づく読取画像データのファイル名の自動変更を有効化するか否かを設定するためのボタンである。フォルダ振分け設定ボタン１００４は、改竄検出結果に基づく読取画像データの自動フォルダ振分けを有効化するか否かを設定するためのボタンである。設定ウィンドウ１０００において、有効化された設定に対応するボタンは、強調的に表示されてよい。スキャンボタン１００５は、画像処理装置１０１にセットされた原稿の読取りをトリガするためのボタンである。スキャンボタン１００５が操作され、スキャンが完了すると、プレビュー領域１００１に読取画像のプレビューイメージが表示される。ユーザは、データの送信が開始される前に、画像処理装置１０１に別の原稿をセットしてスキャンボタン１００５を再度操作することで、複数の読取画像及び関連するデータをまとめて画像処理装置１０１に保持させてもよい。少なくとも１つの対象原稿のスキャンが完了すると、送信開始ボタン１００６が操作可能となる。送信開始ボタン１００６は、処理対象画像及び関連するデータの送信をトリガする改竄検出指示の入力のためのボタンである。送信開始ボタン１００６が操作されると、読取部１１１は、改竄検出要求と共に、処理対象画像及び関連するデータを改竄検出サーバ１０３へ送信する。 << 4. Details of display control >>
FIG. 10 is a schematic diagram showing an example of a GUI for setting and instructing tampering detection that can be displayed on the screen of the display device 210 under the control of the display control unit 112 of the image processing device 101. The setting window 1000 shown in FIG. 10 has a preview area 1001, an emphasis setting button 1002, a file name change setting button 1003, a folder distribution setting button 1004, a scan button 1005, and a transmission start button 1006. The emphasis setting button 1002 is a button for setting whether or not to enable emphasis on the tampered portion when displaying the tampering detection result. The file name change setting button 1003 is a button for setting whether or not to enable automatic change of the file name of the scanned image data based on the tampering detection result. The folder distribution setting button 1004 is a button for setting whether or not to enable automatic folder distribution of the scanned image data based on the tampering detection result. In the settings window 1000, the buttons corresponding to the activated settings may be highlighted. The scan button 1005 is a button for triggering the reading of the document set in the image processing device 101. When the scan button 1005 is operated and the scan is completed, the preview image of the scanned image is displayed in the preview area 1001. Before the data transmission is started, the user sets another document in the image processing device 101 and operates the scan button 1005 again to collect a plurality of scanned images and related data in the image processing device 101. May be retained by. When the scanning of at least one target document is completed, the transmission start button 1006 can be operated. The transmission start button 1006 is a button for inputting a tampering detection instruction that triggers transmission of the image to be processed and related data. When the transmission start button 1006 is operated, the reading unit 111 transmits the image to be processed and the related data to the tampering detection server 103 together with the tampering detection request.

図１１（Ａ）及び図１１（Ｂ）は、表示制御部１１２による制御の下で表示デバイス２１０の画面上に表示され得る、改竄検出結果の一覧及び詳細をそれぞれ表示するためのＧＵＩの一例を示す模式図である。図１１（Ａ）に示した一覧ウィンドウ１１００は、リスト領域１１０１、ファイル名変更ボタン１１０３、フォルダ振分けボタン１１０４及びＯＫボタン１１０５を有する。リスト領域１１０１には、３つの対象原稿の改竄検出結果にそれぞれ対応する３つのリストアイテム１１０２ａ、１１０２ｂ、１１０２ｃが一覧化されている。各リストアイテムは、対象原稿（又は対応する読取画像）を識別する「原稿ＩＤ」、改竄検出をいつ実行したかを示す「日時」、及び原稿中に改竄が検出されたか否かを示す「改竄有無」をデータ項目として有する。また、各リストアイテムの右端には、読取画像のプレビューが表示される。図１１（Ａ）の例では、原稿ＩＤ＝“Ｓｃａｎ１”，“Ｓｃａｎ３”である２つの対象原稿は、改竄検出の結果として改竄部分を含むと判定されており、一方で原稿ＩＤ＝“Ｓｃａｎ２”である対象原稿は改竄部分を含まないと判定されている。表示制御部１１２は、検出結果データ３２に含まれる検出結果画像において改竄部分に属することが示された画素が１つ以上存在する対象原稿は「改竄あり」であると判定し得る。表示制御部１１２は、例えばリストアイテム１１０２ａがユーザにより操作（例えば、タップ）されると、原稿ＩＤ＝“Ｓｃａｎ１”で識別される対象原稿を対象とした詳細ウィンドウ１１５０を画面上に表示させる。ファイル名変更ボタン１１０３は、改竄検出結果に基づく読取画像データのファイル名の自動変更の実行をトリガするためのボタンである。表示制御部１１２は、ファイル名変更ボタン１１０３がユーザにより操作されると、それぞれのチェックボックスがチェックされた対象原稿に対応する読取画像データのファイル名を自動的に変更する。例えば、「改竄あり」の読取画像データのファイル名には、改竄があることを意味する所定の接頭辞又は接尾辞（例えば、“改竄あり”）が付加され得る。「改竄なし」の読取画像データのファイル名にも、改竄がないことを意味する所定の接頭辞又は接尾辞（例えば、“改竄なし”）が付加されてよい。フォルダ振分けボタン１１０４は、改竄検出結果に基づく読取画像データの自動フォルダ振分けの実行をトリガするためのボタンである。表示制御部１１２は、フォルダ振分けボタン１１０４がユーザにより操作されると、それぞれのチェックボックスがチェックされた対象原稿に対応する読取画像データを複数のフォルダのいずれかへ自動的に振分ける。例えば、「改竄あり」の読取画像データは、（例えば、“改竄あり”というフォルダ名を有する）第１のフォルダに保存され得る。また、「改竄なし」の読取画像データは、（例えば、“改竄なし”というフォルダ名を有する）第２のフォルダに保存され得る。表示制御部１１２は、ＯＫボタン１１０５がユーザにより操作されると、一覧ウィンドウ１１００の表示を終了させる。 11 (A) and 11 (B) are examples of GUIs for displaying a list and details of tampering detection results that can be displayed on the screen of the display device 210 under the control of the display control unit 112. It is a schematic diagram which shows. The list window 1100 shown in FIG. 11A has a list area 1101, a file name change button 1103, a folder distribution button 1104, and an OK button 1105. In the list area 1101, three list items 1102a, 1102b, and 1102c corresponding to the falsification detection results of the three target originals are listed. Each list item has a "manuscript ID" that identifies the target manuscript (or the corresponding scanned image), a "date and time" that indicates when tampering detection was executed, and a "tampering" that indicates whether or not tampering was detected in the manuscript. Presence or absence ”is included as a data item. A preview of the scanned image is displayed at the right end of each list item. In the example of FIG. 11A, the two target manuscripts having the manuscript ID = "Scan1" and "Scan3" are determined to include the tampered portion as a result of tampering detection, while the manuscript ID = "Scan2". It is determined that the target manuscript is not included in the falsified part. The display control unit 112 can determine that the target manuscript in which one or more pixels indicated to belong to the falsified portion in the detection result image included in the detection result data 32 is present is “falsified”. When, for example, the list item 1102a is operated (for example, tapped) by the user, the display control unit 112 displays a detail window 1150 for the target document identified by the document ID = "Scan1" on the screen. The file name change button 1103 is a button for triggering the execution of automatic change of the file name of the scanned image data based on the tampering detection result. When the file name change button 1103 is operated by the user, the display control unit 112 automatically changes the file name of the scanned image data corresponding to the target document in which each check box is checked. For example, a predetermined prefix or suffix (for example, "tampered with") indicating that there is falsification may be added to the file name of the read image data of "tampered with". A predetermined prefix or suffix (for example, "no tampering") indicating that there is no tampering may be added to the file name of the read image data of "no tampering". The folder distribution button 1104 is a button for triggering the execution of automatic folder distribution of the read image data based on the tampering detection result. When the folder distribution button 1104 is operated by the user, the display control unit 112 automatically distributes the scanned image data corresponding to the target document in which each check box is checked to any of a plurality of folders. For example, the “tampered” scanned image data may be stored in a first folder (eg, having the folder name “tampered”). Further, the read image data of "no tampering" can be stored in a second folder (for example, having a folder name of "no tampering"). When the OK button 1105 is operated by the user, the display control unit 112 ends the display of the list window 1100.

図１１（Ｂ）に示した詳細ウィンドウ１１５０は、画像表示領域１１５１、修正ボタン１１５３、対比表示ボタン１１５４、送信ボタン１１５５、印刷ボタン１１５６及びＯＫボタン１１５７を有する。画像表示領域１１５１は、一覧ウィンドウ１１００で指定された対象原稿の強調画像を表示する領域である。図１１（Ｂ）の例では、画像表示領域１１５１に強調画像３２ｂが表示されている。表示制御部１１２は、修正ボタン１１５３がユーザにより操作されると、改竄検出結果をユーザが修正することを可能とする、後述する修正ウィンドウ１２００を画面上に表示させる。また、表示制御部１１２は、対比表示ボタン１１５４が操作されると、強調画像及び比較画像を対比的に配置した後述する対比表示ウィンドウ１３００又は１３５０を画面上に表示させる。なお、対比表示ボタン１１５４は、改竄検出の結果として改竄部分を含むと判定された対象原稿についてのみ操作可能とされ得る。また、表示制御部１１２は、送信ボタン１１５５が操作されると、指定された対象原稿の処理対象画像及び強調画像の一方又は双方を他の装置へ送信する。ここでの画像の送信は、例えば電子メールへの添付、又はファイルサーバ上のリンクを記述した電子メール若しくはその他のメッセージの送信といった、いかなる手法で行われてもよい。送信の宛て先は、予め画像処理装置１０１に登録されている宛て先であってもよく、又はポップアップされる宛て先ウィンドウ（図示せず）を介してユーザにより指定されてもよい。また、表示制御部１１２は、印刷ボタン１１５６が操作されると、指定された対象原稿の処理対象画像及び強調画像の一方又は双方をプリンタデバイス２０５に印刷させる。また、表示制御部１１２は、ＯＫボタン１１５７が操作されると、詳細ウィンドウ１１５０の表示を終了させ、一覧ウィンドウ１１００を再び画面上に表示させる。 The detail window 1150 shown in FIG. 11B has an image display area 1151, a correction button 1153, a comparison display button 1154, a transmission button 1155, a print button 1156, and an OK button 1157. The image display area 1151 is an area for displaying the emphasized image of the target original specified in the list window 1100. In the example of FIG. 11B, the emphasized image 32b is displayed in the image display area 1151. When the correction button 1153 is operated by the user, the display control unit 112 displays a correction window 1200, which will be described later, on the screen, which enables the user to correct the tampering detection result. Further, when the contrast display button 1154 is operated, the display control unit 112 causes the contrast display window 1300 or 1350, which will be described later, in which the emphasized image and the comparison image are arranged in contrast to be displayed on the screen. The comparison display button 1154 may be operable only for the target document that is determined to include the tampered portion as a result of the tampering detection. Further, when the transmission button 1155 is operated, the display control unit 112 transmits one or both of the processed target image and the emphasized image of the designated target document to the other device. The transmission of the image here may be performed by any method, for example, attachment to an e-mail, or transmission of an e-mail containing a link on a file server or other message. The destination of transmission may be a destination registered in advance in the image processing device 101, or may be specified by the user via a pop-up destination window (not shown). Further, when the print button 1156 is operated, the display control unit 112 causes the printer device 205 to print one or both of the processed target image and the emphasized image of the designated target document. Further, when the OK button 1157 is operated, the display control unit 112 ends the display of the detail window 1150 and causes the list window 1100 to be displayed on the screen again.

図１２は、表示制御部１１２による制御の下で表示デバイス２１０の画面上に表示され得る、改竄検出結果をユーザにより修正可能とするＧＵＩの一例を示す模式図である。図１２に示した修正ウィンドウ１２００は、画像表示領域１２０１、改竄画素指定ボタン１２０３、改竄画素解除ボタン１２０４及びＯＫボタン１２０５を有する。画像表示領域１２０１は、指定された対象原稿の強調画像を表示する領域である。図１２の例では、画像表示領域１２０１に強調画像３２ｂが表示されている。表示制御部１１２は、改竄画素指定ボタン１２０３がユーザにより操作されると、画像表示領域１２０１においてユーザにより指定される画素を、改竄検出結果において改竄部分に属する画素に変更する。また、表示制御部１１２は、改竄画素解除ボタン１２０４がユーザにより操作されると、画像表示領域１２０１においてユーザにより指定される画素を、改竄検出結果において改竄部分に属さない画素に変更する。表示制御部１１２は、ＯＫボタン１２０５が操作されると、ユーザによる改竄検出結果の修正を検出結果データ３２に反映し、修正ウィンドウ１２００の表示を終了させる。図１１（Ａ）の一覧ウィンドウ１１００においてボタン１１０３又は１１０４の操作によりトリガされるファイル名の自動変更又は自動フォルダ振分けは、このように改竄検出結果がユーザにより修正された場合、修正された改竄検出結果に基づいて実行され得る。それにより、改竄の有無の検証の後に続く（フォルダ名又は保存フォルダを手掛かりとした）人的な又はシステム的な文書の処理を、修正後の改竄検出結果に基づいてより正確に行うことが可能となる。 FIG. 12 is a schematic diagram showing an example of a GUI that can be displayed on the screen of the display device 210 under the control of the display control unit 112 and that allows the user to modify the tampering detection result. The modification window 1200 shown in FIG. 12 has an image display area 1201, a tampered pixel designation button 1203, a tampered pixel release button 1204, and an OK button 1205. The image display area 1201 is an area for displaying the emphasized image of the designated target document. In the example of FIG. 12, the emphasized image 32b is displayed in the image display area 1201. When the falsification pixel designation button 1203 is operated by the user, the display control unit 112 changes the pixel designated by the user in the image display area 1201 to a pixel belonging to the falsification portion in the falsification detection result. Further, when the falsification pixel release button 1204 is operated by the user, the display control unit 112 changes the pixel specified by the user in the image display area 1201 to a pixel that does not belong to the falsification portion in the falsification detection result. When the OK button 1205 is operated, the display control unit 112 reflects the correction of the tampering detection result by the user in the detection result data 32, and ends the display of the correction window 1200. In the list window 1100 of FIG. 11 (A), the automatic change of the file name or the automatic folder distribution triggered by the operation of the button 1103 or 1104 is the modified tampering detection when the tampering detection result is modified by the user in this way. It can be executed based on the result. As a result, it is possible to more accurately process human or systematic documents (using the folder name or storage folder as a clue) following the verification of the presence or absence of tampering based on the tampering detection result after modification. It becomes.

表示制御部１１２は、上述した修正ウィンドウ１２００を介してユーザにより改竄検出結果が修正された場合に、修正後の改竄検出結果に基づいて、学習済みモデルを更新してもよい。具体的には、表示制御部１１２は、例えば、処理対象画像及び修正後の検出結果画像のペアを、モデル更新要求と共に学習装置１０２へ送信する。学習装置１０２の学習部１２２は、表示制御部１１２からのモデル更新要求の受信に応じて、処理対象画像内の各文字領域の文字領域画像を入力画像、検出結果画像内の同じ文字領域の文字領域画像を教師画像として用いて、学習済みモデルを更新し得る。それにより、現行の学習済みモデルが検出を誤り易い画素のパターンについて再学習が行われるため、学習済みモデルの改竄検出の精度を効果的に改善させることができる。 When the tampering detection result is modified by the user via the modification window 1200 described above, the display control unit 112 may update the trained model based on the modified tampering detection result. Specifically, the display control unit 112 transmits, for example, a pair of the processing target image and the corrected detection result image to the learning device 102 together with the model update request. The learning unit 122 of the learning device 102 receives a model update request from the display control unit 112, inputs a character area image of each character area in the processing target image, and a character in the same character area in the detection result image. The trained model can be updated using the region image as the teacher image. As a result, the pixel pattern that the current trained model tends to make an error in detection is relearned, so that the accuracy of the tampering detection of the trained model can be effectively improved.

図１３（Ａ）は、表示制御部１１２による制御の下で表示デバイス２１０の画面上に表示され得る、強調画像と比較画像とを対比的に配置したＧＵＩの第１の例を示す模式図である。第１の例では、対象原稿の全体を示す強調画像及び比較画像が表示される。図１３（Ａ）に示した対比表示ウィンドウ１３００は、強調画像表示領域１３０１、比較画像表示領域１３０２、文字別表示ボタン１３０３、修正ボタン１３０５及びＯＫボタン１３０６を有する。図１３（Ａ）の例では、強調画像表示領域１３０１に強調画像３２ｂが、比較画像表示領域１３０２に比較画像２１ａがそれぞれ表示されている。強調画像３２ｂは、対象原稿の読取画像において改竄部分に属すると判定された画素を強調して示す画像である。比較画像２１ａは、処理対象画像２１ａと同一の画像であって、改竄部分に属すると判定された画素を（強調することなく）読取画像の通りに示す画像である。表示制御部１１２は、文字別表示ボタン１３０３がユーザにより操作されると、文字領域ごとに切出された強調画像及び比較画像を対比的に配置した、後述する対比表示ウィンドウ１３５０に画面を遷移させる。また、表示制御部１１２は、修正ボタン１３０５が操作されると、改竄検出結果をユーザが修正することを可能とする上述した修正ウィンドウ１２００を画面上に表示させる。また、表示制御部１１２は、ＯＫボタン１３０６が操作されると、対比表示を終了させる。 FIG. 13A is a schematic diagram showing a first example of a GUI in which a emphasized image and a comparative image are arranged in contrast, which can be displayed on the screen of the display device 210 under the control of the display control unit 112. is there. In the first example, an emphasized image and a comparative image showing the entire target document are displayed. The contrast display window 1300 shown in FIG. 13A has a highlighted image display area 1301, a comparative image display area 1302, a character-specific display button 1303, a correction button 1305, and an OK button 1306. In the example of FIG. 13A, the highlighted image 32b is displayed in the highlighted image display area 1301 and the comparison image 21a is displayed in the comparison image display area 1302. The emphasized image 32b is an image in which the pixels determined to belong to the falsified portion in the scanned image of the target document are emphasized and shown. The comparative image 21a is an image that is the same as the image to be processed 21a and shows the pixels determined to belong to the falsified portion as the read image (without emphasizing). When the character-specific display button 1303 is operated by the user, the display control unit 112 shifts the screen to the contrast display window 1350, which will be described later, in which the emphasized image and the comparison image cut out for each character area are arranged in contrast. .. Further, when the correction button 1305 is operated, the display control unit 112 displays on the screen the above-mentioned correction window 1200 that enables the user to correct the tampering detection result. Further, the display control unit 112 ends the comparison display when the OK button 1306 is operated.

図１３（Ｂ）は、表示制御部１１２による制御の下で表示デバイス２１０の画面上に表示され得る、強調画像と比較画像とを対比的に配置したＧＵＩの第２の例を示す模式図である。第２の例では、文字別の部分画像（文字領域画像）である強調画像及び比較画像が表示される。図１３（Ｂ）に示した対比表示ウィンドウ１３５０は、リスト領域１３５１、全体表示ボタン１３５３、比較モード切替ボタン１３５４及びＯＫボタン１３５６を有する。リスト領域１３５１には、対象原稿内で認識された１つ以上の文字の文字領域が、上下方向にスクロール可能な形でリストアップされる。図１３（Ｂ）の例では、リスト領域１３５１に、２つのリストアイテム１３５２ａ、１３５２ｂが表示されている。各リストアイテムは、対象原稿の原稿ＩＤ及び各文字領域に付与される番号（Ｎｏ．）の組合せにより一意に識別され得る。リスト領域１３５１は、中央に強調画像表示領域１３６１、右端に比較画像表示領域１３６２を有する。各強調画像表示領域１３６１には、強調された改竄部分を含む、強調画像内の１つの文字領域の部分画像（ここでは、この部分画像をも強調画像という）が表示される。各比較画像表示領域１３６２には、強調されていない改竄部分を含む、処理対象画像内の１つの文字領域の部分画像（ここでは、この部分画像をも比較画像という）が表示される。但し、比較画像表示領域１３６２に表示される比較画像の内容は、後述する比較モードに依存して変化し得る。表示制御部１１２は、強調画像表示領域１３６１又は比較画像表示領域１３６２がユーザにより操作（例えば、タップ）されると、文字領域単位の改竄検出結果をユーザが修正することを可能とする、後述する修正ウィンドウ１５００を画面上に表示させる。また、表示制御部１１２は、全体表示ボタン１３５３がユーザにより操作されると、上述した対比表示ウィンドウ１３００に画面を遷移させる。また、表示制御部１１２は、比較モード切替ボタン１３５４が操作されると、後述するように、比較モードの設定をユーザ入力に依存して切替える。また、表示制御部１１２は、ＯＫボタン１３５６が操作されると、対比表示を終了させる。 FIG. 13B is a schematic diagram showing a second example of a GUI in which a emphasized image and a comparative image are arranged in contrast, which can be displayed on the screen of the display device 210 under the control of the display control unit 112. is there. In the second example, a emphasized image and a comparison image, which are partial images (character area images) for each character, are displayed. The contrast display window 1350 shown in FIG. 13B has a list area 1351, an overall display button 1353, a comparison mode switching button 1354, and an OK button 1356. In the list area 1351, character areas of one or more characters recognized in the target document are listed in a form that can be scrolled in the vertical direction. In the example of FIG. 13B, two list items 1352a and 1352b are displayed in the list area 1351. Each list item can be uniquely identified by the combination of the manuscript ID of the target manuscript and the number (No.) assigned to each character area. The list area 1351 has a highlighted image display area 1361 in the center and a comparative image display area 1362 at the right end. In each emphasized image display area 1361, a partial image of one character area in the emphasized image including the emphasized falsified portion (here, this partial image is also referred to as an emphasized image) is displayed. In each comparative image display area 1362, a partial image of one character area in the image to be processed (here, this partial image is also referred to as a comparative image) including a falsified portion that is not emphasized is displayed. However, the content of the comparison image displayed in the comparison image display area 1362 may change depending on the comparison mode described later. When the highlighted image display area 1361 or the comparative image display area 1362 is operated (for example, tapped) by the user, the display control unit 112 enables the user to correct the tampering detection result for each character area, which will be described later. Display the correction window 1500 on the screen. Further, when the entire display button 1353 is operated by the user, the display control unit 112 shifts the screen to the comparison display window 1300 described above. Further, when the comparison mode switching button 1354 is operated, the display control unit 112 switches the comparison mode setting depending on the user input, as will be described later. Further, the display control unit 112 ends the comparison display when the OK button 1356 is operated.

表示制御部１１２は、単一の比較モードのみをサポートしてもよく、又は複数の比較モードの候補の間で、対比表示のための比較モードを動的に切替可能であってもよい。前者の場合、対比表示ウィンドウ１３５０は、比較モード切替ボタン１３５４を有しなくてもよい。後者の場合、比較モードの候補は、例えば次のうちの２つ以上を含み得る。
・比較モードＣ１：比較画像は、強調されていない改竄部分を含む文字領域の画像である。
・比較モードＣ２：比較画像は、強調されていない改竄部分を含む文字領域と当該
文字領域の周辺領域とを含む画像である。
・比較モードＣ３：比較画像は、強調画像において示される文字と同じ文字を示す他の文字領域の画像を含む。
・比較モードＣ４：比較画像は、抑制され又は非表示化された改竄部分を含む文字領域の画像である。 The display control unit 112 may support only a single comparison mode, or may dynamically switch the comparison mode for comparison display among a plurality of comparison mode candidates. In the former case, the comparison display window 1350 does not have to have the comparison mode switching button 1354. In the latter case, the comparison mode candidates may include, for example, two or more of the following:
-Comparison mode C1: The comparison image is an image of a character region including an unemphasized falsified portion.
-Comparison mode C2: The comparison image is an image including a character area including an unemphasized falsified part and a peripheral area of the character area.
Comparison mode C3: The comparison image includes an image of another character area showing the same characters as the characters shown in the highlighted image.
-Comparison mode C4: The comparison image is an image of a character region including a falsified portion that has been suppressed or hidden.

表示制御部１１２は、これら比較モードの候補のリストを画面上に表示させ、所望の比較モードをユーザに指定させてもよい。その代わりに、比較モード切替ボタン１３５４の操作の都度、比較モードの設定が所定の順序でトグル（順々に切替え）されてもよい。表示制御部１１２は、このようにユーザにより指定される比較モードに従って、対比表示ウィンドウ１３５０の比較画像表示領域１３６２に表示される比較画像の内容を切替え得る。 The display control unit 112 may display a list of candidates for these comparison modes on the screen and allow the user to specify a desired comparison mode. Instead, the comparison mode settings may be toggled (switched in sequence) in a predetermined order each time the comparison mode switching button 1354 is operated. The display control unit 112 can switch the content of the comparison image displayed in the comparison image display area 1362 of the comparison display window 1350 according to the comparison mode thus designated by the user.

図１４（Ａ）は、比較モードＣ１における比較画像の例を示している。図１４（Ａ）に示した比較画像１４０１は、対応する強調画像により示される文字の文字領域と同じ位置及び同じサイズで処理対象画像から切出された画像である。比較画像１４０１において、改竄部分は、強調されることなく、対象原稿を読取ることにより生成された読取画像の通りに示される。比較モードＣ１では、ユーザは、強調画像と比較画像との間で文字を構成する部分同士の対応関係を一目瞭然に把握することができる。 FIG. 14A shows an example of a comparison image in the comparison mode C1. The comparative image 1401 shown in FIG. 14A is an image cut out from the processing target image at the same position and the same size as the character area of the character indicated by the corresponding emphasized image. In the comparative image 1401, the falsified portion is shown as the scanned image generated by scanning the target document without being emphasized. In the comparison mode C1, the user can clearly grasp the correspondence between the parts constituting the characters between the emphasized image and the comparison image.

図１４（Ｂ）は、比較モードＣ２における比較画像の例を示している。図１４（Ｂ）に示した比較画像１４０２は、比較画像１４０１の周辺領域を含む形で処理対象画像から切出された画像である。比較画像１４０２においても、改竄部分は、強調されることなく、対象原稿を読取ることにより生成された読取画像の通りに示される。比較モードＣ２では、ユーザは、改竄が疑われる文字の前後の文字を通じて色味、濃淡及びストロークの癖といった要素を評価して、改竄検出結果の妥当性を判断することができる。 FIG. 14B shows an example of a comparison image in the comparison mode C2. The comparative image 1402 shown in FIG. 14B is an image cut out from the image to be processed so as to include the peripheral region of the comparative image 1401. Also in the comparative image 1402, the falsified portion is shown as the scanned image generated by scanning the target document without being emphasized. In the comparison mode C2, the user can evaluate factors such as color, shading, and stroke habit through the characters before and after the character suspected of being tampered with, and judge the validity of the tampering detection result.

図１４（Ｃ）は、比較モードＣ３における比較画像の例を示している。比較モードＣ３では、例えば、対比表示ウィンドウ１３５０の比較画像表示領域１３６２に、比較画像１４０１及び比較画像１４０３が表示され得る。比較画像１４０３は、対応する強調画像の文字領域において示される文字と同じ文字を示す他の文字領域の画像である。表示制御部１１２は、例えば、ＯＣＲ結果に基づいて、異なる位置に記入された２つの文字が同じ文字であるか否かを判定し得る。比較画像１４０３においても、改竄部分は、強調されることなく、対象原稿を読取ることにより生成された読取画像の通りに示される。比較モードＣ３では、ユーザは、改竄が疑われる文字と同じ文字の色味、濃淡及びストロークの癖といった要素を参考にして、改竄検出結果の妥当性を判断することができる。 FIG. 14C shows an example of a comparison image in the comparison mode C3. In the comparison mode C3, for example, the comparison image 1401 and the comparison image 1403 can be displayed in the comparison image display area 1362 of the comparison display window 1350. The comparative image 1403 is an image of another character area showing the same characters as the characters shown in the character area of the corresponding emphasized image. The display control unit 112 can determine, for example, whether or not two characters written at different positions are the same character based on the OCR result. Also in the comparative image 1403, the falsified portion is shown as the scanned image generated by scanning the target document without being emphasized. In the comparison mode C3, the user can determine the validity of the tampering detection result by referring to factors such as the color, shading, and stroke habit of the same character as the character suspected of being tampered with.

図１４（Ｄ）は、比較モードＣ４における比較画像の例を示している。図１４（Ｄ）に示した比較画像１４０４は、対応する強調画像により示される文字の文字領域と同じ位置及び同じサイズで処理対象画像から切出された画像である。但し、比較画像１４０１とは異なり、比較画像１４０４においては、改竄部分は、抑制され又は非表示化されている。比較モードＣ４では、ユーザは改竄によって書き足された可能性のあるストロークが無い状態で文書の内容を見ることができ、改竄部分に属さないと判定された画素に注意を向けて改竄検出結果の妥当性を判断することができる。 FIG. 14D shows an example of a comparison image in the comparison mode C4. The comparison image 1404 shown in FIG. 14 (D) is an image cut out from the processing target image at the same position and the same size as the character area of the character indicated by the corresponding emphasized image. However, unlike the comparative image 1401, in the comparative image 1404, the falsified portion is suppressed or hidden. In the comparison mode C4, the user can see the contents of the document without strokes that may have been added by tampering, and pay attention to the pixels that are determined not to belong to the tampering part. The validity can be judged.

図１５は、表示制御部１１２による制御の下で表示デバイス２１０の画面上に表示され得る、改竄検出結果をユーザにより修正可能とするＧＵＩの他の例を示す模式図である。図１５に示した修正ウィンドウ１５００は、強調画像表示領域１５０１、比較画像表示領域１５０２、比較モード切替ボタン１５０３、改竄画素指定ボタン１５０４、改竄画素解除ボタン１５０５及びＯＫボタン１５０６を有する。強調画像表示領域１５０１は、指定された対象原稿の文字領域の強調画像を表示する領域である。図１５の例では、強調画像表示領域１５０１に強調画像１５１１が表示されている。比較画像表示領域１５０２は、指定された対象原稿の文字領域の比較画像を表示する領域である。図１５の例では、比較画像表示領域１５０２に比較画像１５１２が表示されている。ここでは、上述した比較モードＣ１が設定されているものとする。表示制御部１１２は、比較モード切替ボタン１５０３が操作されると、上述した２つ以上の比較モードの候補の間で比較モードの設定を切替え、新たに設定された比較モードに対応する比較画像を比較画像表示領域１５０２に表示させる。表示制御部１１２は、改竄画素指定ボタン１５０４がユーザにより操作されると、強調画像表示領域１５０１においてユーザにより指定される画素を、改竄検出結果において改竄部分に属する画素に変更する。また、表示制御部１１２は、改竄画素解除ボタン１５０５がユーザにより操作されると、強調画像表示領域１５０１においてユーザにより指定される画素を、改竄検出結果において改竄部分に属さない画素に変更する。表示制御部１１２は、ＯＫボタン１５０６が操作されると、ユーザによる改竄検出結果の修正を検出結果データ３２に反映し、修正ウィンドウ１５００の表示を終了させる。 FIG. 15 is a schematic diagram showing another example of a GUI that can be displayed on the screen of the display device 210 under the control of the display control unit 112 and that allows the user to modify the tampering detection result. The modification window 1500 shown in FIG. 15 has a highlighted image display area 1501, a comparison image display area 1502, a comparison mode switching button 1503, a tampered pixel designation button 1504, a tampered pixel release button 1505, and an OK button 1506. The highlighted image display area 1501 is an area for displaying a highlighted image of the character area of the designated target document. In the example of FIG. 15, the highlighted image 1511 is displayed in the highlighted image display area 1501. The comparative image display area 1502 is an area for displaying a comparative image of the character area of the designated target document. In the example of FIG. 15, the comparison image 1512 is displayed in the comparison image display area 1502. Here, it is assumed that the above-mentioned comparison mode C1 is set. When the comparison mode switching button 1503 is operated, the display control unit 112 switches the comparison mode setting between the two or more comparison mode candidates described above, and displays a comparison image corresponding to the newly set comparison mode. It is displayed in the comparative image display area 1502. When the falsification pixel designation button 1504 is operated by the user, the display control unit 112 changes the pixel designated by the user in the highlighted image display area 1501 to a pixel belonging to the falsification portion in the falsification detection result. Further, when the falsification pixel release button 1505 is operated by the user, the display control unit 112 changes the pixel specified by the user in the highlighted image display area 1501 to a pixel that does not belong to the falsification portion in the falsification detection result. When the OK button 1506 is operated, the display control unit 112 reflects the correction of the tampering detection result by the user in the detection result data 32, and ends the display of the correction window 1500.

図１６は、対象文書の改竄が検出された場合に画像処理装置１０１により実行される表示制御処理の具体的な流れの一例を示すフローチャートである。図１６に示した処理は、ストレージ２０８からＲＡＭ２０４にロードされるコントローラプログラムを実行するＣＰＵ２０１による制御の下で、画像処理装置１０１において実行される。当該処理は、図８のＳ８０７に相当し得る。 FIG. 16 is a flowchart showing an example of a specific flow of display control processing executed by the image processing device 101 when falsification of the target document is detected. The process shown in FIG. 16 is executed in the image processing device 101 under the control of the CPU 201 that executes the controller program loaded from the storage 208 to the RAM 204. The process may correspond to S807 in FIG.

まず、表示制御部１１２は、Ｓ１６０１で、対比表示の表示モードとして、文字別対比表示及び全体対比表示のいずれが指定されたのかを判定する。例えば、図１１（Ｂ）の詳細ウィンドウ１１５０の対比表示ボタン１１５４、又は図１３（Ｂ）の対比表示ウィンドウ１３５０の全体表示ボタン１３５３が操作された場合、表示制御部１１２は、全体対比表示が指定されたと判定し得る。一方、図１３（Ａ）の対比表示ウィンドウ１３００の文字別表示ボタン１３０３が操作された場合、表示制御部１１２は、文字別対比表示が指定されたと判定し得る。文字別対比表示が指定されたと判定される場合、処理はＳ１６０２へ進む。一方、全体対比表示が指定されたと判定される場合、処理はＳ１６２０へ進む。 First, the display control unit 112 determines in S1601 whether the character-specific contrast display or the overall contrast display is designated as the display mode of the contrast display. For example, when the contrast display button 1154 of the detail window 1150 of FIG. 11 (B) or the entire display button 1353 of the contrast display window 1350 of FIG. 13 (B) is operated, the display control unit 112 specifies the overall comparison display. It can be determined that it has been done. On the other hand, when the character-specific display button 1303 of the comparison display window 1300 of FIG. 13A is operated, the display control unit 112 can determine that the character-specific comparison display is specified. If it is determined that the character-specific contrast display is specified, the process proceeds to S1602. On the other hand, when it is determined that the overall comparison display is specified, the process proceeds to S1620.

Ｓ１６０２で、表示制御部１１２は、画像内の１つ以上の文字領域の位置及びサイズを示す文字領域データを取得する。表示制御部１１２は、例えば、ＯＣＲサーバ１０４による文字認識の結果を示す認識結果データ３１を、文字領域データとして、検出結果データ３２と共に改竄検出サーバ１０３から受信してもよい。その代わりに、表示制御部１１２は、対象原稿が既知のフォーマットを有する帳票である場合には、既知のフォーマットに含まれる文字領域の位置及びサイズを予め定義した文字領域データをストレージ２０８から取得してもよい。その後のＳ１６０３〜Ｓ１６１２は、文字領域データにより示される文字領域のうち、検出結果データ３２により改竄部分に属すると判定された画素を含む文字領域の各々について繰り返される。図７（Ａ）の処理対象画像２１ａ及び検出結果画像３２ａを例にとると、文字領域７０１ａ、及びその右隣りの２つの文字の矩形の文字領域が、改竄部分に属すると判定された画素を含む文字領域である。よって、これら３つの文字領域について、Ｓ１６０３〜Ｓ１６１２が繰り返され得る。 In S1602, the display control unit 112 acquires character area data indicating the position and size of one or more character areas in the image. For example, the display control unit 112 may receive the recognition result data 31 indicating the result of character recognition by the OCR server 104 as the character area data from the tampering detection server 103 together with the detection result data 32. Instead, when the target manuscript is a form having a known format, the display control unit 112 acquires the character area data in which the position and size of the character area included in the known format are defined in advance from the storage 208. You may. Subsequent steps S1603 to S1612 are repeated for each of the character areas including the pixels determined to belong to the falsified portion by the detection result data 32 in the character area indicated by the character area data. Taking the processing target image 21a and the detection result image 32a of FIG. 7A as an example, the character area 701a and the rectangular character area of the two characters to the right of the character area 701a are pixels determined to belong to the falsified portion. The character area to be included. Therefore, S1603 to S1612 can be repeated for these three character areas.

Ｓ１６０３で、表示制御部１１２は、改竄部分を含む文字領域のうちの１つを選択する（以下、選択した文字領域を選択領域という）。次いで、表示制御部１１２は、Ｓ１６０４で、文字領域データにより示される位置及びサイズに従って、選択領域の部分画像を強調画像から切出す。その後の処理は、Ｓ１６０５で、現時点で設定されている比較モードが比較モードＣ２であるか否かに依存して分岐する。現時点で比較モードＣ２が設定されている場合、処理はＳ１６０６へ進む。比較モードＣ１、Ｃ３又はＣ４が設定されている場合、処理はＳ１６０７へ進む。比較モードＣ２の場合、表示制御部１１２は、Ｓ１６０６で、選択領域及び当該選択領域の外側の周辺領域を含む部分画像を読取画像から切出す。一例として、周辺領域を含む部分画像のサイズは、選択領域のサイズに対し横方向にＷ倍、縦方向にＨ倍（Ｗ、Ｈは予め設定される１よりも大きい倍率。例えば、Ｗ＝４且つＨ＝２）であってよい。他の例として、周辺領域は、選択領域の近傍のＮ個の文字（Ｎは予め設定される整数）を含む領域として動的に設定されてもよい。比較モードＣ２では、ここで切出した部分画像が比較画像となる。一方、比較モードＣ１、Ｃ３又はＣ４の場合、表示制御部１１２は、Ｓ１６０７で、選択領域の部分画像を読取画像から切出す。現時点で比較モードＣ１が設定されている場合、Ｓ１６０７で切出した部分画像が比較画像となる。現時点で比較モードＣ３が設定されている場合、Ｓ１６０８の分岐において処理はＳ１６０９へ進む。Ｓ１６０９では、表示制御部１１２は、選択領域により示される文字と同じ文字を示す他の１つ以上の文字領域の部分画像を読取画像から切出す。ここでの他の文字領域は、改竄部分に属すると判定された画素を含まない領域であることが望ましい。表示制御部１１２は、ここでＯＣＲサーバ１０４にＯＣＲを要求し、ＯＣＲサーバ１０４から返送される文字認識の結果に基づいて、選択領域により示される文字と同じ文字を示す他の文字領域を特定してもよい。なお、対象原稿中にそのような他の文字領域が存在しない場合には、Ｓ１６０９はスキップされてもよい。比較モードＣ３では、Ｓ１６０７及びＳ１６０９でそれぞれ切出した部分画像の組合せ（例えば、２つの部分画像を横に並べた画像）が比較画像となる。現時点で比較モードＣ４が設定されている場合、Ｓ１６０８及びＳ１６１０の分岐を経て、処理はＳ１６１１へ進む。Ｓ１６１１では、表示制御部１１２は、Ｓ１６０７で切出した部分画像内の改竄部分に属すると判定された画素の値を抑制（例えば、白色といった背景色と同一の色若しくはそれに近い色に画素値を修正）する。比較モードＣ４では、このように加工された部分画像が比較画像となる。表示制御部１１２は、Ｓ１６１２で、改竄部分に属すると判定された画素を含む未処理の文字領域が残されているかを判定する。そのような未処理の文字領域が残されている場合、処理はＳ１６０３へ戻り、次の文字領域について上述したＳ１６０３〜Ｓ１６１１が繰り返される。そうではない場合、処理はＳ１６１３へ進む。 In S1603, the display control unit 112 selects one of the character areas including the falsified part (hereinafter, the selected character area is referred to as a selection area). Next, in S1604, the display control unit 112 cuts out a partial image of the selected area from the emphasized image according to the position and size indicated by the character area data. Subsequent processing branches in S1605 depending on whether or not the comparison mode currently set is the comparison mode C2. If the comparison mode C2 is currently set, the process proceeds to S1606. When the comparison mode C1, C3 or C4 is set, the process proceeds to S1607. In the case of the comparison mode C2, the display control unit 112 cuts out a partial image including the selected area and the peripheral area outside the selected area from the scanned image in S1606. As an example, the size of the partial image including the peripheral area is W times in the horizontal direction and H times in the vertical direction with respect to the size of the selected area (W and H are magnifications larger than the preset 1; for example, W = 4). And H = 2). As another example, the peripheral area may be dynamically set as an area containing N characters (N is a preset integer) in the vicinity of the selected area. In the comparison mode C2, the partial image cut out here becomes the comparison image. On the other hand, in the case of the comparison mode C1, C3 or C4, the display control unit 112 cuts out a partial image of the selected area from the scanned image in S1607. When the comparison mode C1 is set at the present time, the partial image cut out in S1607 becomes the comparison image. If the comparison mode C3 is currently set, the process proceeds to S1609 at the branch of S1608. In S1609, the display control unit 112 cuts out a partial image of one or more other character areas showing the same character as the character indicated by the selected area from the scanned image. It is desirable that the other character area here is an area that does not include pixels determined to belong to the falsified portion. The display control unit 112 requests OCR from the OCR server 104 here, and identifies another character area indicating the same character as the character indicated by the selection area based on the result of character recognition returned from the OCR server 104. You may. If such another character area does not exist in the target manuscript, S1609 may be skipped. In the comparison mode C3, the combination of the partial images cut out in S1607 and S1609 (for example, an image in which the two partial images are arranged side by side) becomes the comparison image. If the comparison mode C4 is currently set, the process proceeds to S1611 through the branching of S1608 and S1610. In S1611, the display control unit 112 suppresses the value of the pixel determined to belong to the falsified portion in the partial image cut out in S1607 (for example, corrects the pixel value to the same color as the background color such as white or a color close to it). ). In the comparison mode C4, the partial image processed in this way becomes the comparison image. In S1612, the display control unit 112 determines whether or not an unprocessed character area including a pixel determined to belong to the falsified portion remains. When such an unprocessed character area is left, the process returns to S1603, and the above-mentioned S1603 to S1611 are repeated for the next character area. If not, the process proceeds to S1613.

Ｓ１６１３で、表示制御部１１２は、文字領域ごとの強調画像及び比較画像の１つ以上のペアを、対比表示ウィンドウ内で対比的に表示させる。このような文字別の対比表示によれば、ユーザは、個々の文字を構成するどの部分について改竄が疑われるかを強調画像から把握した上で、比較画像における当該部分の色味又は濃淡を見て、個々の文字について実際に改竄があったかを判断することができる。 In S1613, the display control unit 112 causes one or more pairs of the emphasized image and the comparison image for each character area to be displayed in contrast in the contrast display window. According to such a contrast display for each character, the user grasps from the highlighted image which part of each character is suspected of being tampered with, and then sees the color or shade of the part in the comparison image. Therefore, it can be determined whether or not each character has actually been tampered with.

Ｓ１６２０で、表示制御部１１２は、対象原稿の全体を示す強調画像及び比較画像（読取画像）を対比表示ウィンドウ内で対比的に表示させる。このように、本実施形態では、原稿全体を示す対比表示と、上述した文字別の対比表示との間の円滑な切替えが可能とされる。それにより、ユーザは、原稿内のどの位置の文字に改竄が疑われるかの全体表示における概略的な把握と、文字別表示における個々の文字の検証とを相互に行き来しながら、対象原稿を通じた改竄検出結果の検証を効率的に進めることができる。 In S1620, the display control unit 112 displays the emphasized image and the comparative image (scanned image) showing the entire target document in a contrasting manner in the contrasting display window. As described above, in the present embodiment, it is possible to smoothly switch between the contrast display showing the entire manuscript and the contrast display for each character described above. As a result, the user can pass through the target manuscript while going back and forth between a rough grasp of the character at which position in the manuscript is suspected of being tampered with in the overall display and verification of each character in the character-specific display. Verification of tampering detection results can be carried out efficiently.

なお、図１６では、表示制御部１１２が改竄検出サーバ１０３から受信される強調画像を利用する例を説明したが、表示制御部１１２は、改竄検出結果に基づいて処理対象画像（読取画像）から自ら強調画像を生成してもよい。 Although FIG. 16 has described an example in which the display control unit 112 uses the emphasized image received from the tampering detection server 103, the display control unit 112 starts from the processing target image (read image) based on the tampering detection result. You may generate the emphasized image by yourself.

＜＜５．変形例＞＞
上述した実施形態では、各文字領域の文字領域画像を学習済みモデルに適用することにより、当該文字領域内の各画素が改竄部分に属するか否かを判定する手法を主に説明した。しかしながら、改竄検出のための手法は、かかる例には限定されない。例えば、第１の変形例として、改竄検出は、画素ごとではなく文字ごとに行われてもよい。文字ごとの改竄検出では、例えば、各画素が改竄部分に属するか否かを示す教師画像（例えば、図６（Ｃ）の二値画像６１２）の代わりに、各文字領域が改竄部分を含むか否かを示すフラグを教師データとして用いて生成／更新された学習済みモデルが利用され得る。学習済みモデルは、１つの文字領域画像（入力画像）に対してそれが改竄部分を含むか否かを示すビットを出力する任意の判別型のモデルであってよく、例えばニューラルネットワークモデルの一種であるＶＧＧが利用されてもよい。 << 5. Modification example >>
In the above-described embodiment, a method of determining whether or not each pixel in the character area belongs to the falsified part by applying the character area image of each character area to the trained model has been mainly described. However, the method for detecting tampering is not limited to such an example. For example, as a first modification, tampering detection may be performed for each character instead of each pixel. In the character-by-character tampering detection, for example, whether each character area includes a tampered portion instead of the teacher image (for example, the binary image 612 of FIG. 6C) indicating whether or not each pixel belongs to the tampered portion. A trained model generated / updated using a flag indicating whether or not is used as teacher data can be used. The trained model may be an arbitrary discriminant model that outputs a bit indicating whether or not a character area image (input image) contains a tampered part, and is, for example, a kind of neural network model. A VGG may be used.

図１７は、第１の変形例において、文字ごとの改竄検出の結果に基づいて強調画像と比較画像とを対比的に配置したＧＵＩの一例を示している。図１７に示した対比表示ウィンドウ１７００は、強調画像表示領域１７０１、比較画像表示領域１７０２、文字別表示ボタン１７０３、修正ボタン１３０５及びＯＫボタン１３０６を有する。図１７の例では、強調画像表示領域１７０１に強調画像１７３２が、比較画像表示領域１７０２に比較画像１７２１がそれぞれ表示されている。強調画像１７３２は、対象原稿の読取画像において改竄部分を含むと判定された文字領域の文字を強調して示す画像である。比較画像１７２１は、処理対象画像２１ａと同一の画像であって、改竄部分を含むと判定された文字領域の文字を（強調することなく）読取画像の通りに示す画像である。表示制御部１１２は、文字別表示ボタン１７０３がユーザにより操作されると、文字領域ごとに切出された強調画像及び比較画像を対比的に配置した対比表示ウィンドウ（図示せず）に画面を遷移させてもよい。 FIG. 17 shows an example of a GUI in which the emphasized image and the comparison image are arranged in contrast based on the result of the falsification detection for each character in the first modification. The contrast display window 1700 shown in FIG. 17 has a highlighted image display area 1701, a comparative image display area 1702, a character-specific display button 1703, a correction button 1305, and an OK button 1306. In the example of FIG. 17, the emphasized image 1732 is displayed in the emphasized image display area 1701 and the comparative image 1721 is displayed in the comparative image display area 1702. The emphasized image 1732 is an image in which the characters in the character area determined to include the falsified portion in the scanned image of the target document are emphasized. The comparative image 1721 is the same image as the image to be processed 21a, and is an image showing the characters in the character area determined to include the falsified portion as the read image (without emphasizing). When the character-specific display button 1703 is operated by the user, the display control unit 112 shifts the screen to a contrast display window (not shown) in which the emphasized image and the comparison image cut out for each character area are arranged in contrast. You may let me.

第２の変形例として、改竄検出は、判別型のモデルの代わりに、文字領域画像を符号化して文字の特徴量を抽出するオートエンコーダ型のモデル（例えば、ＶＡＥ（ＶａｒｉａｔｉｏｎａｌＡｕｔｏｅｎｃｏｄｅｒ））を利用して行われてもよい。ＶＡＥの学習処理は、入力画像から次元の削減された特徴量を算出するエンコーダ処理と、算出された特徴量から入力画像を復元しようとするデコーダ処理とを含む。エンコーダは、ニューラルネットワークの構造を有し、デコーダはエンコーダとは逆の構造を有する。モデルの誤差は、入力画像と復元画像との間の差（例えば、交差エントロピー）として評価され、その誤差が小さくなるように、例えばバックプロパゲーション法でモデルパラメータの値が調整される。この場合、学習用改竄画像及び教師データは不要である。学習段階では、学習装置１０２は、複数の学習用原本画像を用いた学習処理を通じて、改竄のない文字の特徴量を文字領域画像から抽出するための学習済みモデルを生成／更新する。典型的には、あらゆる文字について適切な特徴量を抽出できる単一のモデルを生成／更新することは現実的ではないため、学習装置１０２は、文字種ごとに異なるモデルパラメータの値を学習し得る。例えば、「１」、「２」、「３」、「二」、「に」といった文字は、互いに異なる種類の文字として扱われてよい。改竄検出段階では、改竄検出サーバ１０３は、各文字領域画像を、例えばＯＣＲの結果として認識された文字種に対応する学習済みモデルのエンコーダに適用して、当該文字領域画像から特徴量を抽出する。また、改竄検出サーバ１０３は、同じ文字種の既知の（改竄なしの）文字領域画像について予め抽出された参照特徴量を取得する。そして、改竄検出サーバ１０３は、両者の特徴量の間の差分が所定の条件を満たす場合（例えば、マンハッタン距離がある閾値以下である場合）、当該文字領域画像は改竄部分を含まないと判定し得る。逆に、当該文字領域画像から抽出された特徴量と参照特徴量との間のマンハッタン距離が上記閾値を上回る場合、当該文字領域画像は改竄部分を含むと判定され得る。本変形例では、第１の変形例と同様に、処理対象画像内で認識される文字の各々が改竄部分を含むか否かが判定される（即ち、文字ごとの改竄検出）。本変形例における強調画像及び比較画像の対比表示は、図１７を用いて説明した例と同様に行われてよい。 As a second modification, the tampering detection uses an autoencoder type model (for example, VAE (Variational Autoencoder)) that encodes a character area image and extracts character features instead of a discriminant type model. May be done. The VAE learning process includes an encoder process for calculating a dimension-reduced feature amount from an input image and a decoder process for restoring an input image from the calculated feature amount. The encoder has the structure of a neural network, and the decoder has the structure opposite to that of the encoder. The model error is evaluated as the difference between the input image and the restored image (eg, cross entropy), and the values of the model parameters are adjusted, for example, by the backpropagation method so that the error is small. In this case, the falsified image for learning and the teacher data are unnecessary. In the learning stage, the learning device 102 generates / updates a learned model for extracting the feature amount of the character without tampering from the character area image through the learning process using a plurality of original images for learning. Typically, it is not practical to generate / update a single model capable of extracting appropriate features for every character, so the learning device 102 can learn different model parameter values for each character type. For example, characters such as "1", "2", "3", "two", and "ni" may be treated as different types of characters. In the tampering detection stage, the tampering detection server 103 applies each character area image to, for example, an encoder of a trained model corresponding to a character type recognized as a result of OCR, and extracts a feature amount from the character area image. Further, the tampering detection server 103 acquires the reference feature amount extracted in advance for the known (non-tampering) character area image of the same character type. Then, the tampering detection server 103 determines that the character area image does not include the tampering portion when the difference between the two feature amounts satisfies a predetermined condition (for example, when the Manhattan distance is equal to or less than a certain threshold value). obtain. On the contrary, when the Manhattan distance between the feature amount extracted from the character area image and the reference feature amount exceeds the above threshold value, it can be determined that the character area image includes a falsified portion. In this modification, as in the first modification, it is determined whether or not each of the characters recognized in the image to be processed includes a tampered portion (that is, tampering detection for each character). The contrast display of the emphasized image and the comparative image in this modified example may be performed in the same manner as in the example described with reference to FIG.

上述した実施形態では、対比表示ウィンドウにおいて強調画像及び比較画像が横方向に並べて表示される例を主に説明した。しかしながら、強調画像及び比較画像は、横方向以外のいかなる方向に並べて表示されてもよい。また、強調画像及び比較画像は、別個のウィンドウに表示されてもよい。また、第３の変形例として、強調画像及び比較画像は、空間的に対比的に配置される代わりに、時間的に相前後して表示されてもよい。例えば、単一の画像表示領域において、Ｘ秒間の強調画像の表示と、Ｘ秒間の比較画像の表示とが交互に反復的に行われてもよい。本明細書における「対比的な表示」との用語は、これらの表示態様の全てを包含するものとする。 In the above-described embodiment, an example in which the highlighted image and the comparative image are displayed side by side in the horizontal direction in the contrast display window has been mainly described. However, the emphasized image and the comparative image may be displayed side by side in any direction other than the horizontal direction. In addition, the emphasized image and the comparison image may be displayed in separate windows. Further, as a third modification, the emphasized image and the comparative image may be displayed one after the other in time instead of being arranged in a spatially contrasting manner. For example, in a single image display area, the display of the emphasized image for X seconds and the display of the comparative image for X seconds may be alternately and repeatedly performed. The term "contrast indication" herein is intended to include all of these indication aspects.

＜＜６．まとめ＞＞
ここまで、図１〜図１７を用いて、本開示の実施形態について詳細に説明した。上述した実施形態では、対象原稿に含まれる改竄部分が、上記対象原稿の読取画像を用いて検出され、上記読取画像における上記改竄部分を強調して示す強調画像及び上記改竄部分を上記読取画像の通りに示す比較画像が、画面上に対比的に表示される。かかる構成によれば、改竄検出結果をユーザに検証させる際に、画像内のどの部分について改竄が疑われるかを強調画像において明瞭にユーザに示しつつ、比較画像において当該部分のもとの色味及び濃淡をユーザに提示することができる。よって、ユーザは、改竄部分の位置を把握した上で当該部分の色味又は濃淡から検出結果の妥当性を容易に判断することができる。 << 6. Summary >>
Up to this point, the embodiments of the present disclosure have been described in detail with reference to FIGS. 1 to 17. In the above-described embodiment, the falsified portion included in the target manuscript is detected by using the scanned image of the target manuscript, and the emphasized image showing the falsified portion in the scanned image with emphasis and the falsified portion of the scanned image are shown. The comparison images shown on the street are displayed in contrast on the screen. According to such a configuration, when the user verifies the tampering detection result, the emphasized image clearly shows the user which part of the image is suspected of being tampered with, and the original color of the part is shown in the comparison image. And shades can be presented to the user. Therefore, the user can easily determine the validity of the detection result from the tint or shade of the falsified portion after grasping the position of the falsified portion.

また、上述した実施形態では、比較画像の内容の異なる２つ以上の比較モードからユーザにより指定される比較モードに従って、上記強調画像及び上記比較画像が対比的に表示され得る。第１の比較モードによれば、ユーザは、強調画像と比較画像との間で文字を構成する部分同士の対応関係を一目瞭然に把握することができる。第２の比較モードによれば、ユーザは、改竄が疑われる文字の前後の文字を通じて色味、濃淡及びストロークの癖といった要素を評価して、改竄検出結果の妥当性を判断することができる。第３の比較モードによれば、ユーザは、改竄が疑われる文字と同じ（他の文字領域の）文字の色味、濃淡及びストロークの癖といった要素を参考にして、改竄検出結果の妥当性を判断することができる。第４の比較モードによれば、ユーザは、改竄によって書き足された可能性のあるストロークが無い状態で画像を見て、改竄部分に属さないと判定された画素に注意を向けて改竄検出結果の妥当性を判断することができる。これら比較モードの間の柔軟な切替えを可能にすることで、改竄検出結果の妥当性を検証するユーザの作業を効率化することができる。 Further, in the above-described embodiment, the emphasized image and the comparison image can be displayed in contrast according to the comparison mode specified by the user from two or more comparison modes having different contents of the comparison image. According to the first comparison mode, the user can clearly grasp the correspondence between the parts constituting the characters between the emphasized image and the comparison image. According to the second comparison mode, the user can evaluate factors such as color, shading, and stroke habit through the characters before and after the character suspected of being tampered with, and judge the validity of the tampering detection result. According to the third comparison mode, the user can determine the validity of the tampering detection result by referring to factors such as the color, shading, and stroke habit of the same character (in other character areas) as the character suspected of being tampered with. You can judge. According to the fourth comparison mode, the user looks at the image without strokes that may have been added by tampering, and pays attention to the pixels determined not to belong to the tampering part, and the tampering detection result. Can be judged as valid. By enabling flexible switching between these comparison modes, it is possible to streamline the work of the user who verifies the validity of the tampering detection result.

また、上述した実施形態では、読取画像のどの部分が改竄されていると判定されたかを示す改竄検出結果を修正可能なユーザインタフェースがユーザに提供され得る。かかる構成によれば、上記強調画像及び上記比較画像の対比的な表示を見てユーザが検出誤りを発見した場合に、ユーザが直ちにその検出誤りを修正することができる。したがって、改竄の有無の検証の後に続く人的な又はシステム的な処理へ、ユーザにより適宜修正された改竄検出結果を円滑に受け渡すことができる。 Further, in the above-described embodiment, the user can be provided with a user interface capable of modifying the falsification detection result indicating which part of the scanned image is determined to be falsified. According to such a configuration, when the user discovers a detection error by looking at the contrasting display of the emphasized image and the comparison image, the user can immediately correct the detection error. Therefore, the tampering detection result appropriately modified by the user can be smoothly passed to the human or systematic process following the verification of the presence or absence of tampering.

ある実施例では、上記読取画像内の複数の画素の各々が改竄部分に属するか否かが判定され、上記改竄部分に属すると判定された画素が上記強調画像において強調され得る。この場合、例えば個々の文字にストロークが書き足された形の改竄のケースでそのストロークのみを強調するなど、精細な改竄検出結果をユーザに視覚的に提示することができる。ある変形例では、上記読取画像内の１つ以上の文字領域の各々が改竄部分を含むか否かが判定され、上記改竄部分を含むと判定された文字領域の文字が上記強調画像において強調され得る。この場合、改竄検出に要する計算処理の負荷が小さいため、読取画像のデータ量が多い場合にも迅速にユーザに改竄検出結果を提示することができる。 In a certain embodiment, it is determined whether or not each of the plurality of pixels in the scanned image belongs to the falsified portion, and the pixel determined to belong to the falsified portion can be emphasized in the emphasized image. In this case, it is possible to visually present the fine tampering detection result to the user, for example, by emphasizing only the stroke in the case of tampering in which strokes are added to individual characters. In a modified example, it is determined whether or not each of one or more character areas in the scanned image includes a falsified portion, and the characters in the character area determined to include the falsified portion are emphasized in the emphasized image. obtain. In this case, since the load of the calculation process required for the tampering detection is small, the tampering detection result can be quickly presented to the user even when the amount of data of the scanned image is large.

＜＜７．その他の実施形態＞＞
上記実施形態は、１つ以上の機能を実現するプログラムをネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサがプログラムを読み出して実行する処理の形式でも実現可能である。また、１つ以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 << 7. Other embodiments >>
In the above embodiment, a program that realizes one or more functions is supplied to a system or a device via a network or a storage medium, and one or more processors in the computer of the system or the device read and execute the program. It is also feasible in the form. It can also be realized by a circuit (for example, an ASIC) that realizes one or more functions.

上記実施形態は、複数の装置からなるシステムにおいて実現されてもよく、全体として１つの装置で実現されてもよい。また、上で単一の装置に含まれるものとして説明した機能が、複数の装置に分散して配置されてもよい。例えば、学習装置１０２のデータ処理部１２１及び学習部１２２は、物理的に別体の装置に実装されてもよい。また、改竄検出サーバ１０３の検出部１３２と同等の機能が画像処理装置１０１に実装されてもよい。同様に、ＯＣＲサーバ１０４の文字認識部１４１と同等の機能が改竄検出サーバ１０３又は画像処理装置１０１に実装されてもよい。本明細書で言及したどの画像も、グレースケール画像であってもよく、又はフルカラー画像であってもよい。 The above embodiment may be realized in a system including a plurality of devices, or may be realized in one device as a whole. Further, the functions described above as being included in a single device may be distributed and arranged in a plurality of devices. For example, the data processing unit 121 and the learning unit 122 of the learning device 102 may be physically mounted on separate devices. Further, the image processing device 101 may be equipped with a function equivalent to that of the detection unit 132 of the tampering detection server 103. Similarly, a function equivalent to that of the character recognition unit 141 of the OCR server 104 may be implemented in the tampering detection server 103 or the image processing device 101. Any image referred to herein may be a grayscale image or a full color image.

発明は上記実施形態に制限されるものではなく、発明の精神及び範囲から離脱することなく、様々な変更及び変形が可能である。従って、発明の範囲を公にするために請求項を添付する。 The invention is not limited to the above embodiments, and various modifications and modifications can be made without departing from the spirit and scope of the invention. Therefore, a claim is attached to make the scope of the invention public.

１００画像処理システム
１０１画像処理サーバ
１１１読取部
１１２表示制御部
１０２学習装置
１２２学習部
１０３改竄検出サーバ
１３１画像取得部
１３２検出部
１０４ＯＣＲサーバ
１１対象原稿
２１処理対象画像（読取画像）
３２検出結果データ 100 Image processing system 101 Image processing server 111 Reading unit 112 Display control unit 102 Learning device 122 Learning unit 103 Manipulation detection server 131 Image acquisition unit 132 Detection unit 104 OCR server 11 Target manuscript 21 Target image (scanned image)
32 Detection result data

Claims

A generation means for generating a learning model by executing machine learning processing based on a falsified image, an image before falsification, and an image showing the difference between the falsified image and the image before falsification. When,
Input means for inputting image data and
An image processing system comprising the estimation means for estimating whether or not the image data input by the input means includes a falsified image by using the learning model generated by the generation means. ..

Has more display means,
Since it is estimated by the estimation means that the image data input by the input means includes a falsified image, the display means emphasizes and displays the falsified part of the falsified image. The image processing system according to claim 1, wherein the image processing system is characterized.

The image processing system according to claim 2, wherein the display means displays an image indicated by image data input by the input means together with an image in which the tampered portion is emphasized.

Reception means and
The first determining means for determining that the designated pixel is a falsified pixel by receiving the designation of the pixel by the user in the falsified image displayed by the display means.
Further, the receiving means further comprises a second determining means for determining that the designated pixel is not a falsified pixel by accepting a pixel designation by the user in the falsified image displayed by the display means. The image processing system according to claim 2 or 3, wherein the image processing system has.

The display means is characterized in that it displays information indicating that falsification has not been detected by presuming that the image data input by the input means does not include a falsified image. The image processing system according to any one of claims 2 to 4.

The learning means includes a falsified image indicated by the image data input by the input means, an image before being falsified indicated by the image data input by the input means, and a falsified image before being falsified. The image processing system according to any one of claims 1 to 5, wherein a learning model is generated by executing a machine learning process based on an image showing a difference between images.

A learning model updated by the machine learning process is specified based on the falsified image indicated by the image data input by the input means and the image before falsification indicated by the image data input by the input means. The image processing system according to claim 6, further comprising a third determination means for determining based on the identification information.

Further having a storage means for storing the learning model and the identification information in association with each other,
The third determination means determines the learning model corresponding to the designated identification information as a learning model updated by performing the machine learning process based on the image before falsification input by the input means. The image processing system according to claim 7, wherein the image processing system is characterized by the above.

A determination means for determining whether the image indicated by the image data input by the input means is a falsified image, or whether the image indicated by the image data input by the input means is an image before falsification.
The image data determined by the determination means to be an image in which the image indicated by the image data has been tampered with, and the image data determined by the determination means to be an image before the image indicated by the image data has been tampered with. The image processing system according to any one of claims 1 to 8, further comprising a storage means for storing in association with each other.

Based on the identifier included in the image data input by the input means, the storage means determines that the image indicated by the image data is a falsified image, and the determination means determines that the image data is a falsified image. The image processing system according to claim 9, wherein the image indicated by the image data is associated with the image data determined to be an image before being tampered with and stored.

The image processing system according to claim 10, wherein the identifier is a two-dimensional code.

The image processing system is composed of at least an image processing device, a learning device, and an estimation device.
The image processing device has the input means.
The learning device has the learning means and
The image processing system according to any one of claims 1 to 11, wherein the estimation device includes an estimation means.

The image processing device is
Further, it has a transmission means for transmitting the image data input by the input means to the learning device.
The learning device is
It further has a receiving means for receiving the image data transmitted from the image processing apparatus by the transmitting means.
The image processing system according to claim 12, wherein the learning means executes the machine learning process based on the image data received by the receiving means.

The image processing system includes at least the image processing device, the learning device, the estimation device, and a character recognition device.
The character recognition device has a character recognition means for executing character recognition processing.
The estimation means of the estimation device is based on the image data input by the image processing device, the learning model generated by the learning device, and the result of the character recognition processing output by the character recognition device. The image processing system according to claim 12 or 13, wherein it estimates whether or not the image data input by the input means includes a falsified image.

It is an image processing method
A generation process that generates a learning model by executing machine learning processing based on a falsified image, an image before falsification, and an image showing the difference between the falsified image and the image before falsification. When,
Input process for inputting image data and
An image processing method characterized by having an estimation step of estimating whether or not a falsified image is included in the image data input in the input step using the learning model generated in the generation step. ..

A program for executing the image processing method according to claim 15 on a computer.