JP2023081269A

JP2023081269A - Image forming device, method for controlling image forming device, and program

Info

Publication number: JP2023081269A
Application number: JP2022090350A
Authority: JP
Inventors: 諭池田; Satoshi Ikeda
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2021-11-30
Filing date: 2022-06-02
Publication date: 2023-06-09

Abstract

To provide an image forming device capable of highly accurately estimating existence of a graph and a table without reference to the types of the graph and the table if the graph and the table exist in a document, a method for controlling the image forming device, and a program.SOLUTION: An image forming device (image processing device) 101 is a device for processing an image. The image forming device 101 comprises a scanner unit (reading means) 10 for reading a document, and an estimation processing unit (estimation processing means) 405 for estimating a chart area in the document read by the scanner unit 10 by using a learned model in which machine learning was performed which estimates a chart area of at least a graph region containing graphs in the document or a table area containing tables.SELECTED DRAWING: Figure 6

Description

本発明は、画像形成装置、画像形成装置の制御方法およびプログラムに関する。 The present invention relates to an image forming apparatus, an image forming apparatus control method, and a program.

紙等の記録材に画像を形成する画像形成装置としては、プリンタ機能、スキャン機能、ファクシミリ機能、コピー機能等を併せ持った複合機が知られている。複合機では、カラー印刷された原稿（ここでは「１次原稿」と言う）をスキャン機能で読み取ることができる。そして、当該読み取られた１次原稿をプリンタ機能で出力するのに際して、印刷コストを削減するために、モノクロ印刷で原稿（ここでは「２次原稿」と言う）を出力する場合がある。例えば、１次原稿内に赤色の棒グラフと青色の棒グラフの２本の棒グラフが含まれている場合、２次原稿では、これら２本の棒グラフは、濃淡が異なるのみであり、識別性がカラー印刷の場合よりも劣る。このような問題に対して、特許文献１に記載の画像形成装置では、モノクロ印刷でも、識別し易いパターン（ハッチング）を各棒グラフに施すことができる。また、特許文献１に記載の画像形成装置では、１次原稿内のグラフを検出する際には、色が施された（塗りつぶされた）領域の形状が矩形または扇形である場合に、当該領域をグラフとして検出している。 2. Description of the Related Art As an image forming apparatus for forming an image on a recording material such as paper, there is known a multifunction machine having a printer function, a scan function, a facsimile function, a copy function, and the like. A multifunction device can read a color-printed document (referred to as a "primary document" here) using a scanning function. When outputting the read primary document using the printer function, the document (herein referred to as a "secondary document") may be output by monochrome printing in order to reduce printing costs. For example, if the primary manuscript contains two bar graphs, a red bar graph and a blue bar graph, in the secondary manuscript, these two bar graphs differ only in shades, and their identifiability is not achieved by color printing. is inferior to the case of In order to address such a problem, the image forming apparatus disclosed in Japanese Patent Application Laid-Open No. 2002-200313 can apply a pattern (hatching) that is easy to identify even in monochrome printing to each bar graph. Further, in the image forming apparatus disclosed in Patent Document 1, when a graph in a primary document is detected, if the shape of a colored (filled) area is a rectangle or a sector, the area is is detected as a graph.

特開２０１６－１６５０２６号公報JP 2016-165026 A

しかしながら、特許文献１に記載の画像形成装置では、１次原稿内に、色が施された矩形または扇形の領域が存在していれば、当該領域を全てグラフとして誤検出してしまう。また、１次原稿内に、線グラフや３Ｄグラフ等のように、矩形または扇形とは異なるグラフが存在している場合、当該グラフを検出することはできない。 However, in the image forming apparatus disclosed in Japanese Patent Application Laid-Open No. 2002-200311, if a colored rectangular or fan-shaped area exists in the primary document, the entire area is erroneously detected as a graph. Also, if the primary document includes a graph that is not rectangular or fan-shaped, such as a line graph or a 3D graph, the graph cannot be detected.

本発明は、原稿内にグラフや表が存在している場合、当該グラフや表の種類に関わらず、グラフや表の存在を高精度に推定することができる画像形成装置、画像形成装置の制御方法およびプログラムを提供することを目的とする。 INDUSTRIAL APPLICABILITY The present invention is an image forming apparatus capable of estimating the existence of a graph or a table with high accuracy regardless of the type of the graph or the table when the document contains the graph or the table, and control of the image forming apparatus. It aims at providing a method and a program.

上記目的を達成するために、本発明の画像形成装置は、画像を処理する画像形成装置であって、原稿を読み取る読取手段と、原稿内のグラフを含むグラフ領域と、表を含む表領域とのうちの少なくとも一方の図表領域を推定するための機械学習が行われた学習済みモデルを用いて、前記読取手段で読み取られた原稿内の前記図表領域を推定する推定処理手段と、を備えることを特徴とする。 To achieve the above object, an image forming apparatus of the present invention is an image forming apparatus for processing an image, comprising reading means for reading a document, a graph area including a graph in the document, and a table area including a table. estimation processing means for estimating the graphic area in the document read by the reading means using a learned model subjected to machine learning for estimating the graphic area of at least one of characterized by

本発明によれば、原稿内にグラフや表が存在している場合、当該グラフの種類や表に関わらず、グラフと表の存在を高精度に推定することができる。 According to the present invention, when a document contains graphs and tables, the existence of the graphs and tables can be estimated with high accuracy regardless of the types of graphs and tables.

画像形成システムの全体構成を示す図である。1 is a diagram showing the overall configuration of an image forming system; FIG. 画像形成装置のハードウェア構成を示すブロック図である。2 is a block diagram showing the hardware configuration of the image forming apparatus; FIG. 機械学習サーバのハードウェア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of a machine-learning server. 画像形成システムのソフトウェア構成を示すブロック図である。2 is a block diagram showing the software configuration of the image forming system; FIG. 学習モデル（グラフ関係）における入出力の構造を示す概念図である。FIG. 4 is a conceptual diagram showing an input/output structure in a learning model (graph relation); 学習モデル（表関係）における入出力の構造を示す概念図である。FIG. 4 is a conceptual diagram showing an input/output structure in a learning model (table relationship); 学習モデルにおける入出力の構造を示す概念図である。FIG. 4 is a conceptual diagram showing the input/output structure in the learning model; 画像形成システムで行われる動作（処理の流れ）を説明するための図である。FIG. 3 is a diagram for explaining operations (process flow) performed in the image forming system; スキャナ部で読み取られる原稿の構成例および推定結果（グラフ関係）を示す図である。FIG. 4 is a diagram showing a configuration example of a document read by a scanner unit and an estimation result (graph relation); スキャナ部で読み取られる原稿の構成例および推定結果（表関係）を示す図である。4A and 4B are diagrams showing a configuration example of a document read by a scanner unit and an estimation result (table relation); FIG. 図７Ａに対応させた表である。7B is a table corresponding to FIG. 7A; 図７Ｂに対応させた表である。FIG. 7B is a table corresponding to FIG. 7B; スキャン系のジョブの流れを示すフローチャートである。4 is a flow chart showing the flow of a scan-related job; 機械学習部および推定処理部の処理の流れを示すフローチャートである。4 is a flow chart showing the flow of processing by a machine learning unit and an estimation processing unit; ＵＩ表示部での表示例を示す図である。FIG. 10 is a diagram showing a display example on a UI display unit;

以下、本発明の実施形態について図面を参照しながら詳細に説明する。しかしながら、以下の実施形態に記載されている構成はあくまで例示に過ぎず、本発明の範囲は実施形態に記載されている構成によって限定されることはない。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. However, the configurations described in the following embodiments are merely examples, and the scope of the present invention is not limited by the configurations described in the embodiments.

＜画像形成システムの全体構成＞
図１は、画像形成システムの全体構成を示す図である。図１に示すように、画像形成システム（画像処理システム）１０００は、画像形成装置（画像処理装置）１０１、機械学習サーバ１０２、汎用コンピュータ１０３、データサーバ１０５を有する。また、画像形成装置１０１、機械学習サーバ１０２、汎用コンピュータ１０３、データサーバ１０５は、有線ＬＡＮ等のネットワーク１０４を介して互いに通信可能に接続されている。画像形成装置１０１は、例えば、プリンタ、複合機、ＦＡＸ等で構成されており、用紙等の記録材に画像を形成することができる。画像形成装置１０１は、ＡＩ機能が搭載されており、随時、機械学習サーバ１０２から学習済みモデルを受信して、ＡＩ機能を実現することができる。機械学習サーバ１０２は、ＡＩ機能を実現するための学習済みモデルの機械学習に必要な学習データを、主にデータサーバ１０５から受信する。なお、機械学習サーバ１０２は、データサーバ１０５以外の装置からも学習データを受信することができる。データサーバ１０５は、機械学習サーバ１０２で機械学習を行うために使用される学習データを、例えば、画像形成装置１０１、汎用コンピュータ１０３等の外部機器から収集して、機械学習サーバ１０２に送信する。汎用コンピュータ１０３は、例えば、パーソナルコンピュータであり、画像形成装置１０１に対するプリントデータの送信等を行う。 <Overall Configuration of Image Forming System>
FIG. 1 is a diagram showing the overall configuration of an image forming system. As shown in FIG. 1 , an image forming system (image processing system) 1000 has an image forming apparatus (image processing apparatus) 101 , a machine learning server 102 , a general-purpose computer 103 and a data server 105 . The image forming apparatus 101, the machine learning server 102, the general-purpose computer 103, and the data server 105 are communicably connected to each other via a network 104 such as a wired LAN. The image forming apparatus 101 is composed of, for example, a printer, a multifunction machine, a FAX, etc., and can form an image on a recording material such as paper. The image forming apparatus 101 is equipped with an AI function, and can receive a learned model from the machine learning server 102 at any time to implement the AI function. The machine learning server 102 mainly receives from the data server 105 learning data necessary for machine learning of a trained model for realizing the AI function. Note that the machine learning server 102 can also receive learning data from devices other than the data server 105 . The data server 105 collects learning data used for machine learning by the machine learning server 102 from external devices such as the image forming apparatus 101 and the general-purpose computer 103 and transmits the learning data to the machine learning server 102 . The general-purpose computer 103 is, for example, a personal computer, and performs transmission of print data to the image forming apparatus 101 and the like.

画像形成システム１０００では、画像形成装置１０１で読み取った原稿のデータをデータサーバ１０５で収集して、そのデータを機械学習サーバ１０２で機械学習することにより、学習モデルを生成する。この学習モデルは、機械学習サーバ１０２からロードした原稿内のグラフや表の有無を推定するための学習モデルである。画像形成装置１０１は、学習モデルを活用するＡＩ機能を有する。このような画像形成システム１０００により、画像形成装置１０１で読み込んだ原稿内にグラフや表があるか否かを判断して、ユーザにその後の処理（例えば原稿データの変更等）を促すことが可能となっている。 In the image forming system 1000, the data server 105 collects the data of the document read by the image forming apparatus 101, and the machine learning server 102 performs machine learning on the collected data to generate a learning model. This learning model is a learning model for estimating the presence/absence of graphs and tables in the manuscript loaded from the machine learning server 102 . The image forming apparatus 101 has an AI function that utilizes learning models. With such an image forming system 1000, it is possible to determine whether or not a document read by the image forming apparatus 101 contains a graph or a table, and prompt the user to perform subsequent processing (for example, change document data). It has become.

＜画像形成装置のハードウェア構成（画像形成装置のコントローラ構成）＞
図２は、画像形成装置のハードウェア構成を示すブロック図である。図２に示すように、画像形成装置１０１は、操作部（操作手段）１４０、スキャナ部（読取手段）１０、プリンタ部（印刷手段）２０を有する。操作部１４０は、例えばマルチタッチセンサ等を備えた液晶ディスプレイ等を有し、画像形成装置１０１を使用するユーザからの各種の操作を受け付ける。スキャナ部１０は、操作部１４０を介したユーザからの操作に応じて作動して、原稿に形成された画像（画像情報）を読み取ることができる（読取工程）。スキャナ部１０は、スキャナ部１０を制御するＣＰＵ、原稿読取を行う際に光を照射する照明ランプ、光を反射させる走査ミラー等を有する。プリンタ部２０は、画像を用紙に印刷することができる。これにより、カラー印刷された原稿や、モノクロ印刷された原稿を得ることができる。プリンタ部２０は、プリンタ部２０の制御を行うＣＰＵ、用紙への画像形成を行う感光体ドラム、用紙への画像定着を行う定着器等を有する。 <Hardware Configuration of Image Forming Apparatus (Controller Configuration of Image Forming Apparatus)>
FIG. 2 is a block diagram showing the hardware configuration of the image forming apparatus. As shown in FIG. 2 , the image forming apparatus 101 has an operation section (operation means) 140 , a scanner section (reading means) 10 and a printer section (printing means) 20 . The operation unit 140 has, for example, a liquid crystal display equipped with a multi-touch sensor, etc., and receives various operations from the user who uses the image forming apparatus 101 . The scanner unit 10 operates according to an operation from the user via the operation unit 140, and can read an image (image information) formed on a document (reading process). The scanner unit 10 includes a CPU that controls the scanner unit 10, an illumination lamp that irradiates light when reading a document, a scanning mirror that reflects light, and the like. The printer section 20 can print an image on paper. As a result, a color-printed document or a monochrome-printed document can be obtained. The printer section 20 includes a CPU for controlling the printer section 20, a photosensitive drum for forming an image on paper, a fixing device for fixing an image on the paper, and the like.

また、画像形成装置１０１は、コントローラ１２００を有する。コントローラ１２００は、スキャナ部１０、プリンタ部２０、ネットワーク１０４、無線ＬＡＮ１０６、公衆回線（ＷＡＮ）３００１と通信可能に接続されており、画像形成装置１０１の動作を統括的に制御する。コントローラ１２００は、ラスタイメージプロセッサ（ＲＩＰ）１２６０、スキャナ画像処理部１２８０、プリンタ画像処理部１２９０、画像回転部１２３０、画像圧縮部１２４０、デバイスＩ／Ｆ１２２０、画像バス２００８を有する。ラスタイメージプロセッサ（ＲＩＰ）１２６０は、ネットワーク１０４を介して汎用コンピュータ１０３から受信した印刷ジョブに含まれるＰＤＬコードをビットマップイメージに展開する。スキャナ画像処理部１２８０は、スキャナ部１０から入力された画像データに対し補正、加工、編集を行う。プリンタ画像処理部１２９０は、プリンタ部２０で出力（印刷）される画像データに対して補正、解像度変換等を行う。画像回転部１２３０は、画像データの回転を行う。画像圧縮部１２４０は、多値画像データについてはＪＰＥＧ、２値画像データについてはＪＢＩＧ、ＭＭＲまたはＭＨの圧縮伸張処理を行う。デバイスＩ／Ｆ１２２０は、スキャナ部１０およびプリンタ部２０と、コントローラ１２００とを接続して、画像データの同期系／非同期系の変換を行う。画像バス２００８は、これらを互いに接続して画像データを高速で転送する。 The image forming apparatus 101 also has a controller 1200 . The controller 1200 is communicably connected to the scanner unit 10, the printer unit 20, the network 104, the wireless LAN 106, and the public line (WAN) 3001, and controls the operation of the image forming apparatus 101 in an integrated manner. The controller 1200 has a raster image processor (RIP) 1260 , scanner image processing section 1280 , printer image processing section 1290 , image rotation section 1230 , image compression section 1240 , device I/F 1220 and image bus 2008 . A raster image processor (RIP) 1260 develops PDL code included in a print job received from the general-purpose computer 103 via the network 104 into a bitmap image. A scanner image processing unit 1280 corrects, processes, and edits image data input from the scanner unit 10 . A printer image processing unit 1290 performs correction, resolution conversion, and the like on image data output (printed) by the printer unit 20 . The image rotation unit 1230 rotates image data. The image compression unit 1240 performs JPEG compression/decompression processing for multilevel image data, and JBIG, MMR, or MH compression/decompression processing for binary image data. A device I/F 1220 connects the scanner unit 10 and the printer unit 20 to the controller 1200 and performs synchronous/asynchronous conversion of image data. An image bus 2008 connects these to each other to transfer image data at high speed.

また、コントローラ１２００は、ＣＰＵ１２０１、ＲＡＭ１２０２、操作部Ｉ／Ｆ１２０６、ネットワーク部１２１０、モデム部１２１１、無線通信Ｉ／Ｆ１２７０を有する。ＣＰＵ１２０１は、画像形成装置１０１を統括的に制御する制御部である。ＲＡＭ１２０２は、ＣＰＵ１２０１が動作するためのシステムワークメモリであり、画像データを一時記憶するための画像メモリとしても機能する。操作部Ｉ／Ｆ１２０６は、操作部１４０に表示する画像データを操作部１４０に対して出力する。また、操作部Ｉ／Ｆ１２０６は、操作部１４０からユーザが入力した情報をＣＰＵ１２０１に伝える役割を有する。ネットワーク部１２１０は、ネットワーク１０４に接続され、汎用コンピュータ１０３やネットワーク１０４上の図示しないその他のコンピュータ端末との通信（送受信）を行う。モデム部１２１１は、公衆回線３００１に接続され、図示しない外部のファクシミリ装置とのデータの通信（送受信）を行う。無線通信Ｉ／Ｆ１２７０は、無線ＬＡＮ１０６により外部の端末と接続する。 Controller 1200 also has CPU 1201 , RAM 1202 , operation unit I/F 1206 , network unit 1210 , modem unit 1211 , and wireless communication I/F 1270 . A CPU 1201 is a control unit that controls the image forming apparatus 101 in an integrated manner. A RAM 1202 is a system work memory for the operation of the CPU 1201 and also functions as an image memory for temporarily storing image data. Operation unit I/F 1206 outputs image data to be displayed on operation unit 140 to operation unit 140 . Further, the operation unit I/F 1206 has a role of transmitting information input by the user from the operation unit 140 to the CPU 1201 . The network unit 1210 is connected to the network 104 and communicates (transmits and receives) with the general-purpose computer 103 and other computer terminals (not shown) on the network 104 . A modem unit 1211 is connected to the public line 3001 and performs data communication (transmission/reception) with an external facsimile machine (not shown). A wireless communication I/F 1270 connects with an external terminal via the wireless LAN 106 .

また、コントローラ１２００は、ＲＯＭ１２０３、ＨＤＤ（ハードディスクドライブ）１２０４、システムバス１２０７、内部通信Ｉ／Ｆ１２０８、画像バス１２１２、ＩｍａｇｅＢｕｓＩ／Ｆ１２０５を有する。ＲＯＭ１２０３には、プログラムが格納されている。このプログラムとしては、例えば、画像形成装置１０１の各部や各手段等の作動（画像形成装置の制御方法）をＣＰＵ１２０１（コンピュータ）に実行させるためのプログラムがある。ハードディスクドライブ（ＨＤＤ）１２０４には、システムソフトウェア、画像データ、ソフトウェアカウンタ値等が格納されている。コントローラ１２００は、印刷やコピージョブ実行時の、ユーザ名、印刷部数、カラー印刷等、出力属性情報等をジョブ実行時の履歴を、ジョブログ情報としてＨＤＤ１２０４またはＲＡＭ１２０２に記録管理している。内部通信Ｉ／Ｆ１２０８は、スキャナ部１０およびプリンタ部２０と通信を行う。ＩｍａｇｅＢｕｓＩ／Ｆ１２０５は、システムバス１２０７および画像バス１２１２を接続し、データ構造を変換するバスブリッジとして機能する。 The controller 1200 also has a ROM 1203 , HDD (hard disk drive) 1204 , system bus 1207 , internal communication I/F 1208 , image bus 1212 and ImageBus I/F 1205 . The ROM 1203 stores programs. As this program, for example, there is a program for causing the CPU 1201 (computer) to execute the operation of each unit and means of the image forming apparatus 101 (control method of the image forming apparatus). A hard disk drive (HDD) 1204 stores system software, image data, software counter values, and the like. The controller 1200 records and manages a history of job execution such as user name, number of copies, color printing, output attribute information, etc., in the HDD 1204 or RAM 1202 as job log information. An internal communication I/F 1208 communicates with the scanner section 10 and the printer section 20 . The ImageBus I/F 1205 connects the system bus 1207 and the image bus 1212 and functions as a bus bridge for converting data structures.

また、コントローラ１２００は、ＧＰＵ１２９１を有する。ＧＰＵ１２９１は、データをより多く並列処理することで効率的な演算を行うことができる。そのため、ディープラーニングのような学習モデルを用いて複数回に渡り学習を行う場合には、ＧＰＵ１２９１で処理を行うことが有効である。本実施形態では、後述の機械学習部４１４による処理には、ＣＰＵ１２０１に加えてＧＰＵ１２９１を用いる。具体的には、学習モデルを含む学習プログラムを実行する場合に、ＣＰＵ１２０１とＧＰＵ１２９１とが協働して演算を行うことで学習を行う。なお、機械学習部４１４の処理は、ＣＰＵ１２０１またはＧＰＵ１２９１のみにより行われてもよい。また、後述の推定処理部（推定処理手段）４０５の処理についても、機械学習部４１４と同様に、ＧＰＵ１２９１を用いてもよい。 Also, the controller 1200 has a GPU 1291 . The GPU 1291 can perform efficient operations by processing more data in parallel. Therefore, when learning is performed multiple times using a learning model such as deep learning, it is effective to perform processing using the GPU 1291 . In this embodiment, a GPU 1291 is used in addition to the CPU 1201 for processing by the machine learning unit 414, which will be described later. Specifically, when a learning program including a learning model is executed, the CPU 1201 and the GPU 1291 cooperate to perform calculations for learning. Note that the processing of the machine learning unit 414 may be performed only by the CPU 1201 or the GPU 1291 . Also, the GPU 1291 may be used for the processing of the estimation processing unit (estimation processing means) 405 described later, similarly to the machine learning unit 414 .

＜機械学習サーバのハードウェア構成＞
図３は、機械学習サーバのハードウェア構成を示すブロック図である。図３に示すように、機械学習サーバ１０２は、ＣＰＵ１３０１、ＲＡＭ１３０２、ＲＯＭ１３０３、ＨＤＤ（ハードディスクドライブ）１３０４、ネットワーク部１３１０、ＩＯ部１３０５、ＧＰＵ１３０６を有する。また、これらは、システムバス１２０７を介して互いに接続されている。ＣＰＵ１３０１は、ＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）やアプリケーションソフト等のプログラムをＨＤＤ１３０４から読み出して実行することで、種々の機能を提供する。ＲＡＭ１３０２は、ＣＰＵ１３０１がプログラムを実行する際のシステムワークメモリである。ＲＯＭ１３０３は、ＢＩＯＳ（ＢａｓｉｃＩｎｐｕｔＯｕｔｐｕｔＳｙｓｔｅｍ）やＯＳを起動するためのプログラム、設定ファイルが記憶されている。ＨＤＤ１３０４は、ハードディスクドライブであって、システムソフトウェア等が記憶されている。ネットワーク部１３１０は、ネットワーク１０４に接続され、画像形成装置１０１等の外部機器と通信（送受信）を行う。ＩＯ部１３０５は、マルチタッチセンサ等を有する液晶ディスプレイ入出力デバイスとで操作部（図示せず）を構成する。そして、ＩＯ部１３０５は、操作部に対して、情報を入出力するインタフェースとして機能する。この操作部には、プログラムが指示する画面情報に基づいて、所定の解像度や色数等で所定の情報が描画される。例えば、操作部には、ＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）画面が形成され、操作に必要な各種ウィンドウやデータ等が表示される。 <Hardware configuration of machine learning server>
FIG. 3 is a block diagram showing the hardware configuration of the machine learning server. As shown in FIG. 3, the machine learning server 102 has a CPU 1301, a RAM 1302, a ROM 1303, a HDD (Hard Disk Drive) 1304, a network section 1310, an IO section 1305, and a GPU 1306. Also, they are connected to each other via a system bus 1207 . The CPU 1301 provides various functions by reading programs such as an OS (Operating System) and application software from the HDD 1304 and executing them. A RAM 1302 is a system work memory when the CPU 1301 executes programs. The ROM 1303 stores a BIOS (Basic Input Output System), programs for starting the OS, and setting files. The HDD 1304 is a hard disk drive and stores system software and the like. A network unit 1310 is connected to the network 104 and communicates (transmits and receives) with external devices such as the image forming apparatus 101 . The IO unit 1305 constitutes an operation unit (not shown) together with a liquid crystal display input/output device having a multi-touch sensor or the like. The IO unit 1305 functions as an interface for inputting/outputting information to/from the operation unit. Predetermined information is drawn on the operation unit with a predetermined resolution, number of colors, etc., based on screen information instructed by the program. For example, a GUI (Graphical User Interface) screen is formed in the operation unit, and various windows and data required for operation are displayed.

ＧＰＵ１３０６は、データをより多く並列処理することで効率的な演算を行うことができる。そのため、ディープラーニングのような学習モデルを用いて複数回に渡り学習を行う場合には、ＧＰＵ１３０６で処理を行うことが有効である。本実施形態では、機械学習部４１４による処理には、ＣＰＵ１３０１に加えてＧＰＵ１３０６を用いる。具体的には、学習モデルを含む学習プログラムを実行する場合に、ＣＰＵ１３０１とＧＰＵ１３０６とが協働して演算を行うことで学習を行う。なお、機械学習部４１４の処理は、ＣＰＵ１３０１またはＧＰＵ１３０６のみにより行われてもよい。また、推定処理部４０５の処理についても、機械学習部４１４と同様に、ＧＰＵ１３０６を用いてもよい。ここで、画像形成装置１０１のＧＰＵ１２９１との使い分けについて説明する。画像形成システム１０００では、ネットワーク１０４を介した通信やＧＰＵの処理に要する負荷、画像形成装置１０１の省電力モード等に応じて、ＧＰＵの計算資源を有効活用がされるようになっている。例えば、画像形成装置１０１が省電力モードに移行する場合、優先的に（積極的に）機械学習サーバ１０２側のＧＰＵ１３０１が活用されるよう構成されている。 The GPU 1306 can perform efficient operations by processing data in parallel. Therefore, when learning is performed multiple times using a learning model such as deep learning, it is effective to perform processing on the GPU 1306 . In this embodiment, a GPU 1306 is used in addition to the CPU 1301 for processing by the machine learning unit 414 . Specifically, when a learning program including a learning model is executed, the CPU 1301 and the GPU 1306 cooperate to perform calculations for learning. Note that the processing of the machine learning unit 414 may be performed only by the CPU 1301 or the GPU 1306 . Also, the GPU 1306 may be used for the processing of the estimation processing unit 405 as well as the machine learning unit 414 . Here, proper use of the GPU 1291 of the image forming apparatus 101 will be described. In the image forming system 1000, the computational resources of the GPU are effectively utilized according to the communication via the network 104, the processing load of the GPU, the power saving mode of the image forming apparatus 101, and the like. For example, when the image forming apparatus 101 shifts to the power saving mode, the GPU 1301 on the machine learning server 102 side is preferentially (actively) utilized.

＜画像形成システムのソフトウェア構成＞
図４は、画像形成システムのソフトウェア構成を示すブロック図である。図４に示すソフトウェア構成４００は、画像形成装置１０１を構成するハードウェア（図２参照）と、プログラムとを利用することにより実現される。ソフトウェア構成４００を実現するためのプログラムは、その構成要素ごとにストレージに格納されており、ＲＡＭに読み出されて、ＣＰＵで実行されるものである。例えば、画像形成装置１０１では、プログラムは、ＨＤＤ１２０４に格納されており、ＲＡＭ１２０２に読み出されて、ＣＰＵ１２０１で実行される。機械学習サーバ１０２やデータサーバ１０５についても同様である。ソフトウェア構成４００は、画像形成システム１０００において、画像形成装置１０１が読み取った学習データを用いて、グラフや表の有無を推定処理する機能を実現可能にするためのものである。 <Software Configuration of Image Forming System>
FIG. 4 is a block diagram showing the software configuration of the image forming system. A software configuration 400 shown in FIG. 4 is realized by using hardware (see FIG. 2) configuring the image forming apparatus 101 and programs. A program for realizing the software configuration 400 is stored in the storage for each component, read out to the RAM, and executed by the CPU. For example, in the image forming apparatus 101 , programs are stored in the HDD 1204 , read out to the RAM 1202 and executed by the CPU 1201 . The same applies to the machine learning server 102 and data server 105 . The software configuration 400 enables the image forming system 1000 to implement a function of estimating the presence/absence of graphs and tables using learning data read by the image forming apparatus 101 .

図４に示すように、画像形成装置１０１のソフトウェアは、ＵＩ表示部（表示手段）４０１、データ記憶部（記憶手段）４０２、ＪＯＢ制御部４０３、画像読取部４０４、推定処理部４０５で構成される。データ記憶部４０２は、画像形成装置１０１のＲＡＭ１２０２やＨＤＤ１２０４に対して、画像データ、学習データ、学習モデル等の各種のデータの記録を行わせる機能的役割を有する。ＪＯＢ制御部４０３は、画像形成装置１０１の操作部１４０を介したユーザの指示に基づいて、コピー、ファックス、プリント等の画像形成装置１０１の基本機能の実行を中心的に行う機能的役割を有する。なお、ＪＯＢ制御部４０３は、基本機能の実行の他に、当該基本機能の実行に伴う他のソフト構成要素間の指示、データの送受信等も中心的に行う機能的役割を有する。ＵＩ表示部４０１は、操作部１４０が有する画面に対して、ユーザへのメッセージを通知するための通知画面を表示させる機能的役割を有している。メッセージとしては、例えば、ユーザからの操作設定の受け付け、その操作受付画面の提供、推定処理部４０５による推定結果等が挙げられる。画像読取部４０４は、ＪＯＢ制御部４０３の指示に基づいてコピーやスキャンを実行する制御を行う場合に、スキャナ部１０により原稿を光学的に読み取らせる機能的役割を有する。推定処理部４０５は、画像形成装置１０１のＣＰＵ１２０１やＧＰＵ１２９１により実行される。推定処理部４０５は、画像形成装置１０１が入出力を行うデータに対して、ＡＩ機能を実現するための推定処理や分類処理等を行う機能的役割を有する。この推定処理は、ＪＯＢ制御部４０３の指示に基づいて行われる。また、推定処理部４０５の結果は、ＪＯＢ制御部４０３に送信されて、ＵＩ表示部４０１にメッセージとして表示される。これにより、ユーザは、推定処理部４０５の結果を把握することができる。 As shown in FIG. 4, the software of the image forming apparatus 101 includes a UI display unit (display means) 401, a data storage unit (storage means) 402, a JOB control unit 403, an image reading unit 404, and an estimation processing unit 405. be. The data storage unit 402 has a functional role of causing the RAM 1202 and HDD 1204 of the image forming apparatus 101 to record various data such as image data, learning data, and learning models. The JOB control unit 403 has a functional role of centrally executing basic functions of the image forming apparatus 101 such as copying, faxing, and printing based on user instructions via the operation unit 140 of the image forming apparatus 101 . . In addition to executing the basic functions, the JOB control unit 403 also plays a central role in instructing, transmitting and receiving data between other software components associated with the execution of the basic functions. The UI display unit 401 has a functional role of displaying a notification screen for notifying the user of a message on the screen of the operation unit 140 . Examples of the message include reception of operation settings from the user, provision of the operation reception screen, estimation results by the estimation processing unit 405, and the like. The image reading unit 404 has a functional role of causing the scanner unit 10 to optically read a document when performing control to execute copying or scanning based on instructions from the JOB control unit 403 . The estimation processing unit 405 is executed by the CPU 1201 and GPU 1291 of the image forming apparatus 101 . The estimation processing unit 405 has a functional role of performing estimation processing, classification processing, and the like for realizing the AI function on data input/output by the image forming apparatus 101 . This estimation processing is performed based on instructions from the JOB control unit 403 . Also, the result of the estimation processing unit 405 is transmitted to the JOB control unit 403 and displayed on the UI display unit 401 as a message. Thereby, the user can grasp the result of the estimation processing unit 405 .

データサーバ１０５のソフトウェアは、データ収集・提供部４１０、データ記憶部４１２で構成される。データ収集・提供部４１０は、機械学習サーバ１０２において機械学習するための学習データの収集と提供とを行う機能的役割を有する。具体的には、画像形成システム１０００では、画像形成装置１０１が操作された操作情報を含む学習データが画像形成装置１０１から送信される。データ収集・提供部４１０は、画像形成装置１０１からの学習データを受信して、当該学習データを機械学習サーバ１０２に提供する。また、収集先は、画像形成装置１０１以外の他の画像形成装置、汎用コンピュータ１０３、データサーバ１０５以外の他のデータサーバであってもよい。これにより、目的の機械学習をさせるために必要なデータの収集が可能となる。データ記憶部４１２は、データ収集・提供部４１０で収集した学習データの記録管理を行う機能的役割を有する。 The software of the data server 105 is composed of a data collecting/providing unit 410 and a data storage unit 412 . The data collecting/providing unit 410 has a functional role of collecting and providing learning data for machine learning in the machine learning server 102 . Specifically, in the image forming system 1000 , the image forming apparatus 101 transmits learning data including operation information that the image forming apparatus 101 has been operated. The data collecting/providing unit 410 receives learning data from the image forming apparatus 101 and provides the learning data to the machine learning server 102 . Also, the collection destination may be an image forming apparatus other than the image forming apparatus 101, a general-purpose computer 103, or a data server other than the data server 105. FIG. This makes it possible to collect the data necessary for performing the desired machine learning. The data storage unit 412 has a functional role of recording and managing learning data collected by the data collecting/providing unit 410 .

機械学習サーバ１０２のソフトウェアは、学習データ生成部４１３、機械学習部４１４、データ記憶部４１５で構成される。学習データ生成部４１３は、データサーバ１０５から学習データを含む種々のデータを受信する。そして、学習データ生成部４１３は、このデータに対し、機械学習のノイズになる不要なデータの除去を行う等の効果的学習結果が得られる形に加工する。このように、学習データ生成部４１３は、学習データの最適化を行う機能的役割を有する。学習データ生成部４１３は、機械学習サーバ１０２のＣＰＵ１３０１により実行される。データ記憶部４１５は、データサーバ１０５から受信したデータ、学習データ生成部４１３で生成された学習データ、機械学習部４１４における学習済みモデルを、機械学習サーバ１０２のＲＡＭ１３０２やＨＤＤ１３０４に一時的に記録させる。機械学習部４１４は、学習データ生成部４１３で生成された学習データを入力データとして、機械学習サーバ１０２のハードウェアリソースであるＧＰＵ１３０６やＣＰＵ１３０１と、図５Ａ（図５Ｂも同様）に示す学習モデルＷを活用して機械学習を行う。 The software of the machine learning server 102 is composed of a learning data generation unit 413 , a machine learning unit 414 and a data storage unit 415 . The learning data generator 413 receives various data including learning data from the data server 105 . Then, the learning data generation unit 413 processes this data into a form that provides effective learning results, such as removing unnecessary data that becomes noise in machine learning. Thus, the learning data generator 413 has a functional role of optimizing the learning data. The learning data generator 413 is executed by the CPU 1301 of the machine learning server 102 . The data storage unit 415 temporarily records the data received from the data server 105, the learning data generated by the learning data generation unit 413, and the trained model in the machine learning unit 414 in the RAM 1302 and HDD 1304 of the machine learning server 102. . The machine learning unit 414 uses the learning data generated by the learning data generation unit 413 as input data, and the GPU 1306 and CPU 1301, which are hardware resources of the machine learning server 102, and the learning model W shown in FIG. to perform machine learning.

＜学習モデル＞
図５Ａ、図５Ｂおよび図５Ｃは、それぞれ、学習モデルにおける入出力の構造を示す概念図である。前述したように、スキャナ部１０により、原稿が読み取られる。そして、図５Ａに示す学習モデルＷは、後述の学習工程を経て調整されて学習済みモデルとなり、スキャナ部１０で読み取られた原稿（原稿データＡ）内の図表領域を推定するための機械学習が行われる。図表領域としては、グラフを含むグラフ領域と、表を含む表領域とのうちの少なくとも一方が含まれる。また、学習モデルＷは、例えば、誤差逆伝搬法等によりパラメータが調整されたニューラルネットワークを用いた学習モデルである。これにより、学習モデルＷは、例えば、学習するための特徴量、重み（結合重み付け係数）等の各種パラメータを自ら生成するディープラーニング（深層学習）を行うことができる。なお、機械学習としては、ディープラーニングに限定されず、例えば、サポートベクターマシン、ロジスティクス回帰、決定木、最近傍法、ナイーブベイズ法等の任意の機械学習アルゴリズムを用いた機械学習であってもよい。また、学習モデルＷは、ニューラルネットワーク以外を用いたモデルであってもよい。 <Learning model>
5A, 5B, and 5C are conceptual diagrams showing input/output structures in the learning model, respectively. As described above, the document is read by the scanner unit 10 . Then, the learning model W shown in FIG. 5A is adjusted through a learning process described later to become a learned model, and machine learning for estimating the graphic area in the document (document data A) read by the scanner unit 10 is performed. done. The chart area includes at least one of a chart area containing a graph and a table area containing a table. Also, the learning model W is, for example, a learning model using a neural network whose parameters are adjusted by the error backpropagation method or the like. As a result, the learning model W can perform deep learning for generating various parameters such as feature amounts and weights (connection weighting coefficients) for learning. Machine learning is not limited to deep learning, and may be machine learning using any machine learning algorithm such as support vector machine, logistics regression, decision tree, nearest neighbor method, naive Bayes method, etc. . Also, the learning model W may be a model using something other than a neural network.

学習モデルＷ（学習済みモデル）は、図表領域であるグラフ領域や表領域の推定対象となる原稿、すなわち、原稿データＡ内のグラフや表に関する情報を入力データとする。この入力データとしては、例えば本実施形態では、図５Ａに示すように、グラフ関しては、データ要素Ｘ０、軸Ｘ１、図番号Ｘ２、タイトルＸ３、凡例Ｘ４、目盛り線Ｘ５、本文Ｘ６が挙げられる。図５Ｂに示すように、表に関しては、データ要素Ｘ１０、表番号Ｘ１１、セルＸ１２、罫線Ｘ１３、行Ｘ１４、列Ｘ１５、本文Ｘ１６が挙げられる。 The learning model W (learned model) uses, as input data, a manuscript to be estimated for a graph area or a table area, that is, information about a graph or a table in the manuscript data A. FIG. For example, in this embodiment, as shown in FIG. 5A, the input data includes data element X0, axis X1, figure number X2, title X3, legend X4, scale line X5, and text X6. . As shown in FIG. 5B, the table includes data element X10, table number X11, cell X12, ruled line X13, row X14, column X15, and text X16.

データ要素Ｘ０は、グラフ自体であり、本実施形態では、グラフ領域には、複数のグラフが含まれている。例えば、図５Ａでは、２月分のグラフとして、３本の棒グラフがあり、３月分のグラフとして、３本の棒グラフがあり、４月分のグラフとして、３本の棒グラフがある。また、原稿（原稿データＡ）では、少なくともグラフがカラー印刷されている。この場合、各月の３本の棒グラフは、それぞれ、異なる色が施されている。また、各月の左側に位置する棒グラフ同士、中央に位置する棒グラフ同士、右側に位置する棒グラフ同士は、それぞれ、同じ色が施されている。例えば、各月の左側に位置する棒グラフ同士は、青色、中央に位置する棒グラフ同士は、赤色、右側に位置する棒グラフ同士は、グレー色が施されている。軸Ｘ１は、縦軸や横軸である。また、軸Ｘ１には、各軸が何を表すかのラベル、各軸の単位等も含まれていてもよい。図番号Ｘ２は、グラフの通し番号、グラフの名称である。タイトルＸ３は、グラフの名称である。凡例Ｘ４は、各グラフが何を表すグラフであるのかを説明する項目である。図５Ａでは、凡例Ｘ４は、各月の３本の棒グラフがそれぞれ何を表すグラフであるのかを説明しており、左側のグラフが「ａ」、中央のグラフが「ｂ」、右側のグラフが「ｃ」であることを表している。目盛り線Ｘ５は、グラフの量を示す目盛り線である。本文Ｘ６は、名詞「グラフ」が含まれる文章である。 Data element X0 is the graph itself, and in this embodiment the graph area contains multiple graphs. For example, in FIG. 5A, there are three bar graphs for February, three bar graphs for March, and three bar graphs for April. Also, in the manuscript (manuscript data A), at least the graph is printed in color. In this case, the three bar graphs for each month are colored differently. The bar graphs positioned on the left side of each month, the bar graphs positioned in the center, and the bar graphs positioned on the right side of each month are colored in the same color. For example, the bar graphs positioned on the left side of each month are colored blue, the bar graphs positioned in the center are colored red, and the bar graphs positioned on the right side are colored gray. Axis X1 is a vertical axis or a horizontal axis. Axis X1 may also include a label indicating what each axis represents, a unit for each axis, and the like. The figure number X2 is the serial number of the graph and the name of the graph. Title X3 is the name of the graph. The legend X4 is an item explaining what each graph represents. In FIG. 5A, the legend X4 explains what each of the three bar graphs for each month represents. It represents that it is "c". A scale line X5 is a scale line indicating the amount of the graph. Text X6 is a sentence that includes the noun "graph".

データ要素Ｘ１０は、表自体であり、表領域には、本実施形態では１つの表が含まれているが、これに限定されず、複数の表が含まれていてもよい。また、表は、グラフ同様にカラー印刷で表示されている。この場合、複数のセルＸ１２に色が施されている。セルＸ１２は、表のデータ要素を表す要素である。２月の「ｅ」、３月の「ｈ」および４月の「ｈ」の各セルＸ１２と、２月の「ｇ」、３月の「ｆ」および４月の「ｅ」の各セルＸ１２とは、互いに異なる色が施されている。例えば、２月の「ｅ」、３月の「ｈ」、４月の「ｈ」の各セルＸ１２は橙色が施され、２月の「ｇ」、３月の「ｆ」、４月の「ｅ」の各セルＸ１２は緑色が施されている。図５Ｂ中、２月の「ｅ」、３月の「ｈ」、４月の「ｈ」の各セルＸ１２には、それぞれの月の最小出荷数が示され、２月の「ｇ」、３月の「ｆ」、４月の「ｅ」の各セルＸ１２には、最大出荷数が示されている。表番号Ｘ１１は、表の通し番号および表の名称である。罫線Ｘ１３は、表の各セルＸ１２を構成する区切りを示す。行Ｘ１４、列Ｘ１５は、表のデータ要素を縦横に並べたものを示す。本文Ｘ１６は、名詞「表」が含まれる文章である。 The data element X10 is the table itself, and the table area includes one table in this embodiment, but is not limited to this and may include multiple tables. Also, the table is displayed in color printing in the same manner as the graph. In this case, a plurality of cells X12 are colored. Cell X12 is an element representing a data element of the table. Each cell X12 of "e" in February, "h" in March and "h" in April, and each cell X12 of "g" in February, "f" in March and "e" in April are colored differently from each other. For example, each cell X12 of "e" in February, "h" in March, and "h" in April is colored orange, and "g" in February, "f" in March, and "h" in April. Each cell X12 of "e" is colored green. In FIG. 5B, each cell X12 of "e" in February, "h" in March, and "h" in April indicates the minimum number of shipments for each month, and "g" in February, 3 The maximum number of shipments is shown in each cell X12 of "f" for month and "e" for April. The table number X11 is the serial number of the table and the name of the table. A ruled line X13 indicates a break that constitutes each cell X12 of the table. Row X14 and column X15 indicate the data elements of the table arranged vertically and horizontally. Text X16 is a sentence that includes the noun "table".

また、学習モデルＷ（学習済みモデル）は、図表領域としてのグラフ領域や表領域の有無を出力データとする。出力データとしては、例えば本実施形態では、グラフや表が有る場合には「出力データＹ１」とし、グラフや表が無い場合には「出力データＹ２」とする。このような構成の学習モデルＷにより、グラフ領域や表領域をより正確に推定することができる、すなわち、グラフ領域や表領域の推定正解率が向上する。そして、画像形成装置１０１の推定処理部４０５は、学習モデルＷ（学習済みモデル）を用いて、スキャナ部１０で読み取られた原稿内のグラフ領域や表領域を推定することができる（推定処理工程）。 Also, the learning model W (learned model) uses the presence/absence of graph areas and table areas as figure/table areas as output data. For example, in this embodiment, the output data is "output data Y1" when there is a graph or table, and "output data Y2" when there is no graph or table. The learning model W having such a configuration enables more accurate estimation of the graph area and the table area, that is, the estimation accuracy rate of the graph area and the table area is improved. Using the learning model W (learned model), the estimation processing unit 405 of the image forming apparatus 101 can estimate the graph area and the table area in the document read by the scanner unit 10 (estimation processing step ).

図５Ｃに示す構造では、学習モデルＷは、「正解値が既知の入力データ」と「正解値」とをセットにした（組にした）学習データが多数用意されている（Ｂ１参照）。この学習モデルＷは、誤差検出部と更新部とを有していてもよい。誤差検出部は、入力層に入力される入力データＸ（Ｂ２参照）に応じて、出力層から出力される出力データＹ（Ｂ３参照）を演算して、当該出力データＹ（Ｂ４参照）を出力する。そして、誤差検出部は、出力データＹと教師データＴとの誤差を得る。誤差検出部は、損失関数を用いて、出力データＹと教師データＴとの誤差を表す損失Ｌを計算する（Ｂ５参照）。更新部は、誤差検出部で得られた損失Ｌに基づいて、その損失が小さくなるように、ニューラルネットワーク（学習モデルＷ）のノード間の結合重み付け係数等を更新する（Ｂ６参照）。この更新部は、例えば、誤差逆伝播法を用いて、結合重み付け係数等を更新する。誤差逆伝播法は、上記の誤差が小さくなるように、ニューラルネットワークのノード間の結合重み付け係数等を調整する方法である。 In the structure shown in FIG. 5C, the learning model W is prepared with a large number of learning data sets (sets) of "input data with known correct values" and "correct values" (see B1). This learning model W may have an error detector and an updater. The error detection unit calculates the output data Y (see B3) output from the output layer according to the input data X (see B2) input to the input layer, and outputs the output data Y (see B4). do. Then, the error detection section obtains the error between the output data Y and the teacher data T. FIG. The error detector uses a loss function to calculate a loss L representing the error between the output data Y and the teacher data T (see B5). Based on the loss L obtained by the error detection unit, the update unit updates the weighting coefficients for coupling between nodes of the neural network (learning model W) so as to reduce the loss (see B6). This updating unit updates the connection weighting coefficients and the like using, for example, the error backpropagation method. The error backpropagation method is a method of adjusting connection weighting coefficients and the like between nodes of a neural network so as to reduce the above error.

前述したように、学習モデルＷは、「正解値が既知の入力データ」と「正解値」をセットにした学習データが多数用意されている。そして、学習モデルＷでは、この正解値に対応する入力データＸを入力した場合の出力データＹが正解値に極力近づくように、学習モデルＷ内の重み付け係数が調整される。これにより、正解の精度の高い学習モデルＷが得られる。このような学習モデルＷを得る工程を「学習工程」と言い、学習工程を経て調整された学習モデルを「学習済みモデル」と言う。また、用意する教師データ、すなわち、「正解値が既知の入力データ」と「正解値」とのセットは、例えば、以下のようなものとすることができる。 As described above, the learning model W is prepared with a large number of learning data sets of "input data with known correct values" and "correct values". Then, in the learning model W, the weighting coefficients in the learning model W are adjusted so that the output data Y when inputting the input data X corresponding to this correct value approaches the correct value as much as possible. As a result, a learning model W with a high accuracy of correct answer is obtained. The process of obtaining such a learning model W is called a "learning process", and the learning model adjusted through the learning process is called a "learned model". Also, the training data to be prepared, that is, the set of "input data with known correct values" and "correct values" can be, for example, as follows.

正解値が既知の入力データＸ：グラフや表が存在する原稿のデータにおけるグラフや表に関する情報。この情報としては、特に限定されず、例えば、前述した軸Ｘ１、図番号Ｘ２、タイトルＸ３、凡例Ｘ４、目盛り線Ｘ５、本文Ｘ６等が挙げられる。
期待値（Ｔ）：Ａ（グラフや表あり）＝１、Ｂ（グラフや表なし）＝０
期待値（Ｔ）とは、「正解値が既知の入力データ」を入力した場合の「正解値」を示す出力データＹの値である。期待値（Ｔ）は、グラフや表が有る場合には「出力データＹ１」とし、グラフや表が無い場合には「出力データＹ２」とする。各教師データＴの入力データＸを入力し算出された出力データＹと、期待値（Ｔ）とを所定の損失関数に従って、損失Ｌを求める。本実施形態での損失関数は、「損失Ｌ＝１－出力データＹの推定確率」とする。この損失Ｌが０に近づくように、学習モデルＷの各層間の重み付けを調整する。そして、この調整を行った学習モデルＷを学習済みモデルとして、当該学習済みモデルを機械学習部４１４に実装する。 Input data X for which correct values are known: Information about graphs and tables in manuscript data in which graphs and tables exist. This information is not particularly limited, and includes, for example, the aforementioned axis X1, figure number X2, title X3, legend X4, scale line X5, text X6, and the like.
Expected value (T): A (with graphs and tables) = 1, B (without graphs and tables) = 0
The expected value (T) is the value of the output data Y indicating the "correct value" when "the input data whose correct value is known" is input. The expected value (T) is "output data Y1" when there is a graph or table, and "output data Y2" when there is no graph or table. A loss L is obtained from the output data Y calculated by inputting the input data X of each teacher data T and the expected value (T) according to a predetermined loss function. The loss function in this embodiment is assumed to be "loss L=1-estimated probability of output data Y". The weighting between each layer of the learning model W is adjusted so that the loss L approaches zero. Then, the adjusted learning model W is used as a learned model, and the learned model is implemented in the machine learning unit 414 .

＜画像形成システムで行われる動作（処理の流れ）＞
図６は、画像形成システムで行われる動作（処理の流れ）を説明するための図である。図６に示すように、画像形成システム１０００では、動作Ｉ～動作Ｖが順に実行される。
Ｉ：画像形成装置１０１は、操作部１４０を介してユーザから原稿に対するスキャン動作を受け付けると、そのスキャン動作後に、機械学習サーバ１０２に対して、スキャンデータを送信する。また、画像形成装置１０１は、この送信とともに、スキャンデータに対するグラフ領域や表領域（図表領域）推定を要求する。
II：動作Ｉでグラフ領域や表領域推定が要求された機械学習サーバ１０２は、学習モデルにより、スキャンデータ内のグラフ領域や表領域の有無を推定する。なお、動作IIでは、画像形成装置１０１が機械学習サーバ１０２から学習モデルを受信して、当該学習モデルにより、スキャンデータ内のグラフ領域や表領域の有無を推定してもよい。
III：機械学習サーバ１０２は、動作IIで得られたグラフ領域や表領域の有無の推定結果を画像形成装置１０１に送信する。
ＩＶ：画像形成装置１０１は、動作IIIで機械学習サーバ１０２から送信された推定結果を有する原稿の中のグラフや表について、モノクロ印刷した場合に、そのグラフや表の識別性（視認性）を維持可能な修正例（サンプル画像）を操作部１４０に表示する。これにより、ユーザに対し、修正例を提案することができる。
Ｖ：画像形成装置１０１は、グラフや表を修正例どおりに修正する場合には、その修正例のデータを保存する。 <Operations (Processing Flow) Performed by the Image Forming System>
FIG. 6 is a diagram for explaining the operation (process flow) performed in the image forming system. As shown in FIG. 6, in the image forming system 1000, operations I to V are executed in order.
I: When the image forming apparatus 101 receives a scanning operation for a document from the user via the operation unit 140, the image forming apparatus 101 transmits scanned data to the machine learning server 102 after the scanning operation. Along with this transmission, the image forming apparatus 101 also requests estimation of a graph area and a table area (graphic area) for the scan data.
II: The machine learning server 102 that has been requested to estimate a graph area or a table area in operation I estimates the presence or absence of a graph area or table area in the scan data using a learning model. Note that in operation II, the image forming apparatus 101 may receive a learning model from the machine learning server 102 and use the learning model to estimate whether there is a graph area or a table area in the scan data.
III: The machine learning server 102 transmits to the image forming apparatus 101 the result of estimating the presence/absence of the graph area and table area obtained in operation II.
IV: When the image forming apparatus 101 prints the graphs and tables in the document having the estimation results transmitted from the machine learning server 102 in operation III in monochrome, the image forming apparatus 101 improves the identifiability (visibility) of the graphs and tables. A maintainable correction example (sample image) is displayed on the operation unit 140 . This makes it possible to propose correction examples to the user.
V: When the image forming apparatus 101 corrects the graph or table according to the correction example, it saves the data of the correction example.

図６に示す動作Ｉ～動作Ｖが順に実行されることにより、スキャンデータ内のグラフや表の有無を推定して、その結果、グラフや表が存在し、かつ、モノクロ印刷時にグラフや表の識別性が悪化する場合には、修正例をユーザに提案することができる。そして、修正例に基づいて、スキャンデータを修正することが可能となる。 By executing the operations I to V shown in FIG. 6 in order, the presence/absence of graphs and tables in the scan data is estimated. If the identifiability deteriorates, a correction example can be suggested to the user. Then, it becomes possible to correct the scan data based on the correction example.

＜スキャナ部で読み取られる原稿の構成例（教師データの収集）＞
図７Ａは、スキャナ部で読み取られる原稿の構成例および推定結果（グラフ関係）を示す図である。図７Ｂ、スキャナ部で読み取られる原稿の構成例および推定結果（表関係）を示す図である。図８Ａは、図７Ａに対応させた表である。図８Ｂは、図７Ｂに対応させた表である。 <Example of composition of manuscript read by scanner unit (collection of training data)>
FIG. 7A is a diagram showing a configuration example of a document read by the scanner unit and an estimation result (graph relation). FIG. 7B is a diagram showing a configuration example of a document read by the scanner unit and an estimation result (table relationship); FIG. 8A is a table corresponding to FIG. 7A. FIG. 8B is a table corresponding to FIG. 7B.

図７Ａ（ａ）に示す原稿は、矩形もしくは扇型のグラフ（図７Ａ（ａ）では棒グラフ）を含むグラフ領域が画像として形成された原稿である。図７Ａ（ａ－１）、図８Ａに示すように、図７Ａ（ａ）に示す原稿では、入力データＸとして、データ要素Ｘ０、軸Ｘ１、タイトルＸ３、凡例Ｘ４、目盛り線Ｘ５、本文Ｘ６が有り、図番号Ｘ２が無い。 The document shown in FIG. 7A(a) is a document in which a graph area including a rectangular or fan-shaped graph (a bar graph in FIG. 7A(a)) is formed as an image. As shown in FIGS. 7A(a-1) and 8A, in the manuscript shown in FIG. 7A(a), input data X includes data element X0, axis X1, title X3, legend X4, scale line X5, and text X6. Yes, but no figure number X2.

図７Ａ（ｂ）に示す原稿は、矩形もしくは扇型のグラフ（図７Ａ（ｂ）では円グラフ）を含むグラフ領域が画像として形成された原稿である。図７Ａ（ｂ－１）、図８Ａに示すように、図７Ａ（ｂ）に示す原稿では、入力データＸとして、データ要素Ｘ０、図番号Ｘ２、タイトルＸ３、凡例Ｘ４が有り、軸Ｘ１、目盛り線Ｘ５、本文Ｘ６が無い。 The document shown in FIG. 7A(b) is a document in which a graph area including a rectangular or fan-shaped graph (circular graph in FIG. 7A(b)) is formed as an image. As shown in FIGS. 7A(b-1) and 8A, in the document shown in FIG. 7A(b), input data X includes data element X0, figure number X2, title X3, legend X4, axis X1, scale There is no line X5 and text X6.

図７Ａ（ｃ）に示す原稿は、矩形もしくは扇型ではないグラフ（図７Ａ（ｃ）では折れ線グラフ）を含むグラフ領域が画像として形成された原稿である。図７Ａ（ｃ－１）、図８Ａに示すように、図７Ａ（ｃ）に示す原稿では、入力データＸとして、データ要素Ｘ０、軸Ｘ１、タイトルＸ３～目盛り線Ｘ５が有り、図番号Ｘ２、本文Ｘ６が無い。 The document shown in FIG. 7A(c) is a document in which a graph area including a non-rectangular or fan-shaped graph (line graph in FIG. 7A(c)) is formed as an image. As shown in FIGS. 7A(c-1) and 8A, in the document shown in FIG. 7A(c), there are data element X0, axis X1, title X3 to scale line X5 as input data X, and figure numbers X2, There is no text X6.

図７Ａ（ｄ）に示す原稿は、矩形もしくは扇型でないグラフ（図７Ａ（ｄ）では３Ｄ折れ線グラフ）を含むグラフ領域が画像として形成された原稿である。図７Ａ（ｄ－１）、図８Ａに示すように、図７Ａ（ｄ）に示す原稿では、入力データＸとして、データ要素Ｘ０～目盛り線Ｘ５が有り、本文Ｘ６が無い。 The document shown in FIG. 7A(d) is a document in which a graph area including a non-rectangular or fan-shaped graph (a 3D line graph in FIG. 7A(d)) is formed as an image. As shown in FIGS. 7A(d-1) and 8A, in the document shown in FIG. 7A(d), the input data X includes the data elements X0 to the scale line X5 and does not include the text X6.

図７Ａ（ｅ）に示す原稿は、グラフではない矩形もしくは扇型の図形を含む領域が画像として形成された原稿である。図７Ａ（ｅ－１）、図８Ａに示すように、図７Ａ（ｅ）に示す原稿では、入力データＸとして、データ要素Ｘ０～本文Ｘ６の全てが無い。 The document shown in FIG. 7A(e) is a document in which an area including a rectangular or fan-shaped figure that is not a graph is formed as an image. As shown in FIGS. 7A(e-1) and 8A, in the document shown in FIG. 7A(e), as the input data X, all of the data elements X0 to X6 are absent.

図７Ａ（ｆ）に示す原稿は、矩形もしくは扇型でないグラフ（図７Ａ（ｆ）では円環グラフ）を含むグラフ領域が画像として形成された原稿である。図７Ａ（ｆ－１）、図８Ａに示すように、図７Ａ（ｆ）に示す原稿では、入力データＸとして、凡例Ｘ４が有り、データ要素Ｘ０～タイトルＸ３、目盛り線Ｘ５、本文Ｘ６が無い。 The document shown in FIG. 7A(f) is a document in which a graph area including a non-rectangular or fan-shaped graph (an annular graph in FIG. 7A(f)) is formed as an image. As shown in FIGS. 7A(f-1) and 8A, the manuscript shown in FIG. 7A(f) has legend X4 as input data X, and does not have data elements X0 to title X3, scale lines X5, and text X6. .

図７Ｂ（ｇ）に示す原稿は、表（図７Ｂ（ｇ）では全てのセルに罫線を含む表）を含む表領域が画像として形成された原稿である。図７Ｂ（ｇ－１）、図８Ａに示すように、図７Ｂ（ｇ）に示す原稿では、入力データＸとして、データ要素Ｘ１０、表番号Ｘ１１、セルＸ１２、罫線Ｘ１３、行Ｘ１４、列Ｘ１５が有り、本文Ｘ１６が無い。 The document shown in FIG. 7B(g) is a document in which a table area including a table (in FIG. 7B(g), a table including ruled lines in all cells) is formed as an image. As shown in FIGS. 7B(g-1) and 8A, in the document shown in FIG. 7B(g), the input data X includes data element X10, table number X11, cell X12, ruled line X13, row X14, and column X15. Yes, but no text X16.

図７Ｂ（ｈ）に示す原稿は、表（図７Ｂ（ｈ）ではセルの罫線の一部が無い表）を含む表領域が画像として形成された原稿である。図７Ｂ（ｈ－１）、図８Ａに示すように、図７Ｂ（ｈ）に示す原稿では、入力データＸとして、データ要素Ｘ１０、セルＸ１２、罫線Ｘ１３、行Ｘ１４、列Ｘ１５が有り、表番号Ｘ１１、本文Ｘ１６が無い。 The document shown in FIG. 7B(h) is a document in which a table area including a table (in FIG. 7B(h), a table without some of the cell ruled lines) is formed as an image. As shown in FIG. 7B(h-1) and FIG. 8A, in the document shown in FIG. 7B(h), input data X includes data element X10, cell X12, ruled line X13, row X14, and column X15. There is no X11 and text X16.

例えば学習モデルが未学習状態である場合の推定処理について説明する。ここで、「未学習状態」とは、学習が全く行われていない初期状態の他、学習が行われてはいるものの、不十分である状態も含む。この場合、学習モデルは、データ要素Ｘ０として、棒グラフ、円グラフ、折れ線グラフのみ学習した状態となっている。 For example, an estimation process when the learning model is in an unlearned state will be described. Here, the "unlearned state" includes not only an initial state in which learning is not performed at all, but also a state in which learning is performed but is insufficient. In this case, the learning model is in a state where only bar graphs, pie charts, and line graphs have been learned as the data element X0.

図７Ａ（ａ）に示す原稿の場合、データ要素Ｘ０については、学習済みであり、他の入力データＸについても、「グラフ領域有り」と推定できる程度に十分に揃っている。従って、図７Ａ（ａ）に示す原稿に対しては、「グラフ領域有り」と推定することができる。図８Ａに示すように、この推定結果は、教師データＴとなる、すなわち、期待値（Ｔ）として「Ｔ１＝１、Ｔ２＝０」となる。 In the case of the manuscript shown in FIG. 7A(a), the data element X0 has already been learned, and the other input data X are sufficiently complete to be able to estimate that "there is a graph area". Therefore, it can be estimated that "there is a graph area" for the document shown in FIG. 7A(a). As shown in FIG. 8A, this estimation result becomes teacher data T, that is, "T1=1, T2=0" as the expected value (T).

図７Ａ（ｂ）に示す原稿の場合、データ要素Ｘ０については、学習済みであるが、他の入力データＸについては、グラフ領域を推定できる程度には、不十分である。従って、図７Ａ（ｂ）に示す原稿に対しては、「グラフ領域無し」と推定されてしまう。そこで、このような場合には、ユーザが「グラフ領域有り」と訂正して、学習済みモデルを更新する。これにより、図７Ａ（ｂ）に示す原稿に対しては、「グラフ領域有り」と正確に推定することができる。図８Ａに示すように、この推定結果は、「Ｔ１＝１、Ｔ２＝０」の教師データＴとなる。 In the case of the manuscript shown in FIG. 7A(b), the data element X0 has been learned, but the other input data X is insufficient to estimate the graph area. Therefore, the manuscript shown in FIG. 7A(b) is estimated to have "no graph area". Therefore, in such a case, the user corrects "there is a graph area" and updates the learned model. As a result, it can be accurately estimated that "there is a graph area" for the document shown in FIG. 7A(b). As shown in FIG. 8A, this estimation result becomes teacher data T of "T1=1, T2=0".

図７Ａ（ｃ）に示す原稿の場合、図７Ａ（ａ）に示す原稿の場合と同様に、「グラフ領域有り」と推定される。図８Ａに示すように、この推定結果は、「Ｔ１＝１、Ｔ２＝０」の教師データＴとなる。 In the case of the document shown in FIG. 7A(c), it is estimated that "there is a graph area" as in the case of the document shown in FIG. 7A(a). As shown in FIG. 8A, this estimation result becomes teacher data T of "T1=1, T2=0".

図７Ａ（ｄ）に示す原稿の場合、データ要素Ｘ０については、全くの学習されていないが、他の入力データＸについては、「グラフ領域有り」と推定できる程度に十分に揃っている。従って、図７Ａ（ｄ）に示す原稿に対しては、「グラフ領域有り」と推定することができる。図８Ａに示すように、この推定結果は、「Ｔ１＝１、Ｔ２＝０」の教師データＴとなる。このように未学習のグラフについても、グラフ領域が有る原稿については、正しく推定可能である。 In the case of the manuscript shown in FIG. 7A(d), the data element X0 has not been learned at all, but the other input data X are sufficiently prepared so that it can be estimated that "there is a graph area". Therefore, it can be estimated that "there is a graph area" for the document shown in FIG. 7A(d). As shown in FIG. 8A, this estimation result becomes teacher data T of "T1=1, T2=0". As described above, even unlearned graphs can be correctly estimated for a manuscript having a graph area.

図７Ａ（ｅ）に示す原稿の場合、当該原稿内の図形がデータ要素Ｘ０に近似する形状を有するものの、入力データＸとして、軸Ｘ１～本文Ｘ６の全てが無い。従って、図７Ａ（ｅ）に示す原稿に対しては、「グラフ領域無し」と推定される。図８Ａに示すように、この推定結果は、教師データＴとなる、すなわち、期待値（Ｔ）として「Ｔ１＝０、Ｔ２＝１」となる。また、前記特許文献１に記載の画像形成装置では、図７Ａ（ｅ）に示す原稿に対しては、「グラフ領域有り」と誤判定してしまう。 In the case of the manuscript shown in FIG. 7A(e), although the figure in the manuscript has a shape similar to the data element X0, the input data X does not have all of the axis X1 to the text X6. Therefore, the manuscript shown in FIG. 7A(e) is presumed to have "no graph area". As shown in FIG. 8A, this estimation result becomes teacher data T, that is, "T1=0, T2=1" as the expected value (T). Further, in the image forming apparatus described in Patent Document 1, the original shown in FIG.

図７Ａ（ｆ）に示す原稿の場合、データ要素Ｘ０については、全くの学習されておらず、他の入力データＸについては、凡例Ｘ４が有るのみである。従って、図７Ａ（ｆ）に示す原稿に対しては、「グラフ領域無し」と推定されてしまう。このような場合には、図７Ａ（ｂ）に示す原稿の場合と同様に、ユーザが「グラフ領域有り」と訂正して、学習済みモデルを更新する。これにより、図７Ａ（ｆ）に示す原稿に対しては、「グラフ領域有り」と正確に推定することができる。図８Ａに示すように、この推定結果は、「Ｔ１＝１、Ｔ２＝０」の教師データＴとなる。 In the case of the document shown in FIG. 7A(f), the data element X0 has not been learned at all, and the other input data X has only the legend X4. Therefore, the manuscript shown in FIG. 7A(f) is estimated to have "no graph area". In such a case, as in the case of the manuscript shown in FIG. 7A(b), the user corrects "there is a graph area" and updates the learned model. As a result, it can be accurately estimated that "there is a graph area" for the document shown in FIG. 7A(f). As shown in FIG. 8A, this estimation result becomes teacher data T of "T1=1, T2=0".

図７Ｂ（ｇ）に示す原稿の場合、データ要素Ｘ１０については、学習済みであり、他の入力データＸについても、「表領域有り」と推定する程度に十分に揃っている。図８Ａに示すように、この推定結果は、教師データＴとなる、すなわち、期待値（Ｔ）として「Ｔ１＝１、Ｔ２＝０」となる。 In the case of the manuscript shown in FIG. 7B(g), the data element X10 has already been learned, and the other input data X are sufficiently prepared to be estimated as "with table area". As shown in FIG. 8A, this estimation result becomes teacher data T, that is, "T1=1, T2=0" as the expected value (T).

図７Ｂ（ｈ）に示す原稿の場合、データ要素Ｘ１０については、学習済みであるが、他の入力データＸのうち、例えば表番号Ｘ１１や罫線Ｘ１３が不十分であるが、「表領域有り」と推定できる程度には十分に揃っている。従って、図７Ｂ（ｈ）に示す原稿に対しては、「表領域有り」と推定することができる。このように、既に学習した表と多少要素が異なる表についても、表領域を正しく推定可能である。 In the case of the document shown in FIG. 7B(h), the data element X10 has already been learned, but among the other input data X, for example, the table number X11 and the ruled line X13 are insufficient. It is sufficient enough to be estimated. Therefore, it can be estimated that the document shown in FIG. 7B(h) has "a front area". In this way, it is possible to correctly estimate the table area even for a table whose elements are slightly different from the already learned table.

以上のように、未学習のデータ要素Ｘ０についても、その他の入力データＸが十分に存在していれば、グラフ領域や表領域を正確に推定することができる。また、既に学習しているデータ要素Ｘ０でも、その他の入力データＸが不十分の場合には、グラフ領域や表領域を誤推定してしまうおそれがある。しかしながら、画像形成装置１０１では、誤推定の結果を訂正することにより、学習モデルを更新して、グラフ領域や表領域の有無推定の精度を向上させることができる。このように、画像形成装置１０１によれば、原稿内にグラフや表が存在している場合、当該グラフや表の種類に関わらず、グラフや表の存在を高精度に推定することができる。 As described above, even for an unlearned data element X0, if enough other input data X exist, the graph area or table area can be accurately estimated. Moreover, even with the data element X0 that has already been learned, if the other input data X is insufficient, there is a risk that the graph area or the table area will be erroneously estimated. However, in the image forming apparatus 101, by correcting the erroneous estimation result, the learning model can be updated to improve the accuracy of estimating the presence/absence of graph regions and table regions. As described above, according to the image forming apparatus 101, when a document contains a graph or a table, the existence of the graph or table can be estimated with high accuracy regardless of the type of the graph or table.

＜スキャン系のジョブの流れ＞
図９は、スキャン系のジョブの流れを示すフローチャートである。なお、「スキャン系のジョブ」とは、コピーやスキャンＢＯＸ等のスキャナを用いたジョブ全般のことである。 <Flow of scan-related jobs>
FIG. 9 is a flow chart showing the flow of a scan-related job. Note that the “scan-related jobs” refer to all jobs using a scanner, such as copying and scanning BOX.

ステップＳ１１０１では、ＪＯＢ制御部４０３は、ユーザからスキャン系のジョブの開始を操作部１４０が受け付けたか否かを判断する。ステップＳ１１０１での判断の結果、ＪＯＢ制御部４０３が、ジョブの開始を操作部１４０が受け付けたと判断した場合には、処理はステップＳ１１０２に進む。一方、ステップＳ１１０１での判断の結果、ＪＯＢ制御部４０３は、ジョブの開始を操作部１４０が受け付けていないと判断した場合には、ジョブの開始を操作部１４０が受け付けるまで待機する。 In step S1101, the job control unit 403 determines whether or not the operation unit 140 has received a start of a scan-related job from the user. As a result of the determination in step S1101, if the JOB control unit 403 determines that the operation unit 140 has accepted the start of the job, the process advances to step S1102. On the other hand, if the JOB control unit 403 determines in step S1101 that the operation unit 140 has not accepted the start of the job, it waits until the operation unit 140 accepts the start of the job.

ステップＳ１１０２では、ＪＯＢ制御部４０３は、スキャナ部１０（画像読取部４０４）を作動させて、当該スキャナ部１０に原稿（以下この原稿を「１次原稿」と言う）を読み取らせる。１次原稿は、ＡＤＦ（ＡｕｔｏｍａｔｉｃＤｏｃｕｍｅｎｔＦｅｅｄｅｒ）または原稿台ガラス（いずれも図示せず）に載置された状態で、スキャナ部１０に読み取られる。 In step S1102, the job control unit 403 activates the scanner unit 10 (image reading unit 404) to read a document (hereinafter referred to as "primary document"). The primary document is read by the scanner section 10 while being placed on an ADF (Automatic Document Feeder) or a platen glass (neither of which is shown).

ステップＳ１１０３では、ＪＯＢ制御部４０３は、スキャナ部１０が全ての原稿の読み取りが終了したか否かを判断する。ステップＳ１１０３での判断の結果、ＪＯＢ制御部４０３が、原稿の読み取りが終了したと判断した場合には、処理はステップＳ１１０４に進む。一方、ステップＳ１１０３での判断の結果、ＪＯＢ制御部４０３が、原稿の読み取りが終了していないと判断した場合には、全ての原稿の読み取りが終了するまで、その処理を実行する。 In step S1103, the job control unit 403 determines whether or not the scanner unit 10 has finished reading all the originals. As a result of the determination in step S1103, if the JOB control unit 403 determines that reading of the document is completed, the process advances to step S1104. On the other hand, if the JOB control unit 403 determines in step S1103 that reading of the document has not been completed, the process is executed until reading of all the documents is completed.

ステップＳ１１０４では、ＪＯＢ制御部４０３は、ステップＳ１１０３で読み取った１次原稿のデータをデータ記憶部４０２に保存する。 In step S1104, the job control unit 403 saves the primary document data read in step S1103 in the data storage unit 402. FIG.

ステップＳ１１０５では、ＪＯＢ制御部４０３は、機械学習サーバ１０２の機械学習部４１４を作動させて、当該機械学習部４１４に学習処理を行わせる。ステップＳ１１０５は、サブルーチンであり、これについては、図１０に示す学習フェーズのフローチャートを参照して後述する。 In step S1105, the JOB control unit 403 activates the machine learning unit 414 of the machine learning server 102 to perform learning processing. Step S1105 is a subroutine, which will be described later with reference to the learning phase flow chart shown in FIG.

ステップＳ１１０６では、ＪＯＢ制御部４０３は、推定処理部４０５を作動させて、当該推定処理部４０５に、グラフ領域や表領域を推定する推定処理を行わせる。ステップＳ１１０６は、サブルーチンであり、これについては、図１０に示す推定フェーズのフローチャートを参照して後述する。 In step S1106, the JOB control unit 403 activates the estimation processing unit 405 to perform estimation processing for estimating the graph area and the table area. Step S1106 is a subroutine, which will be described later with reference to the estimation phase flowchart shown in FIG.

ステップＳ１１０７では、ＪＯＢ制御部４０３は、読み取り後の処理を実行して、処理が終了する。読み取り後の処理としては、例えば、推定処理の結果を反映させた原稿（以下この原稿を「２次原稿」と言う）の印刷や、２次原稿のデータの保存等が挙げられる。なお、２次原稿の印刷は、プリンタ部２０で行われる。また、２次原稿のデータの保存は、データ記憶部４０２で行われる。 In step S1107, the job control unit 403 executes post-reading processing, and the processing ends. Post-reading processing includes, for example, printing a document reflecting the result of the estimation process (hereinafter, this document is referred to as a “secondary document”), storing data of the secondary document, and the like. The secondary document is printed by the printer section 20. FIG. Data of the secondary original is saved in the data storage unit 402 .

＜機械学習部および推定処理部の処理の流れ＞
図１０は、機械学習部および推定処理部の処理の流れを示すフローチャートである。図１０中、ステップＳ９０１～ステップＳ９０６が学習フェーズで行われる機械学習部の処理であり、ステップＳ９０７～ステップＳ９１２が推定フェーズで行われる推定処理部の処理である。機械学習部４１４は、学習データが更新されたか否かを一定期間毎に確認する。そして、機械学習部４１４は、学習データが更新されていると判断したタイミングで、学習フェーズを開始する。 <Processing Flow of Machine Learning Unit and Estimation Processing Unit>
FIG. 10 is a flow chart showing the processing flow of the machine learning unit and the estimation processing unit. In FIG. 10, steps S901 to S906 are the processing of the machine learning unit performed in the learning phase, and steps S907 to S912 are the processing of the estimation processing unit performed in the estimation phase. The machine learning unit 414 confirms whether or not the learning data has been updated at regular intervals. Then, the machine learning unit 414 starts the learning phase when it determines that the learning data has been updated.

ステップＳ９０１では、機械学習部４１４は、学習データ生成部４１３を介して、学習データを受信する。学習データとは、前述した原稿データＡである。 In step S<b>901 , the machine learning unit 414 receives learning data via the learning data generation unit 413 . The learning data is the manuscript data A described above.

ステップＳ９０２では、機械学習部４１４は、ステップＳ９０１で受信した原稿データＡの入力データＸが学習モデルＷに入力されて、当該学習モデルＷで機械学習が行われる。 In step S902, the machine learning unit 414 inputs the input data X of the document data A received in step S901 to the learning model W, and the learning model W performs machine learning.

ステップＳ９０３では、機械学習部４１４は、機械学習が終了したか否かを判断する。ステップＳ９０３での判断の結果、機械学習部４１４が、機械学習が終了したと判断した場合には、処理はステップＳ９０４に進む。一方、ステップＳ９０３での判断の結果、機械学習部４１４が、機械学習が終了していないと判断した場合には、ステップＳ９０２の実行を継続する。 In step S903, the machine learning unit 414 determines whether or not machine learning has ended. As a result of the determination in step S903, when the machine learning unit 414 determines that the machine learning has ended, the process proceeds to step S904. On the other hand, when the machine learning unit 414 determines that the machine learning has not ended as a result of the determination in step S903, the execution of step S902 is continued.

ステップＳ９０４では、機械学習部４１４は、学習済みモデルの更新を推定処理部４０５に通知する。 In step S904, the machine learning unit 414 notifies the estimation processing unit 405 of updating the learned model.

ステップＳ９０５では、機械学習部４１４は、推定処理部４０５からの学習済みモデルの送信要求の有無を判断する。ステップＳ９０５での判断の結果、機械学習部４１４が、学習済みモデルの送信要求が有ったと判断した場合には、処理はステップＳ９０６に進む。一方、ステップＳ９０５での判断の結果、機械学習部４１４が、学習済みモデルの送信要求が無いと判断した場合には、処理はステップＳ９０５に戻り、推定処理部４０５からの送信要求を待つ。 In step S<b>905 , the machine learning unit 414 determines whether or not the estimation processing unit 405 has requested transmission of the trained model. As a result of the determination in step S905, if the machine learning unit 414 determines that there is a transmission request for the trained model, the process proceeds to step S906. On the other hand, as a result of the determination in step S905, when the machine learning unit 414 determines that there is no transmission request for the trained model, the process returns to step S905 and waits for a transmission request from the estimation processing unit 405.

ステップＳ９０６では、機械学習部４１４は、学習済みモデルを推定処理部４０５に送信する。 In step S<b>906 , the machine learning unit 414 transmits the learned model to the estimation processing unit 405 .

ステップＳ９０７では、推定処理部４０５は、ＪＯＢ制御部４０３がユーザからのスキャン系のジョブを受け付けたか否かを判断する。このスキャン系のジョブは、１次原稿の読み取りである。ここでの１次原稿は、一例として、図７Ａ（ａ）に示す原稿ものとする。ステップＳ９０７での判断の結果、推定処理部４０５が、ＪＯＢ制御部４０３がスキャン系のジョブを受け付けたと判断した場合には、処理はステップＳ９０８に進む。一方、ステップＳ９０７での判断の結果、推定処理部４０５は、ＪＯＢ制御部４０３がスキャン系のジョブを受け付けていないと判断した場合には、ＪＯＢ制御部４０３がスキャン系のジョブを受け付けるまで待機する。 In step S907, the estimation processing unit 405 determines whether the job control unit 403 has received a scan-related job from the user. This scan-related job is to read the primary document. As an example, the primary document here is the document shown in FIG. 7A(a). As a result of the determination in step S907, if the estimation processing unit 405 determines that the job control unit 403 has received a scan-related job, the process advances to step S908. On the other hand, as a result of the determination in step S907, if the estimation processing unit 405 determines that the JOB control unit 403 has not accepted a scan-related job, it waits until the JOB control unit 403 accepts a scan-related job. .

ステップＳ９０８では、推定処理部４０５は、ＪＯＢ制御部４０３を介して、機械学習部４１４へ学習済みモデルの送信要求を行い、学習済みモデルを受信する。 In step S908, the estimation processing unit 405 requests the machine learning unit 414 to transmit the learned model via the JOB control unit 403, and receives the learned model.

ステップＳ９０９では、ＪＯＢ制御部４０３は、ステップＳ９０７でスキャン系のジョブが有った判断された１次原稿がカラースキャンされたか否かを判断する。ステップＳ９０９での判断の結果、ＪＯＢ制御部４０３が、１次原稿がカラースキャンされたと判断した場合には、処理はステップＳ９１０に進む。一方、ステップＳ９０９での判断の結果、ＪＯＢ制御部４０３が、１次原稿がカラースキャンされていないと判断した場合には、処理は終了する。 In step S909, the job control unit 403 determines whether or not the primary document for which it was determined in step S907 that there was a scan-related job was color-scanned. As a result of the determination in step S909, if the job control unit 403 determines that the primary document has been color-scanned, the process proceeds to step S910. On the other hand, if the job control unit 403 determines in step S909 that the primary document has not been color-scanned, the process ends.

ステップＳ９１０では、推定処理部４０５は、カラースキャンされた１次原稿内のグラフ領域や表領域、すなわち、図表領域を学習済みモデルを用いて推定する。なお、ステップＳ９１０で１次原稿内にグラフ領域や表領域が無いと推定された場合には、処理は終了となる。 In step S910, the estimation processing unit 405 estimates a graph area and a table area, that is, a chart area in the color-scanned primary document using a learned model. If it is estimated in step S910 that there is no graph area or table area in the primary document, the process ends.

ステップＳ９１１では、推定処理部４０５は、グラフ領域内の各棒グラフ（データ要素Ｘ０）や表領域内のデータ要素Ｘ１０を公知の画像処理によって抽出する。そして、推定処理部４０５は、グラフ領域内の当該各棒グラフについて、塗りつぶしパターンが同じ、すなわち、単色が施されているか否かを判断する。また、推定処理部４０５は、表領域内の当該データ要素Ｘ１０について、塗りつぶしパターンが同じ、すなわち、単色が施されているか否かを判断する。ステップＳ９１１での判断の結果、推定処理部４０５が、塗りつぶしパターンが同じと判断した場合には、処理はステップＳ９１２に進む。一方、ステップＳ９１１での判断の結果、推定処理部４０５が、塗りつぶしパターンが同じではないと判断した場合には、処理は終了する。塗りつぶしパターンが同じではない場合には、１次原稿をモノクロ印刷で印刷して２次原稿を得たとしても、当該２次原稿での各棒グラフ同士の識別性（視認性）は、カラー印刷の１次原稿での各棒グラフ同士の識別性（視認性）とほぼ同等に維持されている。 In step S911, the estimation processing unit 405 extracts each bar graph (data element X0) in the graph area and the data element X10 in the table area by known image processing. Then, the estimation processing unit 405 determines whether or not each bar graph in the graph area has the same filling pattern, that is, whether or not the bar graphs are colored. The estimation processing unit 405 also determines whether or not the data element X10 in the table region has the same filling pattern, that is, whether or not it is colored in a single color. As a result of the determination in step S911, if the estimation processing unit 405 determines that the fill patterns are the same, the process proceeds to step S912. On the other hand, if the estimation processing unit 405 determines that the fill patterns are not the same as a result of the determination in step S911, the process ends. If the fill patterns are not the same, even if the primary manuscript is printed in monochrome to obtain a secondary manuscript, the identifiability (visibility) of each bar graph in the secondary manuscript is not as good as that of color printing. The identifiability (visibility) between the bar graphs in the primary manuscript is maintained at approximately the same level.

推定処理部４０５は、各棒グラフや表の視認性を変更することができる。このグラフや表の視認性の変更には、少なくとも色の変更、濃淡の変更、ハッチングの変更のうちの少なくとも１つが含まれ、これらを組み合わせるのが好ましい。また、推定処理部４０５は、棒グラフや表のセルごとに視認性を異ならせることができる。例えば、図７Ａ（ａ－２）に示すように、各月の左側に位置する棒グラフ同士は、ドットが施され、右側に位置する棒グラフ同士は、左側に位置する棒グラフよりも密度が高いドットが施され、中央に位置する棒グラフ同士は、中間密度のドットが施されている。これにより、塗りつぶしパターンが同じ場合に、１次原稿をモノクロ印刷で印刷して２次原稿を得たとしても、当該２次原稿での各棒グラフ同士の識別性は、カラー印刷の１次原稿での各棒グラフ同士の識別性とほぼ同等に維持される。なお、図７Ａ（ｂ－２）に示すように、円グラフでは、当該円グラフを構成する各扇型の部分に、密度が互いに異なるドットが施されている。図７Ａ（ｃ－２）に示すように、折れ線グラフでは、各折れ線グラフが異なる線で表されている。その線としては、特に限定されず、例えば、実線、破線、一点鎖線、二点鎖線等を用いることができる。図７Ａ（ｄ－２）に示すように、３Ｄ折れ線グラフでは、各３Ｄ折れ線グラフに、密度が互いに異なるドットが施されている。図７Ａ（ｆ－２）に示すように、円環グラフでは、当該円環グラフを構成する各円弧状の部分に、密度が互いに異なるドットが施されている。 The estimation processing unit 405 can change the visibility of each bar graph and table. The change in the visibility of the graphs and tables includes at least one of color change, shading change, and hatching change, and it is preferable to combine them. In addition, the estimation processing unit 405 can vary the visibility for each cell of the bar graph or table. For example, as shown in FIG. 7A (a-2), the bar graphs positioned on the left side of each month have dots, and the bar graphs positioned on the right side have dots with a higher density than the bar graphs positioned on the left side. , and the middle-located bars are dotted with intermediate densities. As a result, even if the primary manuscript is printed in black-and-white to obtain a secondary manuscript when the fill pattern is the same, the identifiability of each bar graph in the secondary manuscript will not be as high as in the color-printed primary manuscript. , the distinguishability between each bar graph is maintained almost the same. Note that, as shown in FIG. 7A(b-2), in the pie chart, dots having different densities are applied to each fan-shaped portion forming the pie chart. As shown in FIG. 7A(c-2), each line graph is represented by a different line. The line is not particularly limited, and for example, a solid line, a dashed line, a one-dot chain line, a two-dot chain line, or the like can be used. As shown in FIG. 7A(d-2), in the 3D line graph, dots with different densities are applied to each 3D line graph. As shown in FIG. 7A (f-2), in the circular ring graph, dots having different densities are applied to the circular arc-shaped portions forming the circular ring graph.

そして、ステップＳ９１２では、推定処理部４０５は、ＪＯＢ制御部４０３を介して、ＵＩ表示部４０１（操作部１４０）に、グラフや表の識別性（視認性）が変更された状態の図表領域のサンプル画像を修正候補として表示させることができる（図１１参照）。また、ＵＩ表示部４０１は、この修正候補（サンプル画像）に並べて、グラフの視認性が変更される前の状態（１次原稿のスキャンデータ）のグラフ領域や表領域を表示することもできる。図１１は、ＵＩ表示部での表示例を示す図である。図１１に示すように、ＵＩ表示部４０１には、左側に配置された１次原稿（スキャンデータ）のグラフ領域と、右側に配置された識別性変更後（修正候補）のグラフ領域とが表示される。これにより、ユーザは、双方のグラフ領域を比較して（見比べて）、修正候補における棒グラフの識別性の良否を判断することができる。 Then, in step S912, the estimation processing unit 405 causes the UI display unit 401 (operation unit 140), via the JOB control unit 403, to display an image of the chart area in which the identifiability (visibility) of the graph or table has been changed. Sample images can be displayed as correction candidates (see FIG. 11). The UI display unit 401 can also display the graph area and the table area in the state (scan data of the primary document) before the visibility of the graph is changed, side by side with the correction candidate (sample image). FIG. 11 is a diagram showing a display example on the UI display unit. As shown in FIG. 11, the UI display unit 401 displays a graph area of the primary document (scan data) arranged on the left side and a graph area after identification change (correction candidate) arranged on the right side. be done. This allows the user to compare (look and see) both graph areas and determine whether the bar graphs of the correction candidates are identifiable.

また、図１１に示すように、ＵＩ表示部４０１には、「修正候補に修正する」用のボタンと、「修正しない」用のボタンと、「手動で修正する」用のボタンとが表示させてもよい。ユーザが「修正候補に修正する」用のボタンを操作した場合には、ＵＩ表示部４０１は、スキャナ部１０で読み取られた１次原稿のデータに対し、当該１次原稿のグラフ領域を、修正候補に置き換える決定を実行する。そして、この決定が実行された場合には、プリンタ部２０は、修正候補を含む画像が形成された２次原稿をモノクロ印刷で印刷する。これにより、モノクロ印刷の２次原稿のグラフでも、カラー印刷の１次原稿のグラフと同程度の識別性が維持される。また、データ記憶部４０２は、修正候補に置き換える決定が実行された場合に、修正候補を含む画像が形成された２次原稿のデータを記憶可能である。これにより、ユーザは、任意のタイミングで、２次原稿を印刷することができる。ユーザが「修正しない」用のボタンを操作した場合には、上記の印刷や記憶は実行されない。また、ユーザが「手動で修正する」用のボタンを操作した場合には、ユーザは、各グラフに対して所望の視認性の変更を行うことができる。 Further, as shown in FIG. 11, the UI display unit 401 displays a button for "correct to correction candidate", a button for "not correct", and a button for "correct manually". may When the user operates the button for “correct to correction candidate”, the UI display unit 401 corrects the graph area of the primary manuscript read by the scanner unit 10 . Make a decision to replace the candidate. Then, when this determination is executed, the printer unit 20 prints the secondary document on which the image including the correction candidate is formed by monochrome printing. As a result, even in the graph of the secondary manuscript printed in monochrome, the same level of distinguishability as in the graph of the primary manuscript printed in color is maintained. Further, the data storage unit 402 can store the data of the secondary document on which the image including the correction candidate is formed when the decision to replace with the correction candidate is executed. This allows the user to print the secondary document at any timing. When the user operates the "do not modify" button, the above printing and storage are not executed. Also, when the user operates the button for "manual correction", the user can change the desired visibility for each graph.

なお、「修正候補に修正する」用のボタンが操作された際に、ステップＳ９１０での推定結果に関する情報を教師データとして学習モデルにフィードバックしてもよい。この場合、ステップＳ９１０での推定結果が正解であったことを意味する。そして、このフィードバックにより、学習モデルは、推定正解率を向上させることができる。フィードバックのタイミングとしては、特に限定されず、例えば、「修正候補に修正する」用のボタンの操作時とすることができる。フィードバックされる推定結果に関する情報としては、特に限定されず、例えば、入力データＸ等とすることができる。このようなグラフに関するＵＩ表示部４０１上での処理は、表に関しても同様とすることができる。 It should be noted that information regarding the estimation result in step S910 may be fed back to the learning model as teacher data when the button for "correct to correction candidate" is operated. In this case, it means that the estimation result in step S910 was correct. This feedback allows the learning model to improve its estimated accuracy rate. The timing of the feedback is not particularly limited, and can be, for example, when the button for "correct to correction candidate" is operated. Information relating to the estimation result to be fed back is not particularly limited, and may be input data X or the like, for example. The processing on the UI display unit 401 for such graphs can be performed similarly for tables.

また、本実施形態では、１次原稿がカラー印刷の原稿であり、２次原稿がモノクロ印刷の原稿であったが、これに限定されない。例えば、１次原稿がカラー印刷の原稿であり、２次原稿もカラー印刷の原稿であってもよい。この場合、例えば、赤系統や緑系統の色の識別に困難が生じる色覚異常を有する人に対して、２次原稿でグラフや表の色を変更することにより、色の識別困難性を解消させることができる。また、１次原稿がモノクロ印刷の原稿であり、２次原稿もモノクロ印刷の原稿であってもよい。この場合、モノクロ印刷であるために１次原稿で識別が困難であったグラフや表に対して、２次原稿で各グラフや表のハッチング等を変更することにより、各グラフや表の識別性を向上させることができる。 Further, in the present embodiment, the primary document is a color-printed document and the secondary document is a monochrome-printed document, but the present invention is not limited to this. For example, the primary document may be a color printed document, and the secondary document may also be a color printed document. In this case, for example, for a person with color blindness who has difficulty in distinguishing between red and green colors, the difficulty in distinguishing colors can be resolved by changing the colors of the graphs and tables in the secondary manuscript. be able to. Also, the primary document may be a monochrome printed document, and the secondary document may also be a monochrome printed document. In this case, for graphs and tables that were difficult to identify in the primary manuscript due to monochrome printing, by changing the hatching of each graph and table in the secondary manuscript, each graph and table can be distinguished. can be improved.

本実施形態の開示は、以下の構成、方法およびプログラムを含む。
（構成１）画像を形成する画像形成装置であって、
原稿を読み取る読取手段と、
原稿内のグラフを含むグラフ領域と、表を含む表領域とのうちの少なくとも一方の図表領域を推定するための機械学習が行われた学習済みモデルを用いて、前記読取手段で読み取られた原稿内の前記図表領域を推定する推定処理手段と、を備えることを特徴とする画像形成装置。
（構成２）前記推定処理手段は、前記図表領域として前記グラフ領域を推定した場合には、該グラフ領域内の前記グラフを抽出して、該グラフの視認性を変更可能であり、前記図表領域として前記表領域を推定した場合には、該表領域内の前記表を抽出して、該表の視認性を変更可能であることを特徴とする構成１に記載の画像形成装置。
（構成３）前記グラフ領域には、前記グラフが複数含まれ、前記表領域には、前記表が複数含まれており、
前記推定処理手段は、前記グラフごとに視認性を異ならせ、前記表ごとに視認性を異ならせることを特徴とする構成２に記載の画像形成装置。
（構成４）前記グラフおよび前記表の視認性の変更には、少なくとも色の変更、濃淡の変更、ハッチングの変更のうちの少なくとも１つが含まれることを特徴とする構成２または３に記載の画像形成装置。
（構成５）前記視認性が変更された状態の前記図表領域のサンプル画像を表示可能な表示手段を備えることを特徴とする構成２乃至の４いずれか一項に記載の画像形成装置。
（構成６）前記表示手段は、前記サンプル画像に並べて、前記視認性が変更される前の状態の前記図表領域を表示可能であることを特徴とする構成５に記載の画像形成装置。
（構成７）前記画像形成装置を使用するユーザからの操作を受け付ける操作手段を備え、
前記操作手段は、前記読取手段で読み取られた原稿のデータに対し、該原稿の前記図表領域を、前記サンプル画像に置き換える決定を実行可能であることを特徴とする構成６に記載の画像形成装置。
（構成８）前記操作手段による前記決定が実行された場合に、前記サンプル画像を含む画像が形成された原稿を印刷可能な印刷手段を備えることを特徴とする構成７に記載の画像形成装置。
（構成９）前記読取手段で読み取られた原稿は、少なくとも前記グラフおよび前記表のうちの一方がカラー印刷された原稿であり、
前記サンプル画像を含む画像が形成された原稿は、モノクロ印刷された原稿であることを特徴とする構成８に記載の画像形成装置。
（構成１０）前記操作手段による前記決定が実行された場合に、前記サンプル画像を含む画像が形成された原稿のデータを記憶可能な記憶手段を備えることを特徴とする構成７乃至９のいずれか一項に記載の画像形成装置。
（構成１１）前記学習済みモデルは、ニューラルネットワークであり、前記図表領域の推定対象となる原稿の前記グラフおよび前記表に関する情報を入力データとし、前記図表領域の有無を出力データとすることを特徴とする構成１乃至１０のいずれか一項に記載の画像形成装置。
（構成１２）前記画像形成装置を使用するユーザからの操作を受け付ける操作手段を備え、
前記読取手段は、前記操作手段を介した前記ユーザからの操作に応じて作動することを特徴とする構成１乃至１１のいずれか一項に記載の画像形成装置。
（構成１３）画像を形成する画像形成装置を制御する方法であって、
原稿を読み取る読取工程と、
原稿内のグラフを含むグラフ領域と、表を含む表領域とのうちの少なくとも一方の図表領域を推定するための機械学習が行われた学習済みモデルを用いて、前記読取手段で読み取られた原稿内の前記図表領域を推定する推定処理工程と、を有することを特徴とする画像形成装置の制御方法。
（構成１４）構成１乃至１２のいずれか一項に記載の画像形成装置の各手段をコンピュータに実行させるためのプログラム。 The disclosure of this embodiment includes the following configurations, methods and programs.
(Configuration 1) An image forming apparatus for forming an image,
reading means for reading an original;
A manuscript read by the reading means using a machine-learned trained model for estimating at least one of a graph area containing a graph and a table area containing a table in the manuscript. and estimation processing means for estimating the graphic area in the image forming apparatus.
(Arrangement 2) When the graph area is estimated as the diagram area, the estimation processing means can extract the graph within the graph area and change the visibility of the graph. The image forming apparatus according to Arrangement 1, wherein when the table area is estimated as , the table in the table area is extracted and the visibility of the table can be changed.
(Configuration 3) the graph area includes a plurality of the graphs, the table area includes a plurality of the tables,
The image forming apparatus according to Structure 2, wherein the estimation processing means makes the visibility different for each graph and makes the visibility different for each table.
(Configuration 4) The image according to configuration 2 or 3, wherein the change in visibility of the graph and the table includes at least one of color change, shading change, and hatching change. forming device.
(Arrangement 5) The image forming apparatus according to any one of Arrangements 2 to 4, further comprising display means capable of displaying a sample image of the graphic area with the visibility changed.
(Structure 6) The image forming apparatus according to structure 5, wherein the display means can display the graphic area in a state before the visibility is changed, side by side with the sample image.
(Arrangement 7) Provided with operating means for receiving an operation from a user who uses the image forming apparatus,
The image forming apparatus according to configuration 6, wherein the operation means is capable of determining to replace the graphic area of the document read by the reading means with the sample image. .
(Structure 8) The image forming apparatus according to structure 7, further comprising printing means capable of printing a document on which an image including the sample image is formed when the determination is made by the operation means.
(Arrangement 9) The document read by the reading means is a document in which at least one of the graph and the table is printed in color,
The image forming apparatus according to Arrangement 8, wherein the document on which the image including the sample image is formed is a document printed in monochrome.
(Structure 10) Any one of structures 7 to 9, further comprising storage means capable of storing data of a document on which an image including the sample image is formed when the determination is made by the operation means. 1. The image forming apparatus according to item 1.
(Arrangement 11) The trained model is a neural network, and has as input data information relating to the graph and the table of the document from which the figure/table area is to be estimated, and output data as to whether or not the figure/table area exists. The image forming apparatus according to any one of Configurations 1 to 10.
(Arrangement 12) Provided with operating means for receiving an operation from a user who uses the image forming apparatus,
12. The image forming apparatus according to any one of configurations 1 to 11, wherein the reading unit operates in response to an operation from the user via the operation unit.
(Structure 13) A method for controlling an image forming apparatus that forms an image, comprising:
a reading process for reading an original;
A manuscript read by the reading means using a machine-learned trained model for estimating at least one of a graph area containing a graph and a table area containing a table in the manuscript. and an estimation processing step of estimating the graphic area in the image forming apparatus.
(Arrangement 14) A program for causing a computer to execute each means of the image forming apparatus according to any one of Arrangements 1 to 12.

以上、本発明の好ましい実施形態について説明したが、本発明は上述した実施形態に限定されず、その要旨の範囲内で種々の変形および変更が可能である。本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワークや記憶媒体を介してシステムや装置に供給し、そのシステムまたは装置のコンピュータの１つ以上のプロセッサがプログラムを読み出して実行する処理でも実現可能である。また、本発明は、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 Although preferred embodiments of the present invention have been described above, the present invention is not limited to the above-described embodiments, and various modifications and changes are possible within the scope of the gist thereof. The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or device via a network or a storage medium, and one or more processors of the computer of the system or device reads the program. It can also be realized by executing processing. The invention can also be implemented by a circuit (eg, an ASIC) that implements one or more functions.

１０スキャナ部（読取手段）
２０プリンタ部（印刷手段）
１０１画像形成装置（画像処理装置）
１０２機械学習サーバ
１４０操作部（操作手段）
４０１ＵＩ表示部
４０３ＪＯＢ制御部
４０４画像読取部
４０５推定処理部
４１４機械学習部 10 scanner unit (reading means)
20 printer section (printing means)
101 image forming apparatus (image processing apparatus)
102 machine learning server 140 operation unit (operation means)
401 UI display unit 403 JOB control unit 404 Image reading unit 405 Estimation processing unit 414 Machine learning unit

Claims

An image forming apparatus for forming an image,
reading means for reading an original;
A manuscript read by the reading means using a machine-learned trained model for estimating at least one of a graph area containing a graph and a table area containing a table in the manuscript. and estimation processing means for estimating the graphic area in the image forming apparatus.

When the graph area is estimated as the diagram area, the estimation processing means can extract the graph within the graph area and change the visibility of the graph. 2. The image forming apparatus according to claim 1, wherein the table in the table area can be extracted and the visibility of the table can be changed when the is estimated.

the graph area includes a plurality of the graphs, the table area includes a plurality of the tables,
3. The image forming apparatus according to claim 2, wherein the estimation processing means makes the visibility different for each graph and makes the visibility different for each table.

3. The image forming apparatus according to claim 2, wherein the change in visibility of the graph and the table includes at least one of color change, shade change, and hatching change.

3. The image forming apparatus according to claim 2, further comprising display means capable of displaying a sample image of said graphic area with said visibility changed.

6. The image forming apparatus according to claim 5, wherein the display unit can display the graphic area in a state before the visibility is changed, side by side with the sample image.

an operating means for receiving an operation from a user who uses the image forming apparatus;
7. The image forming apparatus according to claim 6, wherein said operation means can execute a decision to replace said graphic area of said original document read by said reading means with said sample image. Device.

8. The image forming apparatus according to claim 7, further comprising printing means capable of printing a document on which an image including said sample image is formed when said decision is made by said operation means.

The document read by the reading means is a document in which at least one of the graph and the table is printed in color,
9. The image forming apparatus according to claim 8, wherein the document on which the image including the sample image is formed is a document printed in monochrome.

8. The image forming apparatus according to claim 7, further comprising storage means for storing data of a document on which an image including said sample image is formed when said decision is made by said operation means.

3. The trained model is a neural network, and has as input data information relating to the graph and the table of the document from which the figure/table area is to be estimated, and output data as to whether or not the figure/table area exists. 1. The image forming apparatus according to 1.

an operating means for receiving an operation from a user who uses the image forming apparatus;
2. The image forming apparatus according to claim 1, wherein said reading means operates according to an operation by said user via said operation means.

A method of controlling an image forming apparatus for forming an image, comprising:
a reading process for reading an original;
A document read in the reading step using a trained model that has undergone machine learning for estimating at least one of a graph area including a graph and a table area including a table in the document. and an estimation processing step of estimating the graphic area in the image forming apparatus.

A program for causing a computer to execute each means of the image forming apparatus according to claim 1 .