JP6944127B2

JP6944127B2 - Image processing equipment, computer programs, and image processing methods

Info

Publication number: JP6944127B2
Application number: JP2017247063A
Authority: JP
Inventors: 竜司山田
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2017-12-22
Filing date: 2017-12-22
Publication date: 2021-10-06
Anticipated expiration: 2037-12-22
Also published as: JP2019114927A

Description

本明細書は、画像データに対する画像処理に関し、特に、画像内の文字画素を特定するための画像処理に関する。 The present specification relates to image processing for image data, and more particularly to image processing for specifying character pixels in an image.

画像データ、例えば、イメージセンサを用いて印刷物を読み取ることによって生成される画像データでは、該画像データによって示される画像内に、印刷物に含まれる網点が現れる。このような網点を構成する画素は、画像内の文字画素を特定する際に、文字画素として誤って特定されやすい。 In image data, for example, image data generated by reading a printed matter using an image sensor, halftone dots included in the printed matter appear in the image indicated by the image data. Pixels constituting such halftone dots are likely to be erroneously specified as character pixels when specifying character pixels in an image.

特許文献１に開示された画像処理装置は、画素ごとにエッジであるか否かを判定するエッジ判定と、画素ごとに網点であるかを判定する網点判定と、を実行する。画像処理装置は、エッジであり、かつ、網点ではない画素を、文字を示す画素として特定する。 The image processing apparatus disclosed in Patent Document 1 executes edge determination for determining whether or not each pixel is an edge, and halftone dot determination for determining whether or not each pixel is a halftone dot. The image processing device identifies pixels that are edges and are not halftone dots as pixels that indicate characters.

特開平６−１６４９２８号公報Japanese Unexamined Patent Publication No. 6-164928 特開２０１６−３８７３２号公報Japanese Unexamined Patent Publication No. 2016-38732

このように、対象画像内において、文字を構成する文字画素を精度良く特定できる技術が求められていた。 As described above, there has been a demand for a technique capable of accurately identifying the character pixels constituting the characters in the target image.

本明細書は、対象画像内の文字画素を精度良く特定できる技術を開示する。 The present specification discloses a technique capable of accurately identifying character pixels in a target image.

本明細書に開示された技術は、上述の課題の少なくとも一部を解決するためになされたものであり、以下の適用例として実現することが可能である。 The techniques disclosed herein have been made to solve at least a portion of the above problems and can be realized as the following application examples.

［適用例１］画像処理装置であって、対象画像を示す対象画像データを取得する画像取得部と、対象画像データを用いて、文字を構成する文字画素の候補である複数個の文字候補画素を抽出する候補画素抽出部と、前記対象画像データを用いて、前記対象画像上に配置される複数個のブロックのそれぞれが文字を示す文字ブロックであるか否かをブロックごとに判断する判断部であって、前記ブロックごとの判断は、文字を示す複数個の文字画像データと文字を示さない複数個の非文字画像データとを用いてトレーニングされた機械学習モデルを用いて実行される、前記判断部と、前記判断部による判断結果を用いて、前記複数個の文字候補画素の中から、文字を示す複数個の文字画素を特定する文字画素特定部と、を備える画像処理装置。 [Application Example 1] An image processing device, an image acquisition unit that acquires target image data indicating a target image, and a plurality of character candidate pixels that are candidates for character pixels that constitute characters using the target image data. A candidate pixel extraction unit for extracting The block-by-block determination is performed using a machine learning model trained with a plurality of character image data indicating characters and a plurality of non-character image data not indicating characters. An image processing device including a determination unit and a character pixel identification unit that identifies a plurality of character pixels indicating characters from the plurality of character candidate pixels by using the determination result by the determination unit.

上記構成によれば、機械学習モデルを用いて文字ブロックであるか否かがブロックごとに判断される判断結果を用いて、複数個の文字候補画素の中から、複数個の文字画素が特定される。この結果、対象画像内の文字画素を精度良く特定できる。
［適用例２］
適用例１に記載の画像処理装置であって、
前記候補画素抽出部は、前記対象画像データに対して前記対象画像内のエッジの強度を調整するエッジ調整処理を実行することなく、前記複数個の文字候補画素を抽出する、画像処理装置。
［適用例３］
適用例１または２に記載の画像処理装置であって、
前記対象画像データは、複数個の画素の色値を含み、
前記色値は、複数個の成分値を含み、
前記候補画素抽出部は、
前記対象画像データを用いて、前記複数個の画素の色値に対応する複数個の第１値を含む第１画像データであって、前記複数個の第１値のそれぞれは、対応する前記色値の複数個の成分値のうちの最小値および最大値のいずれかに基づく値である、前記第１画像データを生成し、
前記第１画像データを用いて前記複数個の文字候補画素を抽出する、画像処理装置。
［適用例４］
適用例１〜３のいずれかに記載の画像処理装置であって、
前記候補画素抽出部は、
前記対象画像データを用いて、前記対象画像内の複数個の画素のうちの対応する画素の輝度を示す複数個の第２値を含む第２画像データを生成し、
基準よりも高い輝度を有する画素を前記文字候補画素として抽出するように、前記第２画像データを二値化する、画像処理装置。
［適用例５］
適用例１〜４のいずれかに記載の画像処理装置であって、
前記候補画素抽出部は、
第１の抽出処理を実行して、前記複数個の文字候補画素のうちの複数個の第１の文字候補画素を抽出し、
前記第１の抽出処理とは異なる第２の抽出処理を実行して、前記複数個の文字候補画素のうちの複数個の第２の文字候補画素を抽出し、
前記文字画素特定部は、
前記判断部による判断結果を用いて、前記複数個の第１の文字候補画素の中から、複数個の第１の画素を特定し、
前記判断部による判断結果を用いて、前記複数個の第２の文字候補画素の中から、複数個の第２の画素を特定し、
前記複数個の第１の画素と前記複数個の第２の画素とを含む前記複数個の文字画素を特定する、画像処理装置。
［適用例６］
適用例５に記載の画像処理装置であって、
前記判断部によって用いられる前記機械学習モデルは、第１の機械学習モデルと、前記第１の機械学習モデルとは異なる第２の機械学習モデルと、を含み、
前記判断部は、
前記第１の機械学習モデルを用いて、前記複数個のブロックのそれぞれが前記文字ブロックあるか否かをブロックごとに判断し、
前記第２の機械学習モデルを用いて、前記複数個のブロックのそれぞれが前記文字ブロックあるか否かをブロックごとに判断し、
前記文字画素特定部は、
前記第１の機械学習モデルを用いた判断結果を用いて、前記複数個の第１の画素を特定し、
前記第２の機械学習モデルを用いた判断結果を用いて、前記複数個の第２の画素を特定する、画像処理装置。
［適用例７］
適用例６に記載の画像処理装置であって、
前記複数個の文字画像データは、複数個の第１の文字画像データと、前記複数個の第１の文字画像データとは異なる複数個の第２の文字画像データと、を含み、
前記複数個の非文字画像データは、複数個の第１の非文字画像データと、前記複数個の第１の非文字画像データとは異なる複数個の第２の非文字画像データと、を含み、
前記第１の機械学習モデルは、前記複数個の第１の文字画像データと、前記複数個の第１の非文字画像データと、を用いてトレーニングされた前記機械学習モデルであり、
前記第２の機械学習モデルは、前記複数個の第２の文字画像データと、前記複数個の第２の非文字画像データと、を用いてトレーニングされた前記機械学習モデルである、画像処理装置。
［適用例８］
適用例７に記載の画像処理装置であって、
前記対象画像データは、複数個の画素の色値を含み、
前記色値は、複数個の成分値を含み、
前記第１の抽出処理は、
前記対象画像データを用いて、前記複数個の画素の色値に対応する複数個の第１値を含む第１画像データであって、前記複数個の第１値のそれぞれは、対応する前記色値の複数個の成分値のうちの最小値および最大値のいずれかに基づく値である、前記第１画像データを生成する処理と、
前記第１画像データを二値化することによって前記複数個の第１の画素を特定する処理と、
を含み、
前記第２の抽出処理は、
前記対象画像データを用いて、前記対象画像内の複数個の画素のうちの対応する画素の輝度を示す複数個の第２値を含む第２画像データを生成する処理と、
基準よりも高い輝度を有する画素を前記文字候補画素として特定するように、前記第２画像データを二値化することによって前記複数個の第２の画素を特定する処理と、
を含み、
前記第１の機械学習モデルは、背景よりも輝度が低い文字である第１の文字を示す前記複数個の第１の文字画像データと、前記第１の文字を示さない前記複数個の第１の非文字画像データと、を用いてトレーニングされた前記機械学習モデルであり、
前記第２の機械学習モデルは、背景よりも輝度が高い文字である第２の文字を示す前記複数個の第２の文字画像データと、前記第２の文字を示さない前記複数個の第２の非文字画像データと、を用いてトレーニングされた前記機械学習モデルである、画像処理装置。
［適用例９］
適用例１〜８のいずれかに記載の画像処理装置であって、
前記対象画像データのうち、特定済みの前記複数個の文字画素の値に対して第１の画像処理を実行し、前記複数個の文字画素とは異なる画素の値に対して前記第１の画像処理とは異なる第２の画像処理を実行して、画像処理済みの前記対象画像データを生成する画像処理部を備える、画像処理装置。
［適用例１０］
適用例９に記載の画像処理装置であって、
前記画像処理済みの前記対象画像データを用いて、印刷データを生成する印刷データ生成部を備える、画像処理装置。
According to the above configuration, a plurality of character pixels are specified from a plurality of character candidate pixels by using the judgment result of determining whether or not the character block is a character block using a machine learning model. NS. As a result, the character pixels in the target image can be specified with high accuracy.
[Application example 2]
The image processing apparatus according to Application Example 1.
The candidate pixel extraction unit is an image processing device that extracts a plurality of character candidate pixels without executing edge adjustment processing for adjusting the intensity of edges in the target image with respect to the target image data.
[Application example 3]
The image processing apparatus according to Application Example 1 or 2.
The target image data includes color values of a plurality of pixels.
The color value includes a plurality of component values and contains a plurality of component values.
The candidate pixel extraction unit
Using the target image data, it is the first image data including a plurality of first values corresponding to the color values of the plurality of pixels, and each of the plurality of first values is the corresponding color. The first image data, which is a value based on either the minimum value or the maximum value among the plurality of component values of the value, is generated.
An image processing device that extracts the plurality of character candidate pixels using the first image data.
[Application example 4]
The image processing apparatus according to any one of Application Examples 1 to 3.
The candidate pixel extraction unit
Using the target image data, second image data including a plurality of second values indicating the brightness of the corresponding pixel among the plurality of pixels in the target image is generated.
An image processing device that binarizes the second image data so as to extract pixels having a brightness higher than the reference as the character candidate pixels.
[Application example 5]
The image processing apparatus according to any one of Application Examples 1 to 4.
The candidate pixel extraction unit
The first extraction process is executed to extract a plurality of first character candidate pixels among the plurality of character candidate pixels.
A second extraction process different from the first extraction process is executed to extract a plurality of second character candidate pixels among the plurality of character candidate pixels.
The character pixel identification unit is
Using the judgment result by the judgment unit, a plurality of first pixels are identified from the plurality of first character candidate pixels.
Using the judgment result by the judgment unit, a plurality of second pixels are identified from the plurality of second character candidate pixels.
An image processing device that identifies the plurality of character pixels including the plurality of first pixels and the plurality of second pixels.
[Application example 6]
The image processing apparatus according to Application Example 5.
The machine learning model used by the determination unit includes a first machine learning model and a second machine learning model different from the first machine learning model.
The judgment unit
Using the first machine learning model, it is determined for each block whether or not each of the plurality of blocks is the character block.
Using the second machine learning model, it is determined for each block whether or not each of the plurality of blocks is the character block.
The character pixel identification unit is
Using the judgment result using the first machine learning model, the plurality of first pixels are identified, and the plurality of first pixels are identified.
An image processing device that identifies a plurality of the second pixels by using the determination result using the second machine learning model.
[Application 7]
The image processing apparatus according to Application Example 6.
The plurality of character image data includes a plurality of first character image data and a plurality of second character image data different from the plurality of first character image data.
The plurality of non-character image data includes a plurality of first non-character image data and a plurality of second non-character image data different from the plurality of first non-character image data. ,
The first machine learning model is the machine learning model trained using the plurality of first character image data and the plurality of first non-character image data.
The second machine learning model is an image processing apparatus that is the machine learning model trained using the plurality of second character image data and the plurality of second non-character image data. ..
[Application Example 8]
The image processing apparatus according to Application Example 7.
The target image data includes color values of a plurality of pixels.
The color value includes a plurality of component values and contains a plurality of component values.
The first extraction process is
Using the target image data, it is the first image data including a plurality of first values corresponding to the color values of the plurality of pixels, and each of the plurality of first values is the corresponding color. The process of generating the first image data, which is a value based on either the minimum value or the maximum value of the plurality of component values of the value, and
A process of identifying the plurality of first pixels by binarizing the first image data, and
Including
The second extraction process is
A process of using the target image data to generate second image data including a plurality of second values indicating the brightness of the corresponding pixel among the plurality of pixels in the target image.
A process of specifying the plurality of second pixels by binarizing the second image data so that a pixel having a brightness higher than the reference is specified as the character candidate pixel.
Including
In the first machine learning model, the plurality of first character image data showing the first character, which is a character having a brightness lower than that of the background, and the plurality of first characters not showing the first character. The machine learning model trained using the non-character image data of
In the second machine learning model, the plurality of second character image data showing the second character, which is a character having a brightness higher than that of the background, and the plurality of second characters not showing the second character. An image processing apparatus, which is the machine learning model trained using the non-character image data of.
[Application example 9]
The image processing apparatus according to any one of Application Examples 1 to 8.
Among the target image data, the first image processing is executed on the identified values of the plurality of character pixels, and the first image is applied to the values of pixels different from the plurality of character pixels. An image processing apparatus including an image processing unit that executes a second image processing different from the processing to generate the target image data that has been image-processed.
[Application Example 10]
The image processing apparatus according to Application Example 9.
An image processing apparatus including a print data generation unit that generates print data using the target image data that has been image-processed.

なお、本明細書に開示される技術は、種々の形態で実現することが可能であり、例えば、複合機、スキャナ、プリンタ、画像処理方法、これら装置の機能または上記方法を実現するためのコンピュータプログラム、そのコンピュータプログラムを記録した記録媒体、等の形態で実現することができる。 The techniques disclosed in the present specification can be realized in various forms, for example, a multifunction device, a scanner, a printer, an image processing method, a function of these devices, or a computer for realizing the above method. It can be realized in the form of a program, a recording medium on which the computer program is recorded, or the like.

画像処理装置の一例である複合機２００の構成を示すブロック図である。It is a block diagram which shows the structure of the multifunction device 200 which is an example of an image processing apparatus. 画像処理のフローチャートである。It is a flowchart of image processing. 画像処理で用いられる画像の一例を示す第１の図である。FIG. 1 is a first diagram showing an example of an image used in image processing. 文字特定処理で用いられる画像の一例を示す図である。It is a figure which shows an example of the image used in the character identification processing. 第１の二値画像データ生成処理のフローチャートである。It is a flowchart of the 1st binary image data generation processing. スキャンデータの最小成分値と最大成分値の説明図である。It is explanatory drawing of the minimum component value and the maximum component value of scan data. 画像処理に用いられる画像の一例を示す第２の図である。FIG. 2 is a second diagram showing an example of an image used for image processing. 第２の二値画像データ生成処理のフローチャートである。It is a flowchart of the 2nd binary image data generation processing. ブロック判定処理のフローチャートである。It is a flowchart of a block determination process. スキャン画像ＳＩ上に配置される複数個のブロックＢＬの説明図である。It is explanatory drawing of a plurality of blocks BL arranged on a scan image SI. ブロックＢＬごとの判断の一例を示す図である。It is a figure which shows an example of the judgment for each block BL. ブロック判定データにおける画素の値の設定の一例を示す図である。It is a figure which shows an example of setting of the pixel value in a block determination data. 実施例の効果について説明する図である。It is a figure explaining the effect of an Example.

Ａ．実施例：
Ａ−１：複合機２００の構成
実施の形態を実施例に基づき説明する。図１は、画像処理装置の一例である複合機２００の構成を示すブロック図である。複合機２００は、画像処理装置を制御するプロセッサであるＣＰＵ２１０と、ＤＲＡＭなどの揮発性記憶装置２２０と、フラッシュメモリやハードディスクドライブなどの不揮発性記憶装置２３０と、液晶ディスプレイなどの表示部２４０と、液晶ディスプレイと重畳されたタッチパネルやボタンを含む操作部２５０と、ユーザの端末装置１００などの外部装置と通信を行うためのインタフェース（通信ＩＦ）２７０と、印刷実行部２８０と、読取実行部２９０と、を備えている。 A. Example:
A-1: Configuration of the multifunction device 200 An embodiment will be described based on an embodiment. FIG. 1 is a block diagram showing a configuration of a multifunction device 200, which is an example of an image processing device. The multifunction device 200 includes a CPU 210 which is a processor for controlling an image processing device, a volatile storage device 220 such as a DRAM, a non-volatile storage device 230 such as a flash memory and a hard disk drive, and a display unit 240 such as a liquid crystal display. An operation unit 250 including a touch panel and buttons superimposed on a liquid crystal display, an interface (communication IF) 270 for communicating with an external device such as a user's terminal device 100, a print execution unit 280, and a reading execution unit 290. , Is equipped.

読取実行部２９０は、ＣＰＵ２１０の制御に従って、一次元イメージセンサを用いて原稿を光学的に読み取ることによってスキャンデータを生成する。印刷実行部２８０は、ＣＰＵ２１０の制御に従って、複数種類のトナー、具体的には、シアン（Ｃ）、マゼンタ（Ｍ）、イエロ（Ｙ）、ブラック（Ｋ）のトナーを、色材として用いて、レーザ方式で用紙などの印刷媒体に画像を印刷する。具体的には、印刷実行部２８０は、感光ドラムを露光して静電潜像を形成し、該静電潜像にトナーを付着させてトナー像を形成する。印刷実行部２８０は、感光ドラム上に形成されたトナー像を用紙に転写する。なお、変形例では、印刷実行部２８０は、色材としてのインクを吐出して、用紙上に画像を形成するインクジェット方式の印刷実行部であっても良い。 The scanning execution unit 290 generates scan data by optically scanning the document using the one-dimensional image sensor under the control of the CPU 210. According to the control of the CPU 210, the print execution unit 280 uses a plurality of types of toners, specifically, cyan (C), magenta (M), yellow (Y), and black (K) toners as coloring materials. The image is printed on a printing medium such as paper by the laser method. Specifically, the print execution unit 280 exposes the photosensitive drum to form an electrostatic latent image, and attaches toner to the electrostatic latent image to form a toner image. The print execution unit 280 transfers the toner image formed on the photosensitive drum to the paper. In the modified example, the print execution unit 280 may be an inkjet type print execution unit that ejects ink as a coloring material to form an image on paper.

揮発性記憶装置２２０は、ＣＰＵ２１０が処理を行う際に生成される種々の中間データを一時的に格納するバッファ領域を提供する。不揮発性記憶装置２３０には、コンピュータプログラムＰＧが格納されている。コンピュータプログラムＰＧは、ＣＰＵ２１０に複合機２００の制御を実現させる制御プログラムである。本実施例では、コンピュータプログラムＰＧは、複合機２００の製造時に、不揮発性記憶装置２３０に予め格納される形態で提供される。これに代えて、コンピュータプログラムＰＧは、サーバからダウンロードされる形態で提供されても良く、ＤＶＤ−ＲＯＭなどに格納される形態で提供されてもよい。ＣＰＵ２１０は、コンピュータプログラムＰＧを実行することにより、後述する画像処理を実行することができる。 The volatile storage device 220 provides a buffer area for temporarily storing various intermediate data generated when the CPU 210 performs processing. The computer program PG is stored in the non-volatile storage device 230. The computer program PG is a control program that allows the CPU 210 to control the multifunction device 200. In this embodiment, the computer program PG is provided in a form that is pre-stored in the non-volatile storage device 230 at the time of manufacturing the multifunction device 200. Instead, the computer program PG may be provided in a form downloaded from a server, or may be provided in a form stored in a DVD-ROM or the like. The CPU 210 can execute image processing described later by executing the computer program PG.

Ａ−２：画像処理
図２は、画像処理のフローチャートである。この画像処理は、例えば、ユーザが、読取実行部２９０の原稿台に、原稿を載置して、コピーの実行指示を入力した場合に実行される。この画像処理は、原稿を、読取実行部２９０を用いて読み取ることによって生成されるスキャンデータを取得し、該スキャンデータを用いて、原稿を示す印刷データを生成することで、いわゆる原稿のコピーを実現する処理である。 A-2: Image processing FIG. 2 is a flowchart of image processing. This image processing is executed, for example, when the user places the document on the platen of the scanning execution unit 290 and inputs a copy execution instruction. In this image processing, scan data generated by scanning a document using a scanning execution unit 290 is acquired, and the scan data is used to generate print data indicating the document, thereby producing a so-called copy of the document. It is a process to be realized.

Ｓ１０では、ＣＰＵ２１０は、ユーザが原稿台に設置した原稿を、読取実行部２９０を用いて読み取ることによって、対象画像データとしてのスキャンデータを生成する。原稿は、例えば、複合機２００、あるいは、図示しないプリンタによって画像が印刷された印刷物である。生成されたスキャンデータは、揮発性記憶装置２２０（図１）のバッファ領域に格納される。スキャンデータは、複数個の画素の値を含み、複数個の画素の値のそれぞれは、画素の色をＲＧＢ表色系の色値（ＲＧＢ値とも呼ぶ）で表す。すなわち、スキャンデータは、ＲＧＢ画像データである。１個の画素のＲＧＢ値は、例えば、赤色（Ｒ）と緑色（Ｇ）と青色（Ｂ）との３個の色成分の値（以下、Ｒ値、Ｇ値、Ｂ値とも呼ぶ）を含んでいる。本実施例では、各成分値の階調数は、２５６階調である。 In S10, the CPU 210 generates scan data as target image data by reading the document placed on the platen by the user using the scanning execution unit 290. The manuscript is, for example, a printed matter in which an image is printed by a multifunction device 200 or a printer (not shown). The generated scan data is stored in the buffer area of the volatile storage device 220 (FIG. 1). The scan data includes the values of a plurality of pixels, and each of the values of the plurality of pixels represents the color of the pixel as an RGB color value (also referred to as an RGB value). That is, the scan data is RGB image data. The RGB value of one pixel includes, for example, the values of three color components (hereinafter, also referred to as R value, G value, and B value) of red (R), green (G), and blue (B). I'm out. In this embodiment, the number of gradations of each component value is 256 gradations.

ＲＧＢ画像データであるスキャンデータは、ＲＧＢ表色系を構成する３個の色成分に対応する３個の成分画像データ（Ｒ成分画像データ、Ｇ成分画像データ、Ｂ成分画像データ）を含んでいると言うことができる。各成分画像データは、１種類の色成分の値を、画素の値とする画像データである。 The scan data, which is RGB image data, includes three component image data (R component image data, G component image data, B component image data) corresponding to the three color components constituting the RGB color system. Can be said. Each component image data is image data in which the value of one type of color component is used as the pixel value.

図３は、画像処理で用いられる画像の一例を示す第１の図である。図３（Ａ）には、スキャンデータによって示されるスキャン画像ＳＩの一例が示されている。スキャン画像ＳＩは、複数個の画素を含む。該複数個の画素は、第１方向Ｄ１と、第１方向Ｄ１と直交する第２方向Ｄ２と、に沿って、マトリクス状に配置されている。 FIG. 3 is a first diagram showing an example of an image used in image processing. FIG. 3A shows an example of the scanned image SI shown by the scan data. The scanned image SI includes a plurality of pixels. The plurality of pixels are arranged in a matrix along a first direction D1 and a second direction D2 orthogonal to the first direction D1.

図３（Ａ）のスキャン画像ＳＩは、原稿の用紙の地色を示す白色の背景Ｂｇ１と、３個の文字とは異なるオブジェクトＯｂ１〜Ｏｂ３と、４個の文字Ｏｂ４〜Ｏｂ７と、４個の文字Ｏｂ４〜Ｏｂ７の背景Ｂｇ２、Ｂｇ３と、を含んでいる。文字とは異なるオブジェクトは、例えば、写真である。背景Ｂｇ２、Ｂｇ３は、白色とは異なる色を有する均一な画像である。背景Ｂｇ２上の文字Ｏｂ４、Ｏｂ５は、背景Ｂｇ２よりも濃い色を有する文字、すなわち、背景Ｂｇ２よりも輝度が低い文字である。背景Ｂｇ３上の文字Ｏｂ６、Ｏｂ７は、背景Ｂｇ３よりも薄い色を有する文字、すなわち、背景Ｂｇ３よりも輝度が高い文字である。 The scanned image SI of FIG. 3A shows a white background Bg1 indicating the background color of the paper of the original, objects Ob1 to Ob3 different from the three characters, four characters Ob4 to Ob7, and four characters. The backgrounds Bg2 and Bg3 of the characters Ob4 to Ob7 are included. Objects that are different from text are, for example, photographs. The backgrounds Bg2 and Bg3 are uniform images having colors different from white. The characters Ob4 and Ob5 on the background Bg2 are characters having a darker color than the background Bg2, that is, characters having a lower brightness than the background Bg2. The characters Ob6 and Ob7 on the background Bg3 are characters having a lighter color than the background Bg3, that is, characters having a higher brightness than the background Bg3.

Ｓ２０では、ＣＰＵ２１０は、スキャンデータに対して、文字特定処理を実行する。文字特定処理は、スキャン画像ＳＩ内の複数個の画素を、文字を示す複数個の文字画素と、文字を示さない複数個の非文字画素と、に分類することによって、文字画素を特定する処理である。 In S20, the CPU 210 executes character identification processing on the scan data. The character identification process is a process for identifying character pixels by classifying a plurality of pixels in the scanned image SI into a plurality of character pixels indicating characters and a plurality of non-character pixels not indicating characters. Is.

文字特定処理によって、例えば、文字画素の値が「１」とされ、非文字画素の値が「０」とされた二値画像データ（文字特定データとも呼ぶ）が生成される。図３（Ｂ）には、文字特定データによって示される文字特定画像ＴＩの一例が示されている。この文字特定画像ＴＩには、スキャン画像ＳＩ内の４個の文字Ｏｂ４〜Ｏｂ７を構成する複数個の画素が、文字画素Ｔｐ４〜Ｔｐ７として、特定されている。文字特定処理の詳細は、後述する。 By the character identification process, for example, binary image data (also referred to as character identification data) in which the value of the character pixel is set to "1" and the value of the non-character pixel is set to "0" is generated. FIG. 3B shows an example of the character identification image TI indicated by the character identification data. In this character identification image TI, a plurality of pixels constituting the four characters Ob4 to Ob7 in the scanned image SI are specified as character pixels Tp4 to Tp7. The details of the character identification process will be described later.

Ｓ３０では、ＣＰＵ２１０は、スキャンデータに対して、網点平滑化処理を実行して、平滑化画像を示す平滑化画像データを生成する。具体的には、ＣＰＵ２１０は、スキャンデータに含まれる複数個の非文字画素の値のそれぞれに対して、ガウスフィルタなどの平滑化フィルタを用いた平滑化処理を実行して、平滑化処理済みの複数個の非文字画素の値を算出する。平滑化処理の対象となる非文字画素は、Ｓ２０の分類処理によって生成された文字特定データを参照して特定される。ＣＰＵ２１０は、スキャンデータに含まれる複数個の文字画素の値と、平滑化処理済みの複数個の非文字画素の値と、を含む平滑化画像データを生成する。 In S30, the CPU 210 executes a halftone dot smoothing process on the scan data to generate smoothed image data indicating the smoothed image. Specifically, the CPU 210 executes a smoothing process using a smoothing filter such as a Gaussian filter on each of the values of a plurality of non-character pixels included in the scan data, and the smoothing process has been completed. Calculate the values of a plurality of non-character pixels. The non-character pixel to be smoothed is specified by referring to the character identification data generated by the classification process of S20. The CPU 210 generates smoothed image data including the values of the plurality of character pixels included in the scan data and the values of the plurality of non-character pixels that have been smoothed.

図３（Ｃ）には、平滑化画像データによって示される平滑化画像ＧＩが示されている。平滑化画像ＧＩは、白色の背景Ｂｇ１ｇと、スキャン画像ＳＩ内のオブジェクトＯｂ１〜Ｏｂ７、背景Ｂｇ２、Ｂｇ３が平滑化されたオブジェクトＯｂ１ｇ〜Ｏｂ７ｇ、背景Ｂｇ２ｇ、Ｂｇ３ｇと、を含んでいる。これらのオブジェクトＯｂ１ｇ〜Ｏｂ７ｇ、背景Ｂｇ２ｇ、Ｂｇ３ｇのうち、文字Ｏｂ４ｇ〜Ｏｂ７ｇ以外の部分（非文字部分とも呼ぶ）は、スキャン画像ＳＩと比較して、平滑化されている。 FIG. 3C shows a smoothed image GI represented by the smoothed image data. The smoothed image GI includes a white background Bg1g, objects Ob1 to Ob7 in the scanned image SI, objects Ob1g to Ob7g in which the backgrounds Bg2 and Bg3 are smoothed, backgrounds Bg2g, and Bg3g. Of these objects Ob1g to Ob7g, background Bg2g, and Bg3g, parts other than the characters Ob4g to Ob7g (also referred to as non-character parts) are smoothed as compared with the scanned image SI.

Ｓ４０では、ＣＰＵ２１０は、平滑化画像データに対して、文字鮮鋭化処理を実行して、処理済み画像データを生成する。具体的には、ＣＰＵ２１０は、平滑化画像データに含まれる複数個の文字画素の値のそれぞれに対して、アンシャープマスク処理や鮮鋭化フィルタを適用する処理などの鮮鋭化処理を実行して、鮮鋭化処理済みの複数個の文字画素の値を算出する。鮮鋭化処理の対象となる文字画素は、Ｓ２０の分類処理によって生成された文字特定データを参照して特定される。そして、ＣＰＵ２１０は、平滑化画像データに含まれる複数個の非文字画素の値（平滑化処理済みの複数個の非文字画素の値）と、鮮鋭化処理済みの複数個の文字画素の値と、を含む処理済み画像データを生成する。平滑化画像データに含まれる複数個の文字画素の値は、平滑化処理の対象ではないので、スキャンデータに含まれる複数個の文字画素の値と同じである。したがって、本ステップの文字鮮鋭化処理は、スキャンデータに含まれる複数個の文字画素の値に対して実行される、とも言うことができる。 In S40, the CPU 210 executes character sharpening processing on the smoothed image data to generate processed image data. Specifically, the CPU 210 executes a sharpening process such as an unsharp mask process or a process of applying a sharpening filter on each of the values of a plurality of character pixels included in the smoothed image data. Calculate the values of a plurality of character pixels that have been sharpened. The character pixel to be sharpened is specified by referring to the character identification data generated by the classification process of S20. Then, the CPU 210 includes the values of the plurality of non-character pixels included in the smoothed image data (values of the plurality of non-character pixels that have been smoothed) and the values of the plurality of character pixels that have been sharpened. Generates processed image data including ,. Since the values of the plurality of character pixels included in the smoothed image data are not subject to the smoothing process, they are the same as the values of the plurality of character pixels included in the scan data. Therefore, it can be said that the character sharpening process of this step is executed for the values of a plurality of character pixels included in the scan data.

図３（Ｄ）には、処理済み画像データによって示される処理済み画像ＦＩが示されている。処理済み画像ＦＩは、白色の背景Ｂｇ１ｆと、スキャン画像ＳＩ内のオブジェクトＯｂ１〜Ｏｂ７、背景Ｂｇ２、Ｂｇ３に対応するオブジェクトＯｂ１ｆ〜Ｏｂ７ｆ、背景Ｂｇ２ｆ、Ｂｇ３ｆを含んでいる。これらのオブジェクトＯｂ１ｆ〜Ｏｂ７ｆ、背景Ｂｇ２ｆ、Ｂｇ３ｆのうち、文字Ｏｂ４ｆ〜Ｏｂ７ｆのエッジは、スキャン画像ＳＩ内の文字Ｏｂ４〜Ｏｂ７や、平滑化画像ＧＩ内の文字Ｏｂ４ｇ〜Ｏｂ７ｇと比較して、鮮鋭化されている。また、文字以外のオブジェクトＯｂ１ｆ〜Ｏｂ３ｆ、背景Ｂｇ２ｆ、Ｂｇ３ｆのエッジは、鮮鋭化されていない。 FIG. 3D shows a processed image FI represented by the processed image data. The processed image FI includes a white background Bg1f, objects Ob1 to Ob7 in the scanned image SI, objects Ob1f to Ob7f corresponding to the backgrounds Bg2 and Bg3, and backgrounds Bg2f and Bg3f. Of these objects Ob1f to Ob7f, backgrounds Bg2f, and Bg3f, the edges of the characters Ob4f to Ob7f are sharpened as compared with the characters Ob4 to Ob7 in the scanned image SI and the characters Ob4g to Ob7g in the smoothed image GI. Has been done. Further, the edges of the objects Ob1f to Ob3f, the background Bg2f, and the Bg3f other than the characters are not sharpened.

以上の説明から解るように、処理済み画像ＦＩ内のオブジェクトＯｂ１ｆ〜Ｏｂ７ｆ、背景Ｂｇ２ｆ、Ｂｇ３ｆは、鮮鋭化された文字と、平滑化された非文字を含む。 As can be seen from the above description, the objects Ob1f to Ob7f, the background Bg2f, and the Bg3f in the processed image FI include sharpened characters and smoothed non-characters.

Ｓ５０では、ＣＰＵ２１０は、処理済み画像データを用いて印刷データを生成する印刷データ生成処理を実行する。具体的には、ＲＧＢ画像データである処理済み画像データに対して色変換処理が実行されて、印刷に用いられる色材に対応する色成分（Ｃ、Ｍ、Ｙ、Ｋの成分）を有する色値であるＣＭＹＫ値で画素ごとの色を示すＣＭＹＫ画像データが生成される。色変換処理は、例えば、公知のルックアップテーブルを参照して実行される。ＣＭＹＫ値画像データに対して、ハーフトーン処理が実行されて、印刷に用いられる色材ごと、かつ、画素ごとに、ドットの形成状態を示すドットデータが生成される。ドットの形成状態は、例えば、ドット有、ドット無の２種類の状態や、大ドット、中ドット、小ドット、ドット無の４種類の状態を取り得る。ハーフトーン処理は、例えば、ディザ法や、誤差拡散法に従って実行される。該ドットデータは、印刷時に用いられる順に並べ替えられ、該ドットデータに、印刷コマンドが付加されることによって、印刷データが生成される。 In S50, the CPU 210 executes a print data generation process for generating print data using the processed image data. Specifically, a color conversion process is executed on the processed image data which is RGB image data, and a color having color components (C, M, Y, K components) corresponding to the color material used for printing is performed. CMYK image data indicating the color for each pixel is generated by the CMYK value which is a value. The color conversion process is executed with reference to, for example, a known look-up table. Halftone processing is executed on the CMYK value image data, and dot data indicating a dot formation state is generated for each color material used for printing and for each pixel. The dot formation state can be, for example, two types of states with or without dots, and four types of states of large dots, medium dots, small dots, and no dots. The halftone processing is executed according to, for example, a dither method or an error diffusion method. The dot data is rearranged in the order in which they are used at the time of printing, and print data is generated by adding a print command to the dot data.

Ｓ６０では、ＣＰＵ２１０は、印刷処理を実行して、画像処理を終了する。具体的には、ＣＰＵ２１０は、印刷データを印刷実行部２８０に供給して、印刷実行部２８０に処理済み画像を印刷させる。 In S60, the CPU 210 executes the print process and ends the image process. Specifically, the CPU 210 supplies print data to the print execution unit 280, and causes the print execution unit 280 to print the processed image.

以上説明した画像処理によれば、スキャンデータのうち、特定済みの複数個の文字画素の値に対して第１の画像処理（具体的には、エッジ鮮鋭化処理）が実行され（Ｓ４０）、複数個の非文字画素の値に対して第１の画像処理とは異なる第２の画像処理（具体的には、網点平滑化処理）が実行され（Ｓ３０）、処理済み画像データが生成される。この結果、文字画素の値と、文字画素とは異なる画素の値と、に対して、互いに異なる画像処理が実行されるので、スキャンデータに対する適切な画像処理を実現できる。なお、変形例では、Ｓ４０の文字鮮鋭化処理が先に実行され、その後に、Ｓ３０の網点平滑化処理が実行されても良い。 According to the image processing described above, the first image processing (specifically, the edge sharpening process) is executed for the values of the plurality of specified character pixels in the scan data (S40). A second image process (specifically, halftone dot smoothing process) different from the first image process is executed for the values of the plurality of non-character pixels (S30), and the processed image data is generated. NS. As a result, different image processing is executed for the value of the character pixel and the value of the pixel different from the character pixel, so that appropriate image processing for the scan data can be realized. In the modified example, the character sharpening process of S40 may be executed first, and then the halftone dot smoothing process of S30 may be executed.

より具体的には、鮮鋭化処理済みの複数個の文字画素の値と、平滑化処理済みの複数個の非文字画素の値と、を含む処理済み画像データが生成される（Ｓ３０、Ｓ４０）。この結果、見栄えの良い処理済み画像ＦＩを示す処理済み画像データを生成することができる。 More specifically, processed image data including the values of the plurality of character pixels that have been sharpened and the values of the plurality of non-character pixels that have been smoothed is generated (S30, S40). .. As a result, it is possible to generate processed image data showing the processed image FI that looks good.

例えば、図３（Ｄ）の処理済み画像ＦＩに示すように、処理済み画像データでは、文字画素の値には、鮮鋭化処理済みの値が用いられている。この結果、処理済み画像ＦＩの文字がシャープに見えるので、例えば、印刷される処理済み画像ＦＩの見栄えを向上することができる。 For example, as shown in the processed image FI of FIG. 3D, in the processed image data, the sharpening-processed value is used as the value of the character pixel. As a result, the characters of the processed image FI look sharp, so that the appearance of the processed image FI to be printed can be improved, for example.

また、処理済み画像データでは、処理済み画像ＦＩ内の背景Ｂｇ２や、写真などの文字とは異なるオブジェクトを構成する非文字画素の値には、平滑化処理済みの値が用いられている。この結果、処理済み画像ＦＩの文字とは異なる部分に、例えば、モアレの原因となる網点が表れることを抑制できるので、印刷される処理済み画像ＦＩにモアレなどの不具合が発生することを抑制できる。この結果、印刷される処理済み画像ＦＩの見栄えを向上することができる。また、写真内のエッジが過度に強調されることが抑制されるので、画像ＦＩのさらに印刷される処理済み画像ＦＩの見栄えを向上することができる。 Further, in the processed image data, the smoothed value is used as the value of the background Bg2 in the processed image FI and the non-character pixel constituting an object different from the character such as a photograph. As a result, for example, it is possible to suppress the appearance of halftone dots that cause moire in a portion different from the characters of the processed image FI, so that it is possible to prevent problems such as moire from occurring in the printed processed image FI. can. As a result, the appearance of the processed image FI to be printed can be improved. In addition, since the edges in the photograph are suppressed from being excessively emphasized, the appearance of the processed image FI to be further printed can be improved.

例えば、スキャンデータの生成に用いられた原稿は、画像が印刷された印刷物である。このため、例えば、原稿内の白とは異なる色を有する背景Ｂｇ２などの均一な部分は、画像を形成するドットレベルでみると、網点を形成している。網点は、複数個のドットと、ドットが配置されていない部分（原稿の地色を示す部分）と、を含む。このために、スキャン画像ＳＩ内の背景Ｂｇ２、Ｂｇ３を示す領域には、画素レベルでみると、網点が示されている。網点内のドットは、原稿の印刷時に用いられるディザマトリクスなどの影響によって、周期性を持って並んでいる。このためにスキャンデータを用いて印刷を行うと、ハーフトーン処理前の元画像（スキャン画像ＳＩ）内に存在している網点のドットの周期成分と、印刷画像を構成する網点のドットの周期成分と、が干渉して、モアレが表れやすい。本実施例の処理済み画像ＦＩでは、平滑化処理によって、元画像（スキャン画像ＳＩ）内のエッジとは異なる部分のドットの周期成分が低減される。この結果、処理済み画像データを用いて、処理済み画像ＦＩを印刷する場合に、例えば、印刷される処理済み画像ＦＩにモアレが発生することを抑制できる。 For example, the original used to generate the scan data is a printed matter on which an image is printed. Therefore, for example, a uniform portion such as a background Bg2 having a color different from white in a document forms halftone dots when viewed at the dot level forming an image. The halftone dots include a plurality of dots and a portion where the dots are not arranged (a portion indicating the ground color of the document). For this reason, halftone dots are shown in the region showing the backgrounds Bg2 and Bg3 in the scanned image SI when viewed at the pixel level. The dots in the halftone dots are arranged with periodicity due to the influence of the dither matrix used when printing the original. For this reason, when printing is performed using scan data, the periodic components of halftone dot dots existing in the original image (scan image SI) before halftone processing and the dots of halftone dots that make up the printed image Moire is likely to appear due to interference with the periodic component. In the processed image FI of this embodiment, the smoothing process reduces the periodic component of dots in a portion different from the edge in the original image (scanned image SI). As a result, when the processed image FI is printed using the processed image data, for example, it is possible to suppress the occurrence of moire in the printed processed image FI.

特に、上記画像処理では、処理済み画像データを用いて、印刷データが生成される（Ｓ５０）ので、例えば、印刷される処理済み画像ＦＩに発生しやすいモアレを抑制可能な適切な印刷データを生成することができる。 In particular, in the above image processing, print data is generated using the processed image data (S50). Therefore, for example, appropriate print data capable of suppressing moire that tends to occur in the processed image FI to be printed is generated. can do.

Ａ−３：文字特定処理
図２のＳ２０の文字特定処理について説明する。Ｓ２１では、ＣＰＵ２１０は、スキャンデータを用いて、第１の二値画像データ生成処理を実行して、第１の二値画像データを生成する。第１の二値画像データは、文字候補画素と、文字候補画素とは異なる画素と、を示す二値データである。文字候補画素は、文字特定処理にて特定すべき文字画素の候補である。ここで、第１の二値画像データによって示される文字候補画素を第１の文字候補画素とも呼ぶ。 A-3: Character identification process The character identification process of S20 in FIG. 2 will be described. In S21, the CPU 210 uses the scan data to execute the first binary image data generation process to generate the first binary image data. The first binary image data is binary data indicating a character candidate pixel and a pixel different from the character candidate pixel. The character candidate pixel is a candidate for a character pixel to be specified by the character identification process. Here, the character candidate pixel indicated by the first binary image data is also referred to as a first character candidate pixel.

図４は、文字特定処理で用いられる画像の一例を示す図である。図４（Ａ）には、第１の二値画像データによって示される第１の二値画像ＣＩ１の一例が示されている。第１の二値画像ＣＩ１において黒色の部分は、文字候補画素として特定された画素を示し、白色の部分は、文字候補画素とは異なる画素を示す。 FIG. 4 is a diagram showing an example of an image used in the character identification process. FIG. 4A shows an example of the first binary image CI1 represented by the first binary image data. In the first binary image CI1, the black portion indicates a pixel specified as a character candidate pixel, and the white portion indicates a pixel different from the character candidate pixel.

第１の二値画像ＣＩ１では、スキャン画像ＳＩ内の文字とは異なるオブジェクトＯｂ１〜Ｏｂ３を構成する複数個の画素Ｃｐ１〜Ｃｐ３と、文字Ｏｂ４、Ｏｂ５を構成する複数個の画素Ｃｐ４、Ｃｐ５と、背景Ｂｇ３と文字Ｏｂ６、Ｏｂ７との全体を構成する画素Ｃｐｂと、が、第１の文字候補画素として特定されている。また、第１の二値画像ＣＩ１では、文字Ｏｂ６、Ｏｂ７を構成する画素が、背景Ｂｇ３から分離されて特定されてはいない。このように、特定される第１の文字候補画素は、文字とは異なるオブジェクトや背景を構成する画素を含み得る。これは、１種類の二値化処理だけでは、全ての文字候補画素を抽出することは困難であり、また、文字を構成しない画素を排除することも困難であるためである。第１の二値画像データ生成処理の詳細は、後述する。本実施例では、第１の二値画像データ生成処理では、背景よりも輝度が低い文字を構成する文字画素を含む第１の文字候補画素が抽出される。 In the first binary image CI1, a plurality of pixels Cp1 to Cp3 constituting the objects Ob1 to Ob3 different from the characters in the scanned image SI, and a plurality of pixels Cp4 and Cp5 constituting the characters Ob4 and Ob5 are used. The pixel Cpb that constitutes the entire background Bg3 and the characters Ob6 and Ob7 is specified as the first character candidate pixel. Further, in the first binary image CI1, the pixels constituting the characters Ob6 and Ob7 are not specified separately from the background Bg3. As described above, the specified first character candidate pixel may include pixels forming an object or background different from the character. This is because it is difficult to extract all character candidate pixels by only one type of binarization processing, and it is also difficult to exclude pixels that do not constitute characters. Details of the first binary image data generation process will be described later. In this embodiment, in the first binary image data generation process, first character candidate pixels including character pixels constituting characters having a brightness lower than that of the background are extracted.

Ｓ２２では、ＣＰＵ２１０は、スキャンデータに対して、第１のブロック判定処理を実行して、文字を示す文字ブロックと、文字を示さない非文字ブロックと、を示す二値画像データ（第１のブロック判定データとも呼ぶ）を生成する。第１のブロック判定データは、文字ブロックを構成する画素の値が「１」とされ、非文字ブロックを個性する画素の値が「０」とされた二値データである。第１のブロック判定処理は、スキャンデータを用いて、スキャン画像ＳＩ内に配置される複数個のブロックのそれぞれが、文字を示す文字ブロックであるか否かをブロックごとに判断する処理である。１個のブロックは、Ｎ個（Ｎは２以上の整数）の画素を含む矩形の領域である。第１のブロック判定処理の詳細は後述する。 In S22, the CPU 210 executes a first block determination process on the scan data to indicate binary image data (first block) indicating a character block indicating characters and a non-character block indicating no characters. (Also called judgment data) is generated. The first block determination data is binary data in which the value of the pixel constituting the character block is "1" and the value of the pixel individualizing the non-character block is "0". The first block determination process is a process of determining for each block whether or not each of the plurality of blocks arranged in the scan image SI is a character block indicating a character by using the scan data. One block is a rectangular area containing N pixels (N is an integer of 2 or more). The details of the first block determination process will be described later.

図４（Ｂ）には、第１のブロック判定データによって示される第１のブロック判定画像ＢＩ１の一例が示されている。この第１のブロック判定画像ＢＩ１には、スキャン画像ＳＩ内の文字Ｏｂ４、Ｏｂ５が配置された領域に対応する文字ブロックＢｋ４、Ｂｋ５が特定されている。このように、第１のブロック判定データによって特定される文字ブロックは、文字とは異なるオブジェクトを含む領域に対応するブロックを含まない。第１のブロック判定データによって特定される文字ブロックは、背景よりも輝度が低い文字を示すブロックである。このために、第１のブロック判定画像ＢＩ１では、背景よりも輝度が高い文字Ｏｂ６、Ｏｂ７が配置された領域に対応する文字ブロックＢｋ６、Ｂｋ７（図４（Ｄ））が特定されていない。 FIG. 4B shows an example of the first block determination image BI1 shown by the first block determination data. In the first block determination image BI1, the character blocks Bk4 and Bk5 corresponding to the area in which the characters Ob4 and Ob5 are arranged in the scanned image SI are specified. As described above, the character block specified by the first block determination data does not include the block corresponding to the area including the object different from the character. The character block specified by the first block determination data is a block indicating a character having a lower brightness than the background. Therefore, in the first block determination image BI1, the character blocks Bk6 and Bk7 (FIG. 4 (D)) corresponding to the area in which the characters Ob6 and Ob7 having higher brightness than the background are arranged are not specified.

Ｓ２３では、ＣＰＵ２１０は、Ｓ２１にて生成済みの第１の二値画像データと、Ｓ２２にて生成済みの第１のブロック判定データと、を用いて、論理積合成処理を実行する。これによって、複数個の第１の文字画素を示す第１の文字特定データが生成される。具体的には、ＣＰＵ２１０は、第１の二値画像データと、第１のブロック判定データと、の各画素の論理積を取ることによって、第１の文字特定データとしての二値画像データを生成する。換言すれば、ＣＰＵ２１０は、スキャン画像ＳＩ内の複数個の画素のうち、Ｓ２１において第１の文字候補画素として特定され、かつ、Ｓ２２において特定済みの文字ブロック内に位置する画素を、第１の文字画素として特定する。ＣＰＵ２１０は、スキャン画像ＳＩ内の複数個の画素のうち、第１の文字候補画素として特定されない画素と、非文字ブロック内の画素とを、第１の文字画素として特定しない。 In S23, the CPU 210 executes the logical product composition process using the first binary image data generated in S21 and the first block determination data generated in S22. As a result, first character identification data indicating a plurality of first character pixels is generated. Specifically, the CPU 210 generates binary image data as the first character identification data by taking the logical product of each pixel of the first binary image data and the first block determination data. do. In other words, among the plurality of pixels in the scanned image SI, the CPU 210 selects the pixel specified as the first character candidate pixel in S21 and located in the character block identified in S22 as the first character block. Specify as a character pixel. The CPU 210 does not specify the pixel that is not specified as the first character candidate pixel and the pixel in the non-character block as the first character pixel among the plurality of pixels in the scanned image SI.

図４（Ａ）の第１の二値画像ＣＩ１に示すように、第１の二値画像データによって特定される複数個の第１の文字候補画素は、スキャン画像ＳＩ内の文字Ｏｂ４、Ｏｂ５を構成する画素Ｃｐ４、Ｃｐ５の他に、他のオブジェクトＯｂ１〜Ｏｂ３や背景Ｂｇ３を構成する画素Ｃｐ１〜Ｃｐ３、Ｃｐｂを含んでいる。図４（Ｂ）の第１のブロック判定画像ＢＩ１に示すように、第１の二値画像データと第１のブロック判定データとの論理積を取ることで、第１の文字特定データでは、スキャン画像ＳＩ内の文字Ｏｂ４、Ｏｂ５を構成する画素Ｃｐ４、Ｃｐ５が、選択的に第１の文字画素として特定される。すなわち、複数個の第１の文字画素は、スキャン画像ＳＩ内の文字Ｏｂ４、Ｏｂ５を構成する画素Ｃｐ４、Ｃｐ５を含み、他のオブジェクトＯｂ１〜Ｏｂ３や背景Ｂｇ３を構成する画素Ｃｐ１〜Ｃｐ３、Ｃｐｂを含まない As shown in the first binary image CI1 of FIG. 4A, the plurality of first character candidate pixels identified by the first binary image data have the characters Ob4 and Ob5 in the scan image SI. In addition to the constituent pixels Cp4 and Cp5, the pixels Cp1 to Cp3 and Cpb constituting the other objects Ob1 to Ob3 and the background Bg3 are included. As shown in the first block determination image BI1 of FIG. 4B, by taking the logical product of the first binary image data and the first block determination data, the first character identification data can be scanned. The pixels Cp4 and Cp5 constituting the characters Ob4 and Ob5 in the image SI are selectively specified as the first character pixel. That is, the plurality of first character pixels include pixels Cp4 and Cp5 constituting characters Ob4 and Ob5 in the scanned image SI, and pixels Cp1 to Cp3 and Cpb constituting other objects Ob1 to Ob3 and background Bg3. Not included

Ｓ２４では、ＣＰＵ２１０は、スキャンデータを用いて、第２の二値画像データ生成処理を実行して、第２の二値画像データを生成する。第２の二値画像データは、文字候補画素と、文字候補画素とは異なる画素と、を示す二値データである。ここで、第２の二値画像データによって示される文字候補画素を第２の文字候補画素とも呼ぶ。 In S24, the CPU 210 uses the scan data to execute the second binary image data generation process to generate the second binary image data. The second binary image data is binary data indicating a character candidate pixel and a pixel different from the character candidate pixel. Here, the character candidate pixel indicated by the second binary image data is also referred to as a second character candidate pixel.

図４（Ｃ）には、第２の二値画像データによって示される第２の二値画像ＣＩ２の一例が示されている。第２の二値画像ＣＩ２において黒色の部分は、第２の文字候補画素として特定された画素を示し、白色の部分は、文字候補画素とは異なる画素を示す。図４（Ｃ）についても同様である。 FIG. 4C shows an example of the second binary image CI2 represented by the second binary image data. In the second binary image CI2, the black portion indicates a pixel specified as the second character candidate pixel, and the white portion indicates a pixel different from the character candidate pixel. The same applies to FIG. 4 (C).

第２の二値画像ＣＩ２では、背景Ｂｇ３よりも輝度が高い文字Ｏｂ６、Ｏｂ７と、背景Ｂｇ１ｇを構成するＣｐ８と、が、第２の文字候補画素として特定されている。また、第１の二値画像ＣＩ１では、文字Ｏｂ４、Ｏｂ５を構成する画素が特定されていない。このように、第２の文字候補画素は、第１の文字候補画素と同様に、文字とは異なるオブジェクトや背景を構成する画素を含み得る。第２の二値画像データ生成処理の詳細は、後述する。本実施例では、第２の二値画像データ生成処理では、背景よりも輝度が高い文字を構成する文字画素を含む第２の文字候補画素が抽出される。 In the second binary image CI2, the characters Ob6 and Ob7 having higher brightness than the background Bg3 and the Cp8 constituting the background Bg1g are specified as the second character candidate pixels. Further, in the first binary image CI1, the pixels constituting the characters Ob4 and Ob5 are not specified. As described above, the second character candidate pixel may include pixels forming an object or background different from the character, like the first character candidate pixel. Details of the second binary image data generation process will be described later. In this embodiment, in the second binary image data generation process, a second character candidate pixel including a character pixel constituting a character having a brightness higher than that of the background is extracted.

Ｓ２５では、ＣＰＵ２１０は、スキャンデータに対して、第２のブロック判定処理を実行して、文字を示す文字ブロックと、文字を示さない非文字ブロックと、を示す二値画像データ（第２のブロック判定データとも呼ぶ）を生成する。第２のブロック判定データは、第１のブロック判定データと同様に、文字ブロックを構成する画素の値が「１」とされ、非文字ブロックを個性する画素の値が「０」とされた二値データである。第２のブロック判定処理は、第１のブロック判定処理と同様に、スキャンデータを用いて、スキャン画像ＳＩ内に配置される複数個のブロックのそれぞれが、文字を示す文字ブロックであるか否かをブロックごとに判断する処理である。第２のブロック判定処理の詳細は後述する。 In S25, the CPU 210 executes a second block determination process on the scan data to indicate binary image data (second block) indicating a character block indicating characters and a non-character block indicating no characters. (Also called judgment data) is generated. In the second block determination data, similarly to the first block determination data, the value of the pixel constituting the character block is set to "1", and the value of the pixel individualizing the non-character block is set to "0". Value data. In the second block determination process, as in the first block determination process, whether or not each of the plurality of blocks arranged in the scan image SI using the scan data is a character block indicating a character. Is a process for determining each block. The details of the second block determination process will be described later.

図４（Ｄ）には、第２のブロック判定データによって示される第２のブロック判定画像ＢＩ２の一例が示されている。この第２のブロック判定画像ＢＩ２には、スキャン画像ＳＩ内の文字Ｏｂ６、Ｏｂ７が配置された領域に対応する文字ブロックＢｋ６、Ｂｋ７が特定されている。このように、第２のブロック判定データによって特定される文字ブロックは、文字とは異なるオブジェクトを含む領域に対応するブロックを含まない。第２のブロック判定データによって特定される文字ブロックは、背景よりも輝度が高い文字を示すブロックである。このために、第２のブロック判定画像ＢＩ２では、背景よりも輝度が低い文字Ｏｂ４、Ｏｂ５が配置された領域に対応する文字ブロックＢｋ４、Ｂｋ５（図４（Ｂ））が特定されていない。 FIG. 4D shows an example of the second block determination image BI2 shown by the second block determination data. In the second block determination image BI2, the character blocks Bk6 and Bk7 corresponding to the area in which the characters Ob6 and Ob7 are arranged in the scanned image SI are specified. As described above, the character block specified by the second block determination data does not include the block corresponding to the area including the object different from the character. The character block specified by the second block determination data is a block indicating a character having a brightness higher than that of the background. Therefore, in the second block determination image BI2, the character blocks Bk4 and Bk5 (FIG. 4B) corresponding to the area in which the characters Ob4 and Ob5 having lower brightness than the background are arranged are not specified.

Ｓ２６では、ＣＰＵ２１０は、Ｓ２４にて生成済みの第２の二値画像データと、Ｓ２５にて生成済みの第２のブロック判定データと、を用いて、論理積合成処理を実行する。これによって、複数個の第２の文字画素を示す第２の文字特定データが生成される。具体的には、ＣＰＵ２１０は、第２の二値画像データと、第２のブロック判定データと、の各画素の論理積を取ることによって、第２の文字特定データとしての二値画像データを生成する。換言すれば、ＣＰＵ２１０は、スキャン画像ＳＩ内の複数個の画素のうち、Ｓ２４において第２の文字候補画素として特定され、かつ、Ｓ２５において特定済みの文字ブロック内に位置する画素を、第２の文字画素として特定する。ＣＰＵ２１０は、スキャン画像ＳＩ内の複数個の画素のうち、第２の文字候補画素として特定されない画素と、非文字ブロック内の画素とを、第２の文字画素として特定しない。 In S26, the CPU 210 executes the logical product composition process using the second binary image data generated in S24 and the second block determination data generated in S25. As a result, second character identification data indicating a plurality of second character pixels is generated. Specifically, the CPU 210 generates binary image data as the second character identification data by taking the logical product of each pixel of the second binary image data and the second block determination data. do. In other words, among the plurality of pixels in the scanned image SI, the CPU 210 selects the pixel specified as the second character candidate pixel in S24 and located in the character block identified in S25 as the second character block. Specify as a character pixel. The CPU 210 does not specify the pixel that is not specified as the second character candidate pixel and the pixel in the non-character block as the second character pixel among the plurality of pixels in the scanned image SI.

図４（Ｃ）の第２の二値画像ＣＩ２に示すように、第２の二値画像データによって特定される複数個の第２の文字候補画素は、スキャン画像ＳＩ内の文字Ｏｂ６、Ｏｂ７を構成する画素Ｃｐ６、Ｃｐ７の他に、背景Ｂｇ１などを構成する画素Ｃｐ８を含んでいる。図４（Ｄ）の第２のブロック判定画像ＢＩ２に示すように、第２の二値画像データと第２のブロック判定データとの論理積を取ることで、第２の文字特定データでは、スキャン画像ＳＩ内の文字Ｏｂ６、Ｏｂ７を構成する画素Ｃｐ６、Ｃｐ７が、選択的に第２の文字画素として特定される。すなわち、複数個の第２の文字画素は、スキャン画像ＳＩ内の文字Ｏｂ６、Ｏｂ７を構成する画素Ｃｐ６、Ｃｐ７を含み、背景Ｂｇ１などを構成する画素Ｃｐ８を含まない。 As shown in the second binary image CI2 of FIG. 4C, the plurality of second character candidate pixels identified by the second binary image data have the characters Ob6 and Ob7 in the scan image SI. In addition to the constituent pixels Cp6 and Cp7, the constituent pixels Cp8 including the background Bg1 and the like are included. As shown in the second block determination image BI2 of FIG. 4D, by taking the logical product of the second binary image data and the second block determination data, the second character identification data can be scanned. The pixels Cp6 and Cp7 constituting the characters Ob6 and Ob7 in the image SI are selectively specified as the second character pixels. That is, the plurality of second character pixels include the pixels Cp6 and Cp7 constituting the characters Ob6 and Ob7 in the scanned image SI, and do not include the pixels Cp8 forming the background Bg1 and the like.

Ｓ２７では、ＣＰＵ２１０は、Ｓ２３にて生成済みの第１の文字特定データと、Ｓ２６にて生成済みの第２の文字特定データと、を用いて、論理和合成処理を実行する。これによって、最終的に特定すべき複数個の文字画素を示す文字特定データが生成される。換言すれば、ＣＰＵ２１０は、第１の文字特定データによって特定される複数個の第１の文字画素と、第２の文字特定データによって特定される複数個の第２の文字画素と、を含む画素群であって、第１の文字画素とも第２の文字画素とも異なる画素を含まない画素群を、最終的に、複数個の文字画素として特定する。この結果、第１の文字特定データと第２の文字特定データとを用いて、スキャン画像ＳＩ内の複数個の文字画素の特定漏れを効果的に低減できる。例えば、図４（Ｂ）に示す文字特定画像ＴＩのように、最終的な文字特定データによって特定される複数個の文字画素は、スキャン画像ＳＩ内の文字Ｏｂ４〜Ｏｂ７を構成する複数個の文字画素Ｔｐ４〜Ｔｐ７を含み、他のオブジェクトや背景を構成する画素を含んでいない。 In S27, the CPU 210 executes the OR composition process using the first character identification data generated in S23 and the second character identification data generated in S26. As a result, character identification data indicating a plurality of character pixels to be finally specified is generated. In other words, the CPU 210 is a pixel including a plurality of first character pixels specified by the first character identification data and a plurality of second character pixels specified by the second character identification data. A group of pixels that does not include pixels that are different from the first character pixel and the second character pixel is finally specified as a plurality of character pixels. As a result, the identification omission of a plurality of character pixels in the scanned image SI can be effectively reduced by using the first character identification data and the second character identification data. For example, as in the character identification image TI shown in FIG. 4B, the plurality of character pixels identified by the final character identification data are a plurality of characters constituting the characters Ob4 to Ob7 in the scanned image SI. It contains pixels Tp4 to Tp7, and does not include pixels that make up other objects or backgrounds.

Ａ−４：第１の二値画像データ生成処理
図２のＳ２１の第１の二値画像データ生成処理について説明する。図５は、第１の二値画像データ生成処理のフローチャートである。Ｓ１００では、ＣＰＵ２１０は、スキャンデータを用いて、最小成分データを生成する。具体的には、ＣＰＵ２１０は、スキャンデータに含まれる複数個の画素の値（ＲＧＢ値）のそれぞれから、最小成分値Ｖｍｉｎを取得する。最小成分値Ｖｍｉｎは、ＲＧＢ値に含まれる複数個の成分値（Ｒ値、Ｇ値、Ｂ値）のうちの最小値である。ＣＰＵ２１０は、これらの最小成分値Ｖｍｉｎを複数個の画素の値とする画像データを、最小成分データとして生成する。最小成分データは、スキャン画像ＳＩと同じサイズの画像を示す画像データである。最小成分データに含まれる複数個の画素の値のそれぞれは、スキャンデータの対応する画素の値（ＲＧＢ値）の最小成分値Ｖｍｉｎである。 A-4: First binary image data generation process The first binary image data generation process in S21 of FIG. 2 will be described. FIG. 5 is a flowchart of the first binary image data generation process. In S100, the CPU 210 uses the scan data to generate the minimum component data. Specifically, the CPU 210 acquires the minimum component value Vmin from each of the values (RGB values) of the plurality of pixels included in the scan data. The minimum component value Vmin is the minimum value among a plurality of component values (R value, G value, B value) included in the RGB value. The CPU 210 generates image data in which these minimum component values Vmin are the values of a plurality of pixels as the minimum component data. The minimum component data is image data indicating an image having the same size as the scanned image SI. Each of the values of the plurality of pixels included in the minimum component data is the minimum component value Vmin of the corresponding pixel value (RGB value) of the scan data.

図６は、スキャンデータの最小成分値と最大成分値の説明図である。図６（Ａ）〜図６（Ｅ）には、ＲＧＢ値の一例として、シアン（Ｃ）、マゼンタ（Ｍ）、イエロ（Ｙ）、黒（Ｋ）、白（Ｗ）のＲＧＢ値が、棒グラフで図示されている。図６に示すように、Ｃ、Ｍ、Ｙ、Ｋ、ＷのＲＧＢ値（Ｒ、Ｇ、Ｂ）は、それぞれ、（０、２５５、２５５）、（２５５、０、２５５）（２５５、２５５、０）、（０、０、０）、（２５５、２５５、２５５）である。 FIG. 6 is an explanatory diagram of the minimum component value and the maximum component value of the scan data. 6 (A) to 6 (E) show a bar graph showing RGB values of cyan (C), magenta (M), yellow (Y), black (K), and white (W) as an example of RGB values. Is illustrated in. As shown in FIG. 6, the RGB values (R, G, B) of C, M, Y, K, and W are (0, 255, 255), (255, 0, 255) (255, 255, respectively). 0), (0, 0, 0), (255, 255, 255).

これらのＲＧＢ値の輝度Ｙは、上述したように、例えば、Ｙ＝０．２９９×Ｒ＋０．５８７×Ｇ＋０．１１４×Ｂの式を用いて算出できる。Ｃ、Ｍ、Ｙ、Ｋ、Ｗの輝度（０〜２５５の値で表す）は、約１８６、１１３、２２６、０、２５５であり、それぞれに異なる値となる（図６）。これに対して、Ｃ、Ｍ、Ｙ、Ｋ、Ｗの最小成分値Ｖｍｉｎは、図６に示すように、０、０、０、０、２５５となり、白（Ｗ）を除いて同じ値となる。 As described above, the luminance Y of these RGB values can be calculated using, for example, the formula Y = 0.299 × R + 0.587 × G + 0.114 × B. The luminances of C, M, Y, K, and W (represented by values from 0 to 255) are about 186, 113, 226, 0, and 255, which are different values (FIG. 6). On the other hand, the minimum component values Vmin of C, M, Y, K, and W are 0, 0, 0, 0, 255 as shown in FIG. 6, which are the same values except for white (W). ..

図７は、画像処理に用いられる画像の一例を示す第２の図である。図７（Ａ）は、スキャン画像ＳＩのうち、上述した網点領域の拡大図である。例えば、図７（Ａ）の例では、スキャン画像ＳＩ内の網点領域は、複数個のＭドットＭＤと、複数個のＹドットＹＤと、を含んでいる。ここでは、説明のために、ＭドットＭＤを示す画像は、マゼンタの原色を有する均一な画像であり、ＹドットＹＤを示す画像は、イエロの原色を有する均一な画像であるとする。 FIG. 7 is a second diagram showing an example of an image used for image processing. FIG. 7A is an enlarged view of the halftone dot region described above in the scanned image SI. For example, in the example of FIG. 7A, the halftone dot region in the scanned image SI includes a plurality of M dot MDs and a plurality of Y dots YD. Here, for the sake of explanation, it is assumed that the image showing the M dot MD is a uniform image having the primary color of magenta, and the image showing the Y dot YD is a uniform image having the primary color of yellow.

図７（Ｂ）には、最小成分データによって示される最小成分画像ＭＮＩの一例が示されている。この最小成分画像ＭＮＩは、図７（Ａ）のスキャン画像ＳＩに対応している。最小成分画像ＭＮＩでは、スキャン画像ＳＩのＹドットＭＤに対応する領域ＭＤｂ内の画素の値と、ＹドットＹＤに対応する領域ＹＤｂ内の画素の値と、は互いに同じとなる。図７（Ｃ）には、比較例として、各画素の輝度を示す輝度画像データによって示される輝度画像ＹＩが示されている。この輝度画像ＹＩは、図７（Ａ）のスキャン画像ＳＩに対応している。輝度画像ＹＩでは、最小成分画像ＭＮＩとは異なり、スキャン画像ＳＩのＭドットＭＤに対応する領域ＭＤｄ内の画素の値と、ＹドットＹＤに対応する領域ＹＤｄ内の画素の値と、は互いに異なる。 FIG. 7B shows an example of the minimum component image MNI represented by the minimum component data. This minimum component image MNI corresponds to the scanned image SI of FIG. 7 (A). In the minimum component image MNI, the values of the pixels in the region MDb corresponding to the Y dot MD of the scanned image SI and the values of the pixels in the region YDb corresponding to the Y dot YD are the same as each other. As a comparative example, FIG. 7C shows a luminance image YI represented by luminance image data indicating the luminance of each pixel. This luminance image YI corresponds to the scanned image SI of FIG. 7 (A). In the luminance image YI, unlike the minimum component image MNI, the values of the pixels in the region MDd corresponding to the M dot MD of the scan image SI and the values of the pixels in the region YDd corresponding to the Y dot YD are different from each other. ..

以上の説明から解るように、最小成分画像ＭＮＩでは、スキャン画像ＳＩにおいて、原稿内のＣ、Ｍ、Ｙ、Ｋドットが形成された部分に対応する複数個の画素の値の間の差が、輝度画像ＹＩよりも小さくなる。そして、最小成分画像ＭＮＩでは、スキャン画像ＳＩにおいて、原稿内の地色（用紙の白色）を示す領域に対応する地色領域の画素の値が、ドットが形成された部分に対応する画素の値よりも大きくなる。 As can be seen from the above description, in the minimum component image MNI, in the scanned image SI, the difference between the values of the plurality of pixels corresponding to the portions where the C, M, Y, and K dots are formed in the document is large. It is smaller than the luminance image YI. Then, in the minimum component image MNI, in the scanned image SI, the pixel value of the ground color region corresponding to the region indicating the ground color (white of the paper) in the document is the pixel value corresponding to the portion where the dots are formed. Will be larger than.

Ｓ１１０では、ＣＰＵ２１０は、生成された最小成分データに対して、二値化処理を実行して、二値画像データを生成する。本ステップにて生成される二値画像データが、第１の二値画像データである。例えば、ＣＰＵ２１０は、最小成分データにおいて、画素の値が閾値（例えば、１２８）以下である画素を、第１の文字候補画素に分類し、画素の値が閾値より大きな値である画素を、第１の文字候補画素に分類しない。二値画像データでは、上述したように、第１の文字候補画素の値は、「１」とされ、それ以外の画素の値は、「０」とされる。 In S110, the CPU 210 executes a binarization process on the generated minimum component data to generate binary image data. The binary image data generated in this step is the first binary image data. For example, the CPU 210 classifies a pixel having a pixel value equal to or less than a threshold value (for example, 128) as a first character candidate pixel in the minimum component data, and classifies a pixel having a pixel value larger than the threshold value. It is not classified into 1 character candidate pixel. In the binary image data, as described above, the value of the first character candidate pixel is set to "1", and the value of the other pixels is set to "0".

以上説明した第１の二値画像データ生成処理によれば、最小成分データを用いて、第１の二値画像データが生成される。一般的に、背景よりも輝度が低い（すなわち、色が濃い）文字は、主としてドットで構成され、背景は、主として用紙の白色で構成される場合が多い。最小成分データでは、図７を参照して説明したように、原稿内の地色（用紙の白色）を示す領域に対応する地色領域の画素の値が、ドットが形成された部分に対応する画素の値よりも大きくなるので、第１の二値画像データにおいて、背景よりも輝度が低い（すなわち、色が濃い）文字を構成する画素の特定漏れを抑制できる。 According to the first binary image data generation process described above, the first binary image data is generated using the minimum component data. In general, characters that are less bright (that is, darker in color) than the background are often composed mainly of dots, and the background is often composed mainly of the white color of the paper. In the minimum component data, as described with reference to FIG. 7, the pixel value of the ground color region corresponding to the region indicating the ground color (white of the paper) in the document corresponds to the portion where the dots are formed. Since it is larger than the pixel value, it is possible to suppress specific omission of pixels constituting characters having lower brightness (that is, darker color) than the background in the first binary image data.

例えば、イエロ（Ｙ）は、Ｃ、Ｍ、Ｋと比較して濃度が低い（輝度が高い）。このために、用紙の地色（白）の背景に、イエロの文字がある場合には、例えば、輝度を示す輝度画像データを二値化しても、該イエロの文字を構成する文字画素を、適切に文字候補画素として特定できない場合がある。本実施例では、このような場合でも該イエロの文字を構成する文字画素を、適切に文字候補画素として特定できる。このために、最小成分データを用いて、背景よりも輝度が低い文字を構成する文字画素の特定を実行することで、例えば、輝度画像データだけでは特定できない文字を特定し得る。この結果、スキャン画像ＳＩ内の文字候補画素の特定精度を向上できる。 For example, yellow (Y) has a lower density (higher brightness) than C, M, and K. For this reason, when there are yellow characters on the background of the background color (white) of the paper, for example, even if the luminance image data indicating the brightness is binarized, the character pixels constituting the yellow characters are displayed. It may not be possible to properly identify it as a character candidate pixel. In this embodiment, even in such a case, the character pixels constituting the character of the yellow can be appropriately specified as the character candidate pixels. For this purpose, by executing the identification of the character pixels constituting the characters having the brightness lower than that of the background by using the minimum component data, for example, the characters that cannot be specified only by the luminance image data can be identified. As a result, the accuracy of specifying the character candidate pixels in the scanned image SI can be improved.

Ａ−５：第２の二値画像データ生成処理
図２のＳ２４の第２の二値画像データ生成処理について説明する。図８は、第２の二値画像データ生成処理のフローチャートである。Ｓ２００では、ＣＰＵ２１０は、スキャンデータを用いて、輝度画像データを生成する。具体的には、ＣＰＵ２１０は、スキャンデータから取得される各画素のＲ値、Ｇ値、Ｂ値を用いて、各画素の輝度Ｙを算出する。輝度Ｙは、例えば、上記３成分の加重平均であり、具体的には、Ｙ＝０．２９９×Ｒ＋０．５８７×Ｇ＋０．１１４×Ｂの式を用いて算出できる。このように、輝度画像データの複数個の画素の値は、スキャン画像ＳＩ内の複数個の画素のうちの対応する画素の輝度Ｙを示す。 A-5: Second binary image data generation process The second binary image data generation process of S24 of FIG. 2 will be described. FIG. 8 is a flowchart of the second binary image data generation process. In S200, the CPU 210 uses the scan data to generate luminance image data. Specifically, the CPU 210 calculates the brightness Y of each pixel using the R value, G value, and B value of each pixel acquired from the scan data. The brightness Y is, for example, a weighted average of the above three components, and can be specifically calculated using the formula Y = 0.299 × R + 0.587 × G + 0.114 × B. As described above, the value of the plurality of pixels of the luminance image data indicates the luminance Y of the corresponding pixel among the plurality of pixels in the scanned image SI.

Ｓ２１０では、ＣＰＵ２１０は、生成された輝度画像データに対して、二値化処理を実行して、二値画像データを生成する。例えば、ＣＰＵ２１０は、輝度画像データにおいて、画素の値（すなわち、輝度）が閾値（例えば、１２８）以上である画素を、文字候補画素に分類し、画素の値が閾値未満である画素を、文字候補画素とは異なる画素に分類する。二値画像データでは、上述したように、文字候補画素の値は、「１」とされ、文字候補画素とは異なる画素の値は、「０」とされる。そして、Ｓ２１０にて生成される二値画像データが、第２の二値画像データである。第２の二値画像データでは、上述したように背景より輝度が高い（すなわち、色が薄い）文字を構成する文字画素を含む第２の文字候補画素が特定される。 In S210, the CPU 210 executes a binarization process on the generated luminance image data to generate the binary image data. For example, the CPU 210 classifies pixels having a pixel value (that is, brightness) equal to or greater than a threshold value (for example, 128) into character candidate pixels in the luminance image data, and characterizes pixels having a pixel value less than the threshold value. Classify into pixels different from the candidate pixels. In the binary image data, as described above, the value of the character candidate pixel is set to "1", and the value of the pixel different from the character candidate pixel is set to "0". The binary image data generated in S210 is the second binary image data. In the second binary image data, as described above, the second character candidate pixel including the character pixel constituting the character having higher brightness (that is, lighter color) than the background is specified.

以上説明した第２の二値画像データ生成処理によれば、輝度画像データを用いて、第２の二値画像データが生成される。文字の読みやすさのために、文字の色と背景の色とは、輝度に比較的大きな差があることが多い。このために、輝度画像データを用いることで、文字画素の特定漏れを抑制することができる。 According to the second binary image data generation process described above, the second binary image data is generated using the luminance image data. Due to the readability of the characters, there is often a relatively large difference in brightness between the color of the characters and the color of the background. Therefore, by using the luminance image data, it is possible to suppress the specific omission of the character pixel.

さらに、第２の二値画像データと、第１の二値画像データとは、互いに異なる文字画素を特定し得る。例えば、上述したように、第２の二値画像データは、背景より輝度が高い文字を構成する画素を文字候補画素として特定できるのに対して、第１の二値画像データは、背景より輝度が低い文字を構成する画素を文字候補画素として特定できる。また、第１の二値画像データは、例えば、印刷に用いられるＣ、Ｍ、Ｙ、Ｋの原色の背景上に位置し、かつ、Ｃ、Ｍ、Ｙ、Ｋの別の原色あるいは用紙の白色を有する文字を構成する文字画素を特定し難いのに対して、第２の二値画像データは、このような文字を構成する文字画素を特定し得る。このために、第１の二値画像データと第２の二値画像データとを併用することで、さらに、文字候補画素の特定漏れを抑制することができる。 Further, the second binary image data and the first binary image data can identify character pixels different from each other. For example, as described above, in the second binary image data, pixels constituting characters having higher brightness than the background can be specified as character candidate pixels, whereas in the first binary image data, the brightness is higher than that of the background. Pixels constituting characters with a low value can be specified as character candidate pixels. Further, the first binary image data is located on the background of the primary colors C, M, Y, and K used for printing, and is another primary color of C, M, Y, and K, or the white color of the paper. While it is difficult to specify the character pixels constituting the character having the above, the second binary image data can specify the character pixels constituting such a character. Therefore, by using the first binary image data and the second binary image data in combination, it is possible to further suppress the omission of specific character candidate pixels.

Ａ−６：ブロック判定処理
図２のＳ２２の第１のブロック判定処理およびＳ２５の第２のブロック判定処理について説明する。これらのブロック判定処理は、文字を示す複数個の文字画像データと文字を示さない複数個の非文字画像データとを用いてトレーニングされた機械学習モデルを用いて実行される。第１のブロック判定処理と第２のブロック判定処理とでは、用いられる機械学習モデルが互いに異なる。第１のブロック判定処理で用いられる機械学習モデルを第１の機械学習モデルと呼び、第２のブロック判定処理で用いられる機械学習モデルを第２の機械学習モデルと呼ぶ。本実施例では、第１のブロック判定処理と第２のブロック判定処理とでは、用いられる機械学習モデルを除いた処理は同一であり、１つのフローチャートを用いて説明する。 A-6: Block determination process The first block determination process of S22 and the second block determination process of S25 of FIG. 2 will be described. These block determination processes are executed using a machine learning model trained using a plurality of character image data showing characters and a plurality of non-character image data not showing characters. The machine learning models used in the first block determination process and the second block determination process are different from each other. The machine learning model used in the first block determination process is called a first machine learning model, and the machine learning model used in the second block determination process is called a second machine learning model. In this embodiment, the first block determination process and the second block determination process are the same except for the machine learning model used, and will be described with reference to one flowchart.

図９は、ブロック判定処理のフローチャートである。図１０は、スキャン画像ＳＩ上に配置される複数個のブロックＢＬの説明図である。ブロック判定処理は、上述したように、ブロック判定データを、スキャン画像ＳＩ内に配置される複数個のブロックＢＬのそれぞれが文字を示す文字ブロックであるか否かをブロックＢＬごとに判断することによって生成する処理である。 FIG. 9 is a flowchart of the block determination process. FIG. 10 is an explanatory diagram of a plurality of blocks BL arranged on the scanned image SI. As described above, the block determination process determines whether or not each of the plurality of block BLs arranged in the scan image SI is a character block indicating a character based on the block determination data for each block BL. It is a process to generate.

Ｓ４００では、ＣＰＵ２１０は、ブロック判定データを生成するためのキャンバスデータをメモリ（具体的には、揮発性記憶装置２２０のバッファ領域）に準備する。キャンバスデータによって示されるキャンバス（初期画像）は、スキャン画像ＳＩと同じサイズの画像、すなわち、同じ画素数の画像である。キャンバスデータの各画素の値は、所定の初期値（例えば、０）である。 In S400, the CPU 210 prepares canvas data for generating block determination data in a memory (specifically, a buffer area of the volatile storage device 220). The canvas (initial image) indicated by the canvas data is an image having the same size as the scanned image SI, that is, an image having the same number of pixels. The value of each pixel of the canvas data is a predetermined initial value (for example, 0).

Ｓ４０５では、ＣＰＵ２１０は、スキャン画像ＳＩに、注目ブロックを設定する。最初の注目ブロックは、本実施例では、図１０の左上のブロックＢＬ（１）である。１個のブロックは、Ｎ個（Ｎは２以上の整数）の画素を含む矩形の領域である。ここで、図１０において、スキャン画像ＳＩ上にマトリクス状に配置された破線で示される複数個の升目は、サブブロックＳＢを示している。１個のサブブロックＳＢは、ｋ個（ｋは、１≦ｋ＜Ｎをを満たす整数）の画素を含む矩形の領域である。本実施例では、サブブロックＳＢは、縦Ｍ画素×横Ｍ画素（Ｍは、１以上の整数）の領域である（ｋ＝（Ｍ×Ｍ））。例えば、１個のブロックＢＬは、本実施例では、縦Ｌ個×横Ｌ個（Ｌは、２以上の整数）のサブブロックＳＢを含む領域である。すなわち、本実施例の各ブロックＢＬは、縦（Ｌ×Ｍ）画素×横（Ｌ×Ｍ）画素の領域である。本実施例では、Ｍ＝１０、Ｌ＝５であるので、各ブロックＢＬは、縦５０画素×横５０画素の領域である（Ｎ＝２５００）。 In S405, the CPU 210 sets a block of interest in the scanned image SI. The first block of interest is the block BL (1) in the upper left of FIG. 10 in this embodiment. One block is a rectangular area containing N pixels (N is an integer of 2 or more). Here, in FIG. 10, a plurality of squares indicated by broken lines arranged in a matrix on the scanned image SI indicate a sub-block SB. One subblock SB is a rectangular area containing k pixels (k is an integer satisfying 1 ≦ k <N). In this embodiment, the sub-block SB is an area of vertical M pixels × horizontal M pixels (M is an integer of 1 or more) (k = (M × M)). For example, one block BL is an area including sub-block SBs of L vertical × L horizontal (L is an integer of 2 or more) in this embodiment. That is, each block BL of this embodiment is a region of vertical (L × M) pixels × horizontal (L × M) pixels. In this embodiment, since M = 10 and L = 5, each block BL is an area of 50 pixels in the vertical direction and 50 pixels in the horizontal direction (N = 2500).

Ｓ４１０では、ＣＰＵ２１０は、機械学習モデルを用いて、注目ブロック内の画像が文字を示す確率（文字確率Ｔｘｒと呼ぶ）を算出する。 In S410, the CPU 210 uses a machine learning model to calculate the probability that the image in the block of interest indicates a character (referred to as the character probability Txr).

機械学習モデルは、ＣＮＮ（Convolutional Neural Network）を用いたモデルである。このような機械学習モデルとしては、例えば、ＬｅＮｅｔやＡｌｅｘＮｅｔが用いられる。
ＬｅＮｅｔは、例えば、「Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner(1998): Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11(November 1998),2278-2324.」に開示されている。ＡｌｅｘＮｅｔは、例えば、「Alex Krizhevsky, Ilya Sutskever and Geoffrey E. Hinton(2012): ImageNet classification with deep convolutional neural networks In F. Pereira, C. J. C. Burges, L. Bottou, & K. Q. Weinberger, eds. Advances in Neural Information Processing Systems 25. Curran Associates, Inc., 1097-1105.5」に開示されている。 The machine learning model is a model using CNN (Convolutional Neural Network). As such a machine learning model, for example, LeNet or AlexNet is used.
LeNet, for example, "Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner (1998): Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (November 1998), 2278-2324. It is disclosed in. AlexNet, for example, "Alex Krizhevsky, Ilya Sutskever and Geoffrey E. Hinton (2012): ImageNet classification with deep convolutional neural networks In F. Pereira, CJC Burges, L. Bottou, & KQ Weinberger, eds. Advances in Neural Information Processing Systems 25. Curran Associates, Inc., 1097-1105.5 ”.

機械学習モデルの入力は、注目ブロック内のＮ個の画素の値（例えば、ＲＧＢ値や輝度値）を、各画素の注目ブロック内での位置に応じた順序で並べた行列である。すなわち、注目ブロック内のＮ個の画素の値が、注目ブロック内における当該画素の位置と対応付けて入力される。機械学習モデルの出力は、上述した文字確率Ｔｘｒである。 The input of the machine learning model is a matrix in which the values of N pixels in the block of interest (for example, RGB values and luminance values) are arranged in an order according to the position of each pixel in the block of interest. That is, the values of N pixels in the attention block are input in association with the positions of the pixels in the attention block. The output of the machine learning model is the character probability Txr described above.

例えば、文字確率Ｔｘｒは、０〜１００％の数値で表される。ＣＰＵ２１０は、後述するＳ４１５〜Ｓ４３０に示すように、文字確率Ｔｘｒに基づいて、注目ブロックが、文字を示す文字ブロック、文字を示さない非文字ブロック、文字を示すか不明である不明ブロックのいずれであるかを判断する。ＣＰＵ２１０は、文字確率Ｔｘｒが、ＴＨ１≦Ｔｘｒである場合には、注目ブロックは、文字ブロックであると判断する。ＣＰＵ２１０は、文字確率Ｔｘｒが、ＴＨ２≦Ｔｘｒ＜ＴＨ１である場合には、注目ブロックは、不明ブロックであると判断する。ＣＰＵ２１０は、文字確率Ｔｘｒが、Ｔｘｒ＜ＴＨ２である場合には、注目ブロックは、非文字ブロックであると判断する。閾値ＴＨ１は、例えば、７５％であり、閾値ＴＨ２は、例えば、２５％である。 For example, the character probability Txr is represented by a numerical value of 0 to 100%. As shown in S415 to S430 described later, the CPU 210 determines whether the block of interest is a character block indicating a character, a non-character block indicating no character, or an unknown block indicating a character or not, based on the character probability Txr. Determine if there is. When the character probability Txr is TH1 ≦ Txr, the CPU 210 determines that the attention block is a character block. When the character probability Txr is TH2 ≦ Txr <TH1, the CPU 210 determines that the block of interest is an unknown block. When the character probability Txr is Txr <TH2, the CPU 210 determines that the attention block is a non-character block. The threshold TH1 is, for example, 75%, and the threshold TH2 is, for example, 25%.

このように、Ｎ個の画素の値の組みあわせが同じ画像であっても、どの画素の値が、注目ブロック内のどの位置にあるかが異なっていれば、異なる文字確率Ｔｘｒが出力され、該文字確率Ｔｘｒに基づいて、注目ブロックについての判断が行われる。このように、注目ブロックが、文字ブロック、非文字ブロック、不明ブロックのいずれであるかは、注目ブロック内のＮ個の画素の位置とＮ個の画素の値とに応じて判断されることが解る。 In this way, even if the combination of the values of N pixels is the same, different character probabilities Txr are output if the value of which pixel is in which position in the block of interest is different. A judgment is made about the block of interest based on the character probability Txr. In this way, whether the attention block is a character block, a non-character block, or an unknown block can be determined according to the positions of N pixels and the values of N pixels in the attention block. I understand.

機械学習モデルは、例えば、文字を示す所定個数（例えば、３０００個）の文字画像データと、文字を示さない所定個数の（例えば、３０００個）の非文字画像データと、を用いてトレーニングされている。これらのトレーニング用の文字画像データおよび非文字画像データは、Ｎ個の画素を含むブロックＢＬと同じサイズの画像である。非文字画像は、文字とは異なるオブジェクト（例えば、写真）やベタ塗りの背景などを示す画像である。 The machine learning model is trained using, for example, a predetermined number of character image data showing characters (for example, 3000) and a predetermined number of non-character image data not showing characters (for example, 3000). There is. The character image data and the non-character image data for training are images having the same size as the block BL including N pixels. A non-character image is an image showing an object (for example, a photograph) different from characters, a solid background, or the like.

ここで、第１のブロック判定処理に用いられる第１の機械学習モデルと、第２のブロック判定処理に用いられる第２の機械学習モデルと、では、トレーニングに用いられる文字画像データおよび非文字画像データのセットが異なる。第１の機械学習モデルは、背景よりも輝度が低い文字を示す複数個の第１の文字画像データと、背景よりも輝度が低い文字を示さない複数個の第１の非文字画像データと、を用いてトレーニングされている。第２の機械学習モデルは、背景よりも輝度が高い文字を示す複数個の第２の文字画像データと、背景よりも輝度が高い文字を示さない複数個の第２の非文字画像データと、を用いてトレーニングされている。このために、第１のブロック判定処理では、背景よりも輝度が低い文字を示すブロックは、文字ブロックであると判断されるが、背景よりも輝度が高い文字を示すブロックは、文字ブロックであると判断されない。そして、第２のブロック判定処理では、背景よりも輝度が低い文字を示すブロックは、文字ブロックであると判断されず、背景よりも輝度が高い文字を示すブロックは、文字ブロックであると判断される。以下の図１１では、処理の動作の説明のために、白の背景に黒の文字が例示されているが、この例示は、第１のブロック判定処理のための例示である。第２のブロック判定処理では、図１１と同様の文字であって、黒の背景に白の文字である例で、同様の動作になる。 Here, in the first machine learning model used for the first block determination process and the second machine learning model used for the second block determination process, the character image data and the non-character image used for training are used. The set of data is different. The first machine learning model includes a plurality of first character image data showing characters having a brightness lower than the background, a plurality of first non-character image data showing characters having a brightness lower than the background, and a plurality of first non-character image data. Have been trained using. The second machine learning model includes a plurality of second character image data showing characters having a brightness higher than the background, a plurality of second non-character image data showing characters having a brightness higher than the background, and a plurality of second non-character image data. Have been trained using. Therefore, in the first block determination process, a block indicating a character having a brightness lower than that of the background is determined to be a character block, but a block indicating a character having a brightness higher than the background is a character block. Is not judged. Then, in the second block determination process, the block indicating a character having a brightness lower than the background is not determined to be a character block, and the block indicating a character having a brightness higher than the background is determined to be a character block. NS. In FIG. 11 below, black characters are illustrated on a white background for the purpose of explaining the operation of the processing, and this example is an example for the first block determination processing. In the second block determination process, the same operation is performed in the case where the characters are the same as those in FIG. 11 and the characters are white on a black background.

図１１は、ブロックＢＬごとの判断の一例を示す図である。例えば、図１１（Ａ）のブロックＢＬ（１）や図１１（Ｂ）のブロックＢＬ（２）が注目ブロックである場合には、注目ブロック内の比較的広い範囲を文字が占めているので、注目ブロックは、文字ブロックであると判断される。例えば、図１１（Ｃ）のブロックＢＬ（３）が注目ブロックである場合には、注目ブロック内に文字が含まれているものの、文字が占める範囲は比較的狭いので、注目ブロックは、不明ブロックであると判断される。例えば、図１１（Ｄ）のブロックＢＬ（４）が注目ブロックである場合には、注目ブロック内に文字が含まれていないので、注目ブロックは、非文字ブロックであると判断される。以下、Ｓ４１５〜Ｓ４３０の処理を具体的に説明する。 FIG. 11 is a diagram showing an example of determination for each block BL. For example, when the block BL (1) of FIG. 11 (A) and the block BL (2) of FIG. 11 (B) are attention blocks, the characters occupy a relatively wide range in the attention block, so that the characters occupy a relatively wide range. The attention block is determined to be a character block. For example, when the block BL (3) of FIG. 11C is the attention block, the attention block is an unknown block because the character is included in the attention block but the range occupied by the characters is relatively narrow. Is judged to be. For example, when the block BL (4) of FIG. 11 (D) is the attention block, the attention block is determined to be a non-character block because no character is included in the attention block. Hereinafter, the processes of S415 to S430 will be specifically described.

Ｓ４１５では、ＣＰＵ２１０は、Ｓ４１０にて算出された文字確率Ｔｘｒが閾値ＴＨ１以上であるか否かを判断する。文字確率Ｔｘｒが閾値ＴＨ１以上である場合には（Ｓ４１５：ＹＥＳ）、注目ブロックは文字ブロックであると判断される。このために、この場合には、Ｓ４２０にて、ＣＰＵ２１０は、注目ブロック内の全画素の値を、文字を示す値に設定する。文字確率Ｔｘｒが閾値ＴＨ１未満である場合には（Ｓ４１５：ＮＯ）、Ｓ４２０はスキップされる。 In S415, the CPU 210 determines whether or not the character probability Txr calculated in S410 is equal to or greater than the threshold value TH1. When the character probability Txr is equal to or greater than the threshold value TH1 (S415: YES), the attention block is determined to be a character block. Therefore, in this case, in S420, the CPU 210 sets the values of all the pixels in the block of interest to the values indicating the characters. If the character probability Txr is less than the threshold TH1 (S415: NO), S420 is skipped.

図１２は、ブロック判定データにおける画素の値の設定の一例を示す図である。図１２（Ａ）〜（Ｄ）には、ブロック判定データによって示されるブロック判定画像ＢＩが概念的に示されている。図１１（Ａ）のブロックＢＬ（１）や図１１（Ｂ）のブロックＢＬ（２）が注目ブロックである場合には、注目ブロックは文字ブロックであると判断されるので、ブロック判定画像ＢＩにおいても、図１２（Ａ）、（Ｂ）に示すように、ブロックＢＬ（１）、ＢＬ（２）内の全ての画素の値が、文字を示す値「１」に設定される。 FIG. 12 is a diagram showing an example of setting a pixel value in the block determination data. 12 (A) to 12 (D) conceptually show the block determination image BI indicated by the block determination data. When the block BL (1) of FIG. 11 (A) and the block BL (2) of FIG. 11 (B) are attention blocks, it is determined that the attention block is a character block. Also, as shown in FIGS. 12A and 12B, the values of all the pixels in the blocks BL (1) and BL (2) are set to the value "1" indicating the character.

Ｓ４２５では、ＣＰＵ２１０は、文字確率Ｔｘｒが閾値ＴＨ２未満であるか否かを判断する。文字確率Ｔｘｒが閾値ＴＨ２未満である場合には（Ｓ４２５：ＹＥＳ）、注目ブロックは非文字ブロックであると判断される。このために、この場合には、Ｓ４３０にて、ＣＰＵ２１０は、注目ブロック内の全画素の値を非文字を示す値に設定する。文字確率Ｔｘｒが閾値ＴＨ２以上である場合には（Ｓ４２５：ＮＯ）、Ｓ４３０はスキップされる。 In S425, the CPU 210 determines whether or not the character probability Txr is less than the threshold TH2. When the character probability Txr is less than the threshold value TH2 (S425: YES), the attention block is determined to be a non-character block. Therefore, in this case, in S430, the CPU 210 sets the values of all the pixels in the block of interest to values indicating non-characters. When the character probability Txr is equal to or higher than the threshold value TH2 (S425: NO), S430 is skipped.

図１１（Ｄ）のブロックＢＬ（４）が注目ブロックである場合には、注目ブロックは非文字ブロックであると判断されるので、ブロック判定画像ＢＩにおいても、図１２（Ｄ）に示すように、ブロックＢＬ（４）内の全ての画素の値が、非文字を示す値「２」に設定される。 When the block BL (4) of FIG. 11 (D) is the block of interest, it is determined that the block of interest is a non-character block. Therefore, also in the block determination image BI, as shown in FIG. 12 (D). , The values of all the pixels in the block BL (4) are set to the value "2" indicating non-characters.

なお、文字確率Ｔｘｒが閾値ＴＨ２以上であり、かつ、閾値ＴＨ１未満である場合には（Ｓ４１５：ＮＯ、かつ、Ｓ４２５：ＮＯ）、注目ブロックは、不明ブロックであると判断される。このために、この場合には、注目ブロック内の全ての画素の値は変更されない。すなわち、この時点で、文字を示す値「１」を有する画素は、文字を示す値のまま維持され、非文字を示す値「２」を有する画素は、非文字を示す値のまま維持され、不明を示す値「０」を有する画素は、不明を示す値のまま維持される。 When the character probability Txr is equal to or higher than the threshold value TH2 and less than the threshold value TH1 (S415: NO and S425: NO), the attention block is determined to be an unknown block. Therefore, in this case, the values of all the pixels in the block of interest are not changed. That is, at this point, the pixel having the value "1" indicating the character is maintained as the value indicating the character, and the pixel having the value "2" indicating the non-character is maintained as the value indicating the non-character. Pixels having a value "0" indicating unknown are maintained at the value indicating unknown.

図１１（Ｃ）のブロックＢＬ（３）が注目ブロックである場合には、注目ブロックは不明ブロックであると判断されるので、ブロック判定画像ＢＩにおいて、図１２（Ｃ）に示すように、ブロックＢＬ（４）内の全ての画素の値は、変更されることなく、維持される。 When the block BL (3) of FIG. 11 (C) is the block of interest, it is determined that the block of interest is an unknown block. Therefore, in the block determination image BI, as shown in FIG. 12 (C), the block is blocked. The values of all the pixels in BL (4) are maintained unchanged.

Ｓ４３５では、ＣＰＵ２１０は、注目ブロックを右方向にＭ画素だけ移動する。すなわち、注目ブロックは、右方向に１個のサブブロックＳＢ分だけ移動される。例えば、図１０のブロックＢＬ（１）が注目ブロックである場合には、ブロックＢＬ（２）が新たな注目ブロックに設定される。図１０のブロックＢＬ（ｑ−１）が注目ブロックである場合には、ブロックＢＬ（ｑ）が新たな注目ブロックに設定される。 In S435, the CPU 210 moves the attention block to the right by M pixels. That is, the attention block is moved to the right by one subblock SB. For example, when the block BL (1) in FIG. 10 is the block of interest, the block BL (2) is set as a new block of interest. When the block BL (q-1) of FIG. 10 is the block of interest, the block BL (q) is set as a new block of interest.

Ｓ４４０では、ＣＰＵ２１０は、注目ブロックを右方向にＭ画素だけ移動した結果、注目ブロックの右端は、スキャン画像ＳＩの右端よりも右側に移動したか否かを判断する。すなわち、移動後の新たな注目ブロックが、スキャン画像ＳＩの右側にはみ出したか否かを判断する。例えば、新たな注目ブロックが、図１０のブロックＢＬ（ｑ）やブロックＢＬ（ｅ）である場合には、注目ブロックの右端は、スキャン画像ＳＩの右端よりも右側に移動したと判断される。 In S440, the CPU 210 determines whether or not the right end of the attention block has moved to the right side of the right end of the scan image SI as a result of moving the attention block by M pixels to the right. That is, it is determined whether or not the new attention block after the movement protrudes to the right side of the scanned image SI. For example, when the new attention block is the block BL (q) or the block BL (e) of FIG. 10, it is determined that the right end of the attention block has moved to the right side of the right end of the scanned image SI.

注目ブロックの右端が、スキャン画像ＳＩの右端よりも右側に移動していない場合には（Ｓ４４０：ＮＯ）、ＣＰＵ２１０は、Ｓ４１０に戻る。このように、例えば、注目ブロックを右方向にＭ画素ずつずらしながら、順次に、ブロックごとの判断（Ｓ４１０〜Ｓ４３０）が行われる。図１０の例では、ブロックＢＬ（１）、ＢＬ（２）、ＢＬ（３）の順に、各ブロックＢＬが、文字ブロック、非文字ブロック、不明ブロックのいずれであるかが判断される。 When the right end of the attention block has not moved to the right side of the right end of the scanned image SI (S440: NO), the CPU 210 returns to S410. In this way, for example, while shifting the attention block by M pixels to the right, the determination for each block (S410 to S430) is sequentially performed. In the example of FIG. 10, it is determined whether each block BL is a character block, a non-character block, or an unknown block in the order of blocks BL (1), BL (2), and BL (3).

注目ブロックの右端が、スキャン画像ＳＩの右端よりも右側に移動した場合には（Ｓ４４０：ＹＥＳ）、Ｓ４４５にて、ＣＰＵ２１０は、注目ブロックをスキャン画像ＳＩの左端に移動し、Ｓ４５０にて、注目ブロックを下方向にＭ画素だけ移動する。 When the right end of the attention block is moved to the right side of the right end of the scan image SI (S440: YES), in S445, the CPU 210 moves the attention block to the left end of the scan image SI, and in S450, attention is paid. The block is moved downward by M pixels.

Ｓ４５５では、ＣＰＵ２１０は、注目ブロックを下方向にＭ画素だけ移動した結果、注目ブロックの下端は、スキャン画像ＳＩの下端よりも下側に移動したか否かを判断する。すなわち、移動後の新たな注目ブロックが、スキャン画像ＳＩの下側にはみ出したか否かを判断する。例えば、新たな注目ブロックが、図１０のブロックＢＬ（ｅ＋１）である場合には、注目ブロックの下端は、スキャン画像ＳＩの下端よりも下側に移動したと判断される。例えば、移動後の新たな注目ブロックが、図１０のブロックＢＬ（ｅ＋１）である場合には、注目ブロックの下端は、スキャン画像ＳＩの下端よりも下側に移動したと判断される。 In S455, the CPU 210 determines whether or not the lower end of the attention block has moved below the lower end of the scanned image SI as a result of moving the attention block downward by M pixels. That is, it is determined whether or not the new attention block after the movement protrudes below the scanned image SI. For example, when the new attention block is the block BL (e + 1) of FIG. 10, it is determined that the lower end of the attention block has moved below the lower end of the scanned image SI. For example, when the new attention block after movement is the block BL (e + 1) in FIG. 10, it is determined that the lower end of the attention block has moved below the lower end of the scanned image SI.

注目ブロックの下端が、スキャン画像ＳＩの下端よりも下側に移動していない場合には（Ｓ４５５：ＮＯ）、ＣＰＵ２１０は、Ｓ４１０に戻る。このように、例えば、注目ブロックを下方向にＭ画素ずつずらしながら、順次に、左端から右端までの１行分のブロックＢＬの判断が、一行ずつ行われる。例えば、図１０の右端のブロックＢＬ（ｑ−１）の次に、判断の対象となる注目ブロックは、Ｍ画素だけ下側の行の左端のブロックＢＬ（ｑ＋１）である。 If the lower end of the attention block has not moved below the lower end of the scanned image SI (S455: NO), the CPU 210 returns to S410. In this way, for example, while shifting the block of interest downward by M pixels, the block BL for one line from the left end to the right end is sequentially determined line by line. For example, next to the rightmost block BL (q-1) in FIG. 10, the block of interest to be determined is the leftmost block BL (q + 1) in the lower row by M pixels.

注目ブロックの下端が、スキャン画像ＳＩの下端よりも下側に移動した場合には（Ｓ４５５：ＹＥＳ）、全てのブロックＢＬの判断が終了したので、ＣＰＵ２１０は、Ｓ４６０に処理を進める。 When the lower end of the block of interest moves below the lower end of the scanned image SI (S455: YES), the determination of all the blocks BL is completed, so the CPU 210 proceeds to S460.

Ｓ４６０では、ＣＰＵ２１０は、ブロック判定データに、不明を示す値「０」が残っているか否かを判断する。不明を示す値が残っている場合には、Ｓ４６５にて、ＣＰＵ２１０は、不明を示す値を、文字を示す値「１」に設定する。この結果、ブロック判定データの各画素の値は、文字を示す値「１」と非文字を示す値「２」とのいずれかとなる。 In S460, the CPU 210 determines whether or not a value “0” indicating unknown remains in the block determination data. If a value indicating unknown remains, the CPU 210 sets the value indicating unknown to the value "1" indicating a character in S465. As a result, the value of each pixel of the block determination data becomes either a value "1" indicating a character or a value "2" indicating a non-character.

Ｓ４７０では、ＣＰＵ２１０は、非文字を示す値「２」を「０」に変更して、ブロック判定データを「１」と「０」のいずれかの値をとる二値画像データに変換する。この結果、文字を値、すなわち、上述した文字ブロックを構成する画素であることを示す値「１」と、非文字を示す値、すなわち、上述した非文字ブロックを構成する画素であることを示す値「０」と、のいずれかの値を画素ごとに有するブロック判定データが生成される。 In S470, the CPU 210 changes the non-character value "2" to "0" and converts the block determination data into binary image data having either a value of "1" or "0". As a result, it is shown that a character is a value, that is, a value "1" indicating that it is a pixel constituting the above-mentioned character block, and a value indicating a non-character, that is, a pixel constituting the above-mentioned non-character block. Block determination data having any of the values "0" and the value "0" for each pixel is generated.

以上説明した本実施例によれば、対象画像データとしてのスキャンデータを用いて、複数個の文字候補画素が抽出され（図２のＳ２１、Ｓ２４）、スキャンデータを用いて、スキャン画像ＳＩ上に配置される複数個のブロックＢＬのそれぞれが文字を示す文字ブロックであるか否かをブロックごとに判断される（図２のＳ２２、Ｓ２５）。ブロックＢＬごとの判断は、複数個の文字画像データと複数個の非文字画像データとを用いてトレーニングされた機械学習モデルを用いて実行される（図９のＳ４１０）。そして、ブロックごとの判断結果を用いて、複数個の文字候補画素の中から、複数個の文字画素が特定される（図２のＳ２３、Ｓ２６、Ｓ２７）。この結果、この結果、スキャン画像ＳＩ内の文字画素を精度良く特定できる。 According to the present embodiment described above, a plurality of character candidate pixels are extracted using the scan data as the target image data (S21 and S24 in FIG. 2), and the scan data is used on the scan image SI. Whether or not each of the plurality of blocks BL to be arranged is a character block indicating a character is determined for each block (S22 and S25 in FIG. 2). The determination for each block BL is executed using a machine learning model trained using a plurality of character image data and a plurality of non-character image data (S410 in FIG. 9). Then, using the determination result for each block, a plurality of character pixels are specified from the plurality of character candidate pixels (S23, S26, S27 in FIG. 2). As a result, as a result, the character pixels in the scanned image SI can be identified with high accuracy.

図１３は、実施例の効果について説明する図である。図１３（Ａ）〜（Ｄ）には、スキャンデータによって示されるスキャン画像ＳＩ、第１の二値画像データによって示される第１の二値画像ＣＩ１、第１のブロック判定データによって示される第１のブロック判定画像ＢＩ１、第１の文字特定データによって示される第１の文字特定画像ＴＩ１のそれぞれの一例が、概念的に示されている。図１３において、これらの画像ＳＩ、ＣＩ１、ＢＩ１、ＴＩ１の破線で示す升目は、それぞれ、画素Ｐｘを示している。 FIG. 13 is a diagram illustrating the effect of the embodiment. 13 (A) to 13 (D) show a scan image SI indicated by scan data, a first binary image CI1 indicated by first binary image data, and a first block determination data indicated by first block determination data. An example of each of the block determination image BI1 and the first character identification image TI1 shown by the first character identification data is conceptually shown. In FIG. 13, the squares shown by the broken lines of these images SI, CI1, BI1, and TI1 each indicate the pixel Px.

スキャン画像ＳＩには、文字Ｔｘとともに、文字以外のオブジェクト（例えば、写真）や色付きの背景が含まれ得る。図１３（Ａ）の例では、スキャン画像ＳＩには、文字Ｔｘとともに、構成する網点ＤＴが含まれ得る。これは、上述したように、スキャンデータが、印刷物を読み取ることによって生成されたデータであるためである。このような場合に、文字画素のみを特定することは比較的困難である。例えば、網点ＤＴを誤って特定しないように、スキャンデータに対して、画像を平滑化する平滑化処理を行い、平滑化処理済みのスキャンデータに対して画像内のエッジ画素を抽出するエッジ抽出処理を行う方法が考えられる。この方法では、抽出されたエッジ画素が文字画素として特定される。この方法では、網点ＤＴを十分に平滑化できない場合には、網点ＤＴのエッジが誤って文字画素として特定され得る。また、網点ＤＴを十分に平滑化するために、過度に画像を平滑化すると、文字のエッジが過度にぼけてしまうために、文字画素の特定精度が低下してしまう。このために、例えば、図１３（Ｂ）の第１の二値画像ＣＩ１に示すように、第１の二値画像データにおいて、文字Ｔｘを構成する画素Ｃｐｔだけでなく、網点ＤＴを構成する画素Ｃｐｄも第１の文字候補画素として特定されてしまい得る。 The scanned image SI may include a non-character object (for example, a photograph) or a colored background as well as the character Tx. In the example of FIG. 13A, the scanned image SI may include the constituent halftone dots DT together with the character Tx. This is because, as described above, the scan data is the data generated by reading the printed matter. In such a case, it is relatively difficult to specify only the character pixel. For example, edge extraction that smoothes the image of the scan data and extracts the edge pixels in the image from the smoothed scan data so that the halftone dot DT is not erroneously specified. A method of processing can be considered. In this method, the extracted edge pixels are specified as character pixels. In this method, if the halftone dot DT cannot be sufficiently smoothed, the edge of the halftone dot DT can be erroneously identified as a character pixel. Further, if the image is excessively smoothed in order to sufficiently smooth the halftone dot DT, the edges of the characters are excessively blurred, and the identification accuracy of the character pixels is lowered. For this purpose, for example, as shown in the first binary image CI1 of FIG. 13B, in the first binary image data, not only the pixel Cpt constituting the character Tx but also the halftone dot DT is configured. The pixel Cpd may also be specified as the first character candidate pixel.

これに対して、本実施例のＳ２２やＳ２５のブロック判定処理では、ブロックＢＬごとに、ブロック内のＮ個の画素の位置とＮ個の画素の値とに応じて、文字ブロックか否かを含む判断が行われるので、画素ごとの判断と比較して、空間的な分解能は粗くなるものの、判断の誤りは比較的少なくなる。さらに、本実施例では、ブロックＢＬごとの判断を機械学習モデルを用いて実行するので、該機械学習モデルを十分にトレーニングしておくことで、各ブロックが文字ブロックであるか否かは、十分に高い精度で判断できる。このために、例えば、図１３（Ｃ）の第１のブロック判定画像ＢＩ１では、例えば、文字Ｔｘを含む領域は、文字ブロックＴＢとして特定され、網点ＤＴを含む領域は、非文字ブロックＯＢとして特定される。 On the other hand, in the block determination process of S22 and S25 of this embodiment, whether or not the block is a character block is determined for each block BL according to the positions of N pixels in the block and the values of N pixels. Since the judgment including is performed, the spatial resolution is coarser than the judgment for each pixel, but the judgment error is relatively small. Further, in this embodiment, since the judgment for each block BL is executed by using the machine learning model, it is sufficient to sufficiently train the machine learning model to determine whether or not each block is a character block. Can be judged with high accuracy. Therefore, for example, in the first block determination image BI1 of FIG. 13C, for example, the area including the character Tx is specified as the character block TB, and the area including the halftone dot DT is designated as the non-character block OB. Be identified.

この結果、第１の文字特定データと、第１のブロック判定データと、の論理積を取って第１の文字特定データを生成すれば、文字画素を適切に特定することができる。例えば、図１３（Ｄ）の文字特定画像ＴＩ１に示すように、文字Ｔｘを構成する画素が文字画素として特定され、かつ、網点ＤＴを示す画素が文字画素として特定されない。第２の二値画像データと第２のブロック判定データと、これらのデータの論理積を取って得られる第２の文字特定データについても同様である。 As a result, if the first character identification data is generated by taking the logical product of the first character identification data and the first block determination data, the character pixels can be appropriately specified. For example, as shown in the character identification image TI1 of FIG. 13 (D), the pixels constituting the character Tx are specified as character pixels, and the pixels indicating the halftone dot DT are not specified as character pixels. The same applies to the second binary image data, the second block determination data, and the second character identification data obtained by taking the logical product of these data.

さらに、本実施例によれば、ＣＰＵ２１０は、第１の二値画像データ生成処理や第２の二値画像データ生成処理において、対象画像データに対して、画像内のエッジの強度を調整するエッジ調整処理を実行することなく、複数個の文字候補画素を抽出する（図５、図８）。この結果、複数個の文字候補画素を精度良く抽出することができる。エッジ調整処理は、画像を平滑化して画像内のエッジの強度を低くする平滑化処理および画像内のエッジの強度を強くするエッジ強調処理を含む。平滑化処理は、例えば、単純平均のフィルタを用いる処理やガウシアンフィルタを用いる処理を含む。エッジ強調処理は、例えば、アンシャープマスク処理、ラプラシアンフィルタを用いる処理を含む。例えば、仮に、スキャンデータに対して、上述したように、平滑化処理を行って網点ＤＴを消去することや、平滑化処理後に、さらに、エッジ強調処理を行って、ぼけた文字のエッジを強調したりするとする。こうすれば、網点ＤＴを構成する画素が誤って文字画素として特定される不都合を低減し得るが、文字のエッジも変化してしまうために、文字の太さの変化などが生じ得る。この結果、スキャン画像ＳＩ内の文字よりも太い文字を構成するように文字画素が特定される不都合や、スキャン画像ＳＩ内の文字よりも細い文字を構成するように文字画素が特定される不都合が発生して、文字画素の特定精度が低下し得る。例えば、小さな文字が潰れてしまった状態で特定されることや、文字の線画途切れた状態で特定されることが発生し得る。本実施例では、第１の二値画像データや第２の二値画像データと、ブロック判定データとの論理積を取ることで、網点ＤＴなどの文字以外のオブジェクトや背景を構成する画素を排除している。このために、第１の二値画像データや第２の二値画像データでは、網点ＤＴを構成する画素が文字候補画素として誤って特定されていても問題がない。このために、第１の二値画像データ生成処理や第２の二値画像データ生成処理において、対象画像データに対して、平滑化処理やエッジ強調処理を実行する必要がない。この結果、文字のエッジを変化させないので、複数個の文字候補画素を精度良く抽出することができる。 Further, according to the present embodiment, the CPU 210 adjusts the intensity of the edge in the image with respect to the target image data in the first binary image data generation process and the second binary image data generation process. A plurality of character candidate pixels are extracted without executing the adjustment process (FIGS. 5 and 8). As a result, a plurality of character candidate pixels can be extracted with high accuracy. The edge adjustment process includes a smoothing process for smoothing the image to reduce the intensity of the edges in the image and an edge enhancement process for increasing the intensity of the edges in the image. The smoothing process includes, for example, a process using a simple average filter and a process using a Gaussian filter. The edge enhancement process includes, for example, an unsharp mask process and a process using a Laplacian filter. For example, if the scan data is smoothed to eliminate halftone dot DTs as described above, or after the smoothing process, edge enhancement processing is further performed to remove the edges of blurred characters. I would like to emphasize it. By doing so, it is possible to reduce the inconvenience that the pixels constituting the halftone dot DT are erroneously specified as character pixels, but the edges of the characters also change, so that the thickness of the characters may change. As a result, there is an inconvenience that the character pixel is specified so as to form a character thicker than the character in the scanned image SI, and a character pixel is specified so as to form a character thinner than the character in the scanned image SI. This can occur and reduce the accuracy of character pixel identification. For example, it may occur that a small character is specified in a crushed state, or a character is specified in a state where the line drawing is interrupted. In this embodiment, by taking the logical product of the first binary image data or the second binary image data and the block determination data, an object other than characters such as halftone dot DT and pixels constituting the background can be obtained. Exclude. Therefore, in the first binary image data and the second binary image data, there is no problem even if the pixels constituting the halftone dot DT are erroneously specified as the character candidate pixels. Therefore, in the first binary image data generation process and the second binary image data generation process, it is not necessary to execute the smoothing process or the edge enhancement process on the target image data. As a result, since the edges of the characters are not changed, a plurality of character candidate pixels can be extracted with high accuracy.

さらに、本実施例の第１の二値画像データ生成処理では、ＣＰＵ２１０は、スキャンデータを用いて、最小成分データを生成する（図５のＳ１００）。最小成分データの複数個の画素の値のそれぞれは、対応するスキャンデータのＲＧＢ値の３個の成分値のうちの最小値に基づく値である。そして、該最小成分データを用いて複数個の第１の文字候補画素が抽出される（図５のＳ１１０）。この結果、対象画像データをそのまま二値化する場合には抽出され難い文字候補画素を抽出し得る。例えば、上述したように、用紙の地色（白）の背景に、イエロの文字がある場合であっても、該文字を構成する画素を含む文字候補画素を抽出できる。 Further, in the first binary image data generation process of this embodiment, the CPU 210 uses the scan data to generate the minimum component data (S100 in FIG. 5). Each of the values of the plurality of pixels of the minimum component data is a value based on the minimum value of the three component values of the RGB values of the corresponding scan data. Then, a plurality of first character candidate pixels are extracted using the minimum component data (S110 in FIG. 5). As a result, character candidate pixels that are difficult to extract when the target image data is binarized as it is can be extracted. For example, as described above, even when there are yellow characters on the background of the background color (white) of the paper, character candidate pixels including the pixels constituting the characters can be extracted.

さらに、本実施例の第２の二値画像データ生成処理では、ＣＰＵ２１０は、スキャンデータを用いて、輝度画像データを生成する（図８のＳ２００）。ＣＰＵ２１０は、輝度画像データを用いて、基準よりも高い輝度を有する画素を文字候補画素として抽出する（図８のＳ２１０、Ｓ２２０）。この結果、対象画像データをそのまま二値化する場合には抽出され難い比較的輝度が高い文字候補画素を抽出し得る。例えば、上述したように、背景より輝度が高い文字を構成する文字を構成する画素を含む文字候補画素を抽出できる。 Further, in the second binary image data generation process of this embodiment, the CPU 210 uses the scan data to generate luminance image data (S200 in FIG. 8). The CPU 210 uses the luminance image data to extract pixels having a luminance higher than the reference as character candidate pixels (S210 and S220 in FIG. 8). As a result, it is possible to extract character candidate pixels having relatively high brightness, which is difficult to extract when the target image data is binarized as it is. For example, as described above, character candidate pixels including pixels constituting characters constituting characters having higher brightness than the background can be extracted.

さらに、本実施例では、ＣＰＵ２１０は、第１の二値画像データ生成処理を実行して、複数個の第１の文字候補画素を抽出し（図２のＳ２１）、第２の二値画像データ生成処理を実行して、複数個の第２の文字候補画素を抽出する（図２のＳ２４）。そして、ＣＰＵ２１０は、ブロックＢＬごとの判断結果を用いて、複数個の第１の文字候補画素の中から、複数個の第１の文字画素を特定し（図２のＳ２３）、ブロックＢＬごとの判断結果を用いて、複数個の第２の文字候補画素の中から、複数個の第２の文字画素を特定し、複数個の第１の文字画素と複数個の第２の文字画素とを含む複数個の画素を、最終的に文字画素として特定する（図２のＳ２７）。この結果、文字画素の特定漏れを抑制することができる。 Further, in this embodiment, the CPU 210 executes the first binary image data generation process to extract a plurality of first character candidate pixels (S21 in FIG. 2), and the second binary image data. The generation process is executed to extract a plurality of second character candidate pixels (S24 in FIG. 2). Then, the CPU 210 uses the determination result for each block BL to specify a plurality of first character pixels from the plurality of first character candidate pixels (S23 in FIG. 2), and for each block BL. Using the determination result, a plurality of second character pixels are identified from the plurality of second character candidate pixels, and the plurality of first character pixels and the plurality of second character pixels are selected. A plurality of including pixels are finally specified as character pixels (S27 in FIG. 2). As a result, it is possible to suppress specific omission of character pixels.

さらに、本実施例の第１のブロック判定処理（図２のＳ２２）では、ＣＰＵ２１０は、第１の機械学習モデルを用いて、複数個のブロックＢＬのそれぞれが文字ブロックあるか否かをブロックごとに判断する（図９）。第２のブロック判定処理（図２のＳ２５）では、ＣＰＵ２１０は、第１の機械学習モデルとは異なる第２の機械学習モデルを用いて、複数個のブロックＢＬのそれぞれが文字ブロックあるか否かをブロックごとに判断する（図９）。ＣＰＵ２１０は、第１の機械学習モデルを用いた判断結果を用いて、複数個の第１の文字画素を特定し（図２のＳ２３）第２の機械学習モデルを用いて判断結果を用いて、第２の文字画素を特定する（図２のＳ２６）。このように、第１の文字画素を特定する際と、第２の文字画素を特定する際とで、互いに異なる機械学習モデルを用いた判定結果を用いるので、第１の文字画素と第２の文字画素とをそれぞれ適切に特定し得る。 Further, in the first block determination process (S22 in FIG. 2) of the present embodiment, the CPU 210 uses the first machine learning model to determine whether or not each of the plurality of blocks BL has a character block for each block. (Fig. 9). In the second block determination process (S25 in FIG. 2), the CPU 210 uses a second machine learning model different from the first machine learning model, and whether or not each of the plurality of blocks BL has a character block. Is determined for each block (Fig. 9). The CPU 210 identifies a plurality of first character pixels by using the judgment result using the first machine learning model (S23 in FIG. 2), and uses the judgment result using the second machine learning model. The second character pixel is specified (S26 in FIG. 2). In this way, since the determination results using different machine learning models are used when specifying the first character pixel and when specifying the second character pixel, the first character pixel and the second character pixel are used. Character pixels can be appropriately identified.

さらに、本実施例では、第１のブロック判定処理（図２のＳ２２）で用いられる第１の機械学習モデルは、複数個の第１の文字画像データと複数個の第１の非文字画像データとを用いてトレーニングされた機械学習モデルであり、第２のブロック判定処理（図２のＳ２５）で用いられる第２の機械学習モデルは、複数個の第１の文字画像データとは異なる複数個の第２の文字画像データと、複数個の第１の非文字画像データとは異なる複数個の第２の非文字画像データと、を用いてトレーニングされた機械学習モデルである。このように、第１の機械学習モデルと第２の機械学習モデルとは、それぞれ、互いに異なる画像を用いてトレーニングされたモデルである。この結果、第１の文字画素（本実施例では、背景よりも輝度が低い文字を構成する画素）と第２の文字画素（本実施例では背景よりも輝度が高い文字を構成する画素）とをそれぞれより適切に特定することができる。 Further, in the present embodiment, the first machine learning model used in the first block determination process (S22 in FIG. 2) includes a plurality of first character image data and a plurality of first non-character image data. The second machine learning model used in the second block determination process (S25 in FIG. 2) is a machine learning model trained using the above, and is different from the plurality of first character image data. It is a machine learning model trained using the second character image data of the above and a plurality of second non-character image data different from the plurality of first non-character image data. As described above, the first machine learning model and the second machine learning model are models trained using images different from each other. As a result, the first character pixel (in this embodiment, a pixel constituting a character having a brightness lower than the background) and the second character pixel (in this embodiment, a pixel constituting a character having a brightness higher than the background) Can be identified more appropriately.

以上の説明から解るように、上記実施例の最小成分データは、第１画像データの例であり、輝度画像データは、第２画像データの例である。また、第１の二値画像データ生成処理は、第１の抽出処理の例であり、第２の二値画像データ生成処理は、第２の抽出処理の例である。 As can be seen from the above description, the minimum component data of the above embodiment is an example of the first image data, and the luminance image data is an example of the second image data. The first binary image data generation process is an example of the first extraction process, and the second binary image data generation process is an example of the second extraction process.

Ｂ．変形例： B. Modification example:

（１）上記実施例では、第１のブロック判定処理（図２のＳ２１）と第２のブロック判定処理（図２のＳ２４）とで、互いに異なる機械学習モデルを用いている。これに代えて、第１のブロック判定処理と第２のブロック判定処理とで同一の機械学習モデルが用いられても良い。例えば、実施例における、背景よりも輝度が低い文字を示す複数個の第１の文字画像と、背景よりも輝度が高い文字を示す第２の文字画像データと、背景よりも輝度が低い文字と背景よりも輝度が高い文字とのいずれも示さない複数個の非文字画像データと、を用いてトレーニングされて機械学習モデルが、２つのブロック判定処理で用いられても良い。 (1) In the above embodiment, the first block determination process (S21 in FIG. 2) and the second block determination process (S24 in FIG. 2) use different machine learning models. Instead of this, the same machine learning model may be used in the first block determination process and the second block determination process. For example, in the embodiment, a plurality of first character images showing characters having a brightness lower than the background, second character image data showing characters having a brightness higher than the background, and characters having a brightness lower than the background. A machine learning model trained using a plurality of non-character image data that does not show any of the characters having a brightness higher than the background may be used in the two block determination processes.

（２）記実施例では、第１のブロック判定処理で用いられる第１の機械学習モデルと、第２のブロック判定処理で用いられる第２の機械学習モデルとで、互いに異なる文字画像データおよび非文字画像データを用いてトレーニングされている。これに代えて、第１の機械学習モデルと第２の機械学習モデルとは、互いに同じ文字画像データおよび非文字画像データを用いてトレーニングされていても良い。この場合に、第１の機械学習モデルと第２の機械学習モデルとは、例えば、畳み込み層の数やプーリング層の数などのニューラスネットワークの構造が互いに異なっていても良い。 (2) In the above embodiment, the first machine learning model used in the first block determination process and the second machine learning model used in the second block determination process are different from each other in character image data and non-character image data. It is trained using character image data. Alternatively, the first machine learning model and the second machine learning model may be trained using the same character image data and non-character image data. In this case, the structure of the neuros network such as the number of convolution layers and the number of pooling layers may be different from each other between the first machine learning model and the second machine learning model.

（３）上記実施例では、第１の二値画像データと、第２の二値画像データと、の両方を用いて、最終的な文字画素を特定している。これに代えて、第１の二値画像データと、第２の二値画像データと、の一方だけを用いて、文字画素を特定しても良い。例えば、第１の二値画像データのみを用いる場合には、図２のＳ２４〜Ｓ２７は、省略されても良い。 (3) In the above embodiment, the final character pixel is specified by using both the first binary image data and the second binary image data. Instead of this, the character pixel may be specified by using only one of the first binary image data and the second binary image data. For example, when only the first binary image data is used, S24 to S27 in FIG. 2 may be omitted.

（４）第２の二値画像データ生成処理（図８）では、輝度画像データが用いられる（Ｓ２００）。これに代えて、例えば、スキャンデータの対応する画素のＲＧＢ値に含まれる３個の成分値（Ｒ値、Ｇ値、Ｂ値）の平均値を、各画素の値とする平均成分値画像データが用いられても良い。 (4) Luminance image data is used in the second binary image data generation process (FIG. 8) (S200). Instead of this, for example, the average component value image data in which the average value of the three component values (R value, G value, B value) included in the RGB values of the corresponding pixels of the scan data is the value of each pixel. May be used.

（５）上記実施例の第１の二値画像データ生成処理（図５）では、最小成分データが用いられる（Ｓ１００）。これに代えて、最大成分データや反転最小成分データが用いられても良い。 (5) In the first binary image data generation process (FIG. 5) of the above embodiment, the minimum component data is used (S100). Instead of this, the maximum component data or the inverted minimum component data may be used.

最大成分データは、スキャンデータに含まれる複数個の画素に対応する複数個の値を含み、該複数個の値のそれぞれは、スキャンデータの対応する画素の最大成分値Ｖｍａｘである。最大成分値Ｖｍａｘは、スキャンデータの対応する画素のＲＧＢ値に含まれる複数個の成分値（Ｒ値、Ｇ値、Ｂ値）のうちの最大値である。 The maximum component data includes a plurality of values corresponding to a plurality of pixels included in the scan data, and each of the plurality of values is the maximum component value Vmax of the corresponding pixel of the scan data. The maximum component value Vmax is the maximum value among a plurality of component values (R value, G value, B value) included in the RGB values of the corresponding pixels of the scan data.

反転最小成分データは、以下のように、取得される。先ず、スキャンデータに含まれる複数個の画素の値（ＲＧＢ値）のそれぞれについて、複数個の成分値（Ｒ値、Ｇ値、Ｂ値）が反転された反転済みの色値が生成される。反転前のＲＧＢ値を（Ｒｉｎ、Ｇｉｎ、Ｂｉｎ）とすると、反転済みのＲＧＢ値（Ｒｏｕｔ、Ｇｏｕｔ、Ｂｏｕｔ）は、以下の式（１）〜（３）で表される。 The inversion minimum component data is acquired as follows. First, for each of the values (RGB values) of the plurality of pixels included in the scan data, the inverted color values in which the plurality of component values (R value, G value, B value) are inverted are generated. Assuming that the RGB values before inversion are (Rin, Gin, Bin), the inverted RGB values (Rout, Gout, Bout) are represented by the following equations (1) to (3).

Ｒｏｕｔ＝Ｒｍａｘ−Ｒｉｎ …（１）
Ｇｏｕｔ＝Ｇｍａｘ−Ｇｉｎ …（２）
Ｂｏｕｔ＝Ｂｍａｘ−Ｂｉｎ …（３） Rout = Rmax-Rin ... (1)
Gout = Gmax-Gin ... (2)
Bout = Bmax-Bin ... (3)

ここで、Ｒｍａｘ、Ｇｍａｘ、Ｂｍａｘは、それぞれ、Ｒ値、Ｇ値、Ｂ値が取り得る値の最大値であり、本実施例では、Ｒｍａｘ＝Ｇｍａｘ＝Ｂｍａｘ＝２５５である。これらの反転済みのＲＧＢ値を複数個の画素の値とする画像データが、反転画像データとして生成される。そして、反転画像データを用いて、反転最小成分データが生成される。具体的には、反転画像データに含まれる複数個の反転済みのＲＧＢ値のそれぞれから、反転最小成分値ＶＲｍｉｎが取得される。反転最小成分値ＶＲｍｉｎは、該反転済みのＲＧＢ値に含まれる複数個の成分値（Ｒ値、Ｇ値、Ｂ値）のうちの最小値である。反転最小成分データは、これらの反転最小成分値ＶＲｍｉｎを、複数個の画素の値とする画像データである。 Here, Rmax, Gmax, and Bmax are the maximum values that the R value, G value, and B value can take, respectively, and in this embodiment, Rmax = Gmax = Bmax = 255. Image data in which these inverted RGB values are the values of a plurality of pixels is generated as inverted image data. Then, the inverted minimum component data is generated using the inverted image data. Specifically, the inverted minimum component value VRmin is acquired from each of the plurality of inverted RGB values included in the inverted image data. The inverted minimum component value VRmin is the minimum value among a plurality of component values (R value, G value, B value) included in the inverted RGB value. The inversion minimum component data is image data in which these inversion minimum component values VRmin are values of a plurality of pixels.

反転最小成分値ＶＲｍｉｎは、最大成分値の反転値であり、ＶＲｍｉｎ＝（２５５−Ｖｍａｘ）の関係が成り立つ。このために、最大成分データと反転最小成分データとは、両方とも、スキャンデータの各画素の値に含まれる複数個の成分値のうちの最大値に基づく値（最大値の反転値、あるいは、最大値そのもの）を、画素の値とする画像データである、と言うことができる。 The inverted minimum component value VRmin is an inverted value of the maximum component value, and the relationship of VRmin = (255-Vmax) is established. Therefore, both the maximum component data and the inverted minimum component data are values based on the maximum value among the plurality of component values included in the value of each pixel of the scan data (the inverted value of the maximum value, or the inverted value of the maximum value, or It can be said that the maximum value itself) is the image data in which the pixel value is used.

図６に示すように、Ｃ、Ｍ、Ｙ、Ｋ、Ｗの最大成分値Ｖｍａｘは、２５５、２５５、２５５、０、２５５となり、黒（Ｋ）を除いて同じ値となる。したがって、最大成分データや反転最小成分データにおいては、網点領域を構成する５種類の要素、すなわち、Ｃ、Ｍ、Ｙ、Ｋの各ドットと、用紙の地色（白）と、のうちの４種類の要素（Ｃ、Ｍ、Ｙのドットと、用紙の地色（白））を示す画素間の値の差が抑制される。この結果、最大成分データや反転最小成分データを用いる場合には、最小成分データを用いる場合と同様に、網点を構成する画素が、文字候補画素として特定されることを抑制できる。 As shown in FIG. 6, the maximum component values Vmax of C, M, Y, K, and W are 255, 255, 255, 0, and 255, which are the same values except for black (K). Therefore, in the maximum component data and the inverted minimum component data, among the five types of elements constituting the halftone dot region, that is, the dots of C, M, Y, and K, and the background color (white) of the paper. The difference between the values indicating the four types of elements (dots C, M, Y and the background color (white) of the paper) is suppressed. As a result, when the maximum component data or the inverted minimum component data is used, it is possible to prevent the pixels constituting the halftone dots from being specified as the character candidate pixels, as in the case of using the minimum component data.

（６）上記実施例の第１の二値画像データ生成処理（図５）では、最小成分データが用いられる（Ｓ１００）。これに代えて、例えば、輝度画像データが用いられても良い。すなわち、第１の二値画像データ生成処理では、背景よりも輝度が低い文字を構成する文字画素を特定するために、反転処理が行われない輝度画像データが用いられ、第２の二値画像データ生成処理では、背景よりも輝度が高い文字を構成する文字画素を特定するために、反転処理が行われた輝度画像データが用いられても良い。 (6) In the first binary image data generation process (FIG. 5) of the above embodiment, the minimum component data is used (S100). Instead of this, for example, luminance image data may be used. That is, in the first binary image data generation process, in order to identify the character pixels constituting the character having a brightness lower than that of the background, the brightness image data that is not inverted is used, and the second binary image is used. In the data generation processing, inversion-processed brightness image data may be used in order to identify character pixels constituting characters having a brightness higher than that of the background.

（７）上記各実施例では、文字画素に対して、文字鮮鋭化処理が実行され（図２のＳ４０）、非文字画素に対して、網点平滑化処理が実行される（図２のＳ３０）。これに代えて、文字画素に対しては、文字の見栄えを向上するためのアンチエイリアス処理が実行されても良い。また、非文字画素に対しては、例えば、印刷時の色材の使用量を減らすために、色を飛ばす処理（白に変換する処理）が実行されても良い。一般的には、文字画素と、非文字画素と、に互いに異なる画像処理が実行されることが好ましい。あるいは、文字画素と非文字画素のいずれか一方に対して、特定の画像処理が実行され、他方に対して、該特定の画像処理が実行されなくても良い。 (7) In each of the above embodiments, character sharpening processing is executed for character pixels (S40 in FIG. 2), and halftone dot smoothing processing is executed for non-character pixels (S30 in FIG. 2). ). Instead of this, antialiasing processing for improving the appearance of the character may be executed on the character pixel. Further, for non-character pixels, for example, in order to reduce the amount of color material used during printing, a process of skipping colors (a process of converting to white) may be executed. In general, it is preferable that character pixels and non-character pixels are subjected to different image processing. Alternatively, the specific image processing may be executed on either the character pixel or the non-character pixel, and the specific image processing may not be executed on the other.

（８）上記実施例の図９のブロック判定処理では、全てのブロックＢＬについての判断の後に、ブロック判定データに、不明を示す値が残っている場合には（Ｓ４６０：ＹＥＳ）、ＣＰＵ２１０は、Ｓ４６５にて、不明を示す値を、文字を示す値に設定する。これは、文字画素の一部が、誤って非文字画素として特定されることを抑制して、文字の一部がぼけるなどの不都合を避けるためである。例えば、非文字画素の一部が、誤って文字画素として特定されることを抑制して、網点が目立つなどの不都合を避けることを重視する場合には、ＣＰＵ２１０は、Ｓ４６５にて、不明を示す値を、非文字を示す値に設定しても良い。 (8) In the block determination process of FIG. 9 of the above embodiment, if a value indicating unknown remains in the block determination data after the determination for all the block BLs (S460: YES), the CPU 210 determines. In S465, the value indicating unknown is set to the value indicating characters. This is to prevent a part of the character pixel from being mistakenly identified as a non-character pixel and to avoid inconvenience such as a part of the character being blurred. For example, when it is important to prevent a part of non-character pixels from being mistakenly identified as a character pixel and to avoid inconveniences such as conspicuous halftone dots, the CPU 210 makes an unknown question in S465. The indicated value may be set to a value indicating a non-character.

（９）上記実施例では、第１の二値画像データと、第２の二値画像データと、を用いて、最終的に、文字特定データが生成される（図２のＳ２１〜Ｓ２７）。これに代えて、第１の二値画像データと、第２の二値画像データと、第３の二値画像データと、を用いて、文字特定データが生成されても良い。例えば、図２のＳ２６とＳ２７の間で、上述した最大成分データを用いて生成され、該最大成分データが二値化されて、第３の二値画像データが生成される。そして、第３の二値画像データと、Ｓ２２で生成された第１のブロック判定処理と、の論理積をとることによって、第３の文字特定データが生成される。そして、図２のＳ２７では、第１の二値画像データを用いて生成された第１の文字特定データ（図２のＳ２３）と、第２の二値画像データを用いて生成された第２の文字特定データ（図２のＳ２６）と、第３の二値画像データを用いて生成された第３の文字特定データと、の論理和を取ることによって、最終的な文字特定データが生成されても良い。これによって、文字画素の特定漏れをさらに抑制することができる。 (9) In the above embodiment, character identification data is finally generated using the first binary image data and the second binary image data (S21 to S27 in FIG. 2). Instead of this, character identification data may be generated using the first binary image data, the second binary image data, and the third binary image data. For example, between S26 and S27 in FIG. 2, it is generated using the above-mentioned maximum component data, the maximum component data is binarized, and a third binary image data is generated. Then, the third character identification data is generated by taking the logical product of the third binary image data and the first block determination process generated in S22. Then, in S27 of FIG. 2, the first character identification data (S23 of FIG. 2) generated using the first binary image data and the second binary image data generated using the second binary image data are used. The final character identification data is generated by taking the logical sum of the character identification data (S26 in FIG. 2) and the third character identification data generated by using the third binary image data. You may. Thereby, the specific omission of the character pixel can be further suppressed.

（１０）上記実施例のブロック判定処理では、縦（Ｌ×Ｍ）画素×横（Ｌ×Ｍ）画素の注目ブロックを、Ｍ画素ずつずらしながら、文字ブロックであるか否かを判断するので、スキャン画像ＳＩ上に配置される複数個のブロックは互いに重複している（図１０）。これに代えて、複数個のブロックが互いに重複しないように、スキャン画像ＳＩ上に複数個のブロックが配置されても良い。 (10) In the block determination process of the above embodiment, it is determined whether or not the block is a character block while shifting the attention block of vertical (L × M) pixels × horizontal (L × M) pixels by M pixels. A plurality of blocks arranged on the scanned image SI overlap each other (FIG. 10). Instead of this, a plurality of blocks may be arranged on the scanned image SI so that the plurality of blocks do not overlap each other.

（１１）上記実施例のブロック判定処理では、ＣＰＵ２１０は、注目ブロックが、文字ブロック、非文字ブロック、不明ブロックのいずれであるかを判断している。これに代えて、ＣＰＵ２１０は、注目ブロックが、文字ブロックと非文字ブロックとのいずれであるかを判断しても良い。この場合には、例えば、Ｓ４１５で用いる閾値ＴＨ１と、Ｓ４２５で用いる閾値ＴＨ２と、を同じ値にすれば良い。例えば、ＴＨ１＝ＴＨ２＝５０％とすれば良い。 (11) In the block determination process of the above embodiment, the CPU 210 determines whether the block of interest is a character block, a non-character block, or an unknown block. Instead, the CPU 210 may determine whether the block of interest is a character block or a non-character block. In this case, for example, the threshold value TH1 used in S415 and the threshold value TH2 used in S425 may be set to the same value. For example, TH1 = TH2 = 50% may be set.

（１２）上記実施例では、上述したように、図９のＳ４２０、Ｓ４３０に示すように、注目ブロックが文字ブロックまたは非文字ブロックであると判断されると、ブロック判定データにおいて、注目ブロック内の全ての画素の値が、判断結果に応じて設定される。これに代えて、ブロック判定データにおいて、注目ブロック内のＮ個の画素のうち、不明を示す値を有する画素の値だけが、判断結果に応じて設定されても良い。すなわち、互いに重複する第１のブロックと第２のブロックとの両方が、不明ブロック以外のブロック（すなわち、文字ブロックまたは非文字ブロック）であると判断されるとする。この場合には、第１のブロックと第２のブロックの重複領域については、第１のブロックと第２のブロックとのうち、判断の処理順序が先のブロックについての判断結果が優先されても良い。 (12) In the above embodiment, as described above, as shown in S420 and S430 of FIG. 9, when the attention block is determined to be a character block or a non-character block, in the block determination data, the attention block is contained. The values of all the pixels are set according to the judgment result. Instead of this, in the block determination data, of the N pixels in the block of interest, only the value of the pixel having a value indicating unknown may be set according to the determination result. That is, it is determined that both the first block and the second block that overlap each other are blocks other than the unknown block (that is, a character block or a non-character block). In this case, regarding the overlapping area of the first block and the second block, even if the judgment result of the block whose judgment processing order is earlier than that of the first block and the second block is prioritized. good.

（１３）上記実施例では、対象画像データは、スキャンデータであるが、これに限られない。対象画像データは、２次元イメージセンサを備えるデジタルカメラによって印刷物を読み取ることによって生成されても良い。また、対象画像データは、描画作成や文書作成などのアプリケーションプログラムを用いて作成された画像データであっても良い。 (13) In the above embodiment, the target image data is scan data, but is not limited to this. The target image data may be generated by reading the printed matter with a digital camera equipped with a two-dimensional image sensor. Further, the target image data may be image data created by using an application program such as drawing creation or document creation.

（１４）図２の画像処理を実現する画像処理装置は、複合機２００に限らず、種々の装置であってよい。例えば、スキャナやデジタルカメラが、自身で生成された画像データを用いて、プリンタに供給するための印刷データを生成するために、図２の画像処理を実行しても良い。また、例えば、スキャナやプリンタと通信可能な接続される端末装置（例えば、端末装置１００）やサーバ（図示省略）が、スキャナから取得したスキャンデータを用いて、図２の画像処理を実行して、印刷データを生成し、該印刷データをプリンタに供給しても良い。また、ネットワークを介して互いに通信可能な複数個のコンピュータ（例えば、クラウドサーバ）が、画像処理に要する機能を一部ずつ分担して、全体として、画像処理を実行してもよい。この場合、複数個のコンピュータの全体が、画像処理装置の例である。 (14) The image processing device that realizes the image processing of FIG. 2 is not limited to the multifunction device 200, and may be various devices. For example, the scanner or the digital camera may execute the image processing of FIG. 2 in order to generate print data to be supplied to the printer by using the image data generated by the scanner or the digital camera. Further, for example, a connected terminal device (for example, terminal device 100) or a server (not shown) capable of communicating with the scanner or printer executes the image processing of FIG. 2 using the scan data acquired from the scanner. , Print data may be generated and the print data may be supplied to the printer. Further, a plurality of computers (for example, a cloud server) capable of communicating with each other via a network may partially share the functions required for image processing and execute image processing as a whole. In this case, the entire plurality of computers is an example of an image processing device.

（１５）上記各実施例において、ハードウェアによって実現されていた構成の一部をソフトウェアに置き換えるようにしてもよく、逆に、ソフトウェアによって実現されていた構成の一部あるいは全部をハードウェアに置き換えるようにしてもよい。例えば、図９のＳ４１０の機械学習モデルを用いて文字確率Ｔｘｒを算出する処理は、ＡＳＩＣなどの専用のハードウェアによって、実行されても良い。 (15) In each of the above embodiments, a part of the configuration realized by the hardware may be replaced with software, and conversely, a part or all of the configuration realized by the software may be replaced with the hardware. You may do so. For example, the process of calculating the character probability Txr using the machine learning model of S410 in FIG. 9 may be executed by dedicated hardware such as ASIC.

以上、実施例、変形例に基づき本発明について説明してきたが、上記した発明の実施の形態は、本発明の理解を容易にするためのものであり、本発明を限定するものではない。本発明は、その趣旨並びに特許請求の範囲を逸脱することなく、変更、改良され得ると共に、本発明にはその等価物が含まれる。 Although the present invention has been described above based on Examples and Modifications, the above-described embodiments of the invention are for facilitating the understanding of the present invention and do not limit the present invention. The present invention can be modified and improved without departing from the spirit and claims, and the present invention includes equivalents thereof.

１００…端末装置、２００…複合機、２１０…ＣＰＵ、２２０…揮発性記憶装置、２３０…不揮発性記憶装置、２４０…表示部、２５０…操作部、２７０…通信ＩＦ、２８０…印刷実行部、２９０…読取実行部、Ｖｍｉｎ…最小成分値、Ｖｍａｘ…最大成分値、ＶＲｍｉｎ…反転最小成分値、Ｄ１…第１方向、Ｄ２…第２方向、ＳＢ…サブブロック、ＴＢ…文字ブロック、ＯＢ…非文字ブロック、ＰＧ…コンピュータプログラム、ＴＩ…文字特定画像、ＧＩ…平滑化画像、ＦＩ…処理済み画像、ＹＩ…輝度画像、ＢＩ…ブロック判定画像、ＳＩ…スキャン画像、ＢＬ…ブロック、ＤＴ…網点、Ｐｘ…画素、Ｔｘ…文字、Ｏｂ１〜Ｏｂ７…オブジェクト、ＭＮＩ…最小成分画像、Ｂｇ１〜Ｂｇ３…背景 100 ... Terminal device, 200 ... Complex machine, 210 ... CPU, 220 ... Volatile storage device, 230 ... Non-volatile storage device, 240 ... Display unit, 250 ... Operation unit, 270 ... Communication IF, 280 ... Print execution unit, 290 ... Reading execution unit, Vmin ... Minimum component value, Vmax ... Maximum component value, VRmin ... Inverted minimum component value, D1 ... 1st direction, D2 ... 2nd direction, SB ... Subblock, TB ... Character block, OB ... Non-character Block, PG ... computer program, TI ... character specific image, GI ... smoothed image, FI ... processed image, YI ... brightness image, BI ... block judgment image, SI ... scanned image, BL ... block, DT ... halftone dot, Px ... Pixel, Tx ... Character, Ob1 to Ob7 ... Object, MNI ... Minimum component image, Bg1 to Bg3 ... Background

Claims

It is an image processing device
An image acquisition unit that acquires target image data indicating the target image,
It is a candidate pixel extraction unit that extracts a plurality of character candidate pixels that are candidates for a plurality of character pixels by using the target image data, and each of the plurality of character pixels has a character on the target image. The candidate pixel extraction unit, which is a pixel constituting a character in the arranged area,
Using the target image data, it is a determination unit that determines for each block whether or not each of the plurality of blocks arranged on the target image is a character block, and the character block is on the target image. It is a block corresponding to the area where the characters of are arranged, and the judgment for each block is a machine trained using a plurality of character image data indicating characters and a plurality of non-character image data not indicating characters. The determination unit, which is executed using the learning model, includes a first machine learning model and a second machine learning model different from the first machine learning model .
Using the judgment result by the judgment unit, a character pixel identification unit that specifies the plurality of character pixels from the plurality of character candidate pixels, and a character pixel identification unit.
Equipped with a,
The candidate pixel extraction unit
The first extraction process is executed to extract a plurality of first character candidate pixels among the plurality of character candidate pixels.
A second extraction process different from the first extraction process is executed to extract a plurality of second character candidate pixels among the plurality of character candidate pixels.
The judgment unit
Using the first machine learning model, it is determined for each block whether or not each of the plurality of blocks is the character block.
Using the second machine learning model, it is determined for each block whether or not each of the plurality of blocks is the character block.
The character pixel identification unit is
Using the determination result using the first machine learning model, a plurality of first pixels are identified from the plurality of first character candidate pixels.
Using the determination result using the second machine learning model, a plurality of second pixels are identified from the plurality of second character candidate pixels.
Wherein that identifies the plurality of character pixels and a second pixel a plurality of first pixels and the plurality, the image processing apparatus.

The image processing apparatus according to claim 1.
The candidate pixel extraction unit is an image processing device that extracts a plurality of character candidate pixels without executing edge adjustment processing for adjusting the intensity of edges in the target image with respect to the target image data.

The image processing apparatus according to claim 1 or 2.
The plurality of character image data includes a plurality of first character image data and a plurality of second character image data different from the plurality of first character image data.
The plurality of non-character image data includes a plurality of first non-character image data and a plurality of second non-character image data different from the plurality of first non-character image data. ,
The first machine learning model is the machine learning model trained using the plurality of first character image data and the plurality of first non-character image data.
The second machine learning model is an image processing apparatus that is the machine learning model trained using the plurality of second character image data and the plurality of second non-character image data. ..

The image processing apparatus according to claim 3.
The target image data includes color values of a plurality of pixels.
The color value includes a plurality of component values and contains a plurality of component values.
The first extraction process is
Using the target image data, it is the first image data including a plurality of first values corresponding to the color values of the plurality of pixels, and each of the plurality of first values is the corresponding color. The process of generating the first image data, which is a value based on either the minimum value or the maximum value of the plurality of component values of the value, and
A process of identifying the plurality of first pixels by binarizing the first image data, and
Including
The second extraction process is
A process of using the target image data to generate second image data including a plurality of second values indicating the brightness of the corresponding pixel among the plurality of pixels in the target image.
A process of specifying the plurality of second pixels by binarizing the second image data so that a pixel having a brightness higher than the reference is specified as the character candidate pixel.
Including
In the first machine learning model, the plurality of first character image data showing the first character, which is a character having a brightness lower than that of the background, and the plurality of first characters not showing the first character. The machine learning model trained using the non-character image data of
In the second machine learning model, the plurality of second character image data showing the second character, which is a character having a brightness higher than that of the background, and the plurality of second characters not showing the second character. An image processing apparatus, which is the machine learning model trained using the non-character image data of.

The image processing apparatus according to claim 1 or 2.
The plurality of character image data indicates a plurality of first character image data indicating a first character which is a character having a brightness lower than that of the background, and a second character which is a character having a brightness higher than that of the background. Includes a plurality of second character image data and
The first machine learning model is the machine learning model trained using the plurality of first character image data.
The second machine learning model is an image processing device that is the machine learning model trained using the plurality of second character image data.

The image processing apparatus according to any one of claims 1 to 5.
Among the target image data, the first image processing is executed on the identified values of the plurality of character pixels, and the first image is applied to the values of pixels different from the plurality of character pixels. An image processing apparatus including an image processing unit that executes a second image processing different from the processing to generate the target image data that has been image-processed.

The image processing apparatus according to claim 6.
An image processing apparatus including a print data generation unit that generates print data using the target image data that has been image-processed.

It ’s a computer program,
An image acquisition function that acquires target image data indicating the target image, and
It is a candidate pixel extraction function that extracts a plurality of character candidate pixels that are candidates for a plurality of character pixels by using the target image data, and each of the plurality of character pixels has a character on the target image. The candidate pixel extraction function, which is a pixel constituting a character in the arranged area,
Using the target image data, it is a determination function for determining for each block whether or not each of the plurality of blocks arranged on the target image is a character block, and the character block is on the target image. It is a block corresponding to the area where the characters of are arranged, and the judgment for each block is a machine trained using a plurality of character image data indicating characters and a plurality of non-character image data not indicating characters. The determination function, which is executed using the learning model, includes a first machine learning model and a second machine learning model different from the first machine learning model .
A character pixel identification function for identifying the plurality of character pixels from the plurality of character candidate pixels using the determination result by the determination function, and a character pixel identification function.
To the computer ,
The candidate pixel extraction function
The first extraction process is executed to extract a plurality of first character candidate pixels among the plurality of character candidate pixels.
A second extraction process different from the first extraction process is executed to extract a plurality of second character candidate pixels among the plurality of character candidate pixels.
The judgment function
Using the first machine learning model, it is determined for each block whether or not each of the plurality of blocks is the character block.
Using the second machine learning model, it is determined for each block whether or not each of the plurality of blocks is the character block.
The character pixel identification function is
Using the determination result using the first machine learning model, a plurality of first pixels are identified from the plurality of first character candidate pixels.
Using the determination result using the second machine learning model, a plurality of second pixels are identified from the plurality of second character candidate pixels.
Wherein that identifies a plurality of character pixels, the computer program and a second pixel of the plurality the first pixel of the plurality.

It is an image processing method
An image acquisition process for acquiring target image data indicating a target image, and
This is a candidate pixel extraction step of extracting a plurality of character candidate pixels that are candidates for a plurality of character pixels using the target image data, and each of the plurality of character pixels has a character on the target image. The candidate pixel extraction step, which is a pixel constituting a character in the arranged area,
It is a determination step of determining for each block whether or not each of the plurality of blocks arranged on the target image is a character block using the target image data, and the character block is on the target image. It is a block corresponding to the area where the characters of are arranged, and the judgment for each block is a machine trained using a plurality of character image data indicating characters and a plurality of non-character image data not indicating characters. The determination step, which is executed using the learning model, includes a first machine learning model and a second machine learning model different from the first machine learning model .
A character pixel specifying step of specifying the plurality of character pixels from the plurality of character candidate pixels by using the judgment result in the determination step.
Equipped with a,
The candidate pixel extraction step is
The first extraction process is executed to extract a plurality of first character candidate pixels among the plurality of character candidate pixels.
A second extraction process different from the first extraction process is executed to extract a plurality of second character candidate pixels among the plurality of character candidate pixels.
The judgment step is
Using the first machine learning model, it is determined for each block whether or not each of the plurality of blocks is the character block.
Using the second machine learning model, it is determined for each block whether or not each of the plurality of blocks is the character block.
The character pixel identification step is
Using the determination result using the first machine learning model, a plurality of first pixels are identified from the plurality of first character candidate pixels.
Using the determination result using the second machine learning model, a plurality of second pixels are identified from the plurality of second character candidate pixels.
Wherein that identifies a plurality of character pixels, the image processing method and a second pixel of the plurality the first pixel of the plurality.