JP5754147B2

JP5754147B2 - Image reading apparatus and image forming apparatus

Info

Publication number: JP5754147B2
Application number: JP2011018789A
Authority: JP
Inventors: 慎也佐原
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2011-01-31
Filing date: 2011-01-31
Publication date: 2015-07-29
Anticipated expiration: 2031-01-31
Also published as: JP2012160885A

Description

本発明は、画像読取装置及び画像形成装置に関する。 The present invention relates to an image reading apparatus and an image forming apparatus.

原稿の画像を読み取る画像読取装置の機能の一つに白紙判断機能がある。下記特許文献１には、原稿の縁部やパンチ穴の形成箇所を除外領域として指定し、除外領域を除いた領域で白紙判断を行う技術が開示されている。除外領域を設定することで、原稿の影やパンチ穴の影を画像と誤認することがなくなることから、白紙判断の精度が高くなる。尚、特許文献１では、白紙判定対象の領域について、画素濃度の出現数をカウントし、それを閾値と比較することで、原稿が白紙かどうか判断している。 One of the functions of an image reading apparatus that reads an image of a document is a blank page determination function. Japanese Patent Application Laid-Open No. 2004-151561 discloses a technique in which an edge portion of a document or a place where a punch hole is formed is designated as an excluded area, and blank page determination is performed in an area excluding the excluded area. By setting the exclusion area, the shadow of the original or the shadow of the punch hole is not mistaken for the image, and the accuracy of blank page determination is increased. In Patent Document 1, it is determined whether the document is blank by counting the number of occurrences of pixel density in the blank determination target area and comparing it with a threshold value.

特開２００３−１９８７７７号公報JP 2003-198777 A

ところで、全面白紙原稿（定型画像、本文画像とも印刷されていない原稿）だけでなく、ファイル名、ページなどいわゆる定型画像のみ印刷され本文画像の印刷がない準白紙原稿も、白紙と判定したい場合がある。しかし、上述した従来の画像読取装置では、定型画像の形成箇所が除外領域に設定されなければ、準白紙原稿が白紙でないと誤判定される恐れがある。また、除外領域をユーザ設定にすると、ユーザの手間になる。 By the way, there are cases where it is desired to determine not only a blank white document (a document in which neither a standard image nor a text image is printed) but also a quasi-white paper document in which only a so-called standard image such as a file name and a page is printed but no text image is printed. is there. However, in the above-described conventional image reading apparatus, there is a possibility that it is erroneously determined that the quasi-blank document is not blank unless the formation position of the standard image is set in the exclusion area . Also, if the exclusion area is set as a user setting, it will be time-consuming for the user.

本発明は上記のような事情に基づいて完成されたものであって、ユーザの手間を減らしつつ、準白紙原稿を白紙と判断することを目的とする。 The present invention has been completed based on the above-described circumstances, and an object thereof is to determine a semi-blank document as a blank sheet while reducing the user's trouble.

本明細書によって開示される画像読取装置は、原稿の画像を読み取る読取部と、前記読取部が読み取った複数ページの原稿画像を比較することにより、複数ページの同じ座標に形成された共通パターンを検出する検出部と、前記共通パターンを除外して、原稿が白紙か非白紙かを判断する判断部とを備える。この構成では、共通パターンを除外して白紙判断を行うので、例えば、共通パターンのみ印字されていた原稿は白紙と判断される。そして、従来技術のように除外範囲を設けないので、原稿全面を対象に白紙判断を行うことが可能であり、白紙判断の判定精度を高めること可能となる。また、除外範囲を設けないので、ユーザの手間を減らすことが出来る。 An image reading apparatus disclosed in this specification compares a reading unit that reads an image of a document with a plurality of pages of document images that are read by the reading unit, thereby generating a common pattern formed at the same coordinates on a plurality of pages. A detection unit for detecting, and a determination unit for determining whether the document is white or non-blank by excluding the common pattern. In this configuration, since the blank page is determined by excluding the common pattern, for example, a document on which only the common pattern is printed is determined to be blank. Since no exclusion range is provided as in the prior art, blank page determination can be performed on the entire original surface, and the determination accuracy of blank page determination can be increased. Moreover, since no exclusion range is provided, the user's trouble can be reduced.

また、上記画像読取装置において、前記読取部が読み取った各ページの画像から文字を判読して座標と関連付けて出力する判読部と、前記判読された文字と文字の座標から各ページに印字された文字列を認識する文字列認識部と、画像が読み取られた各原稿について、前記文字列と座標を関連付けた文字列テーブルを生成する文字列テーブル生成部を備え、前記検出部は各原稿の文字列テーブルを比較することにより、前記共通パターンとして、座標と文字が一致する共通文字列を検出するようにしてもよい。 Further, in the image reading device, a reading unit that reads characters from each page image read by the reading unit and outputs the characters in association with the coordinates, and printed on each page from the read characters and the character coordinates. A character string recognizing unit for recognizing a character string; and a character string table generating unit for generating a character string table in which the character string is associated with coordinates for each document from which an image has been read. By comparing column tables, a common character string whose coordinates and characters match may be detected as the common pattern.

また、上記画像読取装置において、前記文字には数字が含まれ、前記共通文字列には、座標が一致し連続する数字が含まれるようにしてもよい。 In the image reading apparatus, the character may include a number, and the common character string may include a continuous number of coordinates that coincide with each other.

また、上記画像読取装置において、前記検出部は、原稿のうち中央を含む一部の領域を除外して、前記共通文字列を検出するようにしてもよい。 In the image reading apparatus, the detection unit may detect the common character string by excluding a part of the original including a center.

また、上記画像読取装置において、前記検出部は、前記文字列テーブル生成部にて生成された全文字列テーブルのうち一部の文字列テーブルだけを使用して、前記共通文字列を検出するようにしてもよい。 In the image reading apparatus, the detection unit detects the common character string by using only a part of the character string tables among all the character string tables generated by the character string table generation unit. It may be.

また、上記画像読取装置において、前検出部は、前記文字列テーブル生成部にて生成された全文字列テーブルを対象にして、前記共通文字列を検出するようにしてもよい。 In the image reading apparatus, the pre-detection unit may detect the common character string with respect to all the character string tables generated by the character string table generation unit.

また、上記画像読取装置において、前記検出部は、前記文字列テーブル生成部にて前記文字列テーブルが生成される度に、新しく生成された文字列テーブルに含まれない共通文字列を削除することにより、前記共通文字列を更新するようにしてもよい。 In the image reading apparatus, the detection unit deletes a common character string that is not included in the newly generated character string table every time the character string table is generated by the character string table generation unit. Thus, the common character string may be updated.

また、上記画像読取装置において、前記判断部は、前記文字列テーブルに、前記共通文字以外の文字列が含まれている場合、原稿を非白紙と判断するようにしてもよい。 In the image reading apparatus, the determination unit may determine that the document is non-blank when the character string table includes a character string other than the common character.

また、上記画像読取装置において、前記判断部は、前記文字列テーブルに、前記共通文字列と座標が重なり、文字が異なる文字列が含まれている場合には、非白紙と判断するようにしてもよい。 In the image reading apparatus, the determination unit may determine that the character string table is non-blank when the character string table includes a character string that overlaps the common character string and has different characters. Also good.

本発明によれば、ユーザの手間を減らしつつ原稿の白紙判断をより正確に行うことが可能となる。 According to the present invention, it is possible to more accurately perform blank page determination of a document while reducing user effort.

実施形態１において複合機の斜視図1 is a perspective view of a multifunction machine according to a first embodiment. 画像読取ユニットの断面図Cross section of image reading unit 複合機の電気的構成を示すブロック図Block diagram showing the electrical configuration of the MFP 画像読取ユニットの電気的構成を示すブロック図Block diagram showing the electrical configuration of the image reading unit 白紙除去シーケンスのフローチャート図Flow chart of blank paper removal sequence 原稿を示す図Illustration showing the manuscript サブルーチン１のフローチャート図Flow chart of subroutine 1 文字の座標を説明する図Illustration explaining the coordinates of characters 文字列の座標を説明する図Illustration explaining the coordinates of the string 文字テーブル１を示す図The figure which shows the character table 1 文字テーブル２を示す図The figure which shows the character table 2 文字列テーブル１を示す図The figure which shows the character string table 1 文字列テーブル２を示す図The figure which shows the character string table 2 文字列テーブル３を示す図The figure which shows the character string table 3 文字列テーブル４を示す図The figure which shows the character string table 4 文字列テーブル５を示す図The figure which shows the character string table 5 サブルーチン２のフローチャート図Flow chart of subroutine 2 共通文字列テーブルを示す図Diagram showing common character string table サブルーチン３のフローチャート図Flow chart of subroutine 3 サブルーチン４のフローチャート図Flowchart diagram of subroutine 4 実施形態２において、白紙除去シーケンスのフローチャート図In the second embodiment, a flowchart of the blank paper removal sequence サブルーチン５のフローチャート図Flowchart of subroutine 5 共通文字列テーブルの更新を示す図The figure which shows the update of the common character string table

＜実施形態１＞
実施形態１を、図１ないし図１４を用いて説明する。 <Embodiment 1>
The first embodiment will be described with reference to FIGS.

１．複合機の外観構成
図１は本発明の画像形成装置、画像読取装置の一例である複合機１の外観を示す斜視図であり、図２は画像読取ユニット５の断面図である。尚、本明細書を通じて原稿の読み取り方向である主走査方向をＸ方向とし、原稿の送り方向である副走査方向をＹ方向とする。 1. FIG. 1 is a perspective view showing an appearance of a multifunction machine 1 as an example of an image forming apparatus and an image reading apparatus according to the present invention. FIG. 2 is a cross-sectional view of an image reading unit 5. Throughout this specification, the main scanning direction, which is the document reading direction, is the X direction, and the sub-scanning direction, which is the document feeding direction, is the Y direction.

図１に示すように、複合機１はボックス型の本体部２と、本体部２の上方に配置された画像読取ユニット５を備えている。本体部２には印刷ユニット（本発明の「印刷部」の一例）３が収容されている。印刷ユニット３は、例えば、画像読取ユニット５にて読み取った画像データに基づいて紙などの被記録媒体に対してトナー、インクなどを用いて電子写真方式により画像を形成（印刷）する装置である。 As shown in FIG. 1, the multifunction machine 1 includes a box-type main body 2 and an image reading unit 5 disposed above the main body 2. The main body 2 accommodates a printing unit 3 (an example of the “printing unit” in the present invention) 3. The printing unit 3 is an apparatus that forms (prints) an image on a recording medium such as paper based on the image data read by the image reading unit 5 by using an electrophotographic method using toner, ink, or the like. .

画像読取ユニット５は原稿を読み取るものであり、ＣＩＳ３０、ＡＤＦ４０、原稿載置部５０を含む。原稿載置部５０は、台枠５１、透明なガラス板からなる第１プラテンガラス５２、第２プラテンガラス５３、及びこれらのガラス５２、５３の中間に配置された中間枠５４を含む。原稿カバー４８は、原稿載置部５０を覆う閉姿勢と原稿載置部５０を開放する開姿勢とに回動可能であり、複合機１の本体部２の後側（操作部１１、表示部１２等が設けられる側を前側とする）に連結されている。原稿カバー４８上に、ＡＤＦ４０が設けられている。 The image reading unit 5 reads a document and includes a CIS 30, an ADF 40, and a document placement unit 50. The document placing unit 50 includes a frame 51, a first platen glass 52 made of a transparent glass plate, a second platen glass 53, and an intermediate frame 54 disposed between these glasses 52 and 53. The document cover 48 can be rotated between a closed posture that covers the document placement portion 50 and an open posture that opens the document placement portion 50, and the rear side of the main body 2 of the multifunction machine 1 (the operation portion 11, the display portion). The side on which 12 etc. are provided is connected to the front side). An ADF 40 is provided on the document cover 48.

図２に示すように、ＡＤＦ４０はＡＤＦカバー４１、原稿トレイ４２、搬送路４３、給紙ローラ４４Ａ、一対の搬送ローラ４４Ｂ、一対の排紙ローラ４４Ｃ等の各種ローラ、これら各種ローラを駆動するＡＤＦモータ８６、排紙トレイ４６、および押圧部材４７を含む。ＡＤＦ４０は、原稿トレイ４２に載置されている原稿を給紙ローラ４４Ａにより、一枚ずつ搬送して、第２プラテンガラス５３上を通過させ、排紙トレイ４６に排出する。押圧部材４７は、第２プラテンガラス５３上を通過する原稿が第２プラテンガラス５３から浮かないように、原稿を第２プラテンガラス５３に押圧する。さらに、ＡＤＦ４０には、原稿トレイ４２にセットされた原稿を検出するための、フォトセンサ等の原稿センサ４９が設けられている。 As shown in FIG. 2, the ADF 40 includes an ADF cover 41, a document tray 42, a conveyance path 43, various rollers such as a paper feed roller 44A, a pair of conveyance rollers 44B, a pair of paper discharge rollers 44C, and an ADF that drives these various rollers. A motor 86, a paper discharge tray 46, and a pressing member 47 are included. The ADF 40 conveys the originals placed on the original tray 42 one by one by the paper feed roller 44 </ b> A, passes the second platen glass 53, and discharges the originals to the paper output tray 46. The pressing member 47 presses the document against the second platen glass 53 so that the document passing over the second platen glass 53 does not float from the second platen glass 53. Further, the ADF 40 is provided with a document sensor 49 such as a photo sensor for detecting a document set on the document tray 42.

ＣＩＳ３０は、原稿載置部５０の下方に設けられている。ＣＩＳ３０は、複数の受光素子が図２の紙面垂直方向に直線状に配列されているリニアイメージセンサ３３、ＲＧＢ３色の発光ダイオードなどで構成される光源３１、原稿で反射された光源３１からの反射光をリニアイメージセンサ３３の各受光素子に結像させるロッドレンズアレイ３２、これらが搭載されるキャリッジ３４、およびキャリッジ３４を搬送する図示しない搬送機構を含む。リニアイメージセンサ３３は、受光素子に結像した反射光の輝度や色度を検出し、原稿の画像に基づくデータを生成する。 The CIS 30 is provided below the document placement unit 50. The CIS 30 includes a linear image sensor 33 in which a plurality of light receiving elements are linearly arranged in a direction perpendicular to the paper surface of FIG. 2, a light source 31 composed of RGB three-color light emitting diodes, etc., and a reflection from the light source 31 reflected from the document. It includes a rod lens array 32 that focuses light on each light receiving element of the linear image sensor 33, a carriage 34 on which these are mounted, and a transport mechanism (not shown) that transports the carriage 34. The linear image sensor 33 detects the brightness and chromaticity of the reflected light imaged on the light receiving element, and generates data based on the image of the document.

画像読取ユニット５は、第１プラテンガラス５２にセットされた原稿を読み取るときはＦＢモータ８４によってＣＩＳ３０を第１プラテンガラス５２に平行な副走査方向（図２中のＡ方向）に搬送しながら原稿を１ラインずつ読み取る。一方、ＡＤＦ４０によって搬送される原稿を読み取るときは、画像読取ユニット５は、ＦＢモータ８４によってＣＩＳ３０を第２プラテンガラス５３の直下に移動させ、第２プラテンガラス上の読取位置Ｐを通過する原稿を、イメージセンサ３３によって１ラインずつ読み取る。 When reading the original set on the first platen glass 52, the image reading unit 5 conveys the CIS 30 by the FB motor 84 in the sub-scanning direction (A direction in FIG. 2) parallel to the first platen glass 52. Is read line by line. On the other hand, when reading a document conveyed by the ADF 40, the image reading unit 5 moves the CIS 30 directly below the second platen glass 53 by the FB motor 84, and reads the document that passes the reading position P on the second platen glass. The image sensor 33 reads each line.

さらに、複合機１の前側には、各種のボタンからなり、ユーザからの操作指令を受け付ける操作部１１、複合機１の状態を表示する液晶ディスプレイからなる表示部１２が設けられている。 Further, on the front side of the multifunction device 1, an operation unit 11 that includes various buttons and receives an operation command from a user, and a display unit 12 that includes a liquid crystal display that displays the state of the multifunction device 1 are provided.

２．電気的構成
図３は複合機１の電気的構成を示すブロック図、図４は画像読取ユニットの電気的構成を示すブロック図である。複合機１は、制御部７０、印刷ユニット３、画像読取ユニット５及び通信部７を備えて構成されている。制御部７０はＣＰＵ７１ａ、ＲＯＭ７１ｂ、及びＲＡＭ７１ｃを備えている。ＣＰＵ７１ａはＲＯＭ７１ｂに記憶されている各種のプログラムを実行することによって複合機１の各部を制御する。ＲＯＭ７１ｂはＣＰＵ７１ａが実行する各種のプログラム（例えば、後述する白紙除去シーケンスを実行するためのプログラム）やプログラムの実行に用いるデータ（例えば、後述するＯＣＲ辞書）などを記憶している。ＲＡＭ７１ｃはＣＰＵ７１ａが各種の処理を実行するための主記憶装置として用いられる。また、ＲＡＭ７１ｃには、画像読取ユニット５にて読み取った原稿の画像データが記憶される。 2. Electrical Configuration FIG. 3 is a block diagram showing the electrical configuration of the multifunction machine 1, and FIG. 4 is a block diagram showing the electrical configuration of the image reading unit. The multifunction machine 1 includes a control unit 70, a printing unit 3, an image reading unit 5, and a communication unit 7. The control unit 70 includes a CPU 71a, a ROM 71b, and a RAM 71c. The CPU 71a controls each part of the multifunction device 1 by executing various programs stored in the ROM 71b. The ROM 71b stores various programs executed by the CPU 71a (for example, a program for executing a blank sheet removal sequence described later), data used for execution of the program (for example, an OCR dictionary described later), and the like. The RAM 71c is used as a main storage device for the CPU 71a to execute various processes. The RAM 71c stores image data of a document read by the image reading unit 5.

図４は、画像読取ユニット５の電気的構成を示すブロック図である。画像読取ユニット５は、ＡＳＩＣ８０、ＦＢモータ８４、ＦＢモータ駆動回路８５、ＡＤＦモータ８６、ＡＤＦモータ駆動回路８７、ＣＩＳ３０、光源制御回路８８、ＡＦＥ８９、原稿センサ４９、操作部１１、表示部１２などを備えて構成されている。 FIG. 4 is a block diagram showing an electrical configuration of the image reading unit 5. The image reading unit 5 includes an ASIC 80, an FB motor 84, an FB motor drive circuit 85, an ADF motor 86, an ADF motor drive circuit 87, a CIS 30, a light source control circuit 88, an AFE 89, an original sensor 49, an operation unit 11, a display unit 12, and the like. It is prepared for.

ＡＳＩＣ８０には、ＦＢモータ駆動回路８５、ＡＤＦモータ駆動回路８７、光源制御回路８８、ＡＦＥ８９、操作部１１、表示部１２が接続されている。ＡＳＩＣ８０はＣＰＵ７１ａの制御下でこれらを制御するとともに、ＡＦＥ８９から出力された出力値（画素値）にガンマ補正やシェーディング補正、その他各種の画像処理を施して画素毎にＲＧＢ３つの画素値を持つ画像データを生成する。そして、生成された画像データは、ＲＡＭ７１ｃに記憶される。 Connected to the ASIC 80 are an FB motor drive circuit 85, an ADF motor drive circuit 87, a light source control circuit 88, an AFE 89, an operation unit 11, and a display unit 12. The ASIC 80 controls these under the control of the CPU 71a, and applies gamma correction, shading correction, and other various image processing to the output value (pixel value) output from the AFE 89, and image data having three RGB pixel values for each pixel. Is generated. The generated image data is stored in the RAM 71c.

ＡＦＥ８９（Analog Front End）は、イメージセンサ３３から出力されるアナログの出力値（電圧）をデジタルの出力値（画素値）に変換する回路である。 The AFE 89 (Analog Front End) is a circuit that converts an analog output value (voltage) output from the image sensor 33 into a digital output value (pixel value).

３．白紙除去シーケンス
本実施形態の複合機１は、画像読取ユニット５で読み取った原稿のうち「白紙」の原稿を除去する機能を備えている。尚、ここで言う白紙にはファイル名、ページなどいわゆる定型画像のみ印刷され本文画像の印刷がない準白紙原稿と、全面白紙原稿（定型画像、本文画像とも印刷されていない原稿）の双方を含むものとする。 3. Blank Paper Removal Sequence The multifunction machine 1 of the present embodiment has a function of removing “blank paper” originals from the originals read by the image reading unit 5. Note that the white paper referred to here includes both semi-blank originals that are printed only with so-called standard images, such as file names and pages, and that are not printed with full-text images, and full-scale blank originals (originals with neither fixed-form images nor text images printed). Shall be.

以下、図５を参照して、上記白紙除去を行う白紙除去シーケンスについて詳細を説明する。白紙除去シーケンスは複合機１の起動に伴って実行され、スタート直後、オペレータの入力操作を待つ待機状態となる（Ｓ１）。そして、原稿トレイ４２に原稿がセットされ操作部１１のスタートキーが押されると、Ｓ２に移行する。尚、ここでは、原稿トレイ４２に対して図６に示す全５枚の原稿がセットされたものとして、以下説明を行う。 Hereinafter, the blank sheet removal sequence for performing the blank sheet removal will be described in detail with reference to FIG. The blank sheet removal sequence is executed as the multifunction device 1 is started, and immediately after the start, the standby state is waited for an input operation by the operator (S1). When a document is set on the document tray 42 and the start key of the operation unit 11 is pressed, the process proceeds to S2. Here, the following description will be made assuming that all the five originals shown in FIG. 6 are set on the original tray 42.

Ｓ２では、制御部７０のＣＰＵ７１ａによりＡＤＦ４０を駆動させる処理が開始される。これにより原稿トレイ４２上にセットされた一枚目の原稿が、表面を上に向けた状態で搬送路４３に送りだされる。そして、Ｓ３では、Ｓ２で送り出された原稿（ここでは、１枚目の原稿）の先端が読取位置Ｐに達したかどうか判定する処理が行われる。原稿の先端が読取位置Ｐに達するまでの間は、Ｓ３でＮＯ判定され、Ｓ３の処理を繰り返す状態となる。 In S2, the CPU 71a of the control unit 70 starts processing for driving the ADF 40. As a result, the first document set on the document tray 42 is sent to the transport path 43 with the front side facing up. In S3, a process for determining whether or not the leading edge of the document sent out in S2 (here, the first document) has reached the reading position P is performed. Until the leading edge of the document reaches the reading position P, NO is determined in S3, and the process in S3 is repeated.

そして、送りだされた原稿の先端が読取位置Ｐに達するとＳ３でＹＥＳ判定され、処理はＳ４に移る。Ｓ４では、ＣＩＳ３０による原稿の読み取りが行われる。具体的には、第２プラテンガラス５３上の読取位置Ｐを通過する原稿を、イメージセンサ３３が１ラインずつ読み取ってゆく。そして、全ラインについて読み取りが完了すると、その原稿は排紙トレイ４６上に排紙される。 When the leading edge of the fed document reaches the reading position P, a YES determination is made in S3, and the process proceeds to S4. In S4, the document is read by the CIS 30. Specifically, the image sensor 33 reads the original passing through the reading position P on the second platen glass 53 line by line. When all the lines have been read, the document is discharged onto the discharge tray 46.

Ｓ４に続くＳ５では、ＣＰＵ７１ａにより、Ｓ４にて読み取った原稿が１枚目の原稿か判定される。Ｓ４で読み取った原稿は１枚目であるため、Ｓ５ではＹＥＳ判定される。Ｓ５でＹＥＳ判定されると、次にＳ６に移行する。Ｓ６では、Ｓ４にて読み取った原稿の画像データをＲＡＭ７１ｃに対して保存する処理がＣＰＵ７１ａにより実行される。これにて、１枚目の原稿の画像データがＲＡＭ７１ｃに対して保存（記憶）される。 In S5 following S4, the CPU 71a determines whether the document read in S4 is the first document. Since the document read in S4 is the first document, YES is determined in S5. If YES is determined in S5, the process proceeds to S6. In S6, the CPU 71a executes processing for saving the image data of the original read in S4 in the RAM 71c. Thus, the image data of the first original is saved (stored) in the RAM 71c.

Ｓ７では、保存した画像データから文字テーブルと文字列テーブルを作成する処理が、ＣＰＵ７１ａにより実行される。尚、文字テーブルとは、原稿に印刷された文字（数字を含む）や図柄を、座標（原稿上の座標）と関連付けて記憶させたものであり、また文字列テーブルとは、文字がＸ方向（原稿の主走査方向）に連なった文字列を座標（原稿上の座標）と関連付けて記憶させたものである。尚、これら文字テーブル、文字列テーブル及び後述する共通文字列テーブルはいずれもＲＡＭ７１ｃのワーキングエリアに作成される。そして、この文字テーブルと文字列テーブルを作成する処理はサブルーチン化されており、Ｓ７では、図７に示すサブルーチン１が読み出される。 In S7, processing for creating a character table and a character string table from the stored image data is executed by the CPU 71a. The character table stores characters (including numbers) and designs printed on a document in association with coordinates (coordinates on the document), and the character table is a character string in the X direction. A character string continuous in the (original scanning direction of the document) is stored in association with coordinates (coordinates on the document). These character table, character string table, and common character string table described later are all created in the working area of the RAM 71c. The process for creating the character table and the character string table is made into a subroutine, and in S7, the subroutine 1 shown in FIG. 7 is read.

サブルーチン１は、図７に示すように、Ｓ４１〜Ｓ４９の９つのステップから構成されている。そして、Ｓ４１では、読み取った原稿について黒画素が探索される。この例では、図８ａに示すように、原稿の座標基準を原稿の上縁部左端とし、主走査方向（図８ａ中の左右方向）をＸ方向、副走査方向（図８ａ中の上下方向）をＹ方向としている。黒画素は、座標基準である原稿上縁部左端を始端として、Ｘ方向に探査される。 Subroutine 1 is composed of nine steps S41 to S49 as shown in FIG. In step S41, a black pixel is searched for the read original. In this example, as shown in FIG. 8a, the document coordinate reference is the left edge of the upper edge of the document, the main scanning direction (left-right direction in FIG. 8a) is the X direction, and the sub-scanning direction (up-down direction in FIG. 8a). Is the Y direction. The black pixel is searched in the X direction starting from the left edge of the upper edge of the document, which is the coordinate reference.

黒画素が検出されると、Ｓ４２でＹＥＳ判定されＳ４３に移る。Ｓ４３では、黒画素が連続する連続領域が探索され、画像データが切りだされる。具体例を挙げると、図６に示す１枚目の原稿では、原稿左端の上部に「ｔｉｔｌｅ」なる文字が印刷されている。そのため、初回に行われるＳ４３の処理では「ｔｉｔｌｅ」のうち、最初の文字「ｔ」を囲む連続領域Ｕが検出され、連続領域Ｕの画像データが切り出される。 If a black pixel is detected, YES is determined in S42, and the process proceeds to S43. In S43, a continuous region in which black pixels are continuous is searched, and image data is cut out. As a specific example, in the first document shown in FIG. 6, the characters “title” are printed on the upper left side of the document. Therefore, in the process of S43 performed for the first time, the continuous area U surrounding the first character “t” in “title” is detected, and the image data of the continuous area U is cut out.

Ｓ４４では、Ｓ４３にて切り出した画像データについて、文字判読する処理がＣＰＵ７１ａにより行われる。具体的には、切り出した画像データから形状を解析し、それをＯＣＲ辞書に登録された文字（キャラクタ）の形状を照合して、ＯＣＲ辞書から形状が最も近い文字を検出することにより、文字判読する。ここでは、Ｓ４３にて切り出した画像データは文字「ｔ」であると、ＣＰＵ７１ａにて判読される。尚、Ｓ４４の文字判定処理はいわゆる光学文字認識（ＯＣＲ：Optical Character Recognition）として知られた技術である。また、ここでいう「文字」には、ひらがな、かたかな、ローマ字の他、数字が含まれる。また、ＣＰＵ７１ａが実行するＳ４４の処理により、本発明の「判読部」の処理機能が実現されている。 In S44, the CPU 71a performs a character reading process on the image data cut out in S43. Specifically, character interpretation is performed by analyzing the shape from the extracted image data, comparing the shape of the character (character) registered in the OCR dictionary, and detecting the closest character from the OCR dictionary. To do. Here, the CPU 71a interprets that the image data cut out in S43 is the character “t”. The character determination processing in S44 is a technique known as so-called optical character recognition (OCR). In addition, the “letter” here includes numbers in addition to hiragana, katakana, and romaji. Further, the processing function of the “reading unit” of the present invention is realized by the processing of S44 executed by the CPU 71a.

続く、Ｓ４５では、Ｓ４３にて切り出した画像データがＳ４４にて文字として判読できたか、判定する処理がＣＰＵ７１ａにて実行される。ここでは、Ｓ４３にて切り出した画像データは文字として判読できていることから、Ｓ４５ではＹＥＳ判定され、処理はＳ４６に移る。 In S45, the CPU 71a determines whether the image data cut out in S43 has been read as characters in S44. Here, since the image data cut out in S43 can be read as characters, YES is determined in S45, and the process proceeds to S46.

そして、Ｓ４６では、Ｓ４４にて判読した文字に関する情報を、文字テーブルに書き込む処理（保存する処理）が行われる。文字テーブルは、図９ａ、図９ｂに示すように、文字と、その文字の座標を表としてまとめたものである。そして、この実施形態では、文字の座標を、画像データの切り出しに使用した連続領域（すなわち、文字を囲む矩形領域）Ｕの座標（Ｘ１、Ｘ２、Ｙ１、Ｙ２）を使って表すことにしている（図８ａ参照）。 In S46, a process of writing (storing) information related to the character read in S44 to the character table is performed. As shown in FIGS. 9a and 9b, the character table is a table in which characters and their coordinates are collected. In this embodiment, the coordinates of the characters are expressed using the coordinates (X1, X2, Y1, Y2) of the continuous area U (that is, the rectangular area surrounding the characters) U used to cut out the image data. (See FIG. 8a).

従って、ここでは、図９ａに示す文字テーブルの文字の欄に「ｔ」が書き込まれ、また、文字の座標の欄にＸ１＝１０、Ｘ２＝１４、Ｙ１＝１０、Ｙ２＝２０がそれぞれ書き込まれる（図８ａも参照）。 Therefore, “t” is written in the character column of the character table shown in FIG. 9a, and X1 = 10, X2 = 14, Y1 = 10, and Y2 = 20 are written in the character coordinate column. (See also FIG. 8a).

尚、Ｓ４４で切り出した画像データについて、文字が判読できなかった場合には、Ｓ４５の判定処理でＮＯ判定される。そして、Ｓ４５でＮＯ判定された場合には、Ｓ４７に移行する。Ｓ４７では、判読できなかった画像データは図柄であると認識し、その図柄は座標とともに文字テーブルに書き込まれる。尚、図柄の座標としては、文字と同様に図柄を囲む連続領域Ｕの座標が書き込まれる。 Note that if the image data cut out in S44 cannot be read, NO is determined in the determination process in S45. If NO is determined in S45, the process proceeds to S47. In S47, the image data that could not be read is recognized as a symbol, and the symbol is written in the character table together with the coordinates. As the symbol coordinates, the coordinates of the continuous area U surrounding the symbols are written as in the case of characters.

Ｓ４６、Ｓ４７の処理が完了すると、次にＳ４８の処理が行われる。Ｓ４８では、原稿の全画素について黒画素の探索を行ったか判定する処理がＣＰＵ７１ａにより実行される。この段階では、原稿左端の上部しか行っていないことから、Ｓ４８ではＮＯ判定される。そして、Ｓ４８でＮＯ判定されると、処理はＳ４１に戻り、ＣＰＵ７１ａにより、既に探索済みの領域を除外して、黒画素を探索する処理が再開される。 When the processes of S46 and S47 are completed, the process of S48 is performed next. In S48, the CPU 71a executes processing for determining whether or not black pixels have been searched for all the pixels of the document. At this stage, since only the upper part of the left end of the document has been performed, NO is determined in S48. If NO is determined in S48, the process returns to S41, and the CPU 71a resumes the process of searching for black pixels by excluding the already searched area.

そして、新たな黒画素が検出されると、Ｓ４３にて連続領域Ｕが検出され、更に連続領域Ｕの画像データが切り出される。そして、Ｓ４４では切り出した画像データを文字判読する処理が行われる。その後、Ｓ４５にて、切り出した画像データを文字判読できたか判定する処理が行われ、文字判読できていれば、Ｓ４６にて、文字テーブルに、文字とその文字の座標が書き込まれる。この例では、２回目に行うＳ４６の処理にて、原稿左端の上部に印刷された「ｔｉｔｌｅ」なる文字列のうち、２番目の文字である「ｉ」と、「ｉ」を囲む連続領域Ｕの座標が図９ａに示す文字テーブルに書き込まれる。 When a new black pixel is detected, a continuous area U is detected in S43, and image data of the continuous area U is further cut out. In S44, a process of reading the extracted image data is performed. Thereafter, in S45, a process is performed to determine whether or not the extracted image data has been read. If the character has been read, the character and the coordinates of the character are written in the character table in S46. In this example, the second character “i” and the continuous region U surrounding “i” in the character string “title” printed in the upper part of the left end of the document in the process of S46 performed for the second time. Are written in the character table shown in FIG. 9a.

このような処理が繰り返し行われることで、画像データの切り出しと、切り出した画像データの文字判読が行われ、その結果が文字テーブルに順に書き込まれてゆく。そして、１枚目の原稿には「ｔ」、「ｉ」、「ｔ」、「ｌ」、「ｅ」、「ａ」、「ｂ」、「ｃ」、「い」、「い」、「え」、「１」、「Ａ」、「．」、「ｄ」、「ｏ」、「ｃ」の合計１７の文字が印字されていることから、サブルーチン１の実行により、文字テーブルには、上記した１７個の文字と、その文字の座標が書き込まれることとなる。 By repeatedly performing such processing, the image data is cut out and the extracted image data is interpreted, and the result is sequentially written in the character table. The first document includes “t”, “i”, “t”, “l”, “e”, “a”, “b”, “c”, “i”, “i”, “ E ”,“ 1 ”,“ A ”,“. ”,“ D ”,“ o ”,“ c ”, a total of 17 characters are printed. The 17 characters described above and the coordinates of the characters are written.

そして、原稿の全画素について黒画素の探索が終了すると、Ｓ４８の判定でＹＥＳ判定され、処理はＳ４９に移行する。Ｓ４９では、連なっている文字を文字列としてテーブルを再構成することにより、文字列テーブルが作成される（ＣＰＵ７１ａにより作成される）。 When the search for the black pixel is completed for all the pixels of the document, the determination in S48 is YES, and the process proceeds to S49. In S49, a character string table is created by reconfiguring the table using the consecutive characters as a character string (created by the CPU 71a).

具体的に説明すると、文字が文字列を構成しているかどうかは、２つの条件を満たしているかどうかにより判断される。
（１）Ｙ座標が概ね一致している。
（２）Ｘ座標が連続している。 More specifically, whether or not a character constitutes a character string is determined based on whether or not two conditions are satisfied.
(1) The Y coordinates are almost the same.
(2) The X coordinate is continuous.

例えば、１枚目原稿の文字テーブルを構成する１７個の文字のうち「ｔ」、「ｉ」、「ｔ」、「ｌ」、「ｅ」の５文字はいずれもＹ１の座標が「１０」、Ｙ２の座標が「２０」であり、Ｙ座標が一致している。よって、（１）の条件をクリアしている。 For example, among the 17 characters that make up the character table of the first original, 5 characters “t”, “i”, “t”, “l”, and “e” all have a Y1 coordinate of “10”. , Y2 is “20”, and the Y coordinates are the same. Therefore, the condition (1) is cleared.

また「ｔ」、「ｉ」、「ｔ」、「ｌ」、「ｅ」の５文字は、Ｘ座標が連続している。具体的には、１番目の文字「ｔ」はＸ２の座標が「１４」であるのに対して、２番目の文字「ｉ」はＸ１の座標が「１５」であり、１番目の文字「ｔ」と２番目の「ｉ」のＸ座標は連続している。また、２番目の文字「ｉ」はＸ２の座標「１９」であるのに対して、３番目の文字「ｔ」のＸ１の座標「２０」であり、２番目の文字「ｉ」と３番目の文字「ｔ」のＸ座標は連続している。また、３番目の文字「ｔ」と４番目の文字「ｉ」と、４番目の文字「ｉ」と５番目の文字「ｅ」はＸ座標が連続している。 In addition, the X coordinates of five characters “t”, “i”, “t”, “l”, and “e” are continuous. Specifically, the first character “t” has the X2 coordinate “14”, whereas the second character “i” has the X1 coordinate “15” and the first character “t”. The X coordinates of “t” and the second “i” are continuous. Also, the second character “i” is the X2 coordinate “19”, whereas the third character “t” is the X1 coordinate “20”, and the second character “i” is the third character “t”. The X coordinate of the character “t” is continuous. The third character “t”, the fourth character “i”, the fourth character “i”, and the fifth character “e” have consecutive X coordinates.

このように「ｔ」、「ｉ」、「ｔ」、「ｌ」、「ｅ」の５文字は、（１）の条件と（２）の条件の双方を満たしているので、文字列と認識される。 As described above, the five characters “t”, “i”, “t”, “l”, and “e” satisfy both the conditions (1) and (2), and thus are recognized as character strings. Is done.

そして、１枚目の原稿のうち、番号１〜番号５の５文字（「ｔ」、「ｉ」、「ｔ」、「ｌ」、「ｅ」）と、番号６〜番号８の３文字（「ａ」、「ｂ」、「ｃ」）と、番号９〜番号１１の３文字（「い」、「い」、「え」）と、番号１３〜番号１７の５文字（「Ａ」、「．」、「ｄ」、「ｏ」、「ｃ」）は、いずれも（１）の条件と（２）の条件を満たしており、文字列と認識される。 Of the first document, five characters (“t”, “i”, “t”, “l”, “e”) of number 1 to number 5 and three characters of number 6 to number 8 ( “A”, “b”, “c”), three characters from number 9 to number 11 (“i”, “i”, “e”), and five characters from number 13 to number 17 (“A”, ".", "D", "o", "c") all satisfy the conditions (1) and (2) and are recognized as character strings.

文字列テーブルは、図１０ａ〜図１０ｅに示すように、文字列と、その文字列の座標を表としてまとめたものである。そして、この実施形態では、文字列の座標を、文字列を囲む矩形領域Ｖの座標（Ｘ３、Ｘ４、Ｙ３、Ｙ４）を使って表すことにしている（図８ｂ参照）。 As shown in FIGS. 10a to 10e, the character string table is a table in which character strings and the coordinates of the character strings are collected. In this embodiment, the coordinates of the character string are expressed using the coordinates (X3, X4, Y3, Y4) of the rectangular area V surrounding the character string (see FIG. 8b).

以上のことから、図１０ａに示す文字列テーブル１には「ｔｉｔｌｅ」、「ａｂｃ」、「いいえ」、「１」、「Ａ．ｄｏｃ」の５つの文字列と、各文字列の座標がそれぞれ書き込まれることになる。尚、本実施形態では、単独文字を文字列に含めており、文字「１」についても、文字列として文字列テーブルに書き込むようにしている。また、文字テーブル１や文字列テーブル１など、テーブルの末尾に付した添え字は、原稿の枚数を示す。すなわち、文字テーブル１は１枚目原稿の文字テーブルを意味、文字列テーブル１は、１枚目原稿の文字列テーブルを意味する。また、ＣＰＵ７１ａが実行するＳ４９の処理により、本発明の「文字列認識部」と「文字列テーブル生成部」の果たす処理機能が実現されている。 From the above, the character string table 1 shown in FIG. 10a includes five character strings “title”, “abc”, “No”, “1”, “A.doc”, and coordinates of each character string. Will be written. In this embodiment, a single character is included in the character string, and the character “1” is also written in the character string table as a character string. A subscript attached to the end of the table such as the character table 1 or the character string table 1 indicates the number of documents. That is, the character table 1 means the character table of the first original, and the character string table 1 means the character string table of the first original. Further, the processing function performed by the “character string recognition unit” and the “character string table generation unit” of the present invention is realized by the processing of S49 executed by the CPU 71a.

そして、Ｓ４９にて、文字列テーブルが作成されるとサブルーチン１は終了し、処理は図５のメインフローに戻り、Ｓ２３の処理が行われる。Ｓ２３では、原稿センサ４９の出力に基づいて、原稿トレイ４２上に次の原稿があるか判定される。この段階では、１枚目の原稿しか原稿の読み取りを終了しておらず、原稿トレイ４２上には、残りの原稿が残されているため、Ｓ２３ではＹＥＳ判定される。 When the character string table is created in S49, the subroutine 1 ends, the process returns to the main flow of FIG. 5, and the process of S23 is performed. In S 23, it is determined whether there is a next document on the document tray 42 based on the output of the document sensor 49. At this stage, only the first document has been read, and the remaining document remains on the document tray 42, so a YES determination is made in S23.

そのため、白紙除去シーケンスはＳ２に戻り、ＡＤＦ４０により、原稿トレイ４２上にセットされた二枚目の原稿が、表面を上に向けた状態で搬送路４３に送りだされる（Ｓ３）。そして、原稿の先端が読取位置Ｐに達すると、ＣＩＳ３０による原稿の読み取りが行われる（Ｓ４）。 For this reason, the blank paper removal sequence returns to S2, and the second document set on the document tray 42 is sent to the transport path 43 by the ADF 40 with the front side facing up (S3). When the leading edge of the document reaches the reading position P, the document is read by the CIS 30 (S4).

Ｓ４に続くＳ５では、ＣＰＵ７１ａにより、Ｓ４にて読み取った原稿が１枚目の原稿か判定される。Ｓ４で読み取った原稿は２枚目であるため、Ｓ５ではＮＯ判定される。Ｓ５でＮＯ判定されると、次にＳ８に移行する。Ｓ８では、Ｓ４にて読み取った原稿が２枚目の原稿か判定される。 In S5 following S4, the CPU 71a determines whether the document read in S4 is the first document. Since the document read in S4 is the second document, NO is determined in S5. If NO is determined in S5, the process proceeds to S8. In S8, it is determined whether the document read in S4 is the second document.

ここでは、Ｓ４で２枚目の原稿を読み取っているので、Ｓ８ではＹＥＳ判定される。その後、処理はＳ９に移行して、Ｓ４にて読み取った原稿の画像データをＲＡＭ７１ｃに対して保存する処理がＣＰＵ７１ａにより実行される。これにて、２枚目の原稿の画像データがＲＡＭ７１ｃに対して保存（記憶）される。 Here, since the second original is read in S4, YES is determined in S8. Thereafter, the process proceeds to S9, and the CPU 71a executes a process for storing the image data of the document read in S4 in the RAM 71c. Thus, the image data of the second document is saved (stored) in the RAM 71c.

Ｓ９にて、読み取った画像データをＲＡＭ７１ｃに対して記憶すると、次にＳ１０の処理が実行される。Ｓ１０の処理は、Ｓ７と同じ処理であり、図７に示すサブルーチン１が読み出され、２枚目の原稿について、読み取った画像データから文字テーブルと文字列テーブルを作成する処理が行われる。これにより、図９ｂに示す文字テーブル２と、図１０ｂに示す文字列テーブル２が作成されることとなる。 When the read image data is stored in the RAM 71c in S9, the process of S10 is executed next. The process of S10 is the same process as S7. Subroutine 1 shown in FIG. 7 is read, and a process of creating a character table and a character string table from the read image data is performed for the second original. As a result, the character table 2 shown in FIG. 9b and the character string table 2 shown in FIG. 10b are created.

そして、サブルーチン１の終了後、図５のメインフローに戻り、Ｓ１０の処理が行われる。Ｓ１０の処理は、１枚目原稿の文字列テーブル１と２枚目原稿の文字列テーブルから、共通文字列テーブルを作成する処理がＣＰＵ７１ａにより実行される。この共通文字列テーブルを作成する処理はサブルーチン化されており、Ｓ１１では図１１に示すサブルーチン２が読み出される。 Then, after the subroutine 1 is completed, the process returns to the main flow of FIG. 5 and the process of S10 is performed. In S10, the CPU 71a executes a process of creating a common character string table from the character string table 1 of the first original and the character string table of the second original. The process of creating the common character string table is made into a subroutine, and subroutine S2 shown in FIG. 11 is read in S11.

サブルーチン２は、図１１に示すようにＳ６１〜Ｓ７４の１４ステップから構成されている。このサブルーチン２は、１枚目原稿の文字列テーブル１の「行番号ｉの文字列」に対し、それに一致する文字列が２枚目原稿の文字列テーブル２に含まれているか検索する処理（Ｓ６１〜Ｓ６９）を、文字列テーブル１の「行番号ｉを更新」しながら繰り返し行うことで、１枚目原稿の文字列テーブル１と２枚目原稿の文字列テーブル２に共通する共通文字列を検出し、共通文字列テーブルを作成するものである。 Subroutine 2 includes 14 steps S61 to S74 as shown in FIG. This subroutine 2 searches for whether or not a character string matching the “character string of line number i” in the character string table 1 of the first original is included in the character string table 2 of the second original ( S61 to S69) are repeated while updating the line number i in the character string table 1, so that the common character string common to the character string table 1 of the first original and the character string table 2 of the second original is common. And a common character string table is created.

尚、この実施形態では、下記（３Ａ）の条件と（４）の条件を満たすか、（３Ｂ）の条件と（４）の条件を満たした場合に、文字列は共通であると判断する。 In this embodiment, the character strings are determined to be common when the following conditions (3A) and (4) are satisfied or when the conditions (3B) and (4) are satisfied.

（３Ａ）文字列を構成する文字が一致している(Ｓ６２）。
（３Ｂ）文字列が数字で、かつ文字列テーブル１の「ｉ」行の数字＋１の値が文字列テーブル２の行にある（Ｓ６３）。
（４）文字列の座標がほぼ一致している（Ｓ６４）。 (3A) The characters constituting the character string match (S62).
(3B) The character string is a number, and the value of the number +1 in the “i” row of the character string table 1 is in the row of the character string table 2 (S63).
(4) The coordinates of the character strings are almost the same (S64).

さて、Ｓ６１では、１枚目原稿の文字列テーブル１について行番号「ｉ」が「１」に設定され、２枚目の文字列テーブルについて行番号「ｊ」が「１」に設定される。 In S61, the line number “i” is set to “1” for the character string table 1 of the first original, and the line number “j” is set to “1” for the second character string table.

次にＳ６２では、文字列テーブル１の「ｉ」行目の文字列と、文字列テーブル２の「ｊ」行目の文字列について、文字が一致しているかどうか判定される。ここでは、「ｉ」と「ｊ」はいずれも１であるため、文字列テーブル１と文字テーブル２の１行目の文字列について、文字の一致が判定される。 Next, in S62, it is determined whether or not the characters in the character string in the “i” line of the character string table 1 match the character string in the “j” line in the character string table 2. Here, since both “i” and “j” are 1, character matching is determined for the first character strings in the character string table 1 and the character table 2.

文字列テーブル１と文字列テーブル２の１行目の文字列は、いずれも「ｔｉｔｌｅ」であることから、Ｓ６２ではＹＥＳ判定される。その後、処理はＳ６４に移行する。尚、Ｓ６２でＮＯ判定された場合には、Ｓ６３に移行する。Ｓ６３では、文字列が数字で、かつ文字列テーブル１の「ｉ」行の数字＋１の値が文字列テーブル２の行にあるか判定（ＣＰＵ７１ａにより判定される）される。 Since the character strings in the first row of the character string table 1 and the character string table 2 are both “title”, YES is determined in S62. Thereafter, the process proceeds to S64. If NO is determined in S62, the process proceeds to S63. In S63, it is determined whether the character string is a numeral and the value of the number “i” in the character string table 1 plus the value +1 is in the line of the character string table 2 (determined by the CPU 71a).

Ｓ６４では、２つの文字列が同じような座標にあるか判定される（ＣＰＵ７１ａにより判定される）。従って、ここでは、文字列テーブル１の１行目に書き込まれた文字列「ｔｉｔｌｅ」と、文字列テーブル２の１行目に書き込まれた文字列「ｔｉｔｌｅ」がほぼ同じような座標か判定される。 In S64, it is determined whether the two character strings have the same coordinates (determined by the CPU 71a). Therefore, here, it is determined whether the character string “title” written in the first line of the character string table 1 and the character string “title” written in the first line of the character string table 2 have substantially the same coordinates. The

文字列テーブル１の１行目に書き込まれた文字列「ｔｉｔｌｅ」と、文字列テーブル２の１行目に書き込まれた文字列「ｔｉｔｌｅ」の座標は、Ｘ１＝１０、Ｘ＝１４、Ｙ１＝１０、Ｙ２＝２０であり、４つの座標は全て一致している（図１０ａ、図１０ｂ参照）。そのため、Ｓ６４ではＹＥＳ判定され、次にＳ６５に移行（ただし、ＮＯ判定された場合には、Ｓ６８に移行する）する。 The coordinates of the character string “title” written in the first line of the character string table 1 and the character string “title” written in the first line of the character string table 2 are X1 = 10, X = 14, Y1 = 10, Y2 = 20, and all four coordinates coincide (see FIGS. 10a and 10b). Therefore, YES is determined in S64, and then the process proceeds to S65 (however, if NO is determined, the process proceeds to S68).

尚、Ｓ６４では、２つの文字列が同じような座標にある場合であれば、ＹＥＳ判定するようになっており、２つの文字列の座標が完全に一致している場合に加えて、同じような座標にある場合（具体的には、２つの文字列の座標に数ｍｍ程度の相違がある場合）もＹＥＳ判定される。このように座標の一致判断に余裕を持たせることで、印刷ズレにより、文字列の座標に数ｍｍ程度のズレが発生したとしても、Ｓ６４でＮＯ判断されない。 In S64, if the two character strings are at the same coordinates, the determination is YES, and in addition to the case where the coordinates of the two character strings are completely the same, YES (specifically, when there is a difference of about several millimeters between the coordinates of two character strings), YES is also determined. By providing a margin for the coordinate determination in this way, even if a shift of about several millimeters occurs in the coordinates of the character string due to the printing shift, NO is not determined in S64.

さて、Ｓ６５では、文字列は「数字」か、判定される。文字列「ｔｉｔｌｅ」は「数字」でないため、Ｓ６５ではＮＯ判定され、次にＳ６６に移行する。そして、Ｓ６６では、共通文字列テーブルに、共通文字列と座標を保存する処理が行われる（ＣＰＵ７１ａにより行われる）。 In S65, it is determined whether the character string is “numeric”. Since the character string “title” is not “number”, a NO determination is made in S65, and then the process proceeds to S66. In S66, a process of saving the common character string and coordinates in the common character string table is performed (performed by the CPU 71a).

共通文字列テーブルとは、共通文字列と、その共通文字列の座標を表としてまとめたもの（別の言い方をすると、共通文字列と、その座標を関連付けて記憶させたもの）である（図１２参照）。従って、ここでは、共通文字列テーブルの１行目に、文字列「ｔｉｔｌｅ」と、その座標が保存される（書き込まれる）。また、Ｓ６５でＹＥＳ判定された場合には、Ｓ６７に移行して共通文字列テーブルに、共通文字列として「数字」と座標が保存される（書き込まれる）。 The common character string table is a table in which the common character strings and the coordinates of the common character strings are summarized as a table (in other words, the common character strings and the coordinates are stored in association with each other) (see FIG. 12). Therefore, the character string “title” and its coordinates are stored (written) in the first line of the common character string table. If YES is determined in S65, the process proceeds to S67, and “numerals” and coordinates are stored (written) as a common character string in the common character string table.

Ｓ６６又はＳ６７の処理が終了すると、次にＳ６８の処理が行われる。Ｓ６８では、文字列テーブル２の行番号である「ｊ」をインクリメント（＋１加算）する処理が行われる。従って、ここでは、文字列テーブル２の行番号が「１」から「２」にインクリメントされる。 When the process of S66 or S67 is completed, the process of S68 is performed next. In S68, a process of incrementing (adding +1) “j” which is the line number of the character string table 2 is performed. Therefore, here, the line number of the character string table 2 is incremented from “1” to “2”.

次に、Ｓ６９では、文字列テーブル２の最大行数まで検索したか、ＣＰＵ７１ａにより判定される。この段階では、文字列テーブル２の１行目までしか検索されていないので、ＮＯ判定される。Ｓ６９でＮＯ判定されると、処理はＳ６２に戻る。 Next, in S69, the CPU 71a determines whether the maximum number of lines in the character string table 2 has been searched. At this stage, since only the first line of the character string table 2 has been searched, NO is determined. If a NO determination is made in S69, the process returns to S62.

その後、文字列テーブル１の「１」行目の文字列と、文字列テーブル２の「２」行目の文字列を対象にＳ６２〜Ｓ６７の処理が行われる。文字列テーブル１の「１」行目の文字列と、文字列テーブル２の「２」行目の文字列は、文字が不一致であり、また、数字でもないので、Ｓ６２、Ｓ６３でいずれもＮＯ判定され、処理はＳ６８に移行する。 Thereafter, the processes of S62 to S67 are performed on the character string on the “1” line in the character string table 1 and the character string on the “2” line in the character string table 2. Since the character string in the “1” line of the character string table 1 and the character string in the “2” line of the character string table 2 do not match and are not numbers, both NO in S62 and S63. As a result, the process proceeds to S68.

そして、Ｓ６８では文字列テーブル２の行番号が「２」から「３」にインクリメントされ、続く、Ｓ６９で文字列テーブル２の最大行数まで検索したかどうかが判定される。この段階では、文字列テーブル２の２行目までしか検索されていないので、ＮＯ判定される。Ｓ６９でＮＯ判定されると、処理はＳ６２に戻る。 In S68, the line number of the character string table 2 is incremented from “2” to “3”, and it is determined in S69 whether or not the maximum number of lines in the character string table 2 has been searched. At this stage, since only the second line of the character string table 2 has been searched, NO is determined. If a NO determination is made in S69, the process returns to S62.

その後、文字列テーブル１の「１」行目の文字列と、文字列テーブル２の「３」行目の文字列を対象にＳ６２〜Ｓ６７の処理が行われる。このような処理が繰り返し行われ、文字テーブル２の最大行数（ここでは、３行目）まで検索が完了すると、Ｓ６９でＹＥＳ判定される。 Thereafter, the processes of S62 to S67 are performed on the character string on the “1” line in the character string table 1 and the character string on the “3” line in the character string table 2. When such processing is repeated and the search is completed up to the maximum number of lines in the character table 2 (here, the third line), YES is determined in S69.

そして、Ｓ６９でＹＥＳ判定された場合には、Ｓ７０にて、文字列テーブル２の行番号である「ｊ」を１に設定する処理が行われる。また、Ｓ７１で、文字列テーブル１の行番号である「ｉ」をインクリメント（＋１加算）する処理が行われる。従って、ここでは、文字列テーブル１の行番号が「１」から「２」にインクリメントされる。 If YES is determined in S <b> 69, a process of setting “j” that is the line number of the character string table 2 to 1 is performed in S <b> 70. Also, in S71, a process of incrementing (i.e. adding +1) "i" that is the line number of the character string table 1 is performed. Therefore, here, the line number of the character string table 1 is incremented from “1” to “2”.

その後、Ｓ７２では、文字列テーブル１の「ｉ」行の文字列が、原稿の中央領域ＣＴに含まれているか判定する処理が行われる（ＣＰＵ７１ａにより行われる）。この実施形態では、原稿の中央領域ＣＴを図６に示す一点鎖線で示す範囲（具体的には、定型画像が印字される原稿の上端から所定範囲（ヘッダ）と、定型画像が印字される原稿の下端から所定範囲（フッタ）を除外した範囲）に設定してあり、「ｉ」行の文字列が中央領域ＣＴに含まれていれば、ＹＥＳ判定される。一方、中央領域に含まれていなければ、ＮＯ判定される。尚、原稿の中央領域ＣＴが、本発明の「原稿のうち中央を含む一部の領域」に対応している。また、中央領域ＣＴは座標で設定するとよい。そのようにすれば、文字列テーブル１の「ｉ」行のＹ座標（文字列のＹ座標）と、中央領域ＣＴのＹ座標を比較することにより、文字列テーブル１の「ｉ」行の文字列が、原稿の中央領域ＣＴに含まれているか、簡単に判定できる。 Thereafter, in S72, a process is performed to determine whether the character string in the “i” line of the character string table 1 is included in the central area CT of the document (performed by the CPU 71a). In this embodiment, the central area CT of the document is indicated by a dashed line shown in FIG. 6 (specifically, a predetermined range (header) from the upper end of the document on which the standard image is printed, and the document on which the standard image is printed. If the character string of the “i” line is included in the central region CT, a YES determination is made. On the other hand, if it is not included in the central region, NO is determined. Note that the center area CT of the document corresponds to “a part of the document including the center” of the present invention. The central region CT may be set by coordinates. By doing so, by comparing the Y coordinate of the “i” line in the character string table 1 (the Y coordinate of the character string) with the Y coordinate of the central area CT, the character in the “i” line of the character string table 1 It can be easily determined whether the column is included in the central area CT of the document.

文字列テーブル１の２行目の文字列「ａｂｃ」は、図６に示すように中央領域ＣＫから外れている。そのため、Ｓ７２ではＮＯ判定される。Ｓ７２でＮＯ判定されると、Ｓ７４に移行する。 The character string “abc” in the second line of the character string table 1 is out of the central area CK as shown in FIG. Therefore, NO determination is made in S72. If NO is determined in S72, the process proceeds to S74.

そして、Ｓ７４では、文字列テーブル１の最大行数まで検索したか判定する処理が行われる。この段階では、文字列テーブル１の１行目までしか検索されていないので、ＮＯ判定される。Ｓ７４でＮＯ判定されると、処理はＳ６２に戻る。 Then, in S74, a process for determining whether or not the maximum number of lines in the character string table 1 has been searched is performed. At this stage, since only the first line of the character string table 1 has been searched, NO is determined. If a NO determination is made in S74, the process returns to S62.

その後、文字列テーブル１の「２」行目の文字列と、文字列テーブル２の「１」行目〜「３」行目の各文字列を対象にＳ６２〜Ｓ６９の処理が行われる。これにて、文字列テーブル１の「２」行目の文字列に対して、それに共通する文字列（すなわち、（３Ａ）と（４）の条件か、（３Ｂ）と（４）の条件を満たす文字列）が、文字列テーブル２の「１」行目〜「３」行目に含まれているか、検索される。ただし、文字列テーブル１の「ｉ」行の文字列が、原稿の中央領域に含まれている場合は除く。 Thereafter, the processes of S62 to S69 are performed on the character string on the “2” line of the character string table 1 and the character strings on the “1” line to the “3” line of the character string table 2. As a result, the character string common to the character string on the “2” line of the character string table 1 (that is, the conditions of (3A) and (4) or the conditions of (3B) and (4) are set. It is searched whether the character string to be satisfied is included in the “1” line to the “3” line of the character string table 2. However, the case where the character string of the “i” line of the character string table 1 is included in the central area of the document is excluded.

そして、（３Ａ）の条件と（４）の条件を満たす場合には、その文字列は座標と共に、共通文字列テーブルに保存される（Ｓ６６）。また、（３Ｂ）の条件と（４）の条件を満たす場合には、共通文字列として「数字」と座標が、共通文字テーブルに保存される（Ｓ６６、Ｓ６７）。 If the conditions (3A) and (4) are satisfied, the character string is stored in the common character string table together with the coordinates (S66). When the conditions (3B) and (4) are satisfied, “number” and coordinates are stored as a common character string in the common character table (S66, S67).

このような処理が、文字列テーブル１の「行番号ｉを更新」しながら繰り返し行なわれる。そして、文字列テーブル１の５行目について、それに共通する文字列が、文字列テーブル２の「１」行目〜「３」行目に含まれているか検索し終わると、Ｓ７４にてＹＥＳ判定され、サブルーチン２は終了する。 Such processing is repeatedly performed while “updating line number i” in the character string table 1. Then, when the fifth row of the character string table 1 is searched for whether the common character string is included in the “1” to “3” rows of the character string table 2, a YES determination is made in S74. Then, subroutine 2 ends.

この実施形態では、１枚目の原稿と２枚目の原稿には「ｔｉｔｌｅ」、「数字（１、２、３、・・）」、「Ａ．ｄｏｃ」の３つの共通する文字列が含まれているので、共通文字列テーブルに対して、これら３つ文字列とその座標が保存されることになる（図１２参照）。 In this embodiment, the first document and the second document include three common character strings “title”, “number (1, 2, 3,...)”, And “A.doc”. Therefore, these three character strings and their coordinates are stored in the common character string table (see FIG. 12).

尚、上記した３つの共通文字列のうち、「数字（１、２、３・・）」は、文字列テーブル１の４行目の数字「１」と、文字列テーブル２の２行目の数字「２」を共通した文字列と判断したものである。すなわち、これら両文字列テーブルの２つの数字は座標が一致し、数が連続している。そのため、文字列テーブル１の４行目の数字「１」についてサブルーチン２にかけると、Ｓ６２でＮＯ判定された後、Ｓ６３でＹＥＳ判定される。これら数が連続していることから、文字列テーブル１の「ｉ」行の数字＋１の値が文字列テーブル２の行にあるかの条件を満たすからである。 Of the above three common character strings, “number (1, 2, 3,...)” Is the number “1” in the fourth line of the character string table 1 and the second line in the character string table 2. The number “2” is determined as a common character string. That is, the two numbers in both the character string tables have the same coordinates and the numbers are continuous. Therefore, when the number “1” on the fourth line of the character string table 1 is applied to the subroutine 2, a NO determination is made in S62, and a YES determination is made in S63. This is because these numbers are continuous, so that the condition that the value of the number +1 in the “i” row of the character string table 1 is in the row of the character string table 2 is satisfied.

その後、Ｓ６４、Ｓ６５でそれぞれＹＥＳ判定されることから、Ｓ６７にて、共通文字列テーブルに共通文字列として保存されることになる。このように、共通文字列に、座標が一致し連続する数字を含めるようにしているようにすれば、原稿にふられたページ数を共通文字列に含めることが可能となる。 After that, since YES is determined in S64 and S65, respectively, it is stored as a common character string in the common character string table in S67. As described above, if the common character string includes consecutive numbers with the same coordinates, the number of pages touched on the document can be included in the common character string.

また、このサブルーチン２では、文字列テーブル１の「ｉ」行の文字列が、原稿の中央領域ＣＫに含まれている場合には、「ｉ」行を中央領域外の値に設定する（Ｓ７２、Ｓ７３）。そのため、たとえば、原稿の中央領域ＣＫに含まれる３行目の文字列「いいえ」は検索対象から除外され、２行目の文字列「ａｂｃ」について、それに共通する文字列が文字列テーブル２側に含まれているか検索すると、次は３行目の文字列「いいえ」を飛ばして、４行目の文字列「１」について、それに共通する文字列が文字列テーブル２側に含まれているか検索する。 Further, in this subroutine 2, when the character string of the “i” line in the character string table 1 is included in the central area CK of the document, the “i” line is set to a value outside the central area (S72). , S73). Therefore, for example, the character string “No” on the third line included in the central area CK of the document is excluded from the search target, and the character string common to the character string “abc” on the second line is the character string table 2 side. Next, if the character string “No” on the third line is skipped and a character string common to the character string “1” on the fourth line is included in the character string table 2 side Search for.

このように原稿の中央領域ＣＫを除外して共通文字列を検出すれば、原稿の全領域を対象に共通文字列を検出する場合に比べて共通文字列を検出する処理を短縮できる。また、共通文字列は例えばページ番号、日付、ファイル名等であり、これらは通常、原稿端に印字されることが多い。そのため、原稿の中央領域ＣＫを予め除外しておけば、共通文字列を誤検出することがなくなる。従って、白紙判断を正確に行うことが可能となる。 If the common character string is detected by excluding the central area CK of the document in this way, the process for detecting the common character string can be shortened compared to the case where the common character string is detected for the entire area of the document. The common character string is, for example, a page number, a date, a file name, etc., and these are usually printed at the end of the document. Therefore, if the central area CK of the document is excluded in advance, the common character string is not erroneously detected. Accordingly, it is possible to accurately determine the blank page.

尚、ＣＰＵ７１ａにより実行されるＳ１１の処理（サブルーチン２）により本発明の検出部の果たす機能が実現されている。また、サブルーチン２の実行により、１枚目の原稿の文字列テーブルと２枚目の原稿の文字列テーブルから共通文字列テーブルを作成することにより、本発明の「前記検出部は、前記文字列テーブル生成部にて生成された全文字列テーブルのうち一部の文字列テーブルだけを使用して、前記共通文字列を検出する」を実現させている。 The function performed by the detection unit of the present invention is realized by the process of S11 (subroutine 2) executed by the CPU 71a. Further, by executing the subroutine 2, a common character string table is created from the character string table of the first original and the character string table of the second original, whereby the “detection unit detects the character string” according to the present invention. The common character string is detected using only a part of the character string tables among all the character string tables generated by the table generation unit ”.

そして、サブルーチン２の終了後、図５のメインフローに戻り、Ｓ１２にて１枚目の原稿が、白紙か判断する処理がＣＰＵ７１ａにより実行される。この白紙判断処理はサブルーチン化されており、Ｓ１２では図１３に示すサブルーチン３が読み出される。 Then, after the subroutine 2 is completed, the process returns to the main flow of FIG. 5, and the CPU 71a executes a process of determining whether or not the first original is blank in S12. This blank page determination process is a subroutine, and subroutine S3 shown in FIG. 13 is read out in S12.

サブルーチン３は、Ｓ１１で作成した「共通文字列テーブル」と白紙判断対象となる原稿の「文字列テーブル」を比較することにより白紙判断を行うものであり、図１３に示すＳ８１〜Ｓ８９の９ステップから構成されている。尚、以下の説明において「共通範囲」とは原稿のうち共通文字列が印刷された範囲（４つの座標Ｘ１、Ｘ２、Ｙ１、Ｙ２で表される範囲）のことである。また、非共通範囲とは原稿のうち共通範囲を除くそれ以外の全範囲を意味する。 Subroutine 3 performs blank page determination by comparing the “common character string table” created in S11 with the “character string table” of the original to be blanked, and includes nine steps S81 to S89 shown in FIG. It is composed of In the following description, the “common range” is a range (a range represented by four coordinates X1, X2, Y1, and Y2) in which a common character string is printed in the document. Further, the non-common range means the entire range other than the common range in the original.

Ｓ８１では、白紙判断対象の原稿について、共通範囲が白紙かどうか判定される。共通範囲が白紙かどうかを判断するには、判断対象となる原稿の文字列テーブルに、共通範囲に対して座標が重なる文字列があるか検索すればよく、重なる文字列がなければ、共通範囲は白紙と判断される（ＹＥＳ）。 In S <b> 81, it is determined whether or not the common range is a blank page for the blank page determination target document. To determine whether the common range is blank, it is only necessary to search the character string table of the document to be determined for a character string whose coordinates overlap with the common range. If there is no overlapping character string, the common range is determined. Is determined to be blank (YES).

Ｓ８１でＹＥＳ判定されると、処理はＳ８２に移行する。そして、Ｓ８２では、白紙判断対象の原稿について、非共通範囲が白紙かどうか判定される。非共通範囲が白紙か判断するには、判断対象となる原稿の文字列テーブルに、非共通範囲に含まれる文字列があるか検索すればよく、非共通範囲に含まれる文字列がなければ、非共通範囲は白紙と判断される（ＹＥＳ）。 If YES is determined in S81, the process proceeds to S82. In S82, it is determined whether or not the non-common range is blank for the blank page determination target document. To determine whether the non-common range is blank, it is only necessary to search the character string table of the document to be determined for a character string included in the non-common range. If there is no character string included in the non-common range, The non-common range is determined to be blank (YES).

そして、Ｓ８２でＹＥＳ判定された場合には、Ｓ８３に移行して、白紙フラグが立てられる(ＲＡＭ７１ｃに白紙フラグが記憶される）。一方、Ｓ８２でＮＯ判定された場合には、Ｓ８４に移行して、非白紙フラグが立てられる(ＲＡＭ７１ｃに白紙フラグが記憶される）。 If YES is determined in S82, the process proceeds to S83 and a blank sheet flag is set (the blank sheet flag is stored in the RAM 71c). On the other hand, if NO is determined in S82, the process proceeds to S84, and a non-blank sheet flag is set (the blank sheet flag is stored in the RAM 71c).

また、Ｓ８１でＮＯ判定された場合には、Ｓ８５に移行する。Ｓ８５では、白紙判断の対象となる原稿について、共通範囲に印刷された文字列が、共通文字列テーブル側の共通文字列に対して文字が一致しているか判断される。これは、原稿側の文字列テーブルの文字列と、共通文字列テーブル側の共通文字列を比較することにより判断される。 If NO is determined in S81, the process proceeds to S85. In S85, it is determined whether or not the character string printed in the common range for the document that is the target of blank page determination matches the character string in the common character string table side. This is determined by comparing the character string in the character string table on the document side with the common character string on the common character string table side.

そして、Ｓ８５にてＹＥＳ判定された場合（一致する場合）には、次にＳ８６に移行する。そして、Ｓ８６では、非共通範囲が白紙かどうか、判定される。非共通範囲が白紙か判断するには、判断対象の原稿の文字列テーブルに、非共通範囲に含まれる文字列があるか、検索すればよい。そして、Ｓ８６でＹＥＳ判定された場合には、Ｓ８７に移行して、白紙フラグが立てられる。一方、Ｓ８６でＮＯ判定された場合には、Ｓ８８に移行して、非白紙フラグが立てられる。また、Ｓ８５でＮＯ判定された場合（不一致と判断された場合）も、Ｓ８９に移行して、非白紙フラグが立てられる。尚、サブルーチン３のうちＳ８１、Ｓ８５、Ｓ８９の処理により、本発明の「前記判断部は、前記文字列テーブルに、前記共通文字列と座標が重なり、文字が異なる文字列が含まれている場合には、非白紙と判断する」が実現されている。 If YES is determined in S85 (if they match), the process proceeds to S86. In S86, it is determined whether the non-common range is blank. In order to determine whether the non-common range is blank, it is only necessary to search the character string table of the document to be determined for a character string included in the non-common range. If YES is determined in S86, the process proceeds to S87 and a blank flag is set. On the other hand, if NO is determined in S86, the process proceeds to S88, and a non-blank flag is set. Also, if NO is determined in S85 (when it is determined that there is a mismatch), the process proceeds to S89 and the non-blank flag is set. It should be noted that, according to the processing of S81, S85, and S89 in the subroutine 3, “the determination unit includes a character string in which the character string table includes a character string that is different in character from the common character string. Is determined to be non-blank ”.

次に、上記のサブルーチン３により、１枚目の原稿の白紙判断が如何様になされるか説明する。１枚目の原稿の文字列テーブル１には、共通範囲に座標が重なる文字列（例えば、「ｔｉｔｌｅ」など）がある。従って、Ｓ８１ではＮＯ判定される。また、その文字列（例えば、「ｔｉｔｌｅ」）は、共通文字列に保存された共通文字列と一致している。そのため、Ｓ８５では、ＹＥＳ判定される。そして、１枚目の原稿の文字列テーブルには、「ａｂｃ」や「いいえ」の文字列（座標が共通範囲外の文字列）が含まれていて、非共通範囲は白紙ではない。そのため、Ｓ８６ではＮＯ判定される。 Next, how the blank sheet of the first document is determined by the subroutine 3 will be described. The character string table 1 of the first document has a character string (for example, “title”) whose coordinates overlap the common range. Therefore, NO determination is made in S81. Further, the character string (for example, “title”) matches the common character string stored in the common character string. Therefore, YES determination is made in S85. The character string table of the first document includes character strings “abc” and “No” (character strings whose coordinates are outside the common range), and the non-common range is not blank. Therefore, NO determination is made in S86.

このように、一枚目の原稿の白紙判断では、Ｓ８１にてＮＯ判定、Ｓ８５にてＹＥＳ判定、Ｓ８６にてＮＯ判定される。そして、Ｓ８８にて非白紙フラグが立てられる。Ｓ８８の処理が終わると、サブルーチン３は終了する。尚、ＣＰＵ７１ａに実行されるＳ１２（サブルーチン３）、Ｓ２１（サブルーチン３）の処理により、本発明の判断部の果たす機能が実現されている。 Thus, in the blank page determination of the first document, a NO determination is made in S81, a YES determination is made in S85, and a NO determination is made in S86. In step S88, a non-blank sheet flag is set. When the process of S88 ends, the subroutine 3 ends. The function performed by the determination unit of the present invention is realized by the processing of S12 (subroutine 3) and S21 (subroutine 3) executed by the CPU 71a.

サブルーチン３の終了後、図５のメインフローに戻り、Ｓ１３の処理が行われる。Ｓ１３では、Ｓ１２の判断結果に応じて、１枚目の原稿を白紙除去する処理がＣＰＵ７１ａにより実行される。この白紙除去処理はサブルーチン化されており、Ｓ１３では、図１４に示すサブルーチン４が読み出される。 After the subroutine 3 is completed, the process returns to the main flow of FIG. 5 and the process of S13 is performed. In S13, the CPU 71a executes a process of removing a blank sheet from the first document according to the determination result in S12. This blank sheet removal processing is made into a subroutine, and subroutine S4 shown in FIG. 14 is read in S13.

サブルーチン４は、Ｓ９１とＳ９２の２つのステップから構成されていて、Ｓ９１では、白紙フラグが立っているかどうか判定する処理が行われる。そして、白紙フラグが立っている場合には、その原稿を白紙除去（具体的には、ＲＡＭ７１ｃに保存したその原稿の画像データを削除する）。一方、白紙フラグが立っていない場合には、Ｓ９１ではＮＯ判定され、白紙除去する処理をしないまま処理は終了する。そして、１枚目の原稿は、非白紙フラグが立っており、白紙フラグは立っていない。そのため、白紙除去されることなく、サブルーチン４は終了することになる。 Subroutine 4 is composed of two steps S91 and S92. In S91, a process for determining whether or not a blank page flag is set is performed. If the blank sheet flag is set, the original is removed (specifically, the image data of the original stored in the RAM 71c is deleted). On the other hand, if the blank page flag is not set, NO is determined in S91, and the process ends without performing the blank page removal process. The first original has a non-blank flag and no blank flag. Therefore, the subroutine 4 ends without removing the blank paper.

サブルーチン４の終了後、図５のメインフローに戻り、Ｓ２１の処理にて、原稿（ここでは、２枚目の原稿）が白紙か判断する処理がＣＰＵ７１ａにより実行される。Ｓ２１の処理はＳ１２と同じ処理であり、Ｓ２１ではＳ１２の場合と同様に図１３に示すサブルーチン３が読み出される。 After the subroutine 4 is completed, the process returns to the main flow of FIG. The process in S21 is the same as that in S12. In S21, the subroutine 3 shown in FIG. 13 is read in the same manner as in S12.

サブルーチン３の説明は既に行ったので、ここでは、２枚目の原稿の白紙判断が如何様に行われるかを簡単に説明する。２枚目の原稿の文字列テーブル２には、共通範囲に座標が重なる文字列（例えば、「ｔｉｔｌｅ」など）がある。従って、Ｓ８１ではＮＯ判定される。また、その文字列（例えば、「ｔｉｔｌｅ」）は、共通文字列に保存された共通文字列と一致している。そのため、Ｓ８５では、ＹＥＳ判定される。そして、２枚目の原稿の文字列テーブル２には、共通文字テーブルの共通文字列と同じ文字列しか含まれておらず、非共通範囲に含まれる文字列は存在しない。そのため、Ｓ８６ではＹＥＳ判定される。 Since the subroutine 3 has already been described, how to determine the blank page of the second document will be briefly described here. The character string table 2 of the second document has a character string (for example, “title”) whose coordinates overlap in the common range. Therefore, NO determination is made in S81. Further, the character string (for example, “title”) matches the common character string stored in the common character string. Therefore, YES determination is made in S85. The character string table 2 of the second document includes only the same character string as the common character string in the common character table, and there is no character string included in the non-common range. Therefore, YES determination is made in S86.

このように、２枚目の原稿の白紙判断では、Ｓ８１にてＮＯ判定、Ｓ８５にてＹＥＳ判定、Ｓ８６にてＹＥＳ判定される。そして、Ｓ８８にて、白紙フラグが立てられる。そして、Ｓ８８にて白紙フラグを立てる処理が終わると、サブルーチン３は終了する。 Thus, in the blank page determination of the second document, a NO determination is made in S81, a YES determination is made in S85, and a YES determination is made in S86. In S88, a blank paper flag is set. Then, when the process of setting the blank paper flag is finished in S88, the subroutine 3 is finished.

サブルーチン３の終了後、図５のメインフローに戻り、Ｓ２２の処理が行われる。Ｓ２２では、Ｓ２１の判断結果に応じて、原稿（ここでは、２枚目の原稿）を白紙除去する処理がＣＰＵ７１ａにより実行される。このＳ２２の白紙除去処理（Ｓ１３と同じ処理）はサブルーチン化されており、図１４に示すサブルーチン４が読み出される。そして、サブルーチン４の実行により、２枚目の原稿は白紙除去される。すなわち、ＲＡＭ７１ｃに保存した画像データは削除される。 After the subroutine 3 is completed, the process returns to the main flow of FIG. 5 and the process of S22 is performed. In S22, according to the determination result in S21, the CPU 71a executes processing for removing a blank sheet of the original (here, the second original). This blank paper removal process of S22 (the same process as S13) is made into a subroutine, and subroutine 4 shown in FIG. 14 is read out. Then, by executing subroutine 4, the second original is removed. That is, the image data stored in the RAM 71c is deleted.

その後、処理はＳ２３に移行する。Ｓ２３では、原稿センサ４９の出力に基づいて、原稿トレイ４２上に次の原稿があるか判定される。この段階では、２枚目までしか原稿の読み取りを終了しておらず、原稿トレイ４２上には、原稿が残されているため、Ｓ２３ではＹＥＳ判定される。 Thereafter, the process proceeds to S23. In S 23, it is determined whether there is a next document on the document tray 42 based on the output of the document sensor 49. At this stage, only the second page has been read, and the document remains on the document tray 42, so a YES determination is made in S23.

そのため、白紙除去シーケンスはＳ２に戻り、ＡＤＦ４０により、原稿トレイ４２上にセットされた三枚目の原稿が、表面を上に向けた状態で搬送路４３に送りだされる。そして、原稿の先端が読取位置Ｐに達すると、ＣＩＳ３０による原稿の読み取りが行われる。 Therefore, the blank paper removal sequence returns to S2, and the third document set on the document tray 42 is sent to the transport path 43 by the ADF 40 with the front side facing up. When the leading edge of the document reaches the reading position P, the document is read by the CIS 30.

Ｓ４に続くＳ５では、Ｓ４にて読み取った原稿が１枚目の原稿か判定される。Ｓ４で読み取った原稿は３枚目であるため、Ｓ５ではＮＯ判定される。Ｓ５でＮＯ判定されると、次にＳ８に移行する。Ｓ８では、Ｓ４にて読み取った原稿が２枚目の原稿か判定される。Ｓ４で読み取った原稿は３枚目であるため、Ｓ８ではＮＯ判定される。その後、処理はＳ１４に移行して、Ｓ４にて読み取った原稿の画像データをＲＡＭ７１ｃに対して保存する処理がＣＰＵ７１ａにより実行される。これにて、３枚目の原稿の画像データがＲＡＭ７１ｃに対して保存（記憶）される。 In S5 following S4, it is determined whether the document read in S4 is the first document. Since the document read in S4 is the third, NO is determined in S5. If NO is determined in S5, the process proceeds to S8. In S8, it is determined whether the document read in S4 is the second document. Since the document read in S4 is the third, NO is determined in S8. Thereafter, the process proceeds to S14, and the CPU 71a executes a process for storing the image data of the original read in S4 in the RAM 71c. Thus, the image data of the third original is saved (stored) in the RAM 71c.

Ｓ１４にて、読み取った画像データをＲＡＭ７１ｃに対して記憶すると、次にＳ１５の処理が実行される。Ｓ１５の処理は、Ｓ７、Ｓ９と同じ処理であり、図７に示すサブルーチン１が読み出され、３枚目の原稿について、読み取った画像データから文字テーブルと文字列テーブルを作成する処理が行われる。これにより、文字テーブル３と、図１０ｃに示す文字列テーブル３が作成されることとなる。尚、文字テーブル３は図を省略している。 When the read image data is stored in the RAM 71c in S14, the process of S15 is executed next. The process of S15 is the same process as S7 and S9. Subroutine 1 shown in FIG. 7 is read, and a process of creating a character table and a character string table from the read image data is performed for the third original. . As a result, the character table 3 and the character string table 3 shown in FIG. 10c are created. Note that the character table 3 is not shown.

そして、Ｓ１５の処理が完了すると、次にＳ２１にて３枚目原稿について白紙判断する処理が実行される。そして、Ｓ２１にて白紙判断が行われると、次にＳ２２に移行して、Ｓ２１の判断結果に応じてその原稿を白紙除去する処理が行われる。そして、Ｓ２２の処理が終了すると、再びＳ２３に戻る。 When the process of S15 is completed, a process of determining a blank page for the third original is executed in S21. When the blank page is determined in S21, the process proceeds to S22, and a process for removing the blank sheet according to the determination result in S21 is performed. Then, when the process of S22 ends, the process returns to S23 again.

このような処理が繰り返し行われ、原稿トレイ４２上にセットされた全原稿（この例では、５枚）について、白紙除去する処理が完了すると、Ｓ２３にてＹＥＳ判定されることとなり、白紙除去シーケンスは終了することになる。 When such processing is repeatedly performed and the blank sheet removal processing is completed for all the originals (in this example, five sheets) set on the document tray 42, a YES determination is made in S23, and a blank sheet removal sequence is performed. Will end.

以下、３枚目〜５枚目の各原稿について、原稿の白紙判断が如何様に行われるかを簡単に説明する。
図６に示す３枚目原稿の文字列テーブル３には、図１０ｃに示すように、共通範囲に座標が重なる文字列（例えば、「ｔｉｔｌｅ」など）がある。従って、Ｓ８１ではＮＯ判定される。また、その文字列（例えば、「ｔｉｔｌｅ」）は、共通文字列に保存された共通文字列と一致している。そのため、Ｓ８５では、ＹＥＳ判定される。そして、３枚目の原稿の文字列テーブル３には「あいうえお」なる文字列（座標が共通範囲外の文字列）が含まれており、非共通範囲は白紙ではない。Ｓ８６ではＮＯ判定される。以上のことから、Ｓ８８にて非白紙フラグが立てられるので、３枚目の原稿は白紙除去されない。 In the following, a brief description will be given of how the blank page of the original is determined for each of the third to fifth originals.
The character string table 3 of the third original shown in FIG. 6 includes a character string (for example, “title”) whose coordinates overlap in the common range, as shown in FIG. 10c. Therefore, NO determination is made in S81. Further, the character string (for example, “title”) matches the common character string stored in the common character string. Therefore, YES determination is made in S85. The character string table 3 of the third document includes the character string “Aiueo” (character string whose coordinates are outside the common range), and the non-common range is not blank. In S86, a NO determination is made. From the above, since the non-blank flag is set in S88, the third original is not removed.

次に、図６に示す４枚目の原稿、すなわち原稿の左端に「ｊａｐａｎ」の文字だけが印刷された原稿の場合を説明する。４枚目原稿の文字列テーブル４には、図１０ｄに示すように、文字列「ｊａｐａｎ」が保存されている。この文字列「Ｊａｐａｎ」は共通範囲に座標が重っていることから、Ｓ８１ではＮＯ判定される。一方、この文字列「ｊａｐａｎ」は、共通範囲に記された共通文字列「ｔｉｔｌｅ」とは文字列が一致していない。そのため、Ｓ８５ではＮＯ判定され、Ｓ８９にて非白紙フラグが立てられる。よって、４枚目の原稿は白紙除去されない。 Next, the case of the fourth original shown in FIG. 6, that is, an original in which only the characters “japan” are printed on the left end of the original will be described. In the character string table 4 of the fourth original, a character string “Japan” is stored as shown in FIG. Since the character string “Japan” has coordinates in the common range, NO is determined in S81. On the other hand, the character string “japan” does not match the character string “title” described in the common range. Therefore, a NO determination is made in S85, and a non-blank flag is set in S89. Therefore, the fourth original is not removed.

次に、図６に示す５枚目の原稿、すなわち全面白紙の場合を説明する。５枚目原稿の文字列テーブル５には、図１０ｅに示すように、文字列が一切保存されていない。そのため、Ｓ８１、Ｓ８２ではいずれもＮＯ判定される。そのため、Ｓ８４にて非白紙フラグが立てられ、５枚目の原稿は白紙除去される。 Next, the case of the fifth original shown in FIG. 6, that is, the entire blank sheet will be described. As shown in FIG. 10E, no character string is stored in the character string table 5 of the fifth original. Therefore, NO determination is made in both S81 and S82. Therefore, a non-blank flag is set in S84 and the fifth original is removed.

以上説明したように、実施形態１では、全面白紙原稿（図６の５枚目の原稿）を除去できる。また、実施形態１では、共通文字列を除外して白紙判断を行うことから、準白紙原稿（図６の２枚目の原稿であって、共通文字列のみ印字され本文画像の印刷がない原稿）を白紙除去できる。 As described above, according to the first embodiment, the entire blank original (the fifth original in FIG. 6) can be removed. In the first embodiment, since the blank page is determined by excluding the common character string, a quasi-blank document (the second document in FIG. 6, only the common character string is printed and the body image is not printed). ) Can be removed.

また、実施形態１では、共通範囲に重なる文字列があったとしても、それが共通文字列と一致しなければ、非白紙フラグを立てるようにしてある（Ｓ８１、Ｓ８５、Ｓ８９）。そのため、図６の４枚目の原稿のように、共通文字列の座標に異なる文字列が印刷されている場合には、白紙除去されない。仮に、従来のように除外範囲を設定すると、図６の４枚目の原稿のように除外範囲である共通文字列の座標に異なる文字列が印刷されていても、それは白紙除去される。本実施形態１では、そのような原稿であっても白紙除去されないため、従来の白紙判断方法に比べて白紙判断精度が高い。また、除外範囲を設けないようにすることで、ユーザの手間を減らすことが出来るというメリットも得られる。 In the first embodiment, even if there is a character string that overlaps the common range, if it does not match the common character string, a non-blank flag is set (S81, S85, S89). Therefore, blank pages are not removed when different character strings are printed at the coordinates of the common character string as in the fourth document in FIG. If the exclusion range is set as in the prior art, even if a different character string is printed at the coordinates of the common character string that is the exclusion range as in the fourth document in FIG. 6, it is removed as a blank sheet. In the first exemplary embodiment, blank pages are not removed even in such an original, and therefore, blank page determination accuracy is higher than that of the conventional blank page determination method. Moreover, the merit that a user's effort can be reduced by not providing an exclusion range is also acquired.

また、実施形態１では、判読した文字に基づいて共通文字列を検出しており、濃度に基づいて検出する場合に比べて、共通文字列を正確に検出できる。従って、白紙判断を正確に行うことが可能となる。 In the first embodiment, the common character string is detected based on the read character, and the common character string can be detected more accurately than in the case where the common character string is detected based on the density. Accordingly, it is possible to accurately determine the blank page.

また、実施形態１では、共通文字列に、座標が一致し連続する数字を含めるようにしている。このようにすれば、原稿にふられたページ数を共通文字列に含めることが可能となる。そのため、ページ数を除外して、白紙判定を行うことが可能となり、白紙判断の精度が高まる。 In the first embodiment, the common character string includes consecutive numbers having the same coordinates. In this way, the number of pages touched on the document can be included in the common character string. For this reason, it is possible to perform blank page determination by excluding the number of pages, and the accuracy of blank page determination is increased.

また、実施形態１では、原稿の中央領域ＣＫを除外して共通文字列を検出する。そのため、原稿の全領域を対象に共通文字列を検出する場合に比べて共通文字列を検出する処理を短縮できる。また、共通文字列は例えばページ番号、日付、ファイル名等であり、これらは通常、原稿端に印字されることが多い。そのため、原稿の中央領域ＣＫを予め除外しておけば、共通文字列を誤検出することがなくなる。従って、白紙判断を正確に行うことが可能となる。 In the first embodiment, the common character string is detected by excluding the central area CK of the document. Therefore, the process for detecting the common character string can be shortened compared to the case where the common character string is detected for the entire area of the document. The common character string is, for example, a page number, a date, a file name, etc., and these are usually printed at the end of the document. Therefore, if the central area CK of the document is excluded in advance, the common character string is not erroneously detected. Accordingly, it is possible to accurately determine the blank page.

また、実施形態１では、１枚目の原稿と２枚目の原稿の文字列テーブルだけを使用して、共通文字列を検出している。そのため、共通文字列を検出した以降は、原稿の読み取りと白紙判断を並行して行うことが可能である。 In the first embodiment, the common character string is detected using only the character string tables of the first document and the second document. Therefore, after the common character string is detected, it is possible to perform reading of the original and blank page determination in parallel.

＜実施形態２＞
実施形態２を図１５ないし図１７を用いて説明する。実施形態１では、１枚目の原稿と２枚目原稿の文字列テーブルを比較して、共通文字列テーブルを作成した。そして、１枚目の原稿と２枚目原稿の文字列テーブルを比較して作成した共通文字列テーブルに基づいて、３枚目以降の原稿について白紙判断を行うようにした。 <Embodiment 2>
A second embodiment will be described with reference to FIGS. 15 to 17. In the first embodiment, the common character string table is created by comparing the character string tables of the first document and the second document. Then, based on the common character string table created by comparing the character string tables of the first document and the second document, blank page determination is performed for the third and subsequent documents.

実施形態２は、１枚目の原稿と２枚目原稿の文字列テーブルを比較して共通文字列テーブルを作成する点は、実施形態１と共通している。しかし、実施形態２では、作成した共通文字列テーブルを、３枚目以降の原稿の文字列テーブルと比較して更新する構成となっており、その点が実施形態１と異なっている。そして、この変更に伴って、実施形態２の白紙除去シーケンスでは、実施形態１の白紙除去シーケンスに対してＳ１６の処理（共通文字列テーブル更新処理）を追加している。 The second embodiment is common to the first embodiment in that the common character string table is created by comparing the character string tables of the first document and the second document. However, the second embodiment differs from the first embodiment in that the created common character string table is updated by comparing with the character string tables of the third and subsequent originals. Along with this change, in the blank page removal sequence of the second embodiment, the processing of S16 (common character string table update processing) is added to the blank page removal sequence of the first embodiment.

また、実施形態１では、原稿の画像を読み取ると、その都度白紙判断を行うようにしていたが、実施形態２では、画像の読み取りをまず全原稿分行い、その後、白紙判断をまとめて行うようにした。そして、この変更に伴って、実施形態２の白紙除去シーケンスでは、実施形態１の白紙除去シーケンスに対してＳ２４〜Ｓ２８を追加し、Ｓ１２、Ｓ１３、Ｓ２１、Ｓ２２を削除した内容となっている。 Further, in the first embodiment, blank images are determined each time an image of a document is read. However, in the second embodiment, images are first read for all original documents, and then blank images are determined collectively. I made it. Along with this change, the blank paper removal sequence of the second embodiment has contents obtained by adding S24 to S28 to the blank paper removal sequence of the first embodiment and deleting S12, S13, S21, and S22.

以下、実施形態１との相違点を説明する。
＜１点目の相違点＞
Ｓ１６の処理を、３枚目の原稿の読み取りに伴って共通文字列テーブルを更新する場合を例にとって説明する。Ｓ１６の処理はサブルーチン化されており、Ｓ１６では、図１６に示すサブルーチン５が読み出される。サブルーチン５は、Ｓ１００〜Ｓ１０７の８つのステップから構成されていてＣＰＵ７１ａにより実行される。尚、以下の説明において「ｍ」とは共通文字列テーブルの行番号を示すものとする。 Hereinafter, differences from the first embodiment will be described.
<First difference>
The process of S16 will be described by taking as an example a case where the common character string table is updated as the third original is read. The process of S16 is made into a subroutine. In S16, the subroutine 5 shown in FIG. 16 is read. Subroutine 5 comprises eight steps S100 to S107 and is executed by CPU 71a. In the following description, “m” represents the line number of the common character string table.

まず、Ｓ１００では共通文字列テーブルの行番号「ｍ」が「１」に設定される。続くＳ１０１では共通文字列テーブルの「ｍ」行と同じ座標を、３枚目原稿の文字列テーブルから検索する処理が行われる。そして、同じ座標がない場合にはＳ１０２でＮＯ判定され、Ｓ１０５にて共通文字列テーブルから「ｍ」行は削除される。 First, in S100, the line number “m” of the common character string table is set to “1”. In the subsequent S101, a process for retrieving the same coordinates as the “m” line of the common character string table from the character string table of the third original is performed. If there is no same coordinate, NO is determined in S102, and the "m" line is deleted from the common character string table in S105.

一方、同じ座標が、３枚目原稿の文字列テーブルに含まれていれば、Ｓ１０２でＹＥＳ判定され、処理はＳ１０３に移行する。Ｓ１０３では、３枚目原稿の文字列テーブルから検索された同じ座標の文字列は、共通文字列と同じか判定される。 On the other hand, if the same coordinates are included in the character string table of the third original, YES is determined in S102, and the process proceeds to S103. In S103, it is determined whether the character string having the same coordinates retrieved from the character string table of the third original is the same as the common character string.

同じ文字列であれば、Ｓ１０３でＹＥＳ判定される。一方、同じ文字列でない場合には、Ｓ１０４に移行する。Ｓ１０４では、共通文字列テーブル側の「ｍ」行の文字列は数字で、文字列テーブル側の文字列も数字か、判定される。双方が数字であれば、Ｓ１０４ではＹＥＳ判定される。 If they are the same character string, YES is determined in S103. On the other hand, if they are not the same character string, the process proceeds to S104. In S104, it is determined whether the character string in the “m” line on the common character string table side is a numeral and the character string on the character string table side is also a numeral. If both are numbers, YES is determined in S104.

Ｓ１０３、Ｓ１０４でＹＥＳ判定された場合は、いずれもＳ１０６に移行する。Ｓ１０６では、共通文字列テーブルの行番号である「ｍ」をインクリメント（＋１加算）する処理が行われる。従って、ここでは、共通文字列テーブルの行番号が「１」から「２」にインクリメントされる。 If YES is determined in S103 and S104, the process proceeds to S106. In S106, a process of incrementing (+1 addition) “m” that is the line number of the common character string table is performed. Therefore, here, the line number of the common character string table is incremented from “1” to “2”.

一方、Ｓ１０４でＮＯ判定された場合には、Ｓ１０２でＮＯ判定された場合と同様に、Ｓ１０５にて共通文字列テーブルから「ｍ」行は削除される。そして、Ｓ１０５の処理に続いてＳ１０６に移行し、上記したように共通文字列テーブルの行番号が「１」から「２」にインクリメントされる。 On the other hand, if NO is determined in S104, the “m” line is deleted from the common character string table in S105, similarly to the case where NO is determined in S102. Then, the process proceeds to S106 following the process of S105, and the line number of the common character string table is incremented from “1” to “2” as described above.

その後、Ｓ１０７では、共通文字列テーブルの全行番について検索を行ったか判定される。ここでは、１行目しか検索されていないので、Ｓ１０７ではＮＯ判定される。そのため、処理はＳ１０１に戻り、上記したＳ１０１〜Ｓ１０６の処理が、上記した手順に従って実行される。そして、共通文字列テーブルの全行番について検索を行うと、Ｓ１０７でＹＥＳ判定され、サブルーチン５は終了する。 Thereafter, in S107, it is determined whether a search has been performed for all line numbers in the common character string table. Here, since only the first line has been searched, NO is determined in S107. Therefore, the process returns to S101, and the above-described processes of S101 to S106 are executed according to the above-described procedure. If all line numbers in the common character string table are searched, YES is determined in S107, and the subroutine 5 ends.

このサブルーチン５は、共通文字列テーブルに保存された共通文字列のうち、新しく読み込んだ３枚目以降の原稿の文字列テーブル３に含まれていない共通文字列を削除する（Ｓ１０５）。 This subroutine 5 deletes the common character string that is not included in the character string table 3 of the third and subsequent originals newly read out of the common character strings stored in the common character string table (S105).

そのため、共通文字列テーブルは例えば、図１７に示すように上段→中段→下段の順に、更新されてゆき、最終的には全原稿に共通する共通文字列とその座標だけが保存された状態となる。尚、共通文字列を作成する際、図１０ｅのような、文字列を有しない文字列テーブルを使用しない。ＣＰＵ７１ａにて実行されるＳ１６の処理により、本発明の「前記検出部は、前記文字列テーブル生成部にて前記文字列テーブルが生成される度に、新しく生成された文字列テーブルに含まれない共通文字列を削除することにより、前記共通文字列を更新する」が実現されている。
Therefore, the common character string table is updated, for example, in the order of upper → middle → lower as shown in FIG. 17, and finally the common character string common to all originals and its coordinates are stored. Become. When creating a common character string, a character string table having no character string as shown in FIG. 10E is not used. According to the processing of S16 executed by the CPU 71a, “the detection unit is not included in the newly generated character string table every time the character string table is generated by the character string table generation unit. “The common character string is updated by deleting the common character string” is realized.

＜２点目の相違点＞
次に、図１５を参照して、２点目の相違点であるＳ２４〜Ｓ２８の処理を説明する。尚、以下の説明において、「ｎ」とは読み取った原稿のページ数を示すものとする。 <Second difference>
Next, with reference to FIG. 15, the process of S24-S28 which is the 2nd difference is demonstrated. In the following description, “n” represents the number of pages of the read document.

Ｓ２４からＳ２８の処理は、Ｓ１６にて更新済みの共通文字列テーブルを利用して、画像読取ユニット５で読み取った全ｎ枚の原稿の白紙判断と、白紙除去処理をまとめて行うものである。具体的には、Ｓ２４〜Ｓ２８の５つのステップから構成されていて、Ｓ２４では、原稿のページ数「ｎ」が「０」に設定される。続く、Ｓ２５では、原稿のページ数である「ｍ」をインクリメント（＋１加算）する処理が行われる。従って、ここでは、原稿のページ数が「０」から「１」にインクリメントされる。 The processing from S24 to S28 is performed by collectively performing blank page determination and blank page removal processing of all n documents read by the image reading unit 5, using the common character string table updated in S16. Specifically, the process consists of five steps S24 to S28. In S24, the page number “n” of the document is set to “0”. In S25, a process of incrementing (+1 addition) “m” which is the number of pages of the document is performed. Accordingly, here, the number of pages of the document is incremented from “0” to “1”.

Ｓ２６は、ｎ枚目の原稿を白紙判断する処理が行われる。このＳ２６の処理は、実施形態１のＳ１２やＳ２１と同じ処理であり、Ｓ２６では、サブルーチン３が読み出される。そして、Ｓ１６にて更新済みの共通文字列テーブルを利用して、ｎ枚目（ここでは、１枚目）の原稿が白紙か判断される。 In S26, a process of determining a blank sheet for the nth document is performed. The process of S26 is the same as S12 and S21 of the first embodiment, and the subroutine 3 is read in S26. In step S16, the updated common character string table is used to determine whether the nth (here, the first) document is blank.

Ｓ２６にて１枚目の原稿について、白紙判断が行われると、次に、Ｓ２７に移行する。Ｓ２７では、ｎ枚目の原稿について、白紙除去する処理が行われる。このＳ２７の処理は、実施形態１のＳ１３やＳ２２と同じ処理であり、Ｓ２７では、サブルーチン４が読み出される。そして、Ｓ２７では、Ｓ２６の判断結果に応じて、原稿（ここでは、１枚目の原稿）を白紙除去する処理が実行される。 If a blank page is determined for the first document in S26, the process proceeds to S27. In S27, a blank sheet removal process is performed on the nth document. The process of S27 is the same as S13 and S22 of the first embodiment, and the subroutine 4 is read in S27. In S27, a process of removing a blank sheet from the original (here, the first original) is executed according to the determination result in S26.

そして、Ｓ２８では、白紙判断済みの原稿ページ数ｎが、画像読取ユニット５で読み取ったページ数に達したか判断される。そして、白紙判断済みの原稿ページ数ｎが、読み取ったページ数に達していなければ、Ｓ２８でＮＯ判定され、Ｓ２５に戻る。 In S28, it is determined whether or not the number of original pages n for which blank pages have been determined has reached the number of pages read by the image reading unit 5. If the number n of blank original pages has not reached the number of read pages, a NO determination is made in S28, and the process returns to S25.

以上のことから、画像読取ユニット５で読み取った各原稿について白紙判断と白紙除去処理が行われる。そして、全原稿について、それらの処理が終了すると、Ｓ２８でＹＥＳ判定され、一連の白紙除去シーケンスは終了する。 From the above, blank sheet determination and blank sheet removal processing are performed for each document read by the image reading unit 5. When these processes are completed for all the originals, a YES determination is made in S28, and the series of blank paper removal sequences is completed.

実施形態２では、画像読取ユニット５で読み取った全原稿の文字列テーブルを対象にして、共通文字列を検出するから、全原稿に共通する文字列を共通文字列として検出することが可能である。従って、全原稿に共通する共通文字列だけを除外して白紙判断を行うことが出来るので、原稿の白紙判断を正確に行うことが可能である。 In the second embodiment, since a common character string is detected for the character string table of all originals read by the image reading unit 5, it is possible to detect a character string common to all originals as a common character string. . Accordingly, since it is possible to perform blank page determination by excluding only the common character string common to all the originals, it is possible to accurately determine the blank page of the original.

また、実施形態２では、新たな原稿の読み取りに伴って文字列テーブルが生成される度に、共通文字列テーブルを更新するようにしている。このように共通文字列テーブルを更新するようにしておけば、全原稿を読み取った後に、全原稿の文字列テーブルから共通文字列を作成する場合に比べて、共通文字列の検索に要する手間が少なくて済む。そのため、共通文字列テーブルの作成、及び原稿の白紙判断、除去を効率よく行うことが可能である。 In the second embodiment, the common character string table is updated each time a character string table is generated as a new document is read. If the common character string table is updated in this way, it takes more time to search for the common character string than when the common character string is created from the character string table of all originals after all the originals are read. Less is enough. For this reason, it is possible to efficiently create a common character string table, and determine and remove blank pages from a document.

＜実施形態３＞
実施形態３を説明する。実施形態１では原稿の白紙判断に、サブルーチン３を用いた。実施形態３は、原稿の白紙判断を、実施形態１とは異なる方法で行うようにしたものである。具体的には、実施形態３では、判断対象の原稿の文字列テーブルに対して、共通文字列テーブルに保存された共通文字列以外の文字列が含まれている場合には、非白紙と判断する。 <Embodiment 3>
A third embodiment will be described. In the first embodiment, the subroutine 3 is used for blank page determination. In the third embodiment, the blank page of the document is determined by a method different from that in the first embodiment. Specifically, in the third embodiment, when a character string other than the common character string stored in the common character string table is included in the character string table of the document to be determined, the document is determined to be non-blank. To do.

実施形態１と同様に、図６に示す５枚の原稿を白紙判断する場合を例にとって具体的な白紙判断例を説明する。尚、各原稿の文字列テーブルは実施形態１と同様にサブルーチン１を用いて、図１０ａ〜図１０ｅに示す文字列テーブルが生成させたものとし、共通文字列テーブルはサブルーチン２を用いて、図１２に示す共通文字列テーブルが作成されているものとする。 As in the first embodiment, a specific example of blank page determination will be described by taking as an example the case of blank page determination for five documents shown in FIG. Note that the character string table of each document is generated by using the subroutine 1 as in the first embodiment, and the character string table shown in FIGS. 10a to 10e is generated, and the common character string table is generated by using the subroutine 2. Assume that the common character string table shown in FIG.

さて、図１２に示す共通文字列テーブルには「ｔｉｔｌｅ」、「数字」、「Ａ．ｄｏｃ」の３つの共通文字列が保存されている。一方、図１０ａに示す文字列テーブル１には、共通文字列に含まれていない文字列「ａｂｃ」や「いいえ」が含まれている。また、図１０ｃに示す文字テーブル３には、共通文字列に含まれていない文字列「あいうえお」が含まれている。また、図１０ｄに示す文字テーブル４には、共通文字列に含まれていない文字列「ｊａｐａｎ」が含まれている。 In the common character string table shown in FIG. 12, three common character strings “title”, “number”, and “A.doc” are stored. On the other hand, the character string table 1 shown in FIG. 10A includes character strings “abc” and “No” that are not included in the common character string. In addition, the character table 3 illustrated in FIG. 10C includes a character string “Aiueo” that is not included in the common character string. In addition, the character table 4 illustrated in FIG. 10D includes a character string “japan” that is not included in the common character string.

従って、これら文字列テーブル１に対応する１枚目の原稿、文字列テーブル３に対応する３枚目の原稿、文字列テーブル４に対応する４枚目原稿は非白紙と判断できる。 Accordingly, it can be determined that the first original corresponding to the character string table 1, the third original corresponding to the character string table 3, and the fourth original corresponding to the character string table 4 are non-blank.

一方、それ以外の文字列テーブル２、４は、共通文字列しか含まれていないか、文字列そのものが含まれていないパターンになるので、文字列テーブル２に対応する２枚目の原稿、文字列テーブル４に対応する４枚目の原稿はいずれも白紙と判断できる。このように実施形態３では、文字列と共通文字列を比較するだけの簡単な処理で原稿が白紙か非白紙か判断できるというメリットがある。 On the other hand, since the other character string tables 2 and 4 have a pattern that includes only the common character string or does not include the character string itself, the second manuscript and character corresponding to the character string table 2 are used. It can be determined that the fourth original corresponding to the row table 4 is all blank. As described above, the third embodiment has an advantage that it is possible to determine whether a document is blank or non-blank by a simple process of simply comparing a character string with a common character string.

＜他の実施形態＞
本発明は上記記述及び図面によって説明した実施形態に限定されるものではなく、例えば次のような実施形態も本発明の技術的範囲に含まれる。 <Other embodiments>
The present invention is not limited to the embodiments described with reference to the above description and drawings. For example, the following embodiments are also included in the technical scope of the present invention.

（１）実施形態１〜２では、画像読取装置の一例に複合機を例示したが、印刷ユニットが必ずしも必要でなく、少なくとも制御部７０と画像読取ユニット５を備えた構成であればよい。 (1) In the first and second embodiments, a multifunction peripheral is illustrated as an example of an image reading apparatus. However, a printing unit is not necessarily required, and any configuration that includes at least the control unit 70 and the image reading unit 5 may be used.

（２）実施形態１〜２では、原稿の白紙判断を文字列テーブルを用いて行った例を示したが、具体的には、文字列テーブルに共通文字列以外の文字列が含まれている場合、原稿を非白紙と判断するようにした。白紙判断は、共通文字列を除外して行うものであればよく、例えば、原稿画像から共通文字列を除外する処理を行い、その後、共通文字列を除外した原稿画像について、画素濃度の出現数を閾値と比較することで、原稿が白紙することも可能である。 (2) In the first and second embodiments, an example in which blank page determination of a document is performed using a character string table has been described. Specifically, a character string other than a common character string is included in the character string table. In this case, the manuscript is judged to be non-blank. The blank page determination may be performed by excluding the common character string. For example, the process of excluding the common character string from the document image is performed, and then the number of appearances of the pixel density for the document image from which the common character string is excluded. It is also possible to make the original blank by comparing the above with the threshold.

（３）実施形態２では、共通文字列テーブルを新しく読み取った原稿の文字列テーブルと比較して更新することにより、全原稿に共通する共通文字列テーブルを作成するようにした。全原稿に共通する共通文字列テーブルの作成方法（ＣＰＵ７１ａにて作成される全原稿の全文字列テーブルを対象に共通文字列を検出する方法）は、実施形態２の方法以外に限定されるものではなく、例えば、全原稿を読み取った後に、全原稿の文字列テーブルから共通文字列を作成するようにしてもよい。 (3) In the second embodiment, the common character string table is updated by comparing with the character string table of the newly read original, thereby creating the common character string table common to all originals. A method of creating a common character string table common to all originals (a method of detecting a common character string for all character string tables of all originals created by the CPU 71a) is limited to a method other than the method of the second embodiment. Instead, for example, a common character string may be created from a character string table of all originals after all originals have been read.

（４）実施形態１〜２では、原稿の白紙判断を文字列テーブルを用いて行った例を示したが、具体的には、文字列テーブルに共通文字列以外の文字列が含まれている場合、原稿を非白紙と判断するようにした。白紙判断は、共通文字列を除外して行うものであればよく、例えば、原稿画像から共通文字列を除外する処理を行い、その後、共通文字列を除外した原稿画像について、画素濃度の出現数を閾値と比較することで、原稿が白紙することも可能である。 (4) In the first and second embodiments, an example in which blank page determination of a document is performed using a character string table has been described. Specifically, a character string other than a common character string is included in the character string table. In this case, the manuscript is judged to be non-blank. The blank page determination may be performed by excluding the common character string. For example, the process of excluding the common character string from the document image is performed, and then the number of appearances of the pixel density for the document image from which the common character string is excluded. It is also possible to make the original blank by comparing the above with the threshold.

（５）実施形態１では、文字列テーブルの文字列に数字を含めたが、更に図柄を含めるようにしてもよい。 (5) In the first embodiment, numbers are included in the character strings of the character string table, but symbols may be further included.

１…複合機（本発明の「画像読取装置」、「画像形成装置」の一例）
３…印刷ユニット（本発明の「印刷部」の一例）
５…画像読取ユニット
３０…ＣＩＳ（本発明の「読取部」の一例）
４０…ＡＤＦ
４３…搬送路
７０…制御部
７１ａ…ＣＰＵ（本発明の「検出部」、「判断部」、「判読部」、「文字列認識部」、「文字列テーブル生成部」の一例）
７１ｂ…ＲＯＭ
７１ｃ…ＲＡＭ 1. Multifunction machine (an example of “image reading apparatus” and “image forming apparatus” of the present invention)
3. Printing unit (an example of the “printing unit” of the present invention)
5... Image reading unit 30... CIS (an example of the “reading unit” of the present invention)
40 ... ADF
43 ... conveying path 70 ... control unit 71a ... CPU (an example of "detection unit", "determination unit", "reading unit", "character string recognition unit", "character string table generation unit" of the present invention)
71b ... ROM
71c ... RAM

Claims

A reading unit for reading an image of a document;
A reading unit that reads the characters from the image of each page read by the reading unit and outputs the characters in association with the coordinates;
A character string recognition unit for recognizing a character string printed on each page from the read character and the character coordinates;
A character string table generating unit that generates a character string table in which the character string and the coordinates are stored in association with each other for each document from which an image has been read;
By detecting the character string table of each document, a detection unit that detects a common character string that is formed at the same coordinates on a plurality of pages and that has the same characters as the coordinates;
An image reading apparatus comprising: a determination unit that determines whether a document is blank or non-blank by excluding the common character string detected by the detection unit.

The letters include numbers,
2. The image reading apparatus according to claim 1 , wherein the common character string includes consecutive numbers that coincide with each other on the plurality of pages of the document .

The detector is operable to exclude some areas including the central of the original, the common character string image reading apparatus according to claim 1 or claim 2 for detecting a.

Wherein the detection unit, the by all string table generated by the string table generating unit in the target, the common character claims 1 to detect the sequence of any one of claims 3 image reading apparatus .

The detection unit updates the common character string by deleting a common character string that is not included in the newly generated character string table each time the character string table is generated by the character string table generation unit. The image reading apparatus according to claim 4 .

The determination unit, the string table, the common case where a character string other than a column is contained, claims 1, characterized in that determining the original and non-blank any one of claims 5 The image reading apparatus described in 1.

The determination unit, the string table, the common character string and the coordinates overlap, when the character is contain different strings claims 1, characterized in that it is determined that the non-blank claim The image reading apparatus according to any one of claims 3 to 4 .

An image reading apparatus according to any one of claims 1 to 7 ,
An image forming apparatus comprising: a printing unit that prints an image read by the image reading apparatus on a recording medium.