JP5962234B2

JP5962234B2 - Image processing apparatus, image processing method, and program

Info

Publication number: JP5962234B2
Application number: JP2012130085A
Authority: JP
Inventors: 大黒　慶久; 慶久大黒
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 2012-06-07
Filing date: 2012-06-07
Publication date: 2016-08-03
Anticipated expiration: 2032-06-07
Also published as: JP2013255118A

Description

本発明は、画像処理装置、画像処理方法およびプログラムに関する。 The present invention relates to an image processing apparatus, an image processing method, and a program.

従来、画像読取装置において、複数の文書を含む原稿を読み取る際に、文書間に仕切紙を挿入しておくことにより、文書の区切り位置を特定し、その後の文書単位での処理を自動化する技術が知られている。 2. Description of the Related Art Conventionally, in an image reading apparatus, when a document including a plurality of documents is read, a partition sheet is inserted between the documents so that a document separation position is specified and subsequent processing in document units is automated. It has been known.

例えば、特許文献１には、書誌情報に関連付けて一意の管理番号を生成するとともに、該管理番号と書誌情報を印刷した原稿仕切シートを印刷するデータ処理装置が開示されている。こうして得られた仕切シートを文書の先頭ページに付加した状態でスキャンすることにより、複数の文書を一括して読み込みつつ、正確な区切り位置で文書を管理することができる。 For example, Patent Document 1 discloses a data processing apparatus that generates a unique management number in association with bibliographic information and prints a document partition sheet on which the management number and bibliographic information are printed. By scanning with the partition sheet obtained in this manner added to the first page of the document, it is possible to manage the document at an accurate separation position while reading a plurality of documents at once.

また、従来技術においては、仕切紙として、専用の紙の他、全面白紙や、単色白紙などが用いられている。さらに、特許文献１の仕切シートのように、所定の情報が印字された仕切紙が用いられるものも知られている。 Further, in the prior art, as the partition paper, not only a dedicated paper but also a full-color blank paper or a single-color blank paper is used. Further, a partition sheet on which predetermined information is printed, such as the partition sheet of Patent Document 1, is also known.

しかしながら、仕切紙を単に文書の区切り位置を特定するためだけに利用するのではなく、その利用価値を高めたいという要望がある。これに対し、特許文献１の技術においては、仕切シートに所定の情報が印字されているものの、仕切シートへの情報印字のためには、ユーザ操作が必要となりユーザの負担が大きいという問題があった。 However, there is a demand for increasing the utility value of the partition paper, not just for specifying the document separation position. On the other hand, in the technique of Patent Document 1, although predetermined information is printed on the partition sheet, there is a problem that a user operation is required to print information on the partition sheet, and the burden on the user is heavy. It was.

本発明は、上記に鑑みてなされたものであって、簡単な操作で仕切紙の利用価値を高めることのできる画像処理装置、画像処理方法およびプログラムを提供することを目的とする。 The present invention has been made in view of the above, and an object of the present invention is to provide an image processing apparatus, an image processing method, and a program that can increase the utility value of a partition sheet with a simple operation.

上述した課題を解決し、目的を達成するために、本発明は、原稿ページと、原稿ページ間を区切る複数の仕切紙ページとを含む原稿をページ単位で読み取る読取部と、前記読取部により読み取られたページ単位のページ読取データに基づいて、当該ページ読取データに対応する対象ページが原稿ページであるか仕切紙ページであるかを判定するページ判定部と、前記ページ判定部により前記原稿ページであると判定されたページであって、前記複数の仕切紙ページにより区切られた複数の原稿ページ群ごとに、該原稿ページ群に含まれる原稿ページの前記ページ読取データそれぞれに含まれる情報の一部を抽出して該原稿ページ群に関する関連情報を生成する関連情報生成部と、前記ページ判定部により前記仕切紙ページであると判定されたページに対応するページ読取データそれぞれに対し、該仕切紙ページ直前の原稿ページ群の関連情報を、該仕切紙ページに対応する前記ページ読取データに付加し、編集ページ読取データを得るデータ編集部と、前記複数の仕切紙ページそれぞれに対応するページ読取データを該ページ読取データに対応する編集ページ読取データに変更して前記原稿に対応する原稿読取データを出力する出力部とを備えることを特徴とする。 To solve the above problems and achieve the object, the present invention includes a originals pages, a reading unit reading in units of pages a document comprising a plurality of partition sheet pages that separates between document pages, by the reading unit A page determination unit that determines whether a target page corresponding to the page read data is a document page or a partition page based on the read page reading data in units of pages, and the document page by the page determination unit a determination pages as being, for each of a plurality of document pages group separated by said plurality of partition sheet page, one information included in the page read data each document page included in the document page group a related information generating unit that generates a relevant information about the document page group to extract the parts was determined to be the partition sheet page by the page determining unit For each page read data corresponding to the over-di, the relevant information of the originals page group just before the partition sheet page, added to the page read data corresponding to the division paper pages, obtaining edit page read data Data comprising an editing unit, and an output unit for the page read data corresponding to the plurality of partition sheet page outputs the document reading data corresponding to the document to change the edit page read data corresponding to the page read data It is characterized by that.

また、本発明は、画像処理装置で実行される画像処理方法であって、原稿ページと、原稿ページ間を区切る複数の仕切紙ページとを含む原稿をページ単位で読み取る読取工程と、前記読取工程において読み取られたページ単位のページ読取データに基づいて、当該ページ読取データに対応する対象ページが原稿ページであるか仕切紙ページであるかを判定するページ判定工程と、前記ページ判定工程において前記原稿ページであると判定されたページであって、前記複数の仕切紙ページにより区切られた複数の原稿ページ群ごとに、該原稿ページ群に含まれる原稿ページの前記ページ読取データそれぞれに含まれる情報の一部を抽出して該原稿ページ群に関する関連情報を生成する関連情報生成工程と、前記ページ判定工程において前記仕切紙ページであると判定されたページに対応するページ読取データそれぞれに対し、該仕切紙ページ直前の原稿ページ群の関連情報を、該仕切紙ページに対応する前記ページ読取データに付加し、編集ページ読取データを得るデータ編集工程と、前記複数の仕切紙ページそれぞれに対応するページ読取データを該ページ読取データに対応する編集ページ読取データに変更して前記原稿に対応する原稿読取データを出力する出力工程とを含むことを特徴とする。 Further, the present invention is an image processing method executed by an image processing apparatus, wherein a reading step for reading a document including a document page and a plurality of partition pages separating the document pages in units of pages, and the reading step A page determination step for determining whether the target page corresponding to the page reading data is a document page or a partition page based on the page reading data in page units read in step, and the document in the page determination step For each of a plurality of manuscript page groups that are determined to be pages and separated by the plurality of divider paper pages, information of information included in each of the page reading data of the manuscript pages included in the manuscript page group a related information generating step of generating additional information about the document page group by extracting a portion, the partition in the page judgment step For each page read data corresponding to is determined to be page, the relevant information of the originals page group just before the partition sheet page, added to the page read data corresponding to the division paper pages, edited a data editing step of obtaining a page read data, it outputs the document reading data corresponding to the document to change the edit page read data page read data corresponding to the plurality of partition sheet page corresponding to the page read data And an output process.

また、本発明は、プログラムであって、コンピュータを、原稿ページと、原稿ページ間を区切る複数の仕切紙ページとを含む原稿をページ単位で読み取る読取部と、前記読取部により読み取られたページ単位のページ読取データに基づいて、当該ページ読取データに対応する対象ページが原稿ページであるか仕切紙ページであるかを判定するページ判定部と、前記ページ判定部により前記原稿ページであると判定されたページであって、前記複数の仕切紙ページにより区切られた複数の原稿ページ群ごとに、該原稿ページ群に含まれる原稿ページの前記ページ読取データそれぞれに含まれる情報の一部を抽出して該原稿ページ群に関する関連情報を生成する関連情報生成部と、前記ページ判定部により前記仕切紙ページであると判定されたページに対応するページ読取データそれぞれに対し、該仕切紙ページ直前の原稿ページ群の関連情報を、該仕切紙ページに対応する前記ページ読取データに付加し、編集ページ読取データを得るデータ編集部と、前記複数の仕切紙ページそれぞれに対応するページ読取データを該ページ読取データに対応する編集ページ読取データに変更して前記原稿に対応する原稿読取データを出力する出力部として機能させるためのプログラムである。 In addition, the present invention is a program, wherein the computer is configured to read a document including a document page and a plurality of partition pages separating the document pages in a page unit, and a page unit read by the reading unit. A page determination unit that determines whether the target page corresponding to the page reading data is a manuscript page or a partition page, and the page determination unit determines that the page is the manuscript page. For each of a plurality of document page groups divided by the plurality of partition paper pages, a part of information included in each of the page read data of the document pages included in the document page group is extracted. a related information generating unit that generates a relevant information about the document page group was determined to be the partition sheet page by the page determining unit Bae For each page read data corresponding to the di-, the relevant information of the originals page group just before the partition sheet page, added to the page read data corresponding to the division paper pages, data editing obtaining edit page read data parts and, in order to function the page read data corresponding to the plurality of partition sheet page as an output unit for outputting the document reading data corresponding to the document to change the edit page read data corresponding to the page read data It is a program.

本発明によれば、簡単な操作で仕切紙の利用価値を高めることができるという効果を奏する。 According to the present invention, there is an effect that the utility value of the partition paper can be increased by a simple operation.

図１は、画像処理装置の構成を示すブロック図である。FIG. 1 is a block diagram showing the configuration of the image processing apparatus. 図２は、画像処理装置の読み取り対象となる原稿の一例を示す図である。FIG. 2 is a diagram illustrating an example of a document to be read by the image processing apparatus. 図３は、画像処理装置の読み取り対象となる原稿の一例を示す図である。FIG. 3 is a diagram illustrating an example of a document to be read by the image processing apparatus. 図４は、代表文字列記憶部のデータ構成を模式的に示す図である。FIG. 4 is a diagram schematically illustrating a data configuration of the representative character string storage unit. 図５−１は、編集ページ読取データの一例を示す図である。FIG. 5A is a diagram of an example of edited page read data. 図５−２は、編集ページ読取データの一例を示す図である。FIG. 5B is a diagram of an example of the edited page read data. 図６は、読取部の読み取り対象の原稿に対して得られた原稿読取データの一例を示す図である。FIG. 6 is a diagram illustrating an example of document reading data obtained for a document to be read by the reading unit. 図７は、画像処理装置による処理を示すフローチャートである。FIG. 7 is a flowchart showing processing by the image processing apparatus. 図８は、第１の変更例にかかる原稿読取データを示す図である。FIG. 8 is a diagram showing document reading data according to the first modification. 図９は、第２の変更例にかかる編集ページ読取データを示す図である。FIG. 9 is a diagram showing edited page reading data according to the second modification. 図１０は、第３の変更例にかかる編集ページ読取データを示す図である。FIG. 10 is a diagram showing edited page read data according to the third modification. 図１１は、原稿読取データと、画像形成された印刷用紙の関係を示す図である。FIG. 11 is a diagram showing the relationship between the original reading data and the printing paper on which an image is formed. 図１２は、原稿読取データと、画像形成された印刷用紙の関係を示す図である。FIG. 12 is a diagram illustrating the relationship between the original reading data and the printing paper on which an image is formed. 図１３は、第４の変更例にかかる画像処理装置の処理を示すフローチャートである。FIG. 13 is a flowchart illustrating processing of the image processing apparatus according to the fourth modification. 図１４は、画像処理装置のハードウェア構成を示すブロック図である。FIG. 14 is a block diagram illustrating a hardware configuration of the image processing apparatus. 図１５は、画像処理システムの全体構成を示す図である。FIG. 15 is a diagram illustrating the overall configuration of the image processing system.

以下に添付図面を参照して、画像処理装置、画像処理方法およびプログラムの実施の形態を詳細に説明する。 Hereinafter, embodiments of an image processing apparatus, an image processing method, and a program will be described in detail with reference to the accompanying drawings.

図１は、実施の形態にかかる画像処理装置１の構成を示すブロック図である。画像処理装置１は、読取部１００と、ページ読取データ記憶部１０１と、出力部１０２と、ページ数カウンタ１０３と、ページ判定部１０４と、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ）部１０５と、代表文字列抽出部１０６と、代表文字列記憶部１０７と、関連情報生成部１０８と、データ編集部１０９とを備えている。 FIG. 1 is a block diagram illustrating a configuration of an image processing apparatus 1 according to the embodiment. The image processing apparatus 1 includes a reading unit 100, a page reading data storage unit 101, an output unit 102, a page number counter 103, a page determination unit 104, an OCR (Optical Character Recognition) unit 105, and a representative character string extraction. Unit 106, representative character string storage unit 107, related information generation unit 108, and data editing unit 109.

読取部１００は、原稿をページ単位で読み取る。具体的には、ＡＤＦ（ＡｕｔｏＤｏｃｕｍｅｎｎｔＦｅｅｄｅｒ）などに搭載された原稿を、給紙された順に光学センサを備えた読取部１００のヘッドにより読み取り、ページ単位の画像データ、すなわちページ読取データを得る。読取部１００は、得られたページ読取データをページ読取データ記憶部１０１に書き込む。ページ読取データ記憶部１０１は、原稿に含まれる複数ページのページ読取データを１つのファイルとして、すなわち原稿単位の原稿読取データとして記憶する。なお、本実施の形態においては、読み取りの対象は、各用紙の片面（表面）のみであるものとする。 The reading unit 100 reads a document in units of pages. Specifically, an original document mounted on an ADF (Auto Document Feeder) or the like is read by the head of the reading unit 100 including an optical sensor in the order of feeding, and image data in page units, that is, page read data is obtained. The reading unit 100 writes the obtained page reading data in the page reading data storage unit 101. The page reading data storage unit 101 stores a plurality of pages of page reading data included in a document as one file, that is, as document reading data in units of documents. In the present embodiment, it is assumed that the reading target is only one side (front surface) of each sheet.

図２および図３は、本実施の形態にかかる画像処理装置１の読み取り対象となる原稿の一例を示す図である。読み取り対象となる原稿は、複数の用紙を含んでいる。より詳しくは、原稿は、原稿内容が表示された原稿紙（原稿ページ）と、原稿中の文書を区切るための仕切紙（仕切紙ページ）とを含んでいる。 2 and 3 are diagrams illustrating an example of a document to be read by the image processing apparatus 1 according to the present embodiment. A document to be read includes a plurality of sheets. More specifically, the manuscript includes manuscript paper (manuscript page) on which the manuscript content is displayed and a partition paper (partition paper page) for separating the document in the manuscript.

図２および図３に示す原稿においては、原稿ページ２と原稿ページ３の間には、原稿ページ１〜２と原稿ページ３〜４とを区切るための仕切紙ａ（仕切紙ページ１）が挿入されている。原稿ページ４と原稿ページ５の間には、原稿ページ３〜４と原稿ページ５〜６とを区切るための仕切紙ｂ（仕切紙ページ２）が挿入されている。これにより、原稿中の複数の原稿ページは、原稿ページ群Ａ，Ｂ，Ｃの３つのグループに分類される。ユーザは、原稿に含まれる複数ページをページ単位で複数の原稿ページ群に分割して管理することを希望する場合には、読取部１００の読み取り対象となる原稿の原稿ページ群の境界位置に仕切紙を予め挿入しておく。 2 and 3, a partition sheet a (partition sheet page 1) for separating the document pages 1 and 2 and the document pages 3 to 4 is inserted between the document pages 2 and 3. Has been. A partition sheet b (partition sheet page 2) for separating the document pages 3 to 4 and the document pages 5 to 6 is inserted between the document page 4 and the document page 5. Thus, a plurality of document pages in the document are classified into three groups of document page groups A, B, and C. When the user desires to manage a plurality of pages included in a document by dividing them into a plurality of document page groups in units of pages, the user partitions the page into the boundary positions of the document page groups of the document to be read by the reading unit 100. Insert paper in advance.

ここで、仕切紙は、原稿ページの用紙とは異なる用紙である。具体的には、仕切紙は、後述するページ判定部１０４において、原稿ページと仕切紙ページとを識別可能な用紙であればよく、例えば、白紙、全面単色紙、仕切り用途に適したタブ紙などが挙げられる。なお、本実施の形態においては、仕切紙として、白紙または全面単色紙が用いられる例について説明する。 Here, the partition paper is a paper different from the paper of the original page. Specifically, the partition sheet may be a sheet that can distinguish between the original page and the partition sheet page in the page determination unit 104 to be described later, such as a blank sheet, a full-color monochrome sheet, a tab sheet suitable for partitioning applications, and the like. Is mentioned. In the present embodiment, an example in which white paper or full-color monochrome paper is used as the partition paper will be described.

図１に戻り、ページ数カウンタ１０３は、読取部１００により読み取られたページの数、すなわちページ数をカウントする。ページ数カウンタ１０３は、原稿ページおよび仕切紙に相当する仕切紙ページをそれぞれ１ページとしてカウントする。すなわち、図２および図３に示す原稿においては、原稿ページ１に対しページ数１、原稿ページ２に対しページ数２、仕切紙ページ１に対しページ数３をカウントする。 Returning to FIG. 1, the page number counter 103 counts the number of pages read by the reading unit 100, that is, the number of pages. The page number counter 103 counts each of the original pages and the partition paper pages corresponding to the partition paper as one page. That is, in the document shown in FIGS. 2 and 3, the number of pages is 1 for document page 1, the number of pages is 2 for document page 2, and the number of pages is 3 for partition page 1.

ページ判定部１０４は、読取部１００により得られたページ読取データに基づいて、読取部１００の読み取り対象のページが、原稿ページであるか、仕切紙ページであるかを判定する。ページ判定部１０４は、具体的には、ページ読取データの輝度分布を求め、特定の輝度値の頻度が、予め設定された閾値よりも大きい場合に仕切紙ページであると判定する。 The page determination unit 104 determines whether the page to be read by the reading unit 100 is an original page or a partition page based on the page reading data obtained by the reading unit 100. Specifically, the page determination unit 104 obtains the luminance distribution of the page read data, and determines that the page is a partition page when the frequency of the specific luminance value is greater than a preset threshold value.

なお、他の例としては、仕切紙がタブ付き紙である場合には、ページ判定部１０４は、タブの有無を検出し、検出結果により、仕切り紙ページであるか、原稿ページであるかを判定してもよい。 As another example, when the partition paper is tabbed paper, the page determination unit 104 detects the presence or absence of a tab, and determines whether the page is a partition paper page or a manuscript page based on the detection result. You may judge.

ＯＣＲ部１０５は、ページ読取データ記憶部１０１に記憶されているページ読取データから、ＯＣＲ処理により文字データを抽出する。ＯＣＲ処理は、画像データの特徴を分析し、文字らしさを判定し、文字コードに変換する処理である。ＯＣＲ処理により、文字コードの他、文字の配置位置、文字サイズ、文字認識の確信度など各種情報を得ることができる。 The OCR unit 105 extracts character data from the page reading data stored in the page reading data storage unit 101 by OCR processing. The OCR process is a process of analyzing the characteristics of image data, determining the character likeness, and converting it into a character code. By the OCR process, various information such as character arrangement position, character size, and certainty of character recognition can be obtained in addition to the character code.

代表文字列抽出部１０６は、ＯＣＲ部１０５によるＯＣＲ処理の結果に基づいて、各原稿ページの内容を代表する代表文字列を、原稿ページの文字データの中から抽出する。代表文字列抽出部１０６は具体的には、ページの最上部に配置されている文字列を代表文字列として抽出する。例えば図２および図３に示す原稿ページ１においては、最上段に配置される「月次報告書」の文字列が代表文字列として抽出される。代表文字列抽出部１０６は、さらに抽出した代表文字列をページ数カウンタによりカウントされたページ数に対応付けて代表文字列記憶部１０７に書き込む。 Based on the result of the OCR processing by the OCR unit 105, the representative character string extraction unit 106 extracts a representative character string representing the contents of each original page from the character data of the original page. Specifically, the representative character string extraction unit 106 extracts a character string arranged at the top of the page as a representative character string. For example, in the manuscript page 1 shown in FIGS. 2 and 3, a character string of “monthly report” arranged at the top is extracted as a representative character string. The representative character string extraction unit 106 further writes the extracted representative character string in the representative character string storage unit 107 in association with the number of pages counted by the page number counter.

なお、代表文字列の抽出方法は、実施の形態に限定されるものではなく多様な方法を採用することができる。代表文字列の抽出の他の例としては、サイズが最大の文字列や、行頭文字の配置位置がページの最も左側に配置されている文字列を代表文字列として抽出してもよい。また、他の例としては、ＯＣＲ部１０５により得られた文字の文法的な解析を行い、解析結果に基づいた文字列の重要度が最大となる文字列を代表文字列として抽出してもよい。また、これら抽出方法を組み合わせることにより代表文字列を抽出してもよい。 Note that the method of extracting the representative character string is not limited to the embodiment, and various methods can be adopted. As another example of extracting the representative character string, a character string having the largest size or a character string in which the position of the bullet character is arranged on the leftmost side of the page may be extracted as the representative character string. As another example, a grammatical analysis of the character obtained by the OCR unit 105 may be performed, and a character string having the maximum importance of the character string based on the analysis result may be extracted as a representative character string. . The representative character string may be extracted by combining these extraction methods.

図４は、代表文字列記憶部１０７のデータ構成を模式的に示す図である。代表文字列記憶部１０７は、代表文字列を、代表文字列の抽出元のページに対してページ数カウンタ１０３がカウントしたページ数に対応付けて記憶している。 FIG. 4 is a diagram schematically illustrating a data configuration of the representative character string storage unit 107. The representative character string storage unit 107 stores the representative character string in association with the number of pages counted by the page number counter 103 with respect to the page from which the representative character string is extracted.

関連情報生成部１０８は、ページ判定部１０４により仕切紙ページであると判定された場合に、代表文字列記憶部１０７に記憶されているページ情報とカウント数とに基づいて、仕切紙ページのページ読取データに付加すべき関連情報を生成する。データ編集部１０９は、関連情報生成部１０８により得られた関連情報を、処理対象の仕切紙ページに対応するページ読取データに付加する。以下、関連情報が付加されたページ読取データを編集ページ読取データと称する。 When the page determination unit 104 determines that the page is a partition page, the related information generation unit 108 determines the page of the partition page based on the page information and the count number stored in the representative character string storage unit 107. Related information to be added to the read data is generated. The data editing unit 109 adds the related information obtained by the related information generation unit 108 to the page read data corresponding to the partition page to be processed. Hereinafter, the page reading data to which the related information is added is referred to as edited page reading data.

ここで、関連情報とは、処理対象となる仕切紙ページの直前の原稿ページ群に関する情報である。図５−１および図５−２は、関連情報が付加された編集ページ読取データの一例を示す図である。関連情報は、処理対象となる仕切紙の直前の原稿ページ群に含まれる各原稿ページのページ数と代表文字列とを対応付けた情報を含んでいる。関連情報はさらに、仕切紙ページであることを示す仕切画像として、「仕切り紙」という表示情報を含んでいる。さらに、図５−１および図５−２に示す仕切紙ページ１と仕切紙ページ２に対応する編集ページ読取データにおいては、ページの同一位置に「仕切り紙」の表示が配置され、その下にページ数と代表文字列の組がページ順に上から下に配置されている。 Here, the related information is information relating to a document page group immediately before the partition page to be processed. FIG. 5A and FIG. 5B are diagrams illustrating an example of edited page read data to which related information is added. The related information includes information in which the number of pages of each document page included in the document page group immediately before the partition sheet to be processed is associated with the representative character string. The related information further includes display information “partition paper” as a partition image indicating a partition paper page. Further, in the edited page read data corresponding to the partition paper page 1 and the partition paper page 2 shown in FIGS. 5A and 5B, a “partition paper” display is arranged at the same position on the page, and below that. A set of page number and representative character string is arranged from top to bottom in the page order.

このように、本実施の形態にかかる関連情報生成部１０８は、いずれの編集ページ読取データにおいても、同一の仕切画像が同一の配置位置に配置されるような関連情報を生成する。関連情報生成部１０８はさらに、上述のように、ページ数および代表文字列の組についても、同一の配置ルールに従って、編集ページ読取データに配置されるような関連情報を生成する。 As described above, the related information generation unit 108 according to the present embodiment generates related information such that the same partition image is arranged at the same arrangement position in any edited page read data. Further, as described above, the related information generation unit 108 generates related information that is arranged in the edited page read data according to the same arrangement rule for the number of pages and the set of representative character strings.

このように、同一位置に固定文字列を配置したり、同一の配置ルールに従って代表文字列を配置することにより、仕切紙ページのレイアウトの印象が固定されるので、ユーザは、仕切紙ページを容易に視認することができる。 In this way, the fixed character string is arranged at the same position or the representative character string is arranged according to the same arrangement rule, so that the impression of the layout of the divider page is fixed, so that the user can easily arrange the divider page. Can be visually recognized.

なお、関連情報生成部１０８が生成する関連情報は、仕切紙直前の原稿ページ群に関連する情報であればよく、その種類は実施の形態に限定されるものではない。例えば、仕切画像は実施の形態に限定されるものではなく、文字列にかえて、画像であってもよい。また、関連情報は、データ数のみでもよく、また代表文字列のみでもよい。また、本実施の形態においては、各原稿ページの代表文字列すべてを関連情報に含めることとしたが、これにかえて、原稿ページ群の最初の原稿ページの代表文字列など、所定の原稿ページの代表文字列のみを関連情報に含めることとしてもよい。 The related information generated by the related information generation unit 108 may be information related to the document page group immediately before the partition sheet, and the type thereof is not limited to the embodiment. For example, the partition image is not limited to the embodiment, and may be an image instead of a character string. The related information may be only the number of data or only the representative character string. In the present embodiment, all the representative character strings of each original page are included in the related information. Instead, a predetermined original page such as a representative character string of the first original page of the original page group is used. Only the representative character string may be included in the related information.

図６は、読取部１００の読み取り対象の原稿に対して得られた原稿読取データの一例を示す図である。原稿読取データは、仕切紙ページに対しては、読取部１００により得られたページ読取データにかえて、編集ページ読取データを含んでいる。 FIG. 6 is a diagram illustrating an example of document reading data obtained for a document to be read by the reading unit 100. The original reading data includes edit page reading data instead of the page reading data obtained by the reading unit 100 for the partition page.

出力部１０２は、ページ読取データ記憶部１０１に記憶されている、原稿読取データを他の装置に送信する。なお、ページ読取データ記憶部１０１の原稿読取データには、仕切りページに対応する、編集後のページ読取データが含まれている。また、他の例としては、出力部１０２は、画像形成装置であり、印刷用紙に原稿読取データに対応する画像を形成してもよい。 The output unit 102 transmits the document read data stored in the page read data storage unit 101 to another apparatus. Note that the original reading data in the page reading data storage unit 101 includes edited page reading data corresponding to the partition page. As another example, the output unit 102 may be an image forming apparatus, and may form an image corresponding to the original read data on a print sheet.

図７は、画像処理装置１による処理を示すフローチャートである。処理において、まず、ページ数カウンタ１０３は、ページ数カウンタをクリアする（ステップＳ１）。次に、読取部１００は、原稿の１ページを読み取り、得られたページ読取データをページ読取データ記憶部１０１に書き込む（ステップＳ２）。次に、ページ数カウンタ１０３は、値を１インクリメントする（ステップＳ３）。 FIG. 7 is a flowchart showing processing by the image processing apparatus 1. In the processing, first, the page number counter 103 clears the page number counter (step S1). Next, the reading unit 100 reads one page of the document and writes the obtained page reading data in the page reading data storage unit 101 (step S2). Next, the page number counter 103 increments the value by 1 (step S3).

次に、ページ判定部１０４は、ページ判定を行う。ページが原稿ページである場合には（ステップＳ４，Ｎｏ）、ＯＣＲ部１０５は、読取部１００により得られたページ読取データに対しＯＣＲ処理を施す（ステップＳ５）。次に、代表文字列抽出部１０６は、ＯＣＲ処理の結果に基づいて、ページ読取データから代表文字列を抽出する（ステップＳ６）。次に、代表文字列抽出部１０６は、抽出した代表文字列と抽出元のページ読取データのページ数とを対応付けて代表文字列記憶部１０７に書き込む（ステップＳ７）。 Next, the page determination unit 104 performs page determination. When the page is an original page (No at Step S4), the OCR unit 105 performs OCR processing on the page reading data obtained by the reading unit 100 (Step S5). Next, the representative character string extraction unit 106 extracts a representative character string from the page read data based on the result of the OCR process (step S6). Next, the representative character string extraction unit 106 writes the extracted representative character string and the number of pages of the page read data of the extraction source in association with each other to the representative character string storage unit 107 (step S7).

原稿の読み取りが終了した場合には（ステップＳ８，Ｙｅｓ）、出力部１０２は、ページ読取データ記憶部１０１に記憶されているページ読取データを原稿読取データとして出力する（ステップＳ９）。一方、読み取りが行われていないページが存在する場合には（ステップＳ８，Ｎｏ）、ステップＳ２に進み、未処理のページに対し、ステップＳ２以降の処理が施される。 When the reading of the document is completed (step S8, Yes), the output unit 102 outputs the page reading data stored in the page reading data storage unit 101 as the document reading data (step S9). On the other hand, if there is a page that has not been read (step S8, No), the process proceeds to step S2, and the process after step S2 is performed on the unprocessed page.

ステップＳ４において、読取部１００の読み取り対象ページが仕切紙ページである場合には（ステップＳ４，Ｙｅｓ）、関連情報生成部１０８は、代表文字列記憶部１０７に記憶されている代表文字列およびページ数に基づいて、関連情報を生成する（ステップＳ１１）。 In step S4, when the reading target page of the reading unit 100 is a partition page (step S4, Yes), the related information generation unit 108 displays the representative character string and page stored in the representative character string storage unit 107. Based on the number, related information is generated (step S11).

なお、このとき、代表文字列記憶部１０７には、関連情報生成部１０８の処理対象の仕切紙ページの直前の原稿ページ群中の各原稿ページの代表文字列が記憶されている。したがって、関連情報生成部１０８は、代表文字列記憶部１０７を参照することにより、直前の原稿ページ群の代表文字列を得ることができる。 At this time, the representative character string storage unit 107 stores a representative character string of each document page in the document page group immediately before the partition page to be processed by the related information generation unit 108. Therefore, the related information generation unit 108 can obtain the representative character string of the immediately preceding document page group by referring to the representative character string storage unit 107.

次に、データ編集部１０９は、仕切紙ページに対応するページ読取データに関連情報を付与し、編集ページ読取データを得る（ステップＳ１２）。 Next, the data editing unit 109 adds related information to the page reading data corresponding to the partition page, and obtains editing page reading data (step S12).

データ編集部１０９は、仕切紙ページのページ読取データの編集が完了すると、代表文字列記憶部１０７に記憶されている代表文字列を削除する（ステップＳ１３）。そして、ステップＳ８へ進む。以上で、画像処理装置１による読み取り処理が完了する。 When the editing of the page reading data of the partition page is completed, the data editing unit 109 deletes the representative character string stored in the representative character string storage unit 107 (step S13). Then, the process proceeds to step S8. Thus, the reading process by the image processing apparatus 1 is completed.

このように、本実施の形態にかかる画像処理装置１によれば、ユーザが所定の仕切紙を文書の区切り位置に挿入するだけで、仕切紙に相当するページ読取データに、仕切紙により区切られた文書の内容を付加することができる。したがって、ユーザは、仕切紙ページに対応するページ読取データを参照することにより、各原稿ページ群の特徴およびページ数を把握することができ、さらに次の仕切紙ページの位置を特定することができる。 As described above, according to the image processing apparatus 1 according to the present embodiment, the user only inserts a predetermined partition sheet into the document separation position, and the page reading data corresponding to the partition sheet is partitioned by the partition sheet. The contents of a document can be added. Therefore, the user can grasp the characteristics and the number of pages of each original page group by referring to the page reading data corresponding to the partition page, and can further specify the position of the next partition page. .

さらに、出力部１０２が画像形成部であって、原稿読取データに対応する画像が形成された場合には、ユーザは、原稿ページ間に挿入された仕切紙により、原稿ページの区切り位置を把握することができるだけでなく、仕切紙に印字された内容を閲覧することにより、仕切紙直前の原稿ページ群の内容を把握することができ、さらに、次の仕切紙の位置を特定することができる。このように、単に仕切紙を挿入することにより、より利用価値の高い仕切紙を印字可能な編集ページ読取データを得ることができる。すなわち、簡単な操作で仕切紙の利用価値を高めることができる。 Further, when the output unit 102 is an image forming unit and an image corresponding to the original reading data is formed, the user grasps the position of the original page separation by the partition paper inserted between the original pages. In addition, by browsing the contents printed on the partition sheet, the contents of the original page group immediately before the partition sheet can be grasped, and the position of the next partition sheet can be specified. As described above, simply inserting a partition sheet makes it possible to obtain edited page reading data that can print a partition sheet with higher utility value. That is, the utility value of the partition paper can be increased with a simple operation.

以上、本発明を実施の形態を用いて説明したが、上記実施の形態に多様な変更または改良を加えることができる。 As described above, the present invention has been described using the embodiment, but various changes or improvements can be added to the above embodiment.

そうした第１の変更例としては、図８に示すように、データ編集部１０９は、仕切紙ページに対応するページ読取データを編集するだけでなく、編集後のページ読取データのページ位置を、直前の原稿ページ群の直前のページ位置に移動してもよい。図８に示す例においては、仕切紙ページ１に対応する編集ページ読取データが１ページ目に移動し、原稿ページ１および原稿ページ２に対応するページ読取データが、それぞれ２ページ目および３ページ目に移動する。同様に、仕切紙ページ２に対応する編集ページ読取データが４ページ目に移動し、原稿ページ３および原稿ページ４に対応するページ読取データが、それぞれ５ページ目および６ページ目に移動する。 As such a first modification example, as shown in FIG. 8, the data editing unit 109 not only edits the page reading data corresponding to the partition page, but also sets the page position of the page reading data after editing to the immediately preceding page. You may move to the page position immediately before the original page group. In the example shown in FIG. 8, the edit page read data corresponding to the partition page 1 is moved to the first page, and the page read data corresponding to the original page 1 and the original page 2 is the second page and the third page, respectively. Move to. Similarly, the edited page read data corresponding to the partition page 2 moves to the fourth page, and the page read data corresponding to the original page 3 and the original page 4 move to the fifth page and the sixth page, respectively.

原稿の内容によっては、原稿ページの直前に、原稿ページ群の内容が示された仕切紙ページが挿入されている方が、ユーザにとって都合がよい場合がある。これに対し、本例のように、仕切紙ページに対応する編集ページ読取データのページ位置を、原稿ページ群の直前のページ位置に変更することにより、仕切紙の編集ページ読取データに表示された原稿ページの内容を、実際の原稿ページに先行してユーザに閲覧させることができる。また、原稿ページの内容を見ることなく、仕切紙ページに記載されたページ数のみから次の仕切紙ページの位置を特定することができる。 Depending on the contents of the document, it may be more convenient for the user to insert a divider page showing the contents of the document page group immediately before the document page. On the other hand, as shown in this example, the page position of the edit page read data corresponding to the partition page is changed to the page position immediately before the original page group, and is displayed in the edit page read data of the partition sheet. The content of the manuscript page can be viewed by the user prior to the actual manuscript page. Further, the position of the next partition sheet page can be specified from only the number of pages described in the partition page without looking at the content of the original page.

なお、ページ位置を変更する観点から、ページ読取データは、ＰＤＦ（ＰｏｒｔａｂｌｅＤｏｃｕｍｅｎｔＦｏｒｍａｔ）やＴＩＦＦ（ＴａｇｇｅｄＩｍａｇｅＦｉｌｅＦｏｒｍａｔ）など、マルチページを扱う文書フォーマットであることが好ましい。例えば、ＰＤＦの場合には、原稿ページ本体の情報を変更することなく、ページ番号のみを変更することにより、閲覧時のページ順を入れ替えることができる。 From the viewpoint of changing the page position, the page reading data is preferably a document format that handles multiple pages, such as PDF (Portable Document Format) and TIFF (Tagged Image File Format). For example, in the case of PDF, the page order at the time of browsing can be changed by changing only the page number without changing the information of the document page body.

さらに他の例としては、仕切紙ページに対応する編集ページ読取データのページ位置を、原稿ページの先頭または末尾にまとめることとしてもよい。すなわち、例えば、仕切紙ページ１および仕切紙ページ２対応する編集ページ読取データをそれぞれ１ページ目および２ページ目とし、原稿ページ１〜原稿ページ４のページ読取データをそれぞれ３ページ目〜６ページ目に配置してもよい。 As yet another example, the page position of the edited page read data corresponding to the partition page may be collected at the beginning or end of the document page. That is, for example, the edit page read data corresponding to the partition page 1 and the partition page 2 are the first page and the second page, respectively, and the page read data of the manuscript page 1 to the manuscript page 4 is the third page to the sixth page, respectively. You may arrange in.

第２の変更例としては、図９に示すように、仕切紙ページの編集ページ読取データにおいて、原稿ページ１ページに対し複数の代表文字列を表示してもよい。代表文字列は、複数の観点から好ましさを算出し、これを数値化することにより、総合的に決定されるのが好ましい。そこで、本変更例においては、代表文字列抽出部は、複数の観点から複数の代表文字列候補を抽出することとする。さらに、代表文字列の最大情報量を予め設定しておくこととする。そして、代表文字列抽出部は、抽出した複数の代表文字列候補に対し、上位から順に最大情報量に達するまで代表文字列候補を選択し、選択された複数の代表文字列を、対象とする原稿ページの代表文字列とする。 As a second modification, as shown in FIG. 9, a plurality of representative character strings may be displayed for one page of a manuscript page in the editing page reading data of a partition page. The representative character string is preferably determined comprehensively by calculating the preference from a plurality of viewpoints and digitizing it. Therefore, in this modification, the representative character string extraction unit extracts a plurality of representative character string candidates from a plurality of viewpoints. Further, the maximum information amount of the representative character string is set in advance. Then, the representative character string extraction unit selects the representative character string candidates from the top until the maximum amount of information is reached with respect to the extracted representative character string candidates, and targets the selected representative character strings. A representative character string of a manuscript page.

このように、複数の代表文字列を用いることにより、原稿ページの特徴を反映した内容を仕切紙ページに表示することができ、各原稿ページの内容の違いを表現することができる。さらに、表示可能な代表文字列の情報量をユーザにより設定可能としてもよい。 In this way, by using a plurality of representative character strings, the contents reflecting the characteristics of the original page can be displayed on the partition paper page, and the difference in the contents of each original page can be expressed. Furthermore, the information amount of the representative character string that can be displayed may be set by the user.

第３の変更例としては、関連情報生成部１０８は、仕切紙ページに対し、仕切紙ページの通し番号を付与するような関連情報を生成してもよい。図１０に示す例においては、「仕切り紙」という仕切画像の右側に「Ｎｏ．１」、「Ｎｏ．２」という通し番号が表示されている。このように、仕切紙ページ独自の通し番号を付与することにより、仕切紙ページの見落としを防ぐことができる。 As a third modification, the related information generation unit 108 may generate related information that gives a partition paper page serial number to the partition paper page. In the example illustrated in FIG. 10, serial numbers “No. 1” and “No. 2” are displayed on the right side of the partition image “partition paper”. Thus, by giving a unique serial number to the partition page, it is possible to prevent the partition page from being overlooked.

第４の変更例としては、出力部１０２が画像形成部である場合において、画像形成部は、印刷用紙に原稿読取データの画像を形成する際に、仕切紙ページについては、原稿ページと同一の印刷用紙に画像を形成することなく、異なる印刷用紙に画像を形成することとしてもよい。 As a fourth modification, when the output unit 102 is an image forming unit, when the image forming unit forms an image of the original reading data on the printing paper, the partition page is the same as the original page. An image may be formed on a different printing paper without forming an image on the printing paper.

仕切紙ページを含む原稿読取データの両面印刷を行うことにより、同一の印刷用紙に原稿ページと仕切紙ページの画像が形成された場合には、文書を区切るという仕切紙の意義が薄れてしまう場合がある。そこで、本変更例においては、上述のように、仕切紙ページについては、原稿ページと異なる印刷用紙に画像を形成することとする。 When the image of the manuscript page and the divider paper page is formed on the same printing paper by performing double-sided printing of the original reading data including the divider paper page, the significance of the divider paper that separates the documents is diminished There is. Therefore, in the present modification example, as described above, an image is formed on a printing paper different from the original page for the partition paper page.

図１１および図１２は、原稿読取データと、画像形成された印刷用紙の関係を示す図である。図１１に示すように、仕切紙ページが原稿読取データにおいて奇数ページに存在する場合には、仕切紙ページは、印刷用紙の表面に画像形成される。そこで、この場合には、画像形成部は、表面に仕切紙ページの画像が形成された印刷用紙については、裏面への画像形成を行わずに排紙することとする。そして、画像形成部は、次に給紙された印刷用紙に対して、仕切紙ページに続く原稿ページの画像を形成することとする。 11 and 12 are diagrams showing the relationship between the original reading data and the printing paper on which an image is formed. As shown in FIG. 11, when the partition page is present on an odd page in the original reading data, an image is formed on the surface of the print sheet. Therefore, in this case, the image forming unit discharges the printing paper on which the image of the partition paper page is formed on the front surface without performing image formation on the back surface. Then, the image forming unit forms an image of the original page following the partition paper page on the printing paper fed next.

図１１に示す例においては、印刷用紙２の表面に仕切紙ページ１が表示され、印刷用紙２の裏面には、画像は形成されない。そして、印刷用紙３の表面に、仕切紙ページ１に続く原稿ページ３の画像が形成される。 In the example shown in FIG. 11, the partition paper page 1 is displayed on the front surface of the printing paper 2, and no image is formed on the back surface of the printing paper 2. Then, an image of the original page 3 following the partition paper page 1 is formed on the surface of the printing paper 3.

また、図１２に示すように、仕切紙ページが原稿読取データにおいて偶数ページに存在する場合には、ページ順に両面印刷が成された場合には、仕切紙ページは、印刷用紙の裏面に画像形成されることとなる。そこで、この場合には、画像形成部は、仕切紙ページの直前の原稿ページを印刷用紙の表面に印刷した後、裏面への画像形成を行わずに、この印刷用紙を排紙し、次に給紙された印刷用紙の表面に仕切紙ページの画像を形成する。そして、画像形成部は、仕切紙ページの画像が形成された印刷用紙に対し、裏面への画像形成を行わず、この印刷用紙を排紙し、次に給紙された印刷用紙の表面に、仕切紙ページに続く原稿ページの画像を形成する。 In addition, as shown in FIG. 12, when divider pages are present on even-numbered pages in the original reading data, when duplex printing is performed in the page order, the divider pages are formed on the back side of the printing paper. Will be. Therefore, in this case, the image forming unit prints the original page immediately before the partition paper page on the front side of the print paper, and then discharges the print paper without performing image formation on the back side. An image of a partition sheet page is formed on the surface of the fed printing paper. Then, the image forming unit discharges the printing paper without forming the image on the back side of the printing paper on which the image of the partition paper page is formed, and then, on the surface of the printing paper fed next, An image of a manuscript page following the divider page is formed.

図１２に示す例においては、印刷用紙２の裏面には、画像が形成されず、印刷用紙３の表面に仕切紙ページ１の画像が形成される。そして、印刷用紙３の裏面には画像は形成されず、印刷用紙４の表面に、仕切紙ページ１に続く原稿ページ４の画像が形成される。 In the example shown in FIG. 12, no image is formed on the back surface of the printing paper 2, and the image of the partition paper page 1 is formed on the front surface of the printing paper 3. Then, no image is formed on the back surface of the printing paper 3, and an image of the original page 4 following the partition paper page 1 is formed on the front surface of the printing paper 4.

図１３は、第４の変更例にかかる画像処理装置１の処理を示すフローチャートである。第４の変更例にかかる画像処理装置１においては、ページ数カウンタ１０３がクリアされ（ステップＳ２０）、読取部１００により原稿の１ページが読み取られると（ステップＳ２１）、次に、ＡＤＦの機能により印刷用紙が給紙される（ステップＳ２２）。次に、ページ数カウンタ１０３は、値を１インクリメントする（ステップＳ２３）。 FIG. 13 is a flowchart showing the processing of the image processing apparatus 1 according to the fourth modification. In the image processing apparatus 1 according to the fourth modification, the page number counter 103 is cleared (step S20), and when one page of the document is read by the reading unit 100 (step S21), next, the function of the ADF is used. Printing paper is fed (step S22). Next, the page number counter 103 increments the value by 1 (step S23).

ページ判定の結果、対象ページが原稿ページである場合には（ステップＳ２４，Ｎｏ）、代表文字列抽出処理が行われる（ステップＳ２５）。ここで、代表文字列抽出処理は、図７を参照しつつ説明したステップＳ５〜ステップＳ７の処理である。次に、出力部１０２は、印刷用紙の表面に、ページ読取データの画像を形成する（ステップＳ２６）。 As a result of the page determination, if the target page is a manuscript page (No at Step S24), representative character string extraction processing is performed (Step S25). Here, the representative character string extraction process is the process of steps S5 to S7 described with reference to FIG. Next, the output unit 102 forms an image of page reading data on the surface of the printing paper (step S26).

原稿の読み取りが終了しない場合には（ステップＳ２７，Ｎｏ）、続いて、読取部１００は、次の１ページを読み取る（ステップＳ２８）。次に、ページ数カウンタ１０３は、値を１インクリメントする（ステップＳ２９）。そして、当該読取対象のページが原稿ページである場合には（ステップＳ３０，Ｎｏ）、代表文字列抽出処理が行われる（ステップＳ３１）。なお、ステップＳ３１の代表文字列抽出処理は、ステップＳ２５の代表文字列抽出処理と同様に、ステップＳ５〜ステップＳ７の処理である。 When the reading of the document is not completed (No at Step S27), the reading unit 100 reads the next page (Step S28). Next, the page number counter 103 increments the value by 1 (step S29). If the page to be read is a document page (No at Step S30), a representative character string extraction process is performed (Step S31). Note that the representative character string extraction process of step S31 is the process of steps S5 to S7, similar to the representative character string extraction process of step S25.

次に、出力部１０２は、印刷用紙の裏面に、ページ読取データの画像を形成する（ステップＳ３２）。ここで、ステップＳ３２において利用される印刷用紙は、ステップＳ２６において表面に画像が形成された印刷用紙である。 Next, the output unit 102 forms an image of page reading data on the back surface of the printing paper (step S32). Here, the printing paper used in step S32 is a printing paper on which an image is formed on the surface in step S26.

次に、画像形成された印刷用紙が排紙される（ステップＳ３３）。原稿の読み取りが終了しない場合には（ステップＳ３４，Ｎｏ）、ステップＳ２１に戻る。ステップＳ３４において、原稿の読み取りが終了した場合には（ステップＳ３４，Ｙｅｓ）、処理が完了する。また、ステップＳ２７において、原稿の読み取りが終了した場合には（ステップＳ２７，Ｙｅｓ）、ステップＳ３３に進む。 Next, the image-formed printing paper is discharged (step S33). If the reading of the document is not completed (No at Step S34), the process returns to Step S21. In step S34, when the reading of the document is completed (step S34, Yes), the processing is completed. In step S27, when the reading of the original is completed (step S27, Yes), the process proceeds to step S33.

一方、ステップＳ２４において、対象ページが仕切紙ページである場合（ステップＳ２４，Ｙｅｓ）およびステップＳ３０において、読取対象のページが仕切紙ページである場合（ステップＳ３０，Ｙｅｓ）には、ステップＳ４１へ進み、データ編集処理が行われる（ステップＳ４１）。ここで、データ編集処理は、図７を参照しつつ説明したステップＳ１１〜ステップＳ１３の処理である。次に、印刷用紙の表面に仕切紙ページに対応する編集ページ読取データの画像を形成し（ステップＳ４２）、ステップＳ３３へ進む。以上で処理が完了する。 On the other hand, if the target page is a partition page in step S24 (step S24, Yes) and the page to be read is a partition page in step S30 (step S30, Yes), the process proceeds to step S41. A data editing process is performed (step S41). Here, the data editing process is the process of steps S11 to S13 described with reference to FIG. Next, an image of edited page read data corresponding to the partition paper page is formed on the surface of the printing paper (step S42), and the process proceeds to step S33. This completes the process.

図１４は、画像処理装置１の一例としての複合機のハードウェア構成を示すブロック図である。本図に示すように、この画像処理装置１は、コントローラ１０とエンジン部（Ｅｎｇｉｎｅ）６０とをＰＣＩ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｆａｃｅ）バスで接続した構成となる。コントローラ１０は、画像処理装置１全体の制御と描画、通信、図示しない操作部からの入力を制御するコントローラである。エンジン部６０は、ＰＣＩバスに接続可能なプリンタエンジンなどであり、たとえば白黒プロッタ、１ドラムカラープロッタ、４ドラムカラープロッタ、スキャナまたはファックスユニットなどである。なお、このエンジン部６０には、プロッタなどのいわゆるエンジン部分に加えて、誤差拡散やガンマ変換などの画像処理部分が含まれる。 FIG. 14 is a block diagram illustrating a hardware configuration of a multifunction peripheral as an example of the image processing apparatus 1. As shown in the figure, the image processing apparatus 1 has a configuration in which a controller 10 and an engine unit (Engine) 60 are connected by a PCI (Peripheral Component Interface) bus. The controller 10 is a controller that controls the entire image processing apparatus 1 and controls drawing, communication, and input from an operation unit (not shown). The engine unit 60 is a printer engine that can be connected to a PCI bus, and is, for example, a monochrome plotter, a one-drum color plotter, a four-drum color plotter, a scanner, or a fax unit. The engine unit 60 includes an image processing part such as error diffusion and gamma conversion in addition to a so-called engine part such as a plotter.

コントローラ１０は、ＣＰＵ１１と、ノースブリッジ（ＮＢ）１３と、システムメモリ（ＭＥＭ−Ｐ）１２と、サウスブリッジ（ＳＢ）１４と、ローカルメモリ（ＭＥＭ−Ｃ）１７と、ＡＳＩＣ（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）１６と、ハードディスクドライブ（ＨＤＤ）１８とを有し、ノースブリッジ（ＮＢ）１３とＡＳＩＣ１６との間をＡＧＰ（ＡｃｃｅｌｅｒａｔｅｄＧｒａｐｈｉｃｓＰｏｒｔ）バス１５で接続した構成となる。また、ＭＥＭ−Ｐ１２は、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１２ａと、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ)１２ｂと、をさらに有する。 The controller 10 includes a CPU 11, a north bridge (NB) 13, a system memory (MEM-P) 12, a south bridge (SB) 14, a local memory (MEM-C) 17, and an ASIC (Application Specific Integrated Circuit). 16 and a hard disk drive (HDD) 18, and the north bridge (NB) 13 and the ASIC 16 are connected by an AGP (Accelerated Graphics Port) bus 15. The MEM-P 12 further includes a ROM (Read Only Memory) 12a and a RAM (Random Access Memory) 12b.

ＣＰＵ１１は、画像処理装置１の全体制御をおこなうものであり、ＮＢ１３、ＭＥＭ−Ｐ１２およびＳＢ１４からなるチップセットを有し、このチップセットを介して他の機器と接続される。 The CPU 11 performs overall control of the image processing apparatus 1 and includes a chip set including the NB 13, the MEM-P 12, and the SB 14, and is connected to other devices via the chip set.

ＮＢ１３は、ＣＰＵ１１とＭＥＭ−Ｐ１２、ＳＢ１４、ＡＧＰバス１５とを接続するためのブリッジであり、ＭＥＭ−Ｐ１２に対する読み書きなどを制御するメモリコントローラと、ＰＣＩマスタおよびＡＧＰターゲットとを有する。 The NB 13 is a bridge for connecting the CPU 11 to the MEM-P 12, SB 14, and the AGP bus 15, and includes a memory controller that controls reading / writing with respect to the MEM-P 12, a PCI master, and an AGP target.

ＭＥＭ−Ｐ１２は、プログラムやデータの格納用メモリ、プログラムやデータの展開用メモリ、プリンタの描画用メモリなどとして用いるシステムメモリであり、ＲＯＭ１２ａとＲＡＭ１２ｂとからなる。ＲＯＭ１２ａは、プログラムやデータの格納用メモリとして用いる読み出し専用のメモリであり、ＲＡＭ１２ｂは、プログラムやデータの展開用メモリ、プリンタの描画用メモリなどとして用いる書き込みおよび読み出し可能なメモリである。 The MEM-P 12 is a system memory used as a memory for storing programs and data, a memory for developing programs and data, a memory for drawing a printer, and the like, and includes a ROM 12a and a RAM 12b. The ROM 12a is a read-only memory used as a program / data storage memory, and the RAM 12b is a writable / readable memory used as a program / data development memory, a printer drawing memory, or the like.

ＳＢ１４は、ＮＢ１３とＰＣＩデバイス、周辺デバイスとを接続するためのブリッジである。このＳＢ１４は、ＰＣＩバスを介してＮＢ１３と接続されており、このＰＣＩバスには、ネットワークインターフェース（Ｉ／Ｆ）部なども接続される。 The SB 14 is a bridge for connecting the NB 13 to a PCI device and peripheral devices. The SB 14 is connected to the NB 13 via a PCI bus, and a network interface (I / F) unit and the like are also connected to the PCI bus.

ＡＳＩＣ１６は、画像処理用のハードウェア要素を有する画像処理用途向けのＩＣ（ＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ）であり、ＡＧＰバス１５、ＰＣＩバス、ＨＤＤ１８およびＭＥＭ−Ｃ１７をそれぞれ接続するブリッジの役割を有する。このＡＳＩＣ１６は、ＰＣＩターゲットおよびＡＧＰマスタと、ＡＳＩＣ１６の中核をなすアービタ（ＡＲＢ）と、ＭＥＭ−Ｃ１７を制御するメモリコントローラと、ハードウェアロジックなどにより画像データの回転などをおこなう複数のＤＭＡＣ（ＤｉｒｅｃｔＭｅｍｏｒｙＡｃｃｅｓｓＣｏｎｔｒｏｌｌｅｒ）と、エンジン部６０との間でＰＣＩバスを介したデータ転送をおこなうＰＣＩユニットとからなる。このＡＳＩＣ１６には、ＰＣＩバスを介してＦＣＵ（ＦａｃｓｉｍｉｌｅＣｏｎｔｒｏｌＵｎｉｔ）３０、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）４０、ＩＥＥＥ１３９４（ｔｈｅＩｎｓｔｉｔｕｔｅｏｆＥｌｅｃｔｒｉｃａｌａｎｄＥｌｅｃｔｒｏｎｉｃｓＥｎｇｉｎｅｅｒｓ１３９４）インターフェース５０が接続される。操作表示部２０はＡＳＩＣ１６に直接接続されている。 The ASIC 16 is an IC (Integrated Circuit) for image processing having hardware elements for image processing, and has a role of a bridge for connecting the AGP bus 15, the PCI bus, the HDD 18, and the MEM-C 17. The ASIC 16 includes a PCI target and an AGP master, an arbiter (ARB) that forms the core of the ASIC 16, a memory controller that controls the MEM-C 17, and a plurality of DMACs (Direct Memory) that rotate image data using hardware logic. (Access Controller) and a PCI unit that performs data transfer between the engine unit 60 via the PCI bus. The ASIC 16 is connected with an FCU (Facile Control Unit) 30, a USB (Universal Serial Bus) 40, and an IEEE 1394 (the Institute of Electrical Engineers 50) interface via an PCI bus. The operation display unit 20 is directly connected to the ASIC 16.

ＭＥＭ−Ｃ１７は、コピー用画像バッファ、符号バッファとして用いるローカルメモリであり、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）１８は、画像データの蓄積、プログラムの蓄積、フォントデータの蓄積、フォームの蓄積を行うためのストレージである。 The MEM-C 17 is a local memory used as a copy image buffer and a code buffer, and an HDD (Hard Disk Drive) 18 is a storage for storing image data, programs, font data, and forms. It is.

ＡＧＰバス１５は、グラフィック処理を高速化するために提案されたグラフィックスアクセラレーターカード用のバスインターフェースであり、ＭＥＭ−Ｐ１２に高スループットで直接アクセスすることにより、グラフィックスアクセラレーターカードを高速にするものである。 The AGP bus 15 is a bus interface for a graphics accelerator card proposed for speeding up graphics processing. The AGP bus 15 speeds up the graphics accelerator card by directly accessing the MEM-P 12 with high throughput. It is.

なお、本実施の形態の画像処理装置１で実行されるプログラムは、ＲＯＭ等に予め組み込まれて提供される。 The program executed by the image processing apparatus 1 according to the present embodiment is provided by being incorporated in advance in a ROM or the like.

本実施の形態の画像処理装置１で実行されるプログラムは、インストール可能な形式又は実行可能な形式のファイルでＣＤ−ＲＯＭ、フレキシブルディスク（ＦＤ）、ＣＤ−Ｒ、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）等のコンピュータで読み取り可能な記録媒体に記録して提供するように構成してもよい。 A program executed by the image processing apparatus 1 according to the present embodiment is a file in an installable format or an executable format, such as a CD-ROM, a flexible disk (FD), a CD-R, a DVD (Digital Versatile Disk), or the like. You may comprise so that it may record and provide on a computer-readable recording medium.

さらに、本実施の形態の画像処理装置１で実行されるプログラムを、インターネット等のネットワークに接続されたコンピュータ上に格納し、ネットワーク経由でダウンロードさせることにより提供するように構成しても良い。また、本実施の形態の画像処理装置１で実行されるプログラムをインターネット等のネットワーク経由で提供または配布するように構成しても良い。 Furthermore, the program executed by the image processing apparatus 1 of the present embodiment may be stored on a computer connected to a network such as the Internet and provided by being downloaded via the network. The program executed by the image processing apparatus 1 according to the present embodiment may be configured to be provided or distributed via a network such as the Internet.

本実施の形態の画像処理装置１で実行されるプログラムは、上述した各部（ページ数カウンタ、ページ判定部、ＯＣＲ部、代表文字列抽出部、関連情報生成部およびデータ編集部）を含むモジュール構成となっており、実際のハードウェアとしてはＣＰＵ（プロセッサ）が上記ＲＯＭからプログラムを読み出して実行することにより上記各部が主記憶装置上にロードされ、各部が主記憶装置上に生成されるようになっている。 A program executed by the image processing apparatus 1 according to the present embodiment includes a module configuration including the above-described units (page number counter, page determination unit, OCR unit, representative character string extraction unit, related information generation unit, and data editing unit). As actual hardware, a CPU (processor) reads out a program from the ROM and executes the program so that each unit is loaded onto the main storage device and each unit is generated on the main storage device. It has become.

なお、画像処理装置は、複写機、プリンタ、スキャナ装置、ファクシミリ装置等の画像形成装置であってもよい。 Note that the image processing apparatus may be an image forming apparatus such as a copying machine, a printer, a scanner apparatus, or a facsimile apparatus.

また、図１５に示すようにネットワーク２００を介して接続された複数の装置２０１〜２０３により、上述の画像処理装置に相当する画像処理システム２が実現されてもよい。 In addition, as illustrated in FIG. 15, the image processing system 2 corresponding to the above-described image processing apparatus may be realized by a plurality of apparatuses 201 to 203 connected via the network 200.

１画像処理装置
１００読取部
１０１ページ読取データ記憶部
１０２出力部
１０３ページ数カウンタ
１０４ページ判定部
１０５ＯＣＲ部
１０６代表文字列抽出部
１０７代表文字列記憶部
１０８関連情報生成部
１０９データ編集部 DESCRIPTION OF SYMBOLS 1 Image processing apparatus 100 Reading part 101 Page reading data memory | storage part 102 Output part 103 Page number counter 104 Page determination part 105 OCR part 106 Representative character string extraction part 107 Representative character string memory | storage part 108 Related information generation part 109 Data editing part

特許第４２９８２８７号公報Japanese Patent No. 4298287

Claims

A reading unit that reads a document including a document page and a plurality of partition pages separating the document pages in units of pages;
A page determination unit that determines whether a target page corresponding to the page reading data is a manuscript page or a partition page based on page-read page reading data read by the reading unit;
Wherein a determined pages and original pages by the page determination unit, for each of a plurality of document pages group separated by said plurality of partition sheet page, read the page of the document pages included in the document page group A related information generation unit that extracts a part of information included in each data and generates related information about the document page group;
The relative page judgment section the partition sheet page each page read data corresponding to the determined page that allows the the relevant information of the originals page group just before the partition sheet page, corresponding to the partition sheet page A data editing unit for adding the page reading data to obtain the edited page reading data;
And an outputting unit for outputting the document reading data corresponding to the document to change the edit page read data page read data corresponding to the plurality of partition sheet page corresponding to the page read data An image processing apparatus.

A counting unit that counts the number of pages of the page reading data obtained by the reading unit;
The image processing apparatus according to claim 1, wherein the related information generation unit generates the related information indicating the number of pages of the plurality of document page groups.

A character recognition processing unit that performs character recognition processing on the page reading data corresponding to the pages determined to be the plurality of document pages by the page determination unit;
The related information generation unit extracts a representative character string representing the contents of the plurality of document pages based on a result of the character recognition process, and generates the related information indicating the extracted representative character string. The image processing apparatus according to claim 1, wherein the image processing apparatus is characterized.

Wherein the data editing unit, characterized in that to said page read data each of the plurality of partition sheet page, and further adding a partition image indicating that said a plurality of partition sheet page, obtaining the edit page read data The image processing apparatus according to any one of claims 1 to 3.

Wherein the data editing unit, the editing page reading page location of the data relative to each of the plurality of partition sheet pages in the document reading data of the document, just before immediately before the page read data of originals page group of the partition sheet page 5. The image processing apparatus according to claim 1, wherein the image processing apparatus is changed to the page position.

Wherein the data editing unit, with respect to the page read data corresponding to each of the plurality of partition sheet page, and further adding a serial number of the plurality of partition sheet page included in the document, to obtain the edit page read data The image processing apparatus according to claim 1.

The output unit is an image forming unit that forms an image on printing paper,
The image forming unit discharges the print paper without forming an image on a page immediately before or immediately after the plurality of partition paper pages when forming an image of the original read data on the front and back surfaces of the print paper. by paper, according to any one of claims 1 to 6, characterized by forming an image of said plurality of partition sheet page reading to that editing page corresponding to data on the surface of the printing paper Image processing apparatus.

An image processing method executed by an image processing apparatus,
A reading process of reading a document including a document page and a plurality of partition pages separating the document pages in units of pages;
A page determination step of determining whether a target page corresponding to the page reading data is a manuscript page or a partition page based on page reading data in page units read in the reading step;
A page the is determined in a page determining step is the document page, for each of a plurality of document pages group separated by said plurality of partition sheet page, read the page of the document pages included in the document page group A related information generating step of extracting a part of information included in each data and generating related information about the manuscript page group;
The relative page determining step the partition sheet page each page read data corresponding to the determined page to be in the the relevant information of the originals page group just before the partition sheet page, corresponding to the partition sheet page A data editing step for adding the page reading data to obtain the edited page reading data;
And comprising an output step of outputting the document reading data corresponding to the document to change the edit page read data page read data corresponding to the plurality of partition sheet page corresponding to the page read data Image processing method.

Computer
A reading unit that reads a document including a document page and a plurality of partition pages separating the document pages in units of pages;
A page determination unit that determines whether a target page corresponding to the page reading data is a manuscript page or a partition page based on page-read page reading data read by the reading unit;
Wherein a determined pages and original pages by the page determination unit, for each of a plurality of document pages group separated by said plurality of partition sheet page, read the page of the document pages included in the document page group A related information generation unit that extracts a part of information included in each data and generates related information about the document page group;
The relative page judgment section the partition sheet page each page read data corresponding to the determined page that allows the the relevant information of the originals page group just before the partition sheet page, corresponding to the partition sheet page A data editing unit for adding the page reading data to obtain the edited page reading data;
Program for functioning as an output unit for outputting the document reading data corresponding to the document to change the edit page read data page read data corresponding to the plurality of partition sheet page corresponding to the page read data.