JP2010092141A

JP2010092141A - Image processing system, image reader, image processor, and image processing program

Info

Publication number: JP2010092141A
Application number: JP2008259305A
Authority: JP
Inventors: Toshihiro Mori; 俊浩森
Original assignee: Konica Minolta Business Technologies Inc
Current assignee: Konica Minolta Business Technologies Inc
Priority date: 2008-10-06
Filing date: 2008-10-06
Publication date: 2010-04-22

Abstract

<P>PROBLEM TO BE SOLVED: To improve operability of editing operations and to attain efficient editing work in such a case as post-editing text portions included in an input image. <P>SOLUTION: The image reader 1 included in an image processing system discriminates text regions in an input image from a background region containing non-text objects, generates outline data by applying an outline conversion to text images contained in the text regions and grouping the text images character by character, and generates an editable data file DF by combining the outline data with the background data. The server computer 3 stores the data file DF and in response to a request from a client computer 4 the server computer 3 transmits the data file DF to the client computer 4. At that time, the server computer 3 reads the data file DF and changes a grouping of outline data included in the data file DF to a specified grouping. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、入力画像に含まれる文字画像を電子的に再利用可能なデータに変換する技術に関する。 The present invention relates to a technique for converting a character image included in an input image into electronically reusable data.

近年、情報の電子化が進み、文章や図、写真などが混在した文書を紙ではなく、電子化した情報で保存或いは送信する技術が普及してきている。また近年、複合機やＭＦＰ（Multi Function Peripheral）などと呼ばれる情報処理機は、カラー原稿を読み取ってカラー画像を生成することも可能である。ところが、例えばＡ４サイズのカラー原稿を３００ｄｐｉで読み取ると、カラー画像のデータ容量は約２５ＭＢになるため、このカラー画像を電子的に再利用するにはデータ量が大きすぎるという事態が生じる。そのため、カラー原稿を読み取って得られるカラー画像を圧縮処理するのが一般的であるが、カラー画像の全体に対して均一な圧縮処理を行うとした場合、例えばデータ量が十分に小さくなるような圧縮処理を行えば、原稿に含まれていた文字が判読不能になる。逆に、文字が判読可能な程度に圧縮処理を行えば、データ量を十分に小さくすることができなくなる。 2. Description of the Related Art In recent years, information has been digitized, and a technique for storing or transmitting a document in which texts, diagrams, photographs, and the like are mixed as electronic information instead of paper has become widespread. In recent years, an information processing apparatus called a multifunction peripheral or an MFP (Multi Function Peripheral) can also read a color original and generate a color image. However, for example, when an A4 size color original is read at 300 dpi, the data capacity of the color image is about 25 MB, and there is a situation in which the amount of data is too large to reuse this color image electronically. For this reason, it is common to compress a color image obtained by reading a color document. However, when uniform compression processing is performed on the entire color image, for example, the data amount is sufficiently small. If the compression process is performed, the characters included in the document become unreadable. Conversely, if the compression process is performed to such a degree that the characters can be read, the data amount cannot be made sufficiently small.

そのため、従来は、原稿を読み取った入力画像に含まれる文字領域と背景領域とを判別し、文字領域と背景領域とでそれぞれ異なる処理を行うことが提案されている。例えば、入力画像に含まれる前景画像をベクトルデータに変換し、そのベクトルデータを第１層に格納し、入力画像の背景画像を第２層に格納することにより、複数層で構成される電子データを生成する技術が公知である（例えば、特許文献１）。この技術によれば、入力画像に含まれる文字画像がアウトライン化されたベクトルデータに変換されるので、文字の判読性は失われず、電子データに含まれる文字や文章などを電子的に再利用することができるようになる。 For this reason, conventionally, it has been proposed to determine a character area and a background area included in an input image obtained by reading a document, and to perform different processes for the character area and the background area. For example, by converting the foreground image included in the input image into vector data, storing the vector data in the first layer, and storing the background image of the input image in the second layer, electronic data composed of a plurality of layers A technique for generating the signal is known (for example, Patent Document 1). According to this technology, since the character image included in the input image is converted into the vector data that is outlined, the legibility of the character is not lost, and the characters and sentences included in the electronic data are electronically reused. Will be able to.

また従来、スキャナなどの画像読取装置は、原稿を読み取ってＰＤＦ（Portable Document Format）ファイルを作成する際、アウトラインＰＤＦというファイルを作成することもできる。このアウトラインＰＤＦの機能では、原稿を読み取った入力画像の文字領域に対してアウトライン変換（ベクトルデータ変換）を行い、その変換後のアウトラインデータ（ベクトルデータ）と、入力画像の背景領域とを合成してＰＤＦファイルを作成する。このアウトラインＰＤＦでは、アウトライン変換を行う際に文字領域に含まれる文字単位での処理が行われる。 Conventionally, an image reading apparatus such as a scanner can also create a file called an outline PDF when reading a document and creating a PDF (Portable Document Format) file. The outline PDF function performs outline conversion (vector data conversion) on the character area of the input image read from the document, and combines the converted outline data (vector data) with the background area of the input image. To create a PDF file. In this outline PDF, processing in units of characters included in the character area is performed when performing outline conversion.

一方、従来、コンピュータで実行可能な文書管理プログラムなどが知られている。この文書管理プログラムは、ＰＤＦファイルなどの文書ファイルを管理するソフトウェアであり、例えばコンピュータで実行することにより文書ファイルの編集操作などを行うことができるようになっている。そのため、スキャナなどで原稿を読み取った画像データをコンピュータに入力し、そのコンピュータで文書管理プログラムを起動すれば、入力画像に含まれる文字の配置などを変更することができる。 On the other hand, conventionally, a document management program that can be executed by a computer is known. This document management program is software for managing a document file such as a PDF file. For example, the document management program can be executed by a computer to perform an editing operation on the document file. For this reason, if image data obtained by reading a document with a scanner or the like is input to a computer and a document management program is started on the computer, the arrangement of characters included in the input image can be changed.

特開２００７−２７２６０１号公報JP 2007-272601 A

しかしながら、従来の技術では、入力画像の文字領域に含まれる文字が文字単位でアウトライン変換されるため、スキャナから入力したＰＤＦファイルなどのデータファイルをコンピュータで編集することにより、例えば文字領域に含まれる文字を行単位或いは段落単位で配置を変更しようとする場合、ユーザは一文字ずつ配置を変更する操作を行うか、或いは複数の文字を予め選択する操作を行ったうえで配置を変更する操作をさらに行う必要がある。そのため、編集操作に手間がかかり、効率的な編集を行うことができないという問題があった。 However, in the conventional technique, characters included in the character area of the input image are outline-converted in character units, so that a data file such as a PDF file input from the scanner is edited by a computer, for example, included in the character area. When changing the arrangement of characters in units of lines or paragraphs, the user performs an operation of changing the arrangement of characters one by one, or an operation of changing the arrangement after performing an operation of selecting a plurality of characters in advance. There is a need to do. For this reason, there has been a problem that editing operations are time-consuming and efficient editing cannot be performed.

そこで本発明は、上記従来の問題点を解決することを目的としてなされたものであり、コンピュータなどを利用して入力画像に含まれる文字などを編集する場合の操作性を改善して編集操作を効率良く行うことができるようにした画像処理システム、画像読取装置、画像処理装置および画像処理プログラムを提供するものである。 Therefore, the present invention has been made for the purpose of solving the above-mentioned conventional problems, and improves the operability when editing characters and the like included in an input image using a computer or the like to perform editing operations. An image processing system, an image reading device, an image processing device, and an image processing program that can be efficiently performed are provided.

上記目的を達成するため、請求項１にかかる発明は、画像処理システムであって、入力画像に含まれる文字領域と、文字領域以外の背景領域とを判別する領域判別手段と、前記領域判別手段で判別された文字領域に含まれる文字画像のアウトライン変換を行い、文字単位でグルーピングしたアウトラインデータを生成するアウトライン変換手段と、前記領域判別手段で判別された背景領域から成る背景画像から背景データを生成する背景データ生成手段と、前記アウトライン変換手段で生成された前記アウトラインデータと、前記背景データ生成手段で生成された前記背景データとを統合して編集可能なデータファイルを生成するファイル生成手段と、前記ファイル生成手段で生成された前記データファイルを記憶する記憶手段と、前記記憶手段に記憶された前記データファイルに含まれる前記アウトラインデータのグルーピングの変更を指示する指示手段と、前記記憶手段から前記データファイルを読み出し、当該データファイルに含まれる前記アウトラインデータのグルーピングを前記指示手段からの指示に基づいて変更するグルーピング変更手段と、前記グルーピング変更手段によってグルーピングが変更されたアウトラインデータを含むデータファイルを出力するファイル出力手段と、を備えることを特徴としている。 In order to achieve the above object, an invention according to claim 1 is an image processing system, comprising: a region determining unit that determines a character region included in an input image and a background region other than the character region; and the region determining unit. The outline conversion means for performing outline conversion of the character image included in the character area determined in step (b) and generating outline data grouped in character units, and background data from the background image consisting of the background area determined by the area determination means Background data generating means for generating; file generating means for generating an editable data file by integrating the outline data generated by the outline converting means and the background data generated by the background data generating means; Storage means for storing the data file generated by the file generation means; and the storage Instructing means for instructing a change in the grouping of the outline data included in the data file stored in the stage, and the instructing means for reading out the data file from the storage means and grouping the outline data included in the data file And a file output means for outputting a data file including outline data whose grouping has been changed by the grouping change means.

かかる構成によれば、画像処理システムがデータファイルを出力する際、アウトラインデータの文字単位のグルーピングが他の単位のグルーピングに変更されるので、この画像処理システムから出力されるデータファイルを利用して編集操作を行う際には、文字単位で編集操作を行う必要がなくなる。 According to such a configuration, when the image processing system outputs a data file, the grouping of outline data in character units is changed to grouping in other units. Therefore, the data file output from the image processing system is used. When performing the editing operation, it is not necessary to perform the editing operation in character units.

また請求項２にかかる発明は、請求項１記載の画像処理システムにおいて、前記ファイル出力手段が出力したデータファイルに対する編集後のデータファイルを入力するファイル入力手段をさらに備え、前記グルーピング変更手段は、前記ファイル入力手段に入力した編集後のデータファイルに含まれる前記アウトラインデータのグルーピングを文字単位のグルーピングに戻した後、前記記憶手段に記憶させることを特徴としている。 According to a second aspect of the present invention, in the image processing system according to the first aspect, the image processing system further includes a file input unit that inputs an edited data file for the data file output by the file output unit, and the grouping change unit includes: The outline data grouping included in the edited data file input to the file input means is returned to the character grouping, and then stored in the storage means.

かかる構成によれば、画像処理システムにおいて編集後のデータファイルは、アウトラインデータが文字単位のグルーピングに戻された状態で管理されるので、複数のユーザがデータファイルを共有して利用する場合の利便性が向上する。 According to such a configuration, the edited data file in the image processing system is managed in a state in which the outline data is returned to the character unit grouping, which is convenient when a plurality of users share and use the data file. Improves.

また請求項３にかかる発明は、請求項１又は２に記載の画像処理システムにおいて、前記領域判別手段によって判別された前記文字領域に含まれる文字構造を判別して文字構造判別情報を生成する文字構造判別手段をさらに備え、前記ファイル生成手段は、前記アウトラインデータと、前記背景データと、前記文字構造判別手段で生成された文字構造判別情報とを統合して前記データファイルを生成し、前記グルーピング変更手段は、前記文字構造判別情報に基づいて前記アウトラインデータのグルーピングを文字単位のグルーピングから別の単位のグルーピングに変更することを特徴としている。 According to a third aspect of the present invention, in the image processing system according to the first or second aspect, the character for determining the character structure included in the character area determined by the area determining means and generating character structure determination information And further comprising a structure determining means, wherein the file generating means generates the data file by integrating the outline data, the background data, and the character structure determining information generated by the character structure determining means, and the grouping. The changing unit is characterized in that the grouping of the outline data is changed from grouping in character units to grouping in another unit based on the character structure discrimination information.

かかる構成によれば、グルーピング変更手段が、アウトラインデータのグルーピングを変更する際、文字構造判別情報に基づいて変更することができるので、効率的なグルーピングの変更が可能になる。 According to such a configuration, when the grouping changing unit changes the grouping of the outline data, the grouping changing unit can change the grouping based on the character structure determination information, so that the grouping can be changed efficiently.

また請求項４にかかる発明は、請求項３記載の画像処理システムにおいて、前記文字構造判別手段は、前記文字領域に含まれる文字を文字単位で判別すると共に、行単位、段落単位又は頁単位で判別し、前記文字構造判別情報には行、段落又は頁に関する情報が含まれることを特徴としている。 According to a fourth aspect of the present invention, in the image processing system according to the third aspect, the character structure determining means determines characters included in the character area in units of characters, and in units of lines, paragraphs or pages. The character structure discrimination information includes information on lines, paragraphs or pages.

かかる構成によれば、グルーピング変更手段が、アウトラインデータのグルーピングを行単位、段落単位又は頁単位で効率的に変更することができるようになる。 According to such a configuration, the grouping changing unit can efficiently change the grouping of the outline data in units of lines, paragraphs or pages.

また請求項５にかかる発明は、画像読取装置であって、原稿を読み取って入力画像を生成する画像読取手段と、前記画像入力手段に入力した入力画像に含まれる文字領域と、文字領域以外の背景領域とを判別する領域判別手段と、前記領域判別手段によって判別された前記文字領域に含まれる文字構造を判別して文字構造判別情報を生成する文字構造判別手段と、前記領域判別手段で判別された文字領域に含まれる文字画像のアウトライン変換を行い、文字単位でグルーピングしたアウトラインデータを生成するアウトライン変換手段と、前記領域判別手段で判別された背景領域から成る背景画像から背景データを生成する背景データ生成手段と、前記アウトライン変換手段で生成された前記アウトラインデータと、前記背景データ生成手段で生成された前記背景データと、前記文字構造判別手段で生成された文字構造判別情報とを統合して編集可能なデータファイルを生成するファイル生成手段と、を備えることを特徴としている。 According to a fifth aspect of the present invention, there is provided an image reading apparatus, comprising: an image reading unit that reads an original to generate an input image; a character area included in the input image input to the image input unit; Discriminated by the area discriminating means for discriminating the background area, the character structure discriminating means for discriminating the character structure included in the character area discriminated by the area discriminating means and generating the character structure discriminating information, and the area discriminating means The outline conversion unit that performs outline conversion of the character image included in the set character region and generates outline data grouped in units of characters, and generates background data from the background image formed by the background region determined by the region determination unit Background data generating means, the outline data generated by the outline converting means, and the background data generating means And generated the background data, is characterized by and a file generation means for generating data files editable by integrating the generated character structure identification information in the character structure determination unit.

かかる構成によれば、データファイルに含まれる文字構造判別情報を参照することにより、そのデータファイルを利用する際にアウトラインデータのグルーピングを文字単位から別の単位に変更することができるようになる。 According to such a configuration, by referring to the character structure discrimination information included in the data file, the outline data grouping can be changed from the character unit to another unit when the data file is used.

また請求項６にかかる発明は、画像処理装置であって、文字単位でグルーピングされたアウトラインデータと、背景データとが統合された編集可能なデータファイルを記憶する記憶手段と、前記記憶手段に記憶された前記データファイルに含まれる前記アウトラインデータのグルーピングの変更を指示する指示手段と、前記記憶手段から前記データファイルを読み出し、当該データファイルに含まれる前記アウトラインデータのグルーピングを前記指示手段からの指示に基づいて変更するグルーピング変更手段と、前記グルーピング変更手段によってグルーピングが変更されたアウトラインデータを含むデータファイルを出力するファイル出力手段と、を備えることを特徴としている。 According to a sixth aspect of the present invention, there is provided an image processing apparatus that stores an editable data file in which outline data grouped in character units and background data are integrated, and stored in the storage unit. Instructing means for instructing a change in grouping of the outline data included in the data file, and reading out the data file from the storage means, and instructing the grouping of the outline data included in the data file from the instructing means And a file output unit for outputting a data file including outline data whose grouping has been changed by the grouping change unit.

かかる構成によれば、画像処理装置がデータファイルを出力する際、アウトラインデータの文字単位のグルーピングが別の単位のグルーピングに変更されるので、この画像処理装置から出力されるデータファイルを利用して編集操作を行う際には、文字単位で編集操作を行う必要がなくなる。 According to such a configuration, when the image processing apparatus outputs the data file, the grouping of the outline data in character units is changed to another unit grouping. Therefore, the data file output from the image processing apparatus is used. When performing the editing operation, it is not necessary to perform the editing operation in character units.

また請求項７にかかる発明は、画像処理プログラムであって、コンピュータを、入力画像に含まれる文字領域と、文字領域以外の背景領域とを判別する領域判別手段、前記領域判別手段によって判別された前記文字領域に含まれる文字構造を判別して文字構造判別情報を生成する文字構造判別手段、前記領域判別手段で判別された文字領域に含まれる文字画像のアウトライン変換を行い、文字単位でグルーピングしたアウトラインデータを生成するアウトライン変換手段、前記領域判別手段で判別された背景領域から成る背景画像から背景データを生成する背景データ生成手段、前記アウトライン変換手段で生成された前記アウトラインデータと、前記背景データ生成手段で生成された前記背景データと、前記文字構造判別手段で生成された文字構造判別情報とを統合して編集可能なデータファイルを生成するファイル生成手段、として機能させることを特徴としている。 The invention according to claim 7 is an image processing program, wherein the computer is determined by an area determination means for determining a character area included in an input image and a background area other than the character area, and the area determination means. Character structure discriminating means for discriminating the character structure contained in the character area and generating character structure discriminating information, outline conversion of character images contained in the character area discriminated by the area discriminating means, and grouping in character units Outline conversion means for generating outline data, background data generation means for generating background data from a background image made up of the background area determined by the area determination means, the outline data generated by the outline conversion means, and the background data The background data generated by the generation means and the character structure determination means It is characterized in that is shaped structure determination information and the file generating means for generating data files editable integrates functions as.

また請求項８にかかる発明は、画像処理プログラムであって、コンピュータを、文字単位でグルーピングされたアウトラインデータと、背景データとが統合された編集可能なデータファイルが記憶された所定の記憶手段から前記データファイルを読み出す手段、前記データファイルに含まれる前記アウトラインデータのグルーピングの変更を指示する指示手段、前記データファイルに含まれる前記アウトラインデータのグルーピングを前記指示手段からの指示に基づいて変更するグルーピング変更手段、前記グルーピング変更手段によってグルーピングが変更されたアウトラインデータを含むデータファイルを出力するファイル出力手段、として機能させることを特徴としている。 According to an eighth aspect of the present invention, there is provided an image processing program comprising: a predetermined storage unit storing an editable data file in which outline data grouped in character units and background data are integrated; Means for reading out the data file; instruction means for instructing a change in the grouping of the outline data included in the data file; and grouping for changing the grouping of the outline data included in the data file based on an instruction from the instruction means. It is characterized by functioning as a change means and a file output means for outputting a data file including outline data whose grouping has been changed by the grouping change means.

本発明によれば、文字単位でグルーピングされたアウトラインデータと背景データとを統合したデータファイルが利用される際には、アウトラインデータの文字単位のグルーピングが別の単位のグルーピングに変更されるので、このデータファイルを利用して編集操作を行う際には、文字単位で編集操作を行う必要がなくなる。それ故、特に入力画像に含まれる文字部分を編集する場合の操作性が改善され、編集操作を効率良く行うことができるようになる。 According to the present invention, when a data file that integrates outline data grouped in character units and background data is used, grouping in character units of outline data is changed to grouping in another unit. When performing an editing operation using this data file, it is not necessary to perform the editing operation in character units. Therefore, the operability when editing a character portion included in the input image is improved, and the editing operation can be performed efficiently.

以下、本発明に関する好ましい実施形態について図面を参照しつつ詳細に説明する。尚、各図面において互いに共通する部材には同一符号を付しており、それらについての重複する説明は省略する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings. In addition, the same code | symbol is attached | subjected to the member which is mutually common in each drawing, and the overlapping description about them is abbreviate | omitted.

図１は、本実施形態における画像処理システムの一構成例を示す図である。この画像処理システムは、画像読取装置１と、サーバコンピュータ３と、クライアントコンピュータ４とがネットワーク２を介して相互に通信可能に接続された構成である。図２は、この画像処理システムにおける各部のハードウェア構成を示す図である。 FIG. 1 is a diagram illustrating a configuration example of an image processing system according to the present embodiment. This image processing system has a configuration in which an image reading apparatus 1, a server computer 3, and a client computer 4 are connected to each other via a network 2 so that they can communicate with each other. FIG. 2 is a diagram illustrating a hardware configuration of each unit in the image processing system.

図１に示すように画像読取装置１は、例えば複合機やＭＦＰ（Multi Function Peripheral）などと呼ばれる情報処理機として構成されており、コピー機能、スキャナ機能、ＦＡＸ機能、プリンタ機能などの複数の機能を備えている。この画像読取装置１は、原稿を読み取って入力画像を生成する画像読取部１０を備えており、その画像読取部１０の上部には複数枚の原稿を１枚ずつ搬送する自動原稿搬送装置１０ａが設けられている。画像読取部１０は自動原稿搬送装置１０ａによる原稿の搬送動作と同期して原稿の読み取り動作を行い、自動原稿搬送装置１０ａにセットされた複数枚の原稿を１枚ずつ自動で読み取ることができるようになっている。ただし、自動原稿搬送装置１０ａは特に必須のものではない。画像読取装置１の正面側にはユーザが操作可能な操作パネル１４が設けられており、例えば操作パネル１４に設けられているスタートボタンなどが押されれば、画像読取装置１において原稿の読み取り動作が開始される。また画像読取装置１の下部には、給紙部１５と画像形成部１６とが設けられており、コピー機能やプリンタ機能などが使用される場合、給紙部１５が画像形成部１６に対して用紙を１枚ずつ供給し、画像形成部１６がその用紙に対して画像形成を行って出力するように構成されている。尚、図例では、画像読取装置１が複数の機能を備えた情報処理機である場合を示しているが、本実施形態の画像読取装置１は、少なくとも原稿を読み取って入力画像を生成するスキャナ機能を備えたものであれば良い。 As shown in FIG. 1, the image reading apparatus 1 is configured as an information processing device called, for example, a multifunction peripheral or MFP (Multi Function Peripheral), and has a plurality of functions such as a copy function, a scanner function, a FAX function, and a printer function. It has. The image reading apparatus 1 includes an image reading unit 10 that reads an original and generates an input image. An automatic original conveying apparatus 10 a that conveys a plurality of originals one by one is provided above the image reading unit 10. Is provided. The image reading unit 10 performs a document reading operation in synchronization with the document conveying operation by the automatic document conveying device 10a so that a plurality of documents set on the automatic document conveying device 10a can be automatically read one by one. It has become. However, the automatic document feeder 10a is not particularly essential. An operation panel 14 that can be operated by a user is provided on the front side of the image reading apparatus 1. For example, when a start button or the like provided on the operation panel 14 is pressed, an original reading operation is performed in the image reading apparatus 1. Is started. In addition, a paper feeding unit 15 and an image forming unit 16 are provided in the lower part of the image reading apparatus 1. When a copy function, a printer function, or the like is used, the paper feeding unit 15 is connected to the image forming unit 16. Each sheet is supplied one by one, and the image forming unit 16 forms an image on the sheet and outputs the image. In the example shown in the figure, the image reading apparatus 1 is an information processing machine having a plurality of functions. However, the image reading apparatus 1 according to this embodiment is a scanner that reads at least a document and generates an input image. Any device having a function may be used.

また図２に示すように画像読取装置１は、画像読取部１０が原稿を読み取って生成した入力画像を処理する入力画像処理部１１と、ＣＰＵ１２と、メモリやハードディスクなどの記憶部１３と、操作パネル１４と、給紙部１５と、画像形成部１６と、通信制御部１７と、ネットワークインタフェース１８とがデータバス１９に接続された構成となっている。入力画像処理部１１は、入力画像に含まれる文字領域と背景領域とを判別し、文字領域に含まれる文字をアウトライン変換する処理部である。ＣＰＵ１２は、画像読取部１０が読み取った入力画像を所定のデータファイルとして出力する処理部であり、例えば文字領域と背景領域のそれぞれのデータに対して適切な圧縮処理を行い、アウトラインＰＤＦのように文字領域と背景領域とを統合したデータファイルを生成してサーバコンピュータ３に出力する。この場合、データファイルは、ネットワークインタフェース１８を介してネットワーク２に送出され、サーバコンピュータ３に送られる。尚、通信制御部１７は、画像読取装置１が電話回線９を利用してＦＡＸデータの送受信を行うための通信制御手段である。 As shown in FIG. 2, the image reading apparatus 1 includes an input image processing unit 11 that processes an input image generated by the image reading unit 10 reading a document, a CPU 12, a storage unit 13 such as a memory or a hard disk, and an operation. The panel 14, the paper feeding unit 15, the image forming unit 16, the communication control unit 17, and the network interface 18 are connected to a data bus 19. The input image processing unit 11 is a processing unit that determines a character region and a background region included in the input image and performs outline conversion on the characters included in the character region. The CPU 12 is a processing unit that outputs the input image read by the image reading unit 10 as a predetermined data file. For example, the CPU 12 performs appropriate compression processing on the data in the character area and the background area, as in the outline PDF. A data file in which the character area and the background area are integrated is generated and output to the server computer 3. In this case, the data file is sent to the network 2 via the network interface 18 and sent to the server computer 3. The communication control unit 17 is a communication control unit for the image reading apparatus 1 to transmit and receive FAX data using the telephone line 9.

サーバコンピュータ３は、図２に示すように、ＣＰＵ３０と、メモリ３１と、ハードディスクなどの記憶部３２と、ネットワークインタフェース３３とを備えており、これらがデータバス３４に接続された構成である。記憶部３２は、画像読取装置１が原稿を読み取って生成したデータファイルを蓄積して記憶する記憶手段である。つまり、ＣＰＵ３０は、ネットワークインタフェース３３を介して画像読取装置１が出力したデータファイルを受信すると、そのデータファイルを記憶部３２に格納保存する。このサーバコンピュータ３には、予めサーバ用の画像処理プログラムがインストールされている。そしてＣＰＵ３０がその画像処理プログラムを実行することにより、記憶部３２に記憶している複数のデータファイルを管理すると共に、ネットワーク２を介してクライアントコンピュータ４からデータファイルの送信要求などがあった場合に、記憶部３２から指定されたデータファイルを読み出し、グルーピング変更などの後述する処理を必要に応じて行った後にそのデータファイルをクライアントコンピュータ４に出力するように構成されている。尚、サーバ用の画像処理プログラムは、図１に示すように例えばＣＤ−ＲＯＭなどの記録媒体３ａを利用してサーバコンピュータ３にインストールされる。 As shown in FIG. 2, the server computer 3 includes a CPU 30, a memory 31, a storage unit 32 such as a hard disk, and a network interface 33, which are connected to a data bus 34. The storage unit 32 is a storage unit that accumulates and stores data files generated by the image reading apparatus 1 reading a document. That is, when the CPU 30 receives the data file output from the image reading apparatus 1 via the network interface 33, the CPU 30 stores and saves the data file in the storage unit 32. A server image processing program is installed in the server computer 3 in advance. The CPU 30 executes the image processing program to manage a plurality of data files stored in the storage unit 32, and when there is a data file transmission request from the client computer 4 via the network 2. The data file designated from the storage unit 32 is read out, and after-mentioned processing such as grouping change is performed as necessary, the data file is output to the client computer 4. The server image processing program is installed in the server computer 3 using a recording medium 3a such as a CD-ROM as shown in FIG.

クライアントコンピュータ４は、図２に示すように、ＣＰＵ４０と、メモリ４１と、液晶表示デバイスやＣＲＴなどの表示部４２と、キーボードやマウスなどの操作入力部４３と、ネットワークインタフェース４４とを備えており、これらがデータバス４５に接続された構成である。つまり、クライアントコンピュータ４は、一般的なパーソナルコンピュータ（ＰＣ）によって構成される。このクライアントコンピュータ４には、予めクライアント用の画像処理プログラムがインストールされている。そしてクライアントコンピュータ４のＣＰＵ４０がその画像処理プログラムを実行することにより、サーバコンピュータ３から編集対象のデータファイルを取得してその画像を表示すると共に、ユーザからのその画像に対する編集操作を受け付け、編集処理を実行するように構成されている。尚、クライアント用の画像処理プログラムは、図１に示すように例えばＣＤ−ＲＯＭなどの記録媒体４ａを利用してクライアントコンピュータ４にインストールされる。 As shown in FIG. 2, the client computer 4 includes a CPU 40, a memory 41, a display unit 42 such as a liquid crystal display device or a CRT, an operation input unit 43 such as a keyboard or a mouse, and a network interface 44. These are connected to the data bus 45. That is, the client computer 4 is configured by a general personal computer (PC). The client computer 4 is preinstalled with a client image processing program. When the CPU 40 of the client computer 4 executes the image processing program, the data file to be edited is acquired from the server computer 3 and displayed, and an editing operation on the image from the user is accepted and the editing process is performed. Is configured to run. The client image processing program is installed in the client computer 4 using a recording medium 4a such as a CD-ROM as shown in FIG.

上記のような構成において、例えばユーザが原稿に含まれる文字部分を電子的に再利用（再編集）可能なデータに変換することを望む場合、画像読取装置１に原稿をセットした後、操作パネル１４を操作して選択可能な複数のデータファイル形式の中から例えば「アウトラインＰＤＦ」などを選択して原稿の読み取り開始を指示すると、画像読取装置１では図３に示すような処理が行われる。 In the above configuration, for example, when the user desires to convert the character portion included in the document into data that can be electronically reused (re-edited), after the document is set in the image reading apparatus 1, the operation panel 14 is selected from a plurality of selectable data file formats, for example, “Outline PDF” or the like, and an instruction to start reading a document is issued, the image reading apparatus 1 performs processing as shown in FIG.

図３は、画像読取装置１において原稿を読み取って文字部分が再利用可能な形式のデータファイルを生成する際の処理手順および機能構成を示す図である。図３に示すように、画像読取装置１は、原稿を読み取ってサーバコンピュータ３に出力するためのデータファイルを生成する際、上述した画像読取部１０と、入力画像処理部１１と、ＣＰＵ１２とが機能する。画像読取部１０は、原稿を読み取って入力画像Ｄ１を生成すると、その入力画像Ｄ１を入力画像処理部１１に出力する。尚、この入力画像Ｄ１はラスターデータとなっている。 FIG. 3 is a diagram illustrating a processing procedure and a functional configuration when the image reading apparatus 1 reads a document and generates a data file in a format in which a character part can be reused. As shown in FIG. 3, when the image reading apparatus 1 generates a data file for reading a document and outputting it to the server computer 3, the image reading unit 10, the input image processing unit 11, and the CPU 12 described above are included. Function. When the image reading unit 10 reads an original and generates an input image D1, the image reading unit 10 outputs the input image D1 to the input image processing unit 11. The input image D1 is raster data.

入力画像処理部１１は、画像読取部１０から入力画像Ｄ１を入力すると、前処理部５１、領域判別部５２およびアウトライン変換部５５を順次機能させる。前処理部５１は、領域判別部５２による領域判別処理を適切に行うことができるようにするための前処理を行うものであり、例えば解像度変換や下地除去などの処理を行うように構成される。解像度変換は入力画像全体の解像度を変換する処理であり、この処理は必要に応じて行われる。また下地除去は、背景と下地の違いを明確にするための処理であり、例えば所定濃度値以下のドットを下地部分とみなしてその濃度値を「０」とする処理である。前処理部５１においてこれらの前処理が行われると、次に領域判別部５２が機能する。 When the input image processing unit 11 inputs the input image D1 from the image reading unit 10, the input image processing unit 11 causes the preprocessing unit 51, the region determination unit 52, and the outline conversion unit 55 to function sequentially. The preprocessing unit 51 performs preprocessing so that the region determination processing by the region determination unit 52 can be appropriately performed, and is configured to perform processing such as resolution conversion and background removal, for example. . Resolution conversion is a process of converting the resolution of the entire input image, and this process is performed as necessary. The background removal is a process for clarifying the difference between the background and the background. For example, a dot having a predetermined density value or less is regarded as the background part and the density value is set to “0”. When these preprocessing is performed in the preprocessing unit 51, the area determination unit 52 functions next.

領域判別部５２は、前処理が行われた入力画像Ｄ２に含まれる文字領域と、文字領域以外の背景領域とを判別する。この領域判別部５２は、文字領域を判別する文字領域判別部５３を有しており、文字領域判別部５３が下地除去の行われた入力画像Ｄ２に含まれる文字領域を判別する。この文字領域の判別手法は公知の手法を採用することができる。一例を挙げると、文字領域判別部５３は、入力画像Ｄ２のＲＧＢデータを明度データに変換して白黒のグレー画像を生成し、このグレー画像を所定方向に走査しながらオブジェクトを抽出するためのラベリング処理を行う。このラベリング処理により、入力画像Ｄ２に含まれる全ての文字や写真などがオブジェクトとして抽出される。そして文字領域判別部５３は、抽出した全てのオブジェクトの大きさや濃度、配置などを所定の条件と比較することにより、文字とみなすことができる全てのオブジェクトを抽出する。尚、この処理では、文字の一つ一つが、ひとつのオブジェクトとして抽出されることになる。そして文字領域判別部５３は、文字とみなしたオブジェクトが複数集合して存在している領域を、ひとつの文字領域として判別する。そして文字領域判別部５３が文字領域と判別した部分以外の領域は、背景領域として判別される。 The area determination unit 52 determines a character area included in the input image D2 that has been preprocessed and a background area other than the character area. The area discriminating unit 52 includes a character area discriminating unit 53 that discriminates a character area, and the character area discriminating unit 53 discriminates a character area included in the input image D2 subjected to the background removal. A known method can be adopted as a method for determining the character area. For example, the character area discriminating unit 53 converts the RGB data of the input image D2 into lightness data to generate a black and white gray image, and performs labeling for extracting an object while scanning the gray image in a predetermined direction. Process. Through this labeling process, all characters and photos included in the input image D2 are extracted as objects. Then, the character region determination unit 53 extracts all objects that can be regarded as characters by comparing the size, density, arrangement, and the like of all the extracted objects with a predetermined condition. In this process, each character is extracted as one object. Then, the character area determining unit 53 determines an area where a plurality of objects regarded as characters are present as one character area. A region other than the portion that the character region determination unit 53 has determined as a character region is determined as a background region.

図４は、領域判別部５２による処理の概要を示す図である。図４に示すように前処理の行われた入力画像Ｄ２に、写真Ｍ１や図Ｍ２の他、複数の文字が集合した文字部分Ｍ３，Ｍ４，Ｍ５が存在するとした場合、領域判別部５２は上述した領域判別処理を行うことにより、文字領域から成る文字領域画像ＤＡと、文字領域以外の背景領域から成る背景領域画像ＤＢとを生成する。文字領域画像ＤＡでは背景領域画像ＤＢに含まれる写真Ｍ１や図Ｍ２が除去されており、背景領域画像ＤＢでは文字領域画像ＤＡに含まれる文字部分Ｍ３，Ｍ４，Ｍ５が除去されている。 FIG. 4 is a diagram showing an outline of processing by the area determination unit 52. As shown in FIG. 4, when the pre-processed input image D2 includes character portions M3, M4, and M5 in which a plurality of characters are collected in addition to the photograph M1 and FIG. By performing the area determination processing, a character area image DA composed of character areas and a background area image DB composed of background areas other than the character areas are generated. In the character area image DA, the photograph M1 and FIG. M2 included in the background area image DB are removed, and in the background area image DB, character portions M3, M4, and M5 included in the character area image DA are removed.

また図３に示すように領域判別部５２において文字領域判別部５３は、文字構造判別部５４としても機能する。この文字構造判別部５４は、文字領域に含まれる文字を文字単位で判別すると共に、行単位、段落単位および頁単位でも文字を判別するように構成されている。例えば、文字領域判別部５３が文字とみなすことができる全てのオブジェクトを抽出することにより、文字領域に含まれる全ての文字が文字単位で判別される。そして文字構造判別部５４は、隣接する文字との間隔などを評価することにより、文章が縦書きであるか横書きであるかなどを判定し、文章を構成する行単位、段落単位および頁単位で文字の配置構造を判別する。 In addition, as shown in FIG. 3, the character region determination unit 53 in the region determination unit 52 also functions as a character structure determination unit 54. The character structure determination unit 54 is configured to determine characters included in the character area in units of characters, and to determine characters in units of lines, paragraphs, and pages. For example, by extracting all objects that can be regarded as characters by the character area determination unit 53, all characters included in the character area are determined in units of characters. The character structure discriminating unit 54 determines whether the sentence is vertical writing or horizontal writing by evaluating an interval between adjacent characters and the like, and determines in units of lines, paragraphs, and pages constituting the sentence. Determine the arrangement structure of characters.

図５は、文字構造判別部５４による処理の概要を示す図である。文字構造判別部５４は、例えば文字領域判別部５３によって抽出される文字領域画像ＤＡに含まれるひとつひとつのオブジェクト（文字）を処理対象とし、処理対象のオブジェクトと同じサイズのオブジェクトが近接している方向に文章を構成する行が配置されていると判断する。したがって、図５の場合には、文字領域画像ＤＡにおける行Ｌ１〜Ｌ１７のそれぞれがひとつの行を構成すると判別され、それぞれの行に含まれる文字が特定される。また文字構造判別部５４は、例えばひとつの行と直交する方向に同じサイズのオブジェクトで構成された別の行が近接して配置されている場合、それら複数の行によってひとつの段落が構成されていると判断する。したがって、図５の場合には、文字領域画像ＤＡにおける段落Ｊ１，Ｊ２のそれぞれがひとつの段落を構成すると判別され、それぞれの段落に含まれる文字と、行とが特定される。また文字構造判別部５４は、例えば入力画像Ｄ２に含まれる全ての文字がひとつの頁を構成していると判断する。したがって、図５の場合には、文字領域画像ＤＡにおける頁Ｐ１がひとつの頁を構成すると判別され、その頁に含まれる文字と、行と、段落とが特定される。そして文字構造判別部５４は、上記のようにして判別した文字単位、行単位、段落単位および頁単位での文字に関する情報を、文字構造判別情報ＤＣとして生成する。 FIG. 5 is a diagram showing an outline of processing by the character structure determination unit 54. The character structure determination unit 54 targets, for example, each object (character) included in the character area image DA extracted by the character region determination unit 53, and an object having the same size as the object to be processed is in the proximity. It is determined that the lines constituting the sentence are arranged. Therefore, in the case of FIG. 5, it is determined that each of the lines L1 to L17 in the character area image DA constitutes one line, and the characters included in each line are specified. In addition, the character structure determination unit 54, for example, when another line composed of objects of the same size is arranged close to each other in a direction orthogonal to one line, one paragraph is composed of the plurality of lines. Judge that Therefore, in the case of FIG. 5, it is determined that each of the paragraphs J1 and J2 in the character area image DA constitutes one paragraph, and the characters and lines included in each paragraph are specified. Further, the character structure determination unit 54 determines that all the characters included in the input image D2 constitute one page, for example. Therefore, in the case of FIG. 5, it is determined that the page P1 in the character area image DA constitutes one page, and the characters, lines, and paragraphs included in the page are specified. The character structure discriminating unit 54 generates information about characters in the character unit, line unit, paragraph unit, and page unit discriminated as described above as the character structure discriminating information DC.

上記のようにして領域判別処理が行われると、文字領域判別部５３は文字領域画像ＤＡをアウトライン変換部５５に出力し、背景領域画像ＤＢをＣＰＵ１２に出力する。また文字構造判別部５４が判別した文字単位、行単位、段落単位および頁単位でそれぞれの単位に含まれる文字に関する文字構造判別情報ＤＣもＣＰＵ１２に出力される。そして入力画像処理部１１では、次にアウトライン変換部５５が機能する When the region determination processing is performed as described above, the character region determination unit 53 outputs the character region image DA to the outline conversion unit 55 and outputs the background region image DB to the CPU 12. Further, the character structure determination information DC relating to the characters included in the character unit, line unit, paragraph unit and page unit determined by the character structure determination unit 54 is also output to the CPU 12. In the input image processing unit 11, the outline conversion unit 55 functions next.

アウトライン変換部５５は、文字領域画像ＤＡに含まれる文字を文字単位でアウトライン変換する処理部であり、頂点検出部５６と、直線近似処理部５７と、曲線近似処理部５８と、データ生成部５９とを備えており、これらが順次に機能して処理を行っていくことで一文字ごとにグルーピングされたアウトラインデータ（ベクトルデータ）を生成する。頂点検出部５６は、文字領域画像ＤＡを所定方向に走査して濃度の変化点を抽出していくことで文字の角や端などの頂点を抽出する処理部である。直線近似処理部５７は、頂点検出部５６によって抽出された文字の角や端などの２頂点間で直線近似できるものを特定しながら、その直線近似を行う処理部である。尚、この直線近似は既に公知となっている手法を採用すれば良い。曲線近似処理部５８は、直線近似処理部５７で直線近似されなかった２頂点間を曲線で滑らかに接続するように曲線近似を行う処理部である。尚、この曲線近似についても既に公知となっている手法を採用すれば良く、例えばベジェ近似などが挙げられる。そしてデータ生成部５９は、直線近似処理部５７および曲線近似処理部５８の結果に基づいて一文字ごとにグルーピングされたアウトラインデータを生成する。 The outline conversion unit 55 is a processing unit that performs outline conversion of characters included in the character area image DA in character units. The vertex detection unit 56, the straight line approximation processing unit 57, the curve approximation processing unit 58, and the data generation unit 59 The outline data (vector data) grouped for each character is generated by performing these functions sequentially. The vertex detection unit 56 is a processing unit that extracts vertices such as corners and edges of characters by scanning the character area image DA in a predetermined direction and extracting density change points. The straight line approximation processing unit 57 is a processing unit that performs straight line approximation while specifying what can be linearly approximated between two vertices such as corners and edges of characters extracted by the vertex detection unit 56. For this linear approximation, a known method may be employed. The curve approximation processing unit 58 is a processing unit that performs curve approximation so that two vertices that have not been linearly approximated by the linear approximation processing unit 57 are smoothly connected by a curve. For this curve approximation, a known method may be adopted, for example, Bezier approximation. The data generation unit 59 generates outline data grouped for each character based on the results of the straight line approximation processing unit 57 and the curve approximation processing unit 58.

図６および図７は、アウトライン変換部５５によって生成されるアウトラインデータの一例を示す図であり、図６は文字領域画像ＤＡに含まれる文字の一例を示しており、図７は図６の文字から生成されるアウトラインデータを示している。尚、図例では、説明の便宜上、全て直線近似でアウトラインデータを生成可能な場合を示している。図６に示すように、文字領域画像ＤＡには、「Ｌ」の文字Ｍ８と、「Ｉ」の文字Ｍ９とが含まれており、この文字領域画像ＤＡを原稿端からＸ方向およびＹ方向の２方向に走査していくと、文字Ｍ８からは複数の頂点Ａ，Ｂ，Ｃ，Ｄ，Ｅ，Ｆが抽出される。また文字Ｍ９からは複数の頂点Ｇ，Ｈ，Ｉ，Ｊ，Ｋ，Ｌ，Ｍ，Ｎ，Ｏ，Ｐ，Ｑ，Ｒが抽出される。尚、各頂点のＸＹ座標は、図６に示す通りである。この文字領域画像ＤＡに含まれる各文字をアウトライン変換部５５においてデータ化すると、図７に示すように、文字Ｍ８と文字Ｍ９がそれぞれ一文字ごとにアウトラインデータで表現される。文字Ｍ８およびＭ９のそれぞれのアウトラインデータは、倍率および精度を示すデータと、基準点の座標値を示すデータとが記述された後、その基準点の座標から直線で結んでいく頂点の座標が順に記述されている。このようなアウトラインデータによると、文字Ｍ８およびＭ９のそれぞれを滑らかに再現することができる。ここで図７に示す小文字ｑはグルーピングの始点を示しており、大文字Ｑはグルーピングの終点を示している。したがって、アウトライン変換部５５によって生成されるアウトラインデータは、文字Ｍ８とＭ９のそれぞれがひとつにグルーピングされたデータとなっている。 6 and 7 are diagrams illustrating an example of outline data generated by the outline conversion unit 55. FIG. 6 illustrates an example of characters included in the character area image DA. FIG. 7 illustrates the characters in FIG. The outline data generated from is shown. In the example of the drawing, for convenience of explanation, a case where outline data can be generated by linear approximation is shown. As shown in FIG. 6, the character area image DA includes an “L” character M8 and an “I” character M9. The character area image DA is displayed in the X and Y directions from the document edge. When scanning in two directions, a plurality of vertices A, B, C, D, E, and F are extracted from the character M8. A plurality of vertices G, H, I, J, K, L, M, N, O, P, Q, and R are extracted from the character M9. Note that the XY coordinates of each vertex are as shown in FIG. When each character included in the character area image DA is converted into data by the outline conversion unit 55, as shown in FIG. 7, the character M8 and the character M9 are represented by outline data for each character. The outline data of each of the characters M8 and M9 includes the data indicating the magnification and accuracy, and the data indicating the coordinate value of the reference point, and then the coordinates of the vertices connected by a straight line from the reference point coordinate in order. is described. According to such outline data, each of the characters M8 and M9 can be reproduced smoothly. Here, the lowercase letter q shown in FIG. 7 indicates the starting point of grouping, and the uppercase letter Q indicates the end point of grouping. Therefore, the outline data generated by the outline conversion unit 55 is data in which the characters M8 and M9 are grouped together.

図３に戻り、上記のようにして一文字ごとにグルーピングされたアウトラインデータが生成されると、アウトライン変換部５５は、そのアウトラインデータをＣＰＵ１２に出力する。 Returning to FIG. 3, when outline data grouped for each character is generated as described above, the outline conversion unit 55 outputs the outline data to the CPU 12.

次にＣＰＵ１２について説明する。ＣＰＵ１２は、サーバコンピュータ３に出力するためのデータファイルＤＦを生成する際、低解像度化部６１、非可逆圧縮部６２、可逆圧縮部６３およびファイル生成部６４として機能する。低下解像度化部６１は、入力画像処理部１１から出力される背景領域画像ＤＢを入力して背景領域画像ＤＢの解像度を所定の解像度に変換して低解像度の背景データを生成する処理部である。ここで背景領域画像ＤＢの解像度を低下させることにより、背景データのデータ量を低下させることができる。ただし、低解像度化部６１は特に必須の構成ではなく、低解像度化部６１が設けられない場合には、背景領域画像ＤＢがそのまま背景データとなる。そして背景データは非可逆圧縮部６２に与えられ、非可逆圧縮部６２において例えばＪＰＥＧ圧縮などの非可逆圧縮が行われる。つまり、本実施形態では、背景データの画質を保持することよりも寧ろデータ量を小さくすることを優先しているので、この背景データに対しては低解像度化処理や非可逆の高圧縮処理が施される。そして圧縮された背景データはファイル生成部６４に出力される。 Next, the CPU 12 will be described. When generating a data file DF to be output to the server computer 3, the CPU 12 functions as a resolution reduction unit 61, a lossy compression unit 62, a lossless compression unit 63, and a file generation unit 64. The reduced resolution converting unit 61 is a processing unit that receives the background region image DB output from the input image processing unit 11 and converts the resolution of the background region image DB to a predetermined resolution to generate low-resolution background data. . Here, by reducing the resolution of the background area image DB, the data amount of the background data can be reduced. However, the resolution reduction unit 61 is not particularly essential, and if the resolution reduction unit 61 is not provided, the background area image DB becomes the background data as it is. The background data is supplied to the irreversible compression unit 62, and the irreversible compression unit 62 performs irreversible compression such as JPEG compression. That is, in this embodiment, priority is given to reducing the amount of data rather than maintaining the image quality of the background data. Therefore, low resolution processing and irreversible high compression processing are applied to the background data. Applied. The compressed background data is output to the file generator 64.

一方、可逆圧縮部６３は、入力画像処理部１１から出力される一文字ごとにグルーピングされたアウトラインデータを入力してデータ圧縮を行う処理部である。ここでは、後にアウトラインデータを再現する必要があるため、例えばＦＬＡＴＥ圧縮などの可逆性のある圧縮処理が行われる。そして圧縮されたデータは、ファイル生成部６４に出力される。 On the other hand, the reversible compression unit 63 is a processing unit that performs data compression by inputting outline data grouped for each character output from the input image processing unit 11. Here, since it is necessary to reproduce the outline data later, a reversible compression process such as FLATE compression is performed. The compressed data is output to the file generation unit 64.

ファイル生成部６４は、非可逆圧縮部６２から出力される圧縮された背景データと、可逆圧縮部６３から出力される圧縮されたアウトラインデータと、入力画像処理部１１の文字構造判別部５４から出力される文字構造判別情報ＤＣとを統合して編集可能なデータファイルＤＦを生成する。例えば、このデータファイルＤＦは、閲覧編集の対象となる実体データと、ヘッダ情報やコメント情報などの付加情報とを有しており、実体データはさらに第１層実体データと第２層実体データとを有する階層構造となっている。そしてアウトラインデータは第１層実体データに格納され、背景データは第２層実体データに格納される。また文字構造判別情報ＤＣは、ヘッダ情報やコメント情報などの付加情報に格納される。このようにして生成されるデータファイルＤＦは、画像読取装置１からサーバコンピュータ３に出力される。そしてサーバコンピュータ３は画像読取装置１から出力されたデータファイルＤＦを記憶部３２に保存する。 The file generation unit 64 outputs the compressed background data output from the lossy compression unit 62, the compressed outline data output from the lossless compression unit 63, and the character structure determination unit 54 of the input image processing unit 11. The editable data file DF is generated by integrating the character structure discrimination information DC. For example, the data file DF includes entity data to be browsed and edited and additional information such as header information and comment information. The entity data further includes first layer entity data and second layer entity data. Has a hierarchical structure. The outline data is stored in the first layer entity data, and the background data is stored in the second layer entity data. The character structure determination information DC is stored in additional information such as header information and comment information. The data file DF generated in this way is output from the image reading apparatus 1 to the server computer 3. Then, the server computer 3 stores the data file DF output from the image reading device 1 in the storage unit 32.

次に、上記のようにして画像読取装置１から出力されるデータファイルＤＦをサーバコンピュータ３およびクライアントコンピュータ４で再利用する場合の処理について説明する。サーバコンピュータ３およびクライアントコンピュータ４はそれぞれの画像処理プログラムを実行することにより、画像読取装置１が原稿を読み取って生成したデータファイルＤＦに基づく画像を閲覧および編集するための画像処理装置として機能する。 Next, processing when the data file DF output from the image reading apparatus 1 as described above is reused by the server computer 3 and the client computer 4 will be described. The server computer 3 and the client computer 4 function as image processing devices for viewing and editing images based on the data file DF generated by the image reading device 1 reading a document by executing the respective image processing programs.

図８は、サーバコンピュータ３のＣＰＵ３０がサーバ用の画像処理プログラムを実行することによって機能する各処理部およびクライアントコンピュータ４のＣＰＵ４０がクライアント用の画像処理プログラムを実行することによって機能する各処理部を示すブロック図である。サーバコンピュータ３のＣＰＵ３０は、入出力処理部７１、ファイル管理部７２およびグルーピング変更部７４として機能する。入出力処理部７１は、ネットワーク２を介してデータファイルＤＦの送受信などを行う処理部である。ファイル管理部７２は、記憶部３２に記憶されているデータファイルＤＦを読み出したり、入出力処理部７１がネットワーク２を介して入力したデータファイルＤＦを記憶部３２に保存したりする処理部である。このファイル管理部７２は、グルーピング指示部７３としても機能する。グルーピング指示部７３は、クライアントコンピュータ４からのファイルの送信要求にグルーピングの指定が含まれている場合に、グルーピング変更部７４に対してグルーピングの変更を指示する指示手段である。グルーピング変更部７４は、グルーピング指示部７３からのグルーピングの変更が指示に基づいてデータファイルＤＦに含まれるアウトラインデータのグルーピングを変更する処理部である。 FIG. 8 shows each processing unit that functions when the CPU 30 of the server computer 3 executes the image processing program for the server and each processing unit that functions when the CPU 40 of the client computer 4 executes the image processing program for the client. FIG. The CPU 30 of the server computer 3 functions as an input / output processing unit 71, a file management unit 72, and a grouping change unit 74. The input / output processing unit 71 is a processing unit that performs transmission / reception of the data file DF via the network 2. The file management unit 72 is a processing unit that reads the data file DF stored in the storage unit 32 and stores the data file DF input by the input / output processing unit 71 via the network 2 in the storage unit 32. . The file management unit 72 also functions as a grouping instruction unit 73. The grouping instruction unit 73 is an instruction unit that instructs the grouping change unit 74 to change the grouping when the file transmission request from the client computer 4 includes designation of grouping. The grouping change unit 74 is a processing unit that changes the grouping of the outline data included in the data file DF based on the change in the grouping from the grouping instruction unit 73.

一方、クライアントコンピュータ４のＣＰＵ４０は、ファイル指定部８１、グルーピング指定部８２、送信要求処理部８３、ファイル編集処理部８４および保存処理部８５として機能する。ファイル指定部８１は、閲覧・編集対象となるファイルを指定する処理部である。このファイル指定部８１が機能すると、ネットワーク２を介してサーバコンピュータ３にアクセスし、サーバコンピュータ３の記憶部３２に記憶されている複数のデータファイルＤＦの一覧情報を取得する。そして表示部４２に対してその一覧情報を表示することにより、ユーザがその中から閲覧・編集対象とする一のファイルを選択できるようにする。そしてユーザが閲覧・編集対象とする一のファイルを指定すると、ファイル指定部８１はその指定されたファイル名などを一時的にメモリ４１に格納する。 On the other hand, the CPU 40 of the client computer 4 functions as a file specification unit 81, a grouping specification unit 82, a transmission request processing unit 83, a file editing processing unit 84 and a storage processing unit 85. The file designation unit 81 is a processing unit that designates a file to be browsed / edited. When the file specifying unit 81 functions, the server computer 3 is accessed via the network 2 and the list information of the plurality of data files DF stored in the storage unit 32 of the server computer 3 is acquired. Then, by displaying the list information on the display unit 42, the user can select one file to be browsed / edited from the list information. When the user designates one file to be viewed / edited, the file designation unit 81 temporarily stores the designated file name in the memory 41.

グルーピング指定部８２は、サーバコンピュータ３からデータファイルＤＦを取得する際のアウトラインデータのグルーピングを指定する処理部である。例えば、クライアントコンピュータ３においてクライアント用の画像処理プログラムを起動すると、表示部４２には、図９に示すような初期画面４２ａが表示される。この初期画面４２ａが表示されているときに、ユーザが操作入力部４３を操作してポインタ画像９０をメニューバー９１に移動させ、そのメニューバー９１に含まれる「設定」の項目を選択操作すると、プルダウンメニュー９２が表示される。そのプルダウンメニュー９２の中から「グルーピング」の項目をさらに選択操作すると、別の一覧メニュー９３が表示される。この一覧メニュー９３には、ユーザが選択可能なグルーピングの一覧が表示されている。そしてユーザはこの一覧の中から所望するグルーピングを選択すると、グルーピング指定部８２はユーザによって選択されたグルーピングを指定する。図例では、「行単位」のグルーピングが選択された状態を示している。尚、ユーザが上記のようなグルーピング指定操作を行わない場合、サーバコンピュータ３からデータファイルＤＦを取得する際のアウトラインデータのグルーピングは、デフォルト設定である「文字単位」のままとなる。 The grouping designation unit 82 is a processing unit that designates grouping of outline data when the data file DF is acquired from the server computer 3. For example, when an image processing program for a client is started in the client computer 3, an initial screen 42a as shown in FIG. When the initial screen 42 a is displayed, when the user operates the operation input unit 43 to move the pointer image 90 to the menu bar 91 and selects and operates the “setting” item included in the menu bar 91, A pull-down menu 92 is displayed. When a further “grouping” item is selected from the pull-down menu 92, another list menu 93 is displayed. The list menu 93 displays a list of groupings that can be selected by the user. When the user selects a desired grouping from the list, the grouping designation unit 82 designates the grouping selected by the user. In the example shown in the figure, the “group by row” grouping is selected. When the user does not perform the grouping designation operation as described above, the grouping of outline data when acquiring the data file DF from the server computer 3 remains at the “character unit” which is the default setting.

図８に戻り、送信要求処理部８３は、サーバコンピュータ３に対してデータファイルＤＦの送信要求を行う処理部である。この送信要求処理部８３は、ファイル指定部８１がメモリ４１に格納したファイル名を読み出し、データファイルＤＦのファイル名を指定すると共に、グルーピング指定部８２によってグルーピングの指定が行われた場合には、その指定されたグルーピングに関する情報を含めて送信要求コマンドを生成し、サーバコンピュータ３に送信する。 Returning to FIG. 8, the transmission request processing unit 83 is a processing unit that requests the server computer 3 to transmit the data file DF. The transmission request processing unit 83 reads the file name stored in the memory 41 by the file specifying unit 81, specifies the file name of the data file DF, and when grouping is specified by the grouping specifying unit 82, A transmission request command including information regarding the designated grouping is generated and transmitted to the server computer 3.

サーバコンピュータ３では、入出力処理部７１がその送信要求コマンドを受信すると、ファイル管理部７２が機能して送信要求コマンドを解析する。そして送信要求コマンドに含まれるファイル名から記憶部３２に記憶されているデータファイルＤＦを特定する。また送信要求コマンドにグルーピングに関する情報が含まれている場合には、グルーピング指示部７３が機能し、クライアントコンピュータ４から指定されたグルーピングを特定してグルーピング変更部７４に対してグルーピングの変更を指示する。これにより、グルーピング変更部７４が機能し、記憶部３２から読み出されたデータファイルＤＦに含まれるアウトラインデータのグルーピングを、グルーピング指示部７３からの指示に基づいて変更する。例えば、図９に示したようにクライアントコンピュータ４からの指定が「行単位」でのグルーピングであった場合、グルーピング変更部７４は、データファイルＤＦに含まれるアウトラインデータを「文字単位」のグルーピングから「行単位」のグルーピングに変更する。 In the server computer 3, when the input / output processing unit 71 receives the transmission request command, the file management unit 72 functions to analyze the transmission request command. Then, the data file DF stored in the storage unit 32 is specified from the file name included in the transmission request command. When the transmission request command includes information related to grouping, the grouping instruction unit 73 functions to identify the grouping designated from the client computer 4 and instruct the grouping change unit 74 to change the grouping. . As a result, the grouping change unit 74 functions and changes the grouping of outline data included in the data file DF read from the storage unit 32 based on an instruction from the grouping instruction unit 73. For example, as shown in FIG. 9, when the designation from the client computer 4 is “line unit” grouping, the grouping change unit 74 converts the outline data included in the data file DF from “character unit” grouping. Change to "line unit" grouping.

ここでグルーピング変更部７４による具体的な処理について説明する。上述したようにデータファイルＤＦには、付加情報として文字構造判別情報ＤＣが含まれている。グルーピング変更部７４は、この文字構造判別情報ＤＣに基づいてグルーピングを変更する。すなわち、文字構造判別情報ＤＣには、文字単位、行単位、段落単位および頁単位での文字に関する情報が含まれているため、例えば「行単位」のグルーピングに変更することが指示された場合、グルーピング変更部７４は文字構造判別情報ＤＣを参照することでアウトラインデータの一つ一つの文字がどの行に属するかを判定し、アウトラインデータのグルーピングを「文字単位」から「行単位」に変更する。つまり、記憶部３２に記憶されている状態でデータファイルＤＦに含まれるアウトラインデータは、図７に示したように文字単位でグルーピングされており、各文字のアウトラインデータの先頭と末尾にグルーピングの始点と終点を示す記述（小文字ｑと大文字Ｑ）が付されている。したがって、グルーピング変更部７４は、このグルーピングの始点と終点を示す記述（小文字ｑと大文字Ｑ）の位置を、文字構造判別情報ＤＣに基づいて、グルーピング指示部７３から指示されたグルーピングとなるように変更することで、グルーピングの変更を行う。例えば、図７に示した文字Ｍ８と文字Ｍ９の２つの文字を一つのグルーピングに変更する場合、グルーピング変更部７４は、アウトラインデータの記述を、図７に示すような記述から図１０に示すような記述に書き換えることで、文字Ｍ８と文字Ｍ９の２つの文字が含まれた一つのグルーピングに変更する。尚、段落単位や頁単位など、他の単位にグルーピングを変更する場合も同様であり、各文字のアウトラインデータの先頭と末尾に付されたグルーピングの始点と終点を示す記述（小文字ｑと大文字Ｑ）の位置を変更することによって行われる。 Here, specific processing by the grouping changing unit 74 will be described. As described above, the data file DF includes character structure determination information DC as additional information. The grouping change unit 74 changes the grouping based on the character structure determination information DC. That is, since the character structure determination information DC includes information on characters in character units, line units, paragraph units, and page units, for example, when it is instructed to change the grouping to “line units”, The grouping change unit 74 refers to the character structure determination information DC to determine which line each character of the outline data belongs to, and changes the grouping of the outline data from “character unit” to “line unit”. . That is, the outline data included in the data file DF in the state stored in the storage unit 32 is grouped in units of characters as shown in FIG. 7, and the starting point of grouping at the beginning and end of the outline data of each character. And a description (lowercase q and uppercase Q) indicating the end point. Therefore, the grouping changing unit 74 sets the position of the description (lowercase q and uppercase Q) indicating the start point and end point of the grouping to the grouping instructed by the grouping instruction unit 73 based on the character structure determination information DC. Change the grouping by changing. For example, when the two characters M8 and M9 shown in FIG. 7 are changed to one grouping, the grouping change unit 74 changes the description of the outline data from the description shown in FIG. 7 to FIG. By rewriting to a simple description, it is changed to one grouping including two characters M8 and M9. The same applies when the grouping is changed to other units such as a paragraph unit or a page unit, and a description indicating the start and end points of the grouping attached to the beginning and end of the outline data of each character (lower case q and upper case Q) ) By changing the position.

そしてグルーピング変更部７４は、グルーピング指示部７３からの指示に基づいてアウトラインデータのグルーピングの変更を完了すると、グルーピングの変更が反映されたデータファイルＤＦをファイル管理部７２に出力する。ファイル管理部７２は、グルーピングの変更が反映されたデータファイルＤＦを入出力処理部７１に出力し、入出力処理部７１はそのデータファイルＤＦをクライアントコンピュータ４に対して送信する。 When the grouping change unit 74 completes the grouping change of the outline data based on the instruction from the grouping instruction unit 73, the grouping change unit 74 outputs the data file DF reflecting the grouping change to the file management unit 72. The file management unit 72 outputs the data file DF reflecting the grouping change to the input / output processing unit 71, and the input / output processing unit 71 transmits the data file DF to the client computer 4.

クライアントコンピュータ４は、サーバコンピュータ３からグルーピングの変更が行われたデータファイルＤＦを受信すると、そのデータファイルＤＦの実体データに含まれているアウトラインデータと背景データとを読み出し、それらを合成した画像を表示部４２に表示する。これにより、ユーザはデータファイルＤＦの内容を閲覧することができ、操作入力部４３を操作することにより表示された画像の編集操作を行うことができるようになる。そしてＣＰＵ４０では、ファイル編集処理部８４が機能し、ユーザによる編集操作を受け付ける。このとき、ユーザが例えば行単位で複数の文字を移動させたい場合、上述したグルーピングの変更処理によって各行ごとに複数の文字がひとつのグループとなっているので、ユーザは、ひとつの行を選択して移動させれば、その行に含まれる全ての文字を一括して移動させることができる。つまり、ユーザが行単位で編集操作を行いたいのであれば、予め「行単位」のグルーピングを指定しておくことにより、表示部４２に表示された文字と背景とが合成された画像の文字部分を編集する際には、「行単位」での編集操作が可能になる。またユーザが段落単位で編集操作を行いたいのであれば、予め「段落単位」のグルーピングを指定しておくことによって「段落単位」での編集操作が可能になり、頁単位で編集操作を行いたいのであれば、予め「頁単位」のグルーピングを指定しておくことによって「頁単位」での編集操作が可能になる。したがって、本実施形態では、従来のように一文字ずつ移動させる操作を行ったり、或いは複数の文字を予め選択する操作を行ったりする必要がなくなるので、効率的に編集作業を行うことができるようになる。 When the client computer 4 receives the data file DF whose grouping has been changed from the server computer 3, the client computer 4 reads the outline data and the background data included in the substance data of the data file DF, and generates an image obtained by combining them. It is displayed on the display unit 42. As a result, the user can browse the contents of the data file DF, and can edit the displayed image by operating the operation input unit 43. In the CPU 40, the file editing processing unit 84 functions to accept an editing operation by the user. At this time, when the user wants to move a plurality of characters, for example, in units of lines, the grouping change process described above forms a group of characters for each line, so the user selects one line. To move all the characters in the line at once. That is, if the user wants to perform an editing operation in units of lines, the character part of the image in which the characters displayed on the display unit 42 and the background are synthesized by designating the grouping in units of lines in advance. When editing a file, editing operations can be performed in “row units”. If the user wants to perform editing operations in paragraph units, it is possible to perform editing operations in “paragraph units” by specifying “paragraph unit” grouping in advance, and to perform editing operations in page units. In this case, the editing operation in the “page unit” can be performed by specifying the “page unit” grouping in advance. Therefore, in this embodiment, it is not necessary to perform an operation of moving one character at a time as in the prior art or an operation of selecting a plurality of characters in advance, so that editing can be performed efficiently. Become.

そしてユーザが編集操作を終了してその編集した内容を反映させた状態でサーバコンピュータ３に保存しておく場合、ＣＰＵ４０では保存処理部８５が機能する。保存処理部８５は、ユーザによって編集された内容をデータファイルＤＦに反映させて書き換える。そして書き換えたデータファイルＤＦをサーバコンピュータ３に送信する。 When the user finishes the editing operation and saves the edited contents in the server computer 3, the save processing unit 85 functions in the CPU 40. The storage processing unit 85 rewrites the contents edited by the user by reflecting them in the data file DF. The rewritten data file DF is transmitted to the server computer 3.

サーバコンピュータ３の入出力処理部７１は、編集後の更新されたデータファイルＤＦを受信すると、それをファイル管理部７２に出力する。ファイル管理部７２は、編集後のデータファイルＤＦを入力すると、グルーピング指示部７３を機能させる。そしてグルーピング指示部７３は、データファイルＤＦに含まれるアウトラインデータのグルーピングを「文字単位」に戻すことを指示する。これにより、グルーピング変更部７４は、編集後のデータファイルＤＦに含まれるアウトラインデータのグルーピングを文字単位のグルーピングに戻す。そしてファイル管理部７２は、グルーピング変更部７４によって文字単位のグルーピングに変更されたアウトラインデータを含む編集後のデータファイルＤＦを記憶部３２に上書き保存することで、編集後のデータファイルＤＦが記憶される。このようにサーバコンピュータ３で管理されるデータファイルＤＦは、アウトラインデータのグルーピングが文字単位に統一されることで、ネットワーク２を介して複数のユーザがデータファイルＤＦにアクセスしてデータファイルＤＦを再利用する場合、文字単位のグルーピングが複数のユーザに共通したデフォルト設定とすることができるので、ファイル管理上、複雑な管理が必要でなくなり、効率的な管理が行える。 When the input / output processing unit 71 of the server computer 3 receives the updated data file DF after editing, it outputs it to the file management unit 72. The file management unit 72 causes the grouping instruction unit 73 to function when the edited data file DF is input. Then, the grouping instructing unit 73 instructs to return the grouping of the outline data included in the data file DF to “character unit”. As a result, the grouping changing unit 74 returns the grouping of the outline data included in the edited data file DF to the character grouping. Then, the file management unit 72 overwrites and saves the edited data file DF including the outline data changed to the character unit grouping by the grouping changing unit 74 in the storage unit 32, so that the edited data file DF is stored. The In this way, the data file DF managed by the server computer 3 has the grouping of outline data unified in character units, so that a plurality of users can access the data file DF via the network 2 and re-create the data file DF. When used, grouping in units of characters can be set as a default setting common to a plurality of users, so that complicated management is not necessary for file management, and efficient management can be performed.

以上説明したように、本実施形態の画像処理システムは、画像読取装置１において、入力画像に含まれる文字領域と、文字領域以外の背景領域とを判別し、文字領域に含まれる文字画像のアウトライン変換を行って文字単位でグルーピングしたアウトラインデータを生成する。また背景領域から成る背景画像から背景データを生成する。そして文字単位でグルーピングされたアウトラインデータと、背景データとを統合して編集可能なデータファイルＤＦを生成して出力する。そしてこのデータファイルＤＦには文字領域に含まれる文字構造を判別して生成された文字構造判別情報ＤＣが含まれている。一方、サーバコンピュータ３は、画像読取装置１で生成されたデータファイルＤＦを記憶部３２に記憶するように構成されると共に、記憶部３２に記憶されたデータファイルＤＦに含まれるアウトラインデータのグルーピングの変更を行い、グルーピングが変更されたアウトラインデータを含むデータファイルＤＦを出力するように構成されている。したがって、サーバコンピュータ３から出力されるデータファイルＤＦを利用する際には、文字単位でグルーピングされたアウトラインデータだけでなく、行単位や段落単位など種々の単位でグルーピングされたアウトラインデータを利用することができるようになり、ユーザの編集操作に適した単位でグルーピングされたアウトラインデータを取得することができる。それ故、ユーザの編集操作が容易かつ簡単になり、効率的に編集作業を行うことができる。 As described above, in the image processing system according to the present embodiment, the image reading apparatus 1 determines the character region included in the input image and the background region other than the character region, and outlines the character image included in the character region. Performs conversion to generate outline data grouped in character units. Also, background data is generated from a background image composed of background areas. Then, the outline data grouped in character units and the background data are integrated to generate and output an editable data file DF. The data file DF includes character structure determination information DC generated by determining the character structure included in the character area. On the other hand, the server computer 3 is configured to store the data file DF generated by the image reading device 1 in the storage unit 32, and performs grouping of outline data included in the data file DF stored in the storage unit 32. The data file DF including the outline data in which the grouping is changed is output after being changed. Therefore, when using the data file DF output from the server computer 3, not only outline data grouped in character units but also outline data grouped in various units such as line units and paragraph units. Therefore, outline data grouped in units suitable for the user's editing operation can be acquired. Therefore, the editing operation by the user is easy and simple, and the editing operation can be performed efficiently.

以上、本発明の一実施形態について説明したが、本発明は上述した内容のものに限定されるものではない。すなわち、本発明には種々の変形例が適用可能である。 As mentioned above, although one Embodiment of this invention was described, this invention is not limited to the thing of the content mentioned above. That is, various modifications can be applied to the present invention.

例えば、上述した実施形態では、一例として、文字単位、行単位、段落単位、頁単位でグルーピングの変更を行う場合を説明したが、この他にもグルーピングの変更には種々の態様がある。例えば領域判別部５２の文字構造判別部５４において、文字の縦書き部分と横書き部分とを判別しておき、文字構造判別情報ＤＣにそれらの情報を含めておくことで、グルーピングを変更する際には、縦書き部分と横書き部分とを区別してグルーピングすることができる。また文字構造判別部５４において、文字の色、サイズ、種類（ひらがな、カタカナ、漢字、数字などの種類）、フォントの種類などを判別しておき、文字構造判別情報ＤＣにそれらの情報を含めておくことで、グルーピングを変更する際には、同じ文字色、同じ文字サイズ、同じ文字種類、或いは、同じフォント種類ごとにグルーピングを行うことができる。また図形やグラフ、表などに文字が含まれる場合、それらの文字も判別しておくことで、図形やグラフ、表などに含まれる文字もグルーピングの対象とすることができる。このように種々の態様でグルーピングを行うことができるようにしておけば、ユーザがデータファイルＤＦを利用する際の用途に応じて適切なグルーピングを選択することができるようになり、編集操作時の操作性がより一層向上する。 For example, in the above-described embodiment, the case where the grouping is changed in units of characters, lines, paragraphs, and pages has been described as an example. However, there are various other types of grouping changes. For example, when the character structure determination unit 54 of the region determination unit 52 determines the vertical writing portion and the horizontal writing portion of the character and includes the information in the character structure determination information DC, the grouping is changed. Can be grouped by distinguishing the vertical writing portion and the horizontal writing portion. Further, the character structure discriminating unit 54 discriminates the color, size, type (type of hiragana, katakana, kanji, numbers, etc.), font type, etc. of the character and includes the information in the character structure discrimination information DC. Thus, when changing the grouping, the grouping can be performed for each of the same character color, the same character size, the same character type, or the same font type. In addition, when characters are included in a figure, graph, table, etc., the characters included in the figure, graph, table, etc. can also be grouped by identifying those characters. In this way, if grouping can be performed in various ways, the user can select an appropriate grouping according to the usage when using the data file DF, and at the time of editing operation The operability is further improved.

また上述した実施形態では、サーバコンピュータ３がアウトラインデータのグルーピングを変更する画像処理装置として機能する場合を例示したが、アウトラインデータのグルーピングの変更は必ずしもサーバコンピュータ３で行う必要はなく、クライアントコンピュータ４で行っても良い。特に、本発明においては、上述したサーバコンピュータ３の機能と、クライアントコンピュータ４の機能とが統合されたひとつのコンピュータによって画像処理装置が構成されていても良い。 In the above-described embodiment, the case where the server computer 3 functions as an image processing apparatus that changes the grouping of outline data is exemplified. However, the grouping of outline data does not necessarily need to be changed by the server computer 3, and the client computer 4 You may go in. In particular, in the present invention, the image processing apparatus may be configured by a single computer in which the functions of the server computer 3 and the client computer 4 described above are integrated.

また上述した実施形態では、サーバコンピュータ３においてアウトラインデータのグルーピングを変更する際、データファイルＤＦに含まれる文字構造判別情報ＤＣに基づいてグルーピングを変更する場合を例示した。しかし、データファイルＤＦに文字構造判別情報ＤＣが含まれていない場合でも、サーバコンピュータ３において上述した文字構造判別部５４と同様の処理を行うことによってグルーピングを変更するために必要な情報を得ることができる。そのため、サーバコンピュータ３においてアウトラインデータのグルーピングを変更する際には、上述した画像読取部１の文字構造判別部５４と同様の処理を行うことによってグルーピングの変更を行うようにしても良い。ただしこの場合、グルーピングの変更に要する処理時間が長くなるため、上述した実施形態のようにデータファイルＤＦに含まれる文字構造判別情報ＤＣに基づいてグルーピングを変更する構成とすることがより好ましい。 Further, in the above-described embodiment, the case where the grouping is changed based on the character structure determination information DC included in the data file DF when the grouping of the outline data is changed in the server computer 3 is exemplified. However, even when the character structure determination information DC is not included in the data file DF, the server computer 3 obtains information necessary for changing the grouping by performing the same process as the character structure determination unit 54 described above. Can do. Therefore, when changing the grouping of outline data in the server computer 3, the grouping may be changed by performing the same process as the character structure determination unit 54 of the image reading unit 1 described above. However, in this case, since the processing time required for changing the grouping becomes longer, it is more preferable to change the grouping based on the character structure determination information DC included in the data file DF as in the above-described embodiment.

また上述した実施形態では、図３に示したように画像読取装置１に対して、領域判別部５２、アウトライン変換部５５、低解像度化部６１、非可逆圧縮部６２、可逆圧縮部６３およびファイル生成部６４が設けられる場合を例示したが、これらの機能は画像読取装置１ではなく、ネットワーク２に接続された他のコンピュータによって実現しても良い。この場合、コンピュータには、図３に示したような領域判別部５２、アウトライン変換部５５、低解像度化部６１、非可逆圧縮部６２、可逆圧縮部６３およびファイル生成部６４として機能する画像処理プログラムが予めインストールされており、コンピュータ３がその画像処理プログラムを実行することにより、上述した処理と同様の処理を行って入力画像からデータファイルＤＦを生成する。 In the above-described embodiment, as illustrated in FIG. 3, the area determination unit 52, the outline conversion unit 55, the resolution reduction unit 61, the lossy compression unit 62, the lossless compression unit 63, and the file are compared with the image reading apparatus 1. Although the case where the generation unit 64 is provided has been illustrated, these functions may be realized by another computer connected to the network 2 instead of the image reading device 1. In this case, the image processing functioning as the area determination unit 52, outline conversion unit 55, resolution reduction unit 61, lossy compression unit 62, lossless compression unit 63, and file generation unit 64 as shown in FIG. A program is installed in advance, and the computer 3 executes the image processing program, thereby performing the same processing as described above to generate the data file DF from the input image.

画像処理システムの一構成例を示す図である。1 is a diagram illustrating a configuration example of an image processing system. 画像処理システムにおける各部のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of each part in an image processing system. 画像読取装置において原稿を読み取って文字部分が再利用可能な形式のデータファイルを生成する際の処理手順および機能構成を示す図である。FIG. 5 is a diagram illustrating a processing procedure and a functional configuration when a document is read and a data file in a format in which a character part can be reused is generated in the image reading apparatus. 領域判別部による処理の概要を示す図である。It is a figure which shows the outline | summary of the process by an area | region discrimination | determination part. 文字構造判別部による処理の概要を示す図である。It is a figure which shows the outline | summary of the process by a character structure discrimination | determination part. 文字領域画像に含まれる文字の一例を示す図である。It is a figure which shows an example of the character contained in a character area image. アウトライン変換部によって図６に示す文字から生成されるアウトラインデータの一例を示す図である。It is a figure which shows an example of the outline data produced | generated from the character shown in FIG. 6 by the outline conversion part. サーバコンピュータのＣＰＵがサーバ用の画像処理プログラムを実行することによって機能する各処理部およびクライアントコンピュータのＣＰＵがクライアント用の画像処理プログラムを実行することによって機能する各処理部を示すブロック図である。It is a block diagram which shows each processing part which functions when CPU of a server computer functions by executing the image processing program for servers, and each processing part which functions when CPU of a client computer executes the image processing program for clients. クライアントコンピュータにおいて表示部に表示される画面の一例を示す図である。It is a figure which shows an example of the screen displayed on a display part in a client computer. グルーピング変更部によって図７のアウトラインデータのグルーピングを変更した場合の一例を示す図である。It is a figure which shows an example at the time of changing the grouping of the outline data of FIG. 7 by the grouping change part.

Explanation of symbols

１画像読取装置
２ネットワーク
３サーバコンピュータ（画像処理装置）
４クライアントコンピュータ（画像処理装置）
１０画像読取部（画像読取手段）
３２記憶部（記憶手段）
５２領域判別部（領域判別手段，背景データ生成手段）
５４文字構造判別部（文字構造判別手段）
５５アウトライン変換部（アウトライン変換手段）
６１低解像度化部（背景データ生成手段）
６４ファイル生成部（ファイル生成手段）
７１入出力処理部（ファイル出力手段）
７３グルーピング指示部（グルーピング指示手段）
７４グルーピング変更部（グルーピング変更手段）
ＤＡ文字領域画像
ＤＢ背景領域画像（背景データ）
ＤＣ文字構造判別情報
ＤＦデータファイル 1 image reading device 2 network 3 server computer (image processing device)
4 Client computer (image processing device)
10 Image reading unit (image reading means)
32 storage unit (storage means)
52 area discriminating section (area discriminating means, background data generating means)
54 Character structure determination unit (character structure determination means)
55 Outline converter (outline converter)
61 Low resolution section (background data generation means)
64 File generator (file generator)
71 Input / output processing unit (file output means)
73 Grouping instruction section (grouping instruction means)
74 Grouping change section (grouping change means)
DA Character area image DB Background area image (background data)
DC character structure identification information DF data file

Claims

An area determination means for determining a character area included in the input image and a background area other than the character area;
Outline conversion means for performing outline conversion of character images included in the character area determined by the area determination means, and generating outline data grouped in character units;
Background data generating means for generating background data from a background image consisting of the background area determined by the area determining means;
File generating means for generating an editable data file by integrating the outline data generated by the outline converting means and the background data generated by the background data generating means;
Storage means for storing the data file generated by the file generation means;
Instruction means for instructing a change in grouping of the outline data included in the data file stored in the storage means;
Grouping change means for reading the data file from the storage means and changing the grouping of the outline data included in the data file based on an instruction from the instruction means;
File output means for outputting a data file including outline data whose grouping has been changed by the grouping change means;
An image processing system comprising:

A file input means for inputting an edited data file for the data file output by the file output means;
2. The grouping change unit returns the grouping of the outline data included in the edited data file input to the file input unit to character grouping, and then stores the grouped data in the storage unit. The image processing system described.

Character structure determination means for generating character structure determination information by determining a character structure included in the character area determined by the area determination means,
The file generation unit generates the data file by integrating the outline data, the background data, and the character structure determination information generated by the character structure determination unit,
The image processing system according to claim 1, wherein the grouping changing unit changes the grouping of the outline data from grouping in character units to grouping in another unit based on the character structure determination information.

The character structure determining means determines characters included in the character area in units of characters and in units of lines, paragraphs or pages, and the character structure determination information includes information on lines, paragraphs or pages. The image processing system according to claim 3, wherein:

Image reading means for reading an original and generating an input image;
Area determination means for determining a character area included in the input image input to the image input means and a background area other than the character area;
Character structure determination means for determining character structure included in the character area determined by the area determination means and generating character structure determination information;
Outline conversion means for performing outline conversion of character images included in the character area determined by the area determination means, and generating outline data grouped in character units;
Background data generating means for generating background data from a background image consisting of the background area determined by the area determining means;
A data file that can be edited by integrating the outline data generated by the outline conversion unit, the background data generated by the background data generation unit, and the character structure determination information generated by the character structure determination unit File generation means for generating
An image reading apparatus comprising:

Storage means for storing an editable data file in which outline data grouped in character units and background data are integrated;
Instruction means for instructing a change in grouping of the outline data included in the data file stored in the storage means;
Grouping change means for reading the data file from the storage means and changing the grouping of the outline data included in the data file based on an instruction from the instruction means;
File output means for outputting a data file including outline data whose grouping has been changed by the grouping change means;
An image processing apparatus comprising:

Computer
Area discriminating means for discriminating a character area included in the input image and a background area other than the character area;
Character structure determination means for determining character structure included in the character area determined by the area determination means and generating character structure determination information;
Outline conversion means for performing outline conversion of a character image included in the character area determined by the area determination means, and generating outline data grouped in character units;
Background data generating means for generating background data from a background image consisting of the background area determined by the area determining means;
A data file that can be edited by integrating the outline data generated by the outline conversion unit, the background data generated by the background data generation unit, and the character structure determination information generated by the character structure determination unit File generation means for generating
An image processing program that functions as an image processing program.

Computer
Means for reading out the data file from a predetermined storage means storing an editable data file in which outline data grouped in character units and background data are integrated;
An instruction means for instructing a change in grouping of the outline data included in the data file;
Grouping change means for changing the grouping of the outline data included in the data file based on an instruction from the instruction means;
A file output means for outputting a data file including outline data whose grouping has been changed by the grouping change means;
An image processing program that functions as an image processing program.