JP2006345314A

JP2006345314A - Image processing apparatus and image processing method

Info

Publication number: JP2006345314A
Application number: JP2005170041A
Authority: JP
Inventors: Mitsuru Uzawa; 充鵜沢
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2005-06-09
Filing date: 2005-06-09
Publication date: 2006-12-21

Abstract

<P>PROBLEM TO BE SOLVED: To more plainly edit image data obtained by scanning a paper document while maintaining the contents of the image data. <P>SOLUTION: An image processing method for processing image data obtained by scanning a document including a plurality of objects having respectively different attributes is provided with: a division process for dividing the image data in each block comprising an object having a different attribute; a character processing process for vectorizing an object whose attribute is judged as a character; and a character enlarging process (step S1603) for enlarging (step S1602) a character block composed of the object so as to inscribe the character block with a cell when the vectorized object is included in the cell constituting a table block out of blocks divided by the division process and enlarging the object included in the enlarged character block correspondingly to the enlarged character block. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、紙文書をスキャナ等の読取装置を用いて読み取ることにより生成された画像データについての処理技術に関するものである。 The present invention relates to a processing technique for image data generated by reading a paper document using a reading device such as a scanner.

近年オフィス内の機器のデジタル化、ネットワーク化に伴って、複写機は単に紙文書の複製物を作成するための装置として機能するだけでなく、ＦＡＸや紙文書の電子化、さらにはネットワークを介しての画像データの送付等を行うための装置としても機能するようになってきている。 In recent years, with the digitization and networking of equipment in offices, copiers not only function as devices for creating copies of paper documents, but also digitize faxes and paper documents, and even via networks. It has also come to function as a device for sending all image data.

このように多機能化された複写機は一般にＭＦＰ（ＭｕｌｔｉＦｕｎｃｔｉｏｎＰｅｒｉｐｈｅｒａｌ）と呼ばれており、オフィスにおいて紙文書を取り扱う業務を行う際の中心的な支援ツールとなりつつある。そして、それに対応して、ＭＦＰ自身も、紙文書をあたかも画像データのように簡易に扱えるよう、様々な機能が整えられるようになってきている。 Such a multifunctional copying machine is generally called an MFP (Multi Function Peripheral), and is becoming a central support tool for carrying out business dealing with paper documents in an office. Correspondingly, various functions have been arranged so that the MFP itself can easily handle a paper document as if it were image data.

例えば、特許文献１には、スキャンした紙文書を編集可能な画像データに変換する技術が記載されている。
特開２００４−２６５３８４号公報 For example, Patent Document 1 describes a technique for converting a scanned paper document into editable image data.
JP 2004-265384 A

特許文献１に記載されているような技術によって変換された画像データを用いることにより、ユーザがＭＦＰのタッチパネル上に表示された画像データをワンタッチで編集し、当該編集した画像データを出力することで編集後の紙文書を容易に獲得できるようにすることを、本出願人は考えている。 By using the image data converted by the technique described in Patent Document 1, the user edits the image data displayed on the touch panel of the MFP with one touch, and outputs the edited image data. The applicant considers that it is possible to easily obtain a paper document after editing.

紙文書を編集可能な画像データに変換するにあたり、ＭＦＰでは次のような手順で処理を実行する。まず、紙文書をスキャンすることにより生成された画像データを像域分離し、テキスト、表、写真、線画といった属性ごとに矩形領域（ブロック）に領域分割する。 When converting a paper document into editable image data, the MFP executes processing in the following procedure. First, image data generated by scanning a paper document is image-region-separated and divided into rectangular regions (blocks) for each attribute such as text, table, photograph, and line drawing.

次に、領域分割することにより得られた各ブロックのオブジェクトについて、例えばテキスト領域（テキストブロック）であればＯＣＲし文字認識したのち、文字オブジェクトを全体画像より切り出しアウトラインベクトル化する。このように、アウトラインベクトル化することで、文字オブジェクトは拡大／縮小しても画質の変わらない非解像度依存の画像データとして取り扱うことが可能となる。 Next, for each block object obtained by dividing the area, for example, if it is a text area (text block), OCR is performed and character recognition is performed. Then, the character object is cut out from the entire image and converted into an outline vector. In this way, by converting the outline vector, the character object can be handled as non-resolution-dependent image data whose image quality does not change even when enlarged or reduced.

また、表オブジェクトについては表罫線を表現している画素を全体画像より切り出した後、アウトラインベクトル化し非解像度依存の画像データに変換した後、さらに図形認識を行って表構造を認識する。また、写真、線画等についても同様にして全体画像より切り出しそれぞれ個別に変換する。 As for the table object, the pixels representing the table ruled line are cut out from the entire image, converted into outline vectors, converted into non-resolution-dependent image data, and further figure recognition is performed to recognize the table structure. Similarly, photographs, line drawings, etc. are cut out from the entire image and converted individually.

このように全体画像よりオブジェクトを切り出し、個別に最適な処理が施された画像データについては、それぞれ個別にレイアウトを変更する等の編集処理を行うことが可能であり、それを紙文書として出力したり、また抽出された画像データを任意に選択し画像データとして保存・送付したりすることが可能である。 In this way, it is possible to perform edit processing such as changing the layout of individual image data that has been cut out from the entire image and individually processed optimally, and this is output as a paper document. It is also possible to arbitrarily select the extracted image data and save / send it as image data.

このように、スキャンした紙文書を編集する編集機能が付加されたＭＦＰでは、さらに、高齢者のような視力が低下した人への資料の再発行を行うべく、紙文書中の文字を任意の大きさで簡易に拡大する、いわゆる「原本の再発行機能」の実現が望まれている。 In this way, in an MFP to which an editing function for editing a scanned paper document is added, characters in the paper document can be arbitrarily changed in order to reissue the material to a person with low vision such as an elderly person. Realization of a so-called “original reissue function” that easily expands in size is desired.

「原本の再発行機能」を簡易に実現する方法としては、例えば、紙文書をスキャンした後、像域分離した各オブジェクトをタッチパネル上で任意に拡大し、出力する方法が考えられる。このようにＭＦＰ上で行うＰＣレスの簡易な方法は、ＰＣの複雑な操作を嫌う高齢者にとっては非常に適した方法であるといえる。さらに、テキストを任意の大きさで拡大し、レイアウトする一連の作業を自動化させれば、視力の低下した人にとっては、より親切な機能となる。 As a method of easily realizing the “original reissue function”, for example, a method of scanning a paper document and arbitrarily enlarging and outputting each object obtained by separating the image area on the touch panel can be considered. Thus, it can be said that the PC-less simple method performed on the MFP is very suitable for an elderly person who dislikes complicated operation of the PC. Furthermore, automating a series of tasks for enlarging and laying out text at an arbitrary size will be a more friendly function for people with reduced vision.

しかし、上述のように原本の再発行機能を実現するにあたり、例えば、像域分離することにより切り出されたオブジェクト（例えば、テキスト領域の文字オブジェクト）を文字認識したのち、該認識されたテキストコード等の情報を用いて拡大処理を行ったのでは、文字認識において誤認識があった場合に、当該誤った情報を拡大してしまうおそれがある。このため、拡大処理を行うにあたっては、文字のアウトラインのような形状情報を用いた画像ベースの拡大処理を行うことが望ましい。 However, in realizing the original reissue function as described above, for example, after recognizing an object (for example, a character object in a text area) cut out by image area separation, the recognized text code, etc. If the enlargement process is performed using this information, if there is an erroneous recognition in character recognition, the erroneous information may be enlarged. For this reason, when performing the enlargement process, it is desirable to perform an image-based enlargement process using shape information such as a character outline.

そして、このようなオブジェクトの形状情報を用いた画像ベースの拡大処理を行うにあたっては、拡大したいオブジェクトが含まれる矩形領域（ブロック）を、紙文書上の他のオブジェクトが含まれる矩形領域と被らないように、かつ矩形領域間の隙間をなくすようにして行うのが望ましいと考えられる。 When performing image-based enlargement processing using such object shape information, a rectangular area (block) containing an object to be enlarged is covered with a rectangular area containing another object on a paper document. It is considered desirable to do so so that there is no gap between the rectangular regions.

しかしながら、このような拡大処理の場合、拡大処理しようとするオブジェクトが入れ子（互いに属性の異なるオブジェクトを含む矩形領域同士が重なった状態）になっていた場合には、元々、オブジェクトを含む矩形領域同士が被っているため、両方のオブジェクトを一体として扱うようにしなければ、オブジェクトを拡大することができないといった問題がある。オブジェクト毎に別個に扱った場合、オブジェクト同士の位置ずれが生じ、画像データの内容が変わってしまうおそれがあるからである。 However, in the case of such enlargement processing, if the objects to be enlarged are nested (in a state where rectangular regions including objects having different attributes overlap each other), the rectangular regions including the objects are originally Therefore, there is a problem that the object cannot be enlarged unless both objects are handled as a unit. This is because, when each object is handled separately, positional displacement between the objects may occur, and the content of the image data may change.

具体例を挙げて説明すると、表のようなオブジェクトの場合、表罫線オブジェクトと、文字オブジェクトとは入れ子になっている。このため、表罫線オブジェクトと文字オブジェクトとをそれぞれ別の拡大率で拡大したのでは、表セルから文字がはみ出るなどして、表の内容が変わってしまう場合がある。 In the case of an object such as a table, a table ruled line object and a character object are nested. For this reason, if the table ruled line object and the character object are enlarged at different magnifications, the contents of the table may change due to characters protruding from the table cell.

このため、表罫線オブジェクトと文字オブジェクトとを同じ拡大率で拡大することが考えられるが、一般に、表罫線オブジェクトの拡大は、その大きさが元々文書画像内においてある程度大きい場合が多いため、更に拡大することができないことが多く、それゆえ、セル中の文字オブジェクトも同じ拡大率で拡大した場合、文字オブジェクトを十分拡大することができないといった結果をまねくことになる。 For this reason, it is conceivable to enlarge the table ruled line object and the character object at the same enlargement ratio, but generally, the enlargement of the table ruled line object is often large to some extent in the document image. In many cases, the character object in the cell cannot be enlarged sufficiently if the character object in the cell is enlarged at the same enlargement ratio.

本発明は上記課題に鑑みてなされたものであり、紙文書をスキャンすることにより得られた画像データについて、画像データの内容を維持したまま、より見やすく編集することを目的とする。 SUMMARY An advantage of some aspects of the invention is that image data obtained by scanning a paper document is edited more easily while maintaining the content of the image data.

上記の目的を達成するために本発明に係る画像処理装置は以下のような構成を備える。即ち、
属性の異なる複数のオブジェクトを含む文書をスキャンすることにより得られた画像データを処理する画像処理装置であって、
前記画像データを、互いに属性の異なるオブジェクトからなるブロック毎に分割する分割手段と、
前記分割手段により属性が文字であると判定されたオブジェクトをベクトル化する文字処理手段と、
前記文字処理手段によりベクトル化されたオブジェクトが、前記分割手段により分割されたブロックのうち、表ブロックを構成するセル内に位置していた場合に、該オブジェクトからなる文字ブロックを該セルに内接するよう拡大し、該拡大された文字ブロックに対応して、該文字ブロックに含まれる該オブジェクトを拡大する文字拡大手段とを備える。 In order to achieve the above object, an image processing apparatus according to the present invention comprises the following arrangement. That is,
An image processing apparatus that processes image data obtained by scanning a document including a plurality of objects having different attributes,
Dividing means for dividing the image data into blocks each made up of objects having different attributes;
A character processing means for vectorizing an object whose attribute is determined to be a character by the dividing means;
When the object vectorized by the character processing means is located in a cell constituting a table block among the blocks divided by the dividing means, the character block consisting of the object is inscribed in the cell. Character enlargement means for enlarging the object included in the character block corresponding to the enlarged character block.

本発明によれば、紙文書をスキャンすることにより得られた画像データについて、画像データの内容を維持したまま、より見やすく編集することが可能となる。 According to the present invention, it is possible to edit image data obtained by scanning a paper document more easily while maintaining the content of the image data.

以下、必要に応じて添付図面を参照しながら本発明の実施形態を詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings as necessary.

［第１の実施形態］
１．文書管理システムの構成
本発明の第１の実施形態について説明する。図１は本発明の第１の実施形態にかかる画像処理装置（ＭＦＰ１００）を備える文書管理システムの構成を示す図である。同図に示すように、当該文書管理システムはオフィス１２０とオフィス１３０とがインターネット１０４で接続された環境において実現している。 [First Embodiment]
1. Configuration of Document Management System A first embodiment of the present invention will be described. FIG. 1 is a diagram showing a configuration of a document management system including an image processing apparatus (MFP 100) according to the first embodiment of the present invention. As shown in the figure, the document management system is realized in an environment in which an office 120 and an office 130 are connected via the Internet 104.

オフィス１２０内に構築されたＬＡＮ１０７には、ＭＦＰ１００、ＭＦＰ１００を制御するマネージメントＰＣ１０１、クライアントＰＣ１０２、文書管理サーバ１０６−１、文書管理サーバ１０６−１のデータベース１０５−１がそれぞれ接続され、それらはプロキシサーバ１０３−１に接続されている。 Connected to the LAN 107 constructed in the office 120 are the MFP 100, the management PC 101 that controls the MFP 100, the client PC 102, the document management server 106-1, and the database 105-1 of the document management server 106-1, respectively. 103-1.

同様にオフィス１３０内に構築されたＬＡＮ１０８には、文書管理サーバ１０６−２及びそのデータベース１０５−２がそれぞれ接続されている。また、オフィス１２０内のＬＡＮ１０７とオフィス１３０内のＬＡＮ１０８とはプロキシサーバ１０３−１、１０３−２を介してインターネット１０４に接続されている。 Similarly, the document management server 106-2 and its database 105-2 are connected to the LAN 108 built in the office 130, respectively. The LAN 107 in the office 120 and the LAN 108 in the office 130 are connected to the Internet 104 via proxy servers 103-1 and 103-2.

このように、本実施形態にかかる画像処理装置（ＭＦＰ１００）は、文書管理システムを構成する機器として機能させることができる。具体的には、ＭＦＰ１００は、紙文書の画像読み取り部と読み取った画像データに対する画像処理の一部を担当し、画像処理された画像データはＬＡＮ１０９を用いてマネージメントＰＣ１０１に入力される。マネージメントＰＣ１０１は通常のＰＣであり、内部に画像記憶手段、画像処理手段、表示手段、入力手段を有する。 As described above, the image processing apparatus (MFP 100) according to the present embodiment can function as a device constituting the document management system. Specifically, the MFP 100 is in charge of part of the image processing for the image reading unit of the paper document and the read image data, and the image processed image data is input to the management PC 101 using the LAN 109. The management PC 101 is a normal PC, and has an image storage unit, an image processing unit, a display unit, and an input unit.

２．画像処理装置（ＭＦＰ１００）の構成
図２は、本発明の第１の実施形態にかかる画像処理装置（ＭＦＰ１００）の機能構成を示す図である。以下、図２について図１を参照しながら説明する。 2. Configuration of Image Processing Apparatus (MFP 100) FIG. 2 is a diagram showing a functional configuration of the image processing apparatus (MFP 100) according to the first embodiment of the present invention. Hereinafter, FIG. 2 will be described with reference to FIG.

図２において、オートドキュメントフィーダ（以降ＡＤＦと記す）を含む画像読み取り部２０１は、紙文書（原稿画像）を図示しない光源で照射することにより得られた原稿反射像をレンズで固体撮像素子上に結像することで、固体撮像素子からラスター上の画像読み取り信号を６００ＤＰＩの密度の画像信号として得る。 In FIG. 2, an image reading unit 201 including an auto document feeder (hereinafter referred to as ADF) uses a lens to reflect an original reflection image obtained by irradiating a paper document (original image) with a light source (not shown) onto a solid-state image sensor. By forming an image, an image reading signal on the raster is obtained as an image signal having a density of 600 DPI from the solid-state imaging device.

通常の複写機能はこの画像信号をデータ処理部２０６で記録信号へと画像処理し、複数枚複写にあっては記録装置２０２に一旦一ページ分の記録データを記憶保持した後、記録装置２０３に順次出力して紙上に画像を形成する。 In the normal copying function, the image signal is processed into a recording signal by the data processing unit 206. In the case of a plurality of copies, recording data for one page is temporarily stored in the recording device 202 and then stored in the recording device 203. Output sequentially to form an image on paper.

一方、クライアントＰＣ１０２から出力されるプリントデータにあってはＬＡＮ１０７からネットワークＩＦ２０５を経てデータ処理装置２０６で記録可能なラスターデータに変換した後、前記記録装置２０３により紙上に画像を形成する。 On the other hand, print data output from the client PC 102 is converted into raster data that can be recorded by the data processing device 206 from the LAN 107 via the network IF 205, and then an image is formed on the paper by the recording device 203.

このように、本実施形態にかかるＭＦＰ１００は、通常のＭＦＰの複写・出力機能を持つ一方、編集・出力機能として次に述べる機能も有する。 As described above, the MFP 100 according to the present embodiment has a copy / output function of a normal MFP, and also has the following functions as an edit / output function.

例えば、大型タッチパネルを含む表示・入力装置２０７へのベクトル画像表示が指示されていた場合（ユーザによる指示は、表示・入力装置２０７を介して行われる、以下同じ）、画像読み取り部２０１における読み込み処理により読み込まれた画像信号は、データ処理装置２０６で編集可能な画像信号へ画像処理され、ベクトルデータへと変換処理された後、表示・入力装置２０７上に表示される。 For example, when a vector image display on the display / input device 207 including a large touch panel is instructed (the instruction by the user is given via the display / input device 207, the same applies hereinafter), the reading process in the image reading unit 201 The image signal read in is processed into an image signal that can be edited by the data processing device 206, converted into vector data, and then displayed on the display / input device 207.

表示・入力装置２０７へ表示されたベクトルデータは、ユーザによる指示に従い、データ処理装置２０６にて所望の編集処理が施される。そして、編集処理されたベクトルデータは再度表示・入力装置２０７上に表示される。 The vector data displayed on the display / input device 207 is subjected to desired editing processing by the data processing device 206 in accordance with an instruction from the user. Then, the edited vector data is displayed on the display / input device 207 again.

また、ユーザによる出力指示に従って、データ処理装置２０７ではベクトルデータを記録信号へと画像処理し、記録装置２０３に順次出力して紙上に画像を形成する。また、ユーザによる転送指示に従って、ベクトルデータはネットワークＩＦ２０８からＬＡＮ１０７を経て文書管理サーバ１０６−１、クライアントＰＣ１０２、文書管理サーバ１０６−２等へ送られる。尚、本発明は紙文書を画像読み取り部を用いて読み取ることにより生成された画像データについての処理技術に関するものであり、以下では当該処理概要について詳説する。 Further, in accordance with an output instruction from the user, the data processing device 207 performs image processing of vector data into a recording signal, and sequentially outputs it to the recording device 203 to form an image on paper. In accordance with a transfer instruction from the user, the vector data is sent from the network IF 208 to the document management server 106-1, the client PC 102, the document management server 106-2, etc. via the LAN 107. The present invention relates to a processing technique for image data generated by reading a paper document using an image reading unit, and the processing outline will be described in detail below.

３．画像処理装置における処理
３．１全体処理
ＭＦＰ１００における編集・出力機能に関する全体処理概要を図３を用いて説明する。図３に示すように、ステップＳ３０１では、まず、画像読み取り部２０１を動作させ１枚の原稿をラスター状に走査することでイメージ情報として入力し、６００ＤＰＩ−８ビットの画像信号を得る。 3. Processing in image processing apparatus
3.1 Overall Processing An overview of overall processing related to the editing / output function in the MFP 100 will be described with reference to FIG. As shown in FIG. 3, in step S301, first, the image reading unit 201 is operated to scan a single document in a raster shape and input as image information to obtain a 600 DPI-8-bit image signal.

ステップＳ３０２では、該画像信号をデータ処理装置２０６にて前処理し、記憶装置２０２に１ページ分の画像データとして保存する。そして、該保存した画像データを、文字／線画部分とハーフトーンの画像部分とに領域分割し、文字部分はさらに段落の塊として纏まっているブロック毎に、或は、線で構成された表ごとに、ならびに線画・図画ごとに分離し、各々セグメント化する。一方、ハーフトーンで表現される画像部分については、矩形領域に分離されたブロックの画像部分（写真）、背景部分等、所望のブロック毎に独立したオブジェクトに分解する。 In step S <b> 302, the image signal is preprocessed by the data processing device 206 and stored in the storage device 202 as image data for one page. Then, the stored image data is divided into a character / line drawing part and a halftone image part, and the character part is further divided into blocks or a table composed of lines. In addition, each line drawing / drawing is separated and segmented. On the other hand, an image portion expressed in halftone is decomposed into independent objects for each desired block, such as an image portion (photograph) of a block separated into a rectangular area, a background portion, and the like.

ステップＳ３０３では、ステップＳ３０２にて分割された各ブロックのオブジェクトについて、オブジェクト毎にベクトルデータに変換する。ベクトルデータへの変換は、以下のように実行される。 In step S303, the object of each block divided in step S302 is converted into vector data for each object. Conversion to vector data is performed as follows.

先ず、文字（ＴＥＸＴ）として抽出されたブロック（文字（ＴＥＸＴ）ブロック）については、２値化して得られる文字形状をアウトライン化することで解像度に依存しない形状情報を取得する。一方、ブロックの内部解析情報として、ＯＣＲ（文字認識）しそのテキスト情報、更には文字のサイズ、スタイル、字体を認識し、原稿を走査して得られたイメージ情報から再現可能なフォントデータを抽出する。 First, for a block (character (TEXT) block) extracted as a character (TEXT), shape information independent of resolution is obtained by outlining the character shape obtained by binarization. On the other hand, as internal analysis information of the block, OCR (character recognition) and its text information, as well as character size, style and font are recognized, and reproducible font data is extracted from the image information obtained by scanning the document. To do.

また、線で構成される表（ＴＡＢＬＥ）、線画（ＬＩＮＥ）、図画（ＰＩＣＴＵＲＥ）ブロックに対しては、アウトライン化し解像度に依存しないグラフィックス情報を抽出する一方、図形形状が認識できるものについては、認識処理によりその形状情報を取得する。特に表は矩形の集合として認識可能である。 In addition, for the table (TABLE), line drawing (LINE), and drawing (PICTURE) block composed of lines, while extracting the graphics information independent of the resolution, the figure shape can be recognized. The shape information is acquired by the recognition process. In particular, a table can be recognized as a set of rectangles.

写真（ＰＨＯＴＯ）ブロックに対してはイメージデータとして個別のＪＰＥＧファイルとして処理する。以上、ベクトル化処理により、各オブジェクトは個別に扱うことが可能であり、オブジェクト単位で自由にレイアウト変更、拡大／縮小することが可能である。 Photo (PHOTO) blocks are processed as individual JPEG files as image data. As described above, each object can be handled individually by vectorization processing, and the layout can be freely changed and enlarged / reduced in units of objects.

ステップＳ３０４では、ステップＳ３０３のベクトル化処理によって得られた情報をベクトルデータとして記憶装置２０２に記憶するとともに、各オブジェクトを組み合わせて一枚の紙原稿を再現するようレイアウトした後、表示・入力装置２０７上に表示する。 In step S304, the information obtained by the vectorization process in step S303 is stored as vector data in the storage device 202, and after laying out so as to reproduce one paper document by combining each object, the display / input device 207 is displayed. Display above.

ステップＳ３０４にて、表示・入力装置２０７へ表示されたベクトルデータは、表示・入力装置２０７へのユーザの入力操作に従って、オブジェクトの構成・大きさ・配置を自由に変更することが可能である（ステップＳ３０５）。これらユーザの変更操作結果は随時表示・入力装置２０７上に表示され、インタラクティブにレイアウト変更することが可能である。 In step S304, the vector data displayed on the display / input device 207 can freely change the configuration, size, and arrangement of the object in accordance with a user input operation on the display / input device 207 ( Step S305). These user change operation results are displayed on the display / input device 207 at any time, and the layout can be changed interactively.

ステップＳ３０５にてユーザ所望の編集処理がなされたベクトルデータは、ステップＳ３０６にて、ユーザの指示により、紙面上へ出力、もしくは文書管理サーバ１０６−１、１０６−２、クライアントＰＣ１０２等へ配信される。以下、各処理ステップについて詳細を説明する。 In step S306, the vector data that has been subjected to user-desired editing processing in step S305 is output on paper or distributed to the document management servers 106-1 and 106-2, the client PC 102, and the like in accordance with a user instruction. . Details of each processing step will be described below.

３．２ブロックセレクション（領域分割）処理（ステップＳ３０２）
ブロックセレクション処理（ステップＳ３０２）とは、図４（ａ）に示す画像データ（ステップＳ３０１で読み取られた画像データ）を図４（ｂ）に示すように、各オブジェクトを含む矩形領域（ブロック）毎の塊として認識し、該ブロック各々について文字（ＴＥＸＴ）／図画（ＰＩＣＴＵＲＥ）／写真（ＰＨＯＴＯ）／線画（ＬＩＮＥ）／表（ＴＡＢＬＥ）等の属性判定を行い、異なる属性を持つブロックに分割する処理である。ブロックセレクション処理の具体例を以下に説明する。 3.2 Block selection (area division) processing (step S302)
The block selection process (step S302) is the image data shown in FIG. 4A (image data read in step S301) for each rectangular area (block) including each object as shown in FIG. 4B. Processing to recognize each block as a block of characters (TEXT) / drawing (PICTURE) / photograph (PHOTO) / line drawing (LINE) / table (TABLE), etc., and dividing the block into blocks having different attributes It is. A specific example of the block selection process will be described below.

先ず、画像データを白黒に二値化し、輪郭線追跡を行って黒画素輪郭で囲まれる画素の塊を抽出する。面積の大きい黒画素の塊については、内部にある白画素に対しても輪郭線追跡を行い白画素の塊を抽出し、さらに一定面積以上の白画素の塊の内部からは再帰的に黒画素の塊を抽出する。 First, the image data is binarized into black and white, and contour tracking is performed to extract a block of pixels surrounded by a black pixel contour. For a black pixel block with a large area, the outline is also traced for the white pixels inside it, and the white pixel block is extracted. Extract the lump.

このようにして得られた黒画素の塊を、大きさおよび形状で分類し、異なる属性を持つブロックへ分類していく。たとえば、縦横比が１に近く、大きさが一定の範囲のものを文字相当の画素塊とし、さらに近接する文字が整列良くグループ化可能な部分を文字（ＴＥＸＴ）ブロック、扁平な画素塊を線（ＬＩＮＥ）ブロック、一定大きさ以上でかつ四角系の白画素塊を整列よく内包する黒画素塊の占める範囲を表（ＴＡＬＢＬＥ）ブロック、不定形の画素塊が散在している領域を写真（ＰＨＯＴＯ）ブロック、それ以外の任意形状の画素塊を図画（ＰＩＣＴＵＲＥ）ブロック、などとする。また、これら属性についてどの属性にも判定されない領域を背景（ＢＡＣＫＧＲＯＵＮＤ）として抽出する。 The black pixel blocks thus obtained are classified by size and shape, and are classified into blocks having different attributes. For example, a pixel block corresponding to a character having an aspect ratio close to 1 and having a constant size is used as a character block, a portion where adjacent characters can be grouped in a well-aligned manner is a character (TEXT) block, and a flat pixel block is represented by a line. (LINE) block, the range occupied by black pixel blocks that are not less than a certain size and contain square-shaped white pixel blocks in a well-aligned manner (TALBLE) block, and an area where irregular pixel blocks are scattered (PHOTO) ) A block and a pixel block having an arbitrary shape other than the block are defined as a picture (PICTURE) block. In addition, an area that is not determined by any attribute is extracted as a background (BACKGROUND).

ブロックセレクション処理で得られた各ブロックに対するブロック情報の一例を図５に示す。同図に示すように、ブロック情報は各ブロックごとにその“属性”および紙文書内の位置情報（“座標Ｘ”、“座標Ｙ”）、ブロックの形状（“幅Ｗ”、“高さＨ”）、ならびに“ＯＣＲ情報”の有無についての情報が記載されている。これらブロック毎の情報は以降に説明するベクトル化処理の際に用いられる。 An example of the block information for each block obtained by the block selection process is shown in FIG. As shown in the figure, the block information includes “attribute” and position information (“coordinate X”, “coordinate Y”) in the paper document, block shape (“width W”, “height H” for each block. ") And information about the presence or absence of" OCR information ". The information for each block is used in the vectorization process described below.

３．３ベクトル化処理（ステップＳ３０３）
ブロックセレクション処理（ステップＳ３０２）で得られた各ブロックに対して行うベクトル化処理について図６を用いて説明する。ベクトル化処理とは、各ブロックの属性に応じて適応的に処理を行い、各ブロックの再利用性を実現しつつ、高圧縮で高品質なデータへと変換する処理である。 3.3 Vectorization processing (step S303)
A vectorization process performed on each block obtained in the block selection process (step S302) will be described with reference to FIG. Vectorization processing is processing that performs processing adaptively according to the attribute of each block, and realizes reusability of each block and converts it into high-compression and high-quality data.

図６に示すように、ブロックセレクション処理（ステップＳ３０２）により得られるブロックのうち、ＴＥＸＴブロックについては、２値化処理部６０１にてブロック内を２値化し、抽出した文字オブジェクトについて文字認識部６０２にて文字認識処理を行い、各文字のテキストコードを抽出する。また、各文字形状はアウトライン作成部６０３にてアウトライン化し、直線及び滑らかな曲線で表現された解像度に依存しないデータに変換する。 As shown in FIG. 6, among the blocks obtained by the block selection process (step S302), the TEXT block is binarized by the binarization processing unit 601, and the character recognition unit 602 for the extracted character object. Character recognition processing is performed at, and the text code of each character is extracted. Each character shape is outlined by the outline creation unit 603 and converted into data independent of the resolution expressed by straight lines and smooth curves.

また、ＴＡＢＬＥブロックについては、２値化処理部６０４にて２値化し表枠の２値画像を抽出し、抽出した２値画像に対しアウトライン作成部６０５にてアウトライン化したのち、表処理部６０６にて表処理を行い、表枠を罫線により表現する。なお、ＴＡＢＬＥブロックの２値化では、ブロックセレクション処理において抽出したＴＡＢＬＥブロック内のＴＥＸＴブロックを排除し２値化することで、表枠のみの２値画像が抽出できる。 For the TABLE block, the binarization processing unit 604 binarizes and extracts a binary image of the table frame. The extracted binary image is outlined by the outline creation unit 605, and then the table processing unit 606. Table processing is performed at, and the table frame is expressed by ruled lines. In the binarization of the TABLE block, a binary image of only the table frame can be extracted by eliminating the TEXT block in the TABLE block extracted in the block selection process and binarizing.

また、ＬＩＮＥブロックについては、ＴＥＸＴブロック、ＴＡＢＬＥブロック同様に２値化処理部６０７にてブロック内を２値化しＬＩＮＥ２値画像を抽出し、アウトライン作成部６０８にてアウトライン化し滑らかな曲線及び直線により表現し、図形認識部６０９にて図形認識を行い、罫線、円、楕円、多角形といった情報を抽出する。 As for the LINE block, like the TEXT block and TABLE block, the binarization processing unit 607 binarizes the inside of the block, extracts the LINE binary image, outlines it with the outline creation unit 608, and expresses it with smooth curves and straight lines. Then, the figure recognition unit 609 recognizes the figure and extracts information such as ruled lines, circles, ellipses, and polygons.

また、ＰＨＯＴＯ、ＢＡＣＫＧＲＯＵＮＤブロックについては、適応圧縮部６１０にてイメージ情報として取り出しそれぞれ圧縮する。 The PHOTO and BACKGROUND blocks are extracted as image information by the adaptive compression unit 610 and compressed.

以下、各処理について詳説する。尚、２値化処理部６０１、６０４、６０７における２値化処理、アウトライン作成部６０３、６０５、６０８におけるアウトライン化処理はそれぞれ同じ処理である。 Hereinafter, each process will be described in detail. The binarization processing in the binarization processing units 601, 604, and 607 and the outline conversion processing in the outline creation units 603, 605, and 608 are the same processing.

３．３．１２値化処理
２値化処理部（６０１、６０４、６０７）では、画像データより輝度情報を抽出し、その輝度値のヒストグラムを作成する。ヒストグラム上より複数の閾値を設定し、各々の閾値で２値化された２値画像上の黒画素の連結等を解析することで最適な閾値を導出し、該閾値による２値画像を得る。 3.3.1 Binarization processing The binarization processing units (601, 604, 607) extract luminance information from image data and create a histogram of the luminance values. A plurality of threshold values are set on the histogram, and an optimal threshold value is derived by analyzing the connection of black pixels on the binary image binarized with each threshold value, and a binary image based on the threshold values is obtained.

３．３．２文字認識処理
文字認識部６０２では、文字単位で切り出された画像に対し、パターンマッチの一手法を用いて認識を行い、対応するテキストコードを得る。この認識処理は、文字オブジェクトから得られる特徴を数十次元の数値列に変換した観測特徴ベクトルと、あらかじめ字種毎に求められている辞書特徴ベクトルとを比較し、最も距離の近い字種を認識結果とする処理である。特徴ベクトルの抽出には種々の公知手法があり、たとえば、文字をメッシュ状に分割し、各メッシュ内の文字線を方向別に線素としてカウントしたメッシュ数次元ベクトルを特徴とする方法が挙げられる。 3.3.2 Character Recognition Processing The character recognition unit 602 recognizes an image cut out in units of characters using a pattern matching technique, and obtains a corresponding text code. This recognition process compares an observed feature vector obtained by converting features obtained from a character object into a numerical sequence of several tens of dimensions with a dictionary feature vector obtained in advance for each character type, and determines the character type with the closest distance. This is a process for obtaining a recognition result. There are various known methods for extracting a feature vector. For example, there is a method of dividing a character into meshes and using a mesh number-dimensional vector obtained by counting character lines in each mesh as line elements according to directions.

ブロックセレクション処理（ステップＳ３０２）で抽出したＴＥＸＴブロックに対して文字認識を行う場合は、まず該当ブロックに対し横書き、縦書きの判定を行い、各々対応する方向に行を切り出し、その後文字オブジェクトを切り出して文字画像を得る。横書き、縦書きの判定は、該当ブロック内で画素値に対する水平／垂直の射影を取り、水平射影の分散が大きい場合は横書きブロック、垂直射影の分散が大きい場合は縦書きブロックと判断すればよい。 When character recognition is performed on the TEXT block extracted in the block selection process (step S302), horizontal writing and vertical writing are first determined for the corresponding block, lines are cut out in the corresponding directions, and then the character objects are cut out. To get a character image. Horizontal / vertical writing can be determined by taking horizontal / vertical projections of the pixel values in the corresponding block, and determining horizontal writing blocks when the horizontal projection variance is large and vertical writing blocks when the vertical projection variance is large. .

文字列および文字への分解は、横書きならば水平方向の射影を利用して行を切り出し、さらに切り出された行に対する垂直方向の射影から、文字を切り出すことで行う。縦書きのＴＥＸＴブロックに対しては、水平と垂直を逆にすればよい。なお、この時文字のサイズが検出できる。 For horizontal writing, character strings and characters are decomposed by cutting out lines using horizontal projection, and then cutting out characters from the vertical projection of the cut lines. For a vertically written TEXT block, the horizontal and vertical may be reversed. At this time, the character size can be detected.

３．３．３アウトライン作成処理
アウトライン作成部６０３では、ブロックセレクション処理（ステップＳ３０２）で、図画（ＰＩＣＴＵＲＥ）あるいは線画（ＬＩＮＥ）、表（ＴＡＢＬＥ）ブロックとされたブロックを対象に、ブロック中で抽出された画素塊の輪郭を直線及び曲線で表現されるアウトラインデータに変換する。 3.3.3 Outline Creation Processing The outline creation unit 603 extracts blocks in the block selection process (step S302) for blocks that have been made a drawing (PICTURE), line drawing (LINE), or table (TABLE) block. The contour of the pixel block is converted into outline data expressed by straight lines and curves.

具体的には、輪郭をなす画素の点列を角と看倣される点で区切って、各区間を部分的な直線あるいは曲線で近似する。角とは曲率が極大となる点であり、曲率が極大となる点は、図７に図示するように、任意点Ｐｉに対し左右ｋ個の離れた点Ｐｉ−ｋ、Ｐｉ＋ｋの間に弦を引いたとき、この弦とＰｉの距離が極大となる点として求められる。さらに、Ｐｉ−ｋ、Ｐｉ＋ｋ間の弦の長さ／弧の長さをＲとし、Ｒの値が閾値以下である点を角とみなすことができる。角によって分割された後の各区間は、直線は点列に対する最小二乗法など、曲線は３次スプライン関数などを用いてベクトル化することができる。 Specifically, a point sequence of pixels forming an outline is divided by points regarded as corners, and each section is approximated by a partial straight line or curve. An angle is a point where the curvature is maximized, and the point where the curvature is maximized is that a string is placed between k points Pi-k and Pi + k that are separated from the arbitrary point Pi as shown in FIG. It is obtained as a point where the distance between the string and Pi becomes maximum when drawn. Furthermore, let R be the chord length / arc length between Pi−k and Pi + k, and a point where the value of R is equal to or less than a threshold value can be regarded as a corner. Each section after being divided by the corners can be vectorized by using a least square method for a straight line and a curve using a cubic spline function.

また、対象が内輪郭を持つ場合、ブロックセレクション処理（ステップＳ３０２）で抽出した白画素輪郭の点列を用いて、同様に部分的直線あるいは曲線で近似する。 When the object has an inner contour, the white pixel contour point sequence extracted in the block selection process (step S302) is used to approximate the image by a partial straight line or a curve.

図８にアウトライン化する前の画像データとアウトライン化したベクトルデータの例を示す。図８（ａ）がアウトライン化する前の画像データであり、図８（ｂ）がアウトライン化した画像データである。このように、直線あるいは曲線で近似された文字形状、もしくは表枠、線画は、拡大縮小しても画質の損失がない、即ち解像度に依存しないベクトルデータである。 FIG. 8 shows an example of image data before being outlined and vector data that has been outlined. FIG. 8A shows image data before being outlined, and FIG. 8B shows image data after being outlined. As described above, the character shape approximated by a straight line or a curve, a table frame, or a line drawing is vector data that does not lose image quality even when enlarged or reduced, that is, does not depend on resolution.

３．３．４表処理
表処理部６０６では、表中のセル及びその構成を認識し、表枠を罫線の集合として表現する。図９（ａ）は２値画像をアウトライン化したベクトルデータを示す。アウトライン化した結果、外輪郭と内輪郭が求められるが、例えば外輪郭９０１と内輪郭９０２、９０３、９０４の構成関係により９１１、９１２といったセルの角点を求めていく。角点を求めていった結果、図９（ｂ）のように角点の構成が抽出され、セルの構成及び罫線情報が求められる。次に、求められた各罫線が図９（ａ）の各輪郭線の合間を通るように位置を調整し、輪郭線との位置関係より太さを求める。 3.3.4 Table Processing The table processing unit 606 recognizes the cells in the table and their configurations, and expresses the table frame as a set of ruled lines. FIG. 9A shows vector data obtained by outlining a binary image. As a result of the outline, the outer contour and the inner contour are obtained. For example, the cell corner points 911 and 912 are obtained based on the structural relationship between the outer contour 901 and the inner contours 902, 903, and 904. As a result of obtaining the corner points, the corner point configuration is extracted as shown in FIG. 9B, and the cell configuration and ruled line information are obtained. Next, the position is adjusted so that the determined ruled lines pass between the contour lines in FIG. 9A, and the thickness is determined from the positional relationship with the contour lines.

以上の処理により、表を太さをもつ罫線により表現することが可能である。尚、このような２値画像より表の構成を認識する処理では、黒く塗りつぶされた領域が抽出される可能性がある。図１０の領域Ａに一例を示す。このような領域が抽出された場合は、本来セルであるが２値化処理部６０４における２値化処理の際に２値化閾値によりセルが抽出されていない可能性があるのでブロック内のエッジ情報を調べたり、もしくは再度２値化処理およびアウトライン作成処理をしセルを抽出することで、より忠実に表をベクトル表現することが可能となる。 With the above processing, the table can be expressed by ruled lines having a thickness. In the process of recognizing the structure of the table from such a binary image, there is a possibility that an area painted black is extracted. An example is shown in region A of FIG. If such an area is extracted, it is originally a cell, but the cell in the block may not be extracted by the binarization threshold during the binarization processing in the binarization processing unit 604. By examining the information, or by performing binarization processing and outline creation processing again to extract cells, the table can be expressed more faithfully as a vector.

３．３．５図形認識
図形認識部６０９では、２値画像上の罫線、円や楕円、多角形といった図形表現可能な部位を抽出する。図形認識の例について図１１を用いて説明する。図１１は２値画像をアウトライン化した画像データである。各閉曲線について、アウトラインの曲率、角点情報より円、楕円、多角形情報を抽出する。円は曲率が一定であるかで判断でき、楕円は曲率の遷移情報を用いて判断できる。また、多角形は角点と角点間の曲率で判断できる。例えば図１１では、円、矩形といった図形情報が抽出される。 3.3.5 Graphic Recognition The graphic recognition unit 609 extracts parts that can be expressed in graphics such as ruled lines, circles, ellipses, and polygons on the binary image. An example of figure recognition will be described with reference to FIG. FIG. 11 shows image data obtained by converting a binary image into an outline. For each closed curve, circle, ellipse, and polygon information is extracted from the curvature and corner point information of the outline. A circle can be determined based on whether the curvature is constant, and an ellipse can be determined using curvature transition information. Polygons can be determined from the corner points and the curvature between the corner points. For example, in FIG. 11, graphic information such as a circle and a rectangle is extracted.

次に、抽出された図形アウトラインと周辺のアウトラインとの関係を調べる。例として円アウトライン１１０３について詳しく述べると、円アウトライン１１０３の外輪郭アウトライン１１０４の角点１１１１〜１１１６間の曲率と角点間の距離との関係からアウトライン１１０１は太さをもつ円曲線として図形表現される。 Next, the relationship between the extracted figure outline and surrounding outlines is examined. As an example, the circle outline 1103 will be described in detail. From the relationship between the curvature between the corner points 1111 to 1116 of the outer outline outline 1104 of the circle outline 1103 and the distance between the corner points, the outline 1101 is graphically expressed as a circular curve having a thickness. The

この時、角点間１１１１〜１１１２、１１１３〜１１１４、１１１５〜１１１６との円を構成しない部位はアウトラインとして分割する。尚、予め２値画像を細線化すれば、円と線の結合部位１１２１〜１１２３が効率的に抽出することも可能である。最後に各アウトラインの直線部位を検出し、抽出された直線と対となる直線が存在するか判定し、存在すれば、太さをもつ罫線に置き換える。 At this time, the part which does not comprise the circle with the corner points 1111 to 1112, 1113 to 1114, and 1115 to 1116 is divided as an outline. In addition, if the binary image is thinned in advance, it is possible to efficiently extract the coupling portions 1121 to 1123 of the circle and the line. Finally, a straight line portion of each outline is detected, and it is determined whether there is a straight line paired with the extracted straight line. If there is a straight line, the line is replaced with a ruled line having a thickness.

以上によりアウトラインより図形形状を抽出する。尚、図形として認識されなかったものについては、そのままアウトラインで記述する。 As described above, the figure shape is extracted from the outline. In addition, what was not recognized as a figure is described with an outline as it is.

３．３．６適応圧縮
適応圧縮部６１０では、ブロックセレクション処理（ステップＳ３０２）によりＰＨＯＴＯブロックと判定されたブロック及びＢＡＣＫＧＲＯＵＮＤ（背景）を、ラスターデータとして個別のＪＰＥＧファイルで処理する。この時、背景は紙原稿中のいらない部位を意味し、場合によっては低解像度、もしくは高圧縮によってイメージデータを作成するものとする。これにより、最終的にアプリケーションデータとしてベクトルデータを作成する際に、そのデータ量を減らすことが可能となる。 3.3.6 Adaptive Compression The adaptive compression unit 610 processes, as raster data, individual JPEG files for blocks and BACKGROUND (background) determined as PHOTO blocks by block selection processing (step S302). At this time, the background means a portion which is not necessary in the paper document, and in some cases, image data is created with low resolution or high compression. This makes it possible to reduce the amount of data when finally creating vector data as application data.

３．４ベクトルデータ表示処理（ステップＳ３０４）
以上の通り、一頁分の画像データをブロックセレクション処理（ステップＳ３０２）し、ベクトル化処理（ステップＳ３０３）することで、図１２に示すような中間データ形式のファイルが生成される。このようなデータ形式はドキュメント・アナリシス・アウトプット・フォーマット（ＤＡＯＦ）と呼ばれる。図１２はＤＡＯＦのデータ構造の一例を示す図である。 3.4 Vector data display process (step S304)
As described above, the image data for one page is subjected to the block selection process (step S302) and the vectorization process (step S303), thereby generating an intermediate data format file as shown in FIG. Such a data format is called a document analysis output format (DAOF). FIG. 12 shows an example of the data structure of DAOF.

図１２において、１２０１はｈｅａｄｅｒであり、処理対象の画像データに関する情報が保持される。レイアウト記述データ部１２０２では、画像データ中のＴＥＸＴ（文字）、ＬＩＮＥ（線画）、ＰＩＣＴＵＲＥ（図画）、ＴＡＢＬＥ（表）、ＰＨＯＴＯ（写真）等の属性毎に認識された各ブロックの属性とその位置情報を保持する。 In FIG. 12, reference numeral 1201 denotes a header, which holds information related to image data to be processed. In the layout description data section 1202, the attribute and position of each block recognized for each attribute such as TEXT (character), LINE (line drawing), PICTURE (drawing), TABLE (table), PHOTO (photo), etc. in the image data. Keep information.

文字認識記述データ部１２０３では、ＴＥＸＴブロックを文字認識することにより得られる文字認識結果を保持する。表記述データ部１２０４では、ＴＡＢＬＥブロックの構造の詳細を格納する。画像記述データ部１２０５では、ＰＩＣＴＵＲＥやＬＩＮＥ等のブロックの画像データを紙文書の画像データから切り出して保持する。 The character recognition description data portion 1203 holds a character recognition result obtained by character recognition of the TEXT block. The table description data portion 1204 stores details of the structure of the TABLE block. The image description data portion 1205 cuts out image data of blocks such as PICTURE and LINE from the image data of the paper document and holds them.

一方で、ブロック情報より紙原稿中の文書構造ツリーを作成する。ここで文書構造ツリーについて図１３を用いて説明する。図１３（ａ）はブロック情報により得られる各ブロックとそれぞれの属性ならびにグルーピングブロックの一例を示す図である。各ブロックについて、ブロック間の距離が近い、ブロック幅がほぼ同一である、といったブロック間の関連性を判定しグルーピングしていく。例えばブロック間の関連性によりＴ３、Ｔ４、Ｔ５よりグルーピングブロックＶ１が、また、Ｔ６、Ｔ７よりグルーピングブロックＶ２がそれぞれ生成され、このようなグルーピングを繰り返すことで図１３（ａ）から図１３（ｂ）に示す文書構造ツリーが作成される。 On the other hand, a document structure tree in a paper document is created from block information. Here, the document structure tree will be described with reference to FIG. FIG. 13A is a diagram showing an example of each block obtained from the block information, each attribute, and a grouping block. About each block, the relationship between blocks, such as the distance between blocks being near, and the block width being substantially the same, is determined and grouped. For example, a grouping block V1 is generated from T3, T4, and T5 due to the relationship between the blocks, and a grouping block V2 is generated from T6 and T7. By repeating such grouping, FIGS. ) Is created.

尚、Ｖ０はページ全体を表す最上位階層である。ところで、文書構造ツリーの各ブロックの実データは、ＤＡＯＦに格納されており、文書構造ツリーと関連付けられている。例えばＴＥＸＴブロックであれば、アウトライン形状であったり、もしくはテキストコードであったり、ＤＡＯＦの実データを文書構造ツリーに流し込むことで各種画像が生成可能である。ステップＳ３０４では、以上作成されたベクトルデータを表示する。尚、ＴＥＸＴブロックは文字の誤認識による画像の損失を回避するため文字形状アウトラインを表示する。 Note that V0 is the highest layer representing the entire page. Incidentally, the actual data of each block of the document structure tree is stored in the DAOF and is associated with the document structure tree. For example, in the case of a TEXT block, various types of images can be generated by flowing outline data, text code, or DAOF actual data into a document structure tree. In step S304, the generated vector data is displayed. The TEXT block displays a character shape outline in order to avoid image loss due to erroneous recognition of characters.

３．５ベクトル編集処理（ステップＳ３０５）
ステップＳ３０５では、表示されたベクトルデータをユーザの指示に従い編集する。なお、本実施形態では、目の不自由な人を対象とした、ＴＥＸＴブロック内のベクトルデータの編集機能について述べるが、本発明の目的は特にこれに限られない。ステップＳ３０５における編集処理は、各ブロックのレイアウトの変更、またブロック内部データの変更等、ユーザの所望の編集処理が可能であるものとし、以下に述べる文字オブジェクトのベクトルデータの編集はその一例である。 3.5 Vector editing process (step S305)
In step S305, the displayed vector data is edited according to a user instruction. In the present embodiment, the function of editing vector data in a TEXT block intended for a blind person is described, but the object of the present invention is not particularly limited to this. The editing process in step S305 can be performed by a user's desired editing process such as changing the layout of each block or changing internal data of the block. Editing of character object vector data described below is an example. .

尚、ベクトルデータの編集は、ブロック内部の編集にあってはＤＡＯＦのデータを変更することで、またレイアウト変更にあっては文書構造ツリー及びＤＡＯＦを変更することで、ベクトルデータが編集される。 The vector data is edited by changing the DAOF data when editing the block, and by changing the document structure tree and DAOF when changing the layout.

また、例えばＴＥＸＴブロックにおける文字形状とテキストコードといったブロック内部の情報の切り替えは文書構造ツリーへ流し込むＤＡＯＦデータの切り替えにより可能である。 In addition, for example, switching of information inside the block such as a character shape and a text code in the TEXT block can be performed by switching of DAOF data flowing into the document structure tree.

以下、ＴＥＸＴブロックの編集機能について述べる。図１４は紙文書中の所定のブロックに対して簡易文字拡大を実現する場合のＵＩの一例を示す図である。 The TEXT block editing function will be described below. FIG. 14 is a diagram illustrating an example of a UI for realizing simple character enlargement for a predetermined block in a paper document.

同図において、１４０１はスキャン原稿表示部であり、スキャンしベクトル化したベクトルデータが自動的に表示される。ベクトル化したデータは、各ブロック毎にブロックセレクション処理により得られた属性がわかるよう表示されており、ＵＩ上ではブロック毎に選択することができる。 In the figure, reference numeral 1401 denotes a scanned document display unit, which automatically displays vector data scanned and vectorized. The vectorized data is displayed so that the attribute obtained by the block selection process can be seen for each block, and can be selected for each block on the UI.

ここで、「文字拡大」ボタン１４０２を押すと、選択されたブロックについて文字が拡大されたレイアウトで再度表示される。ユーザは表示結果が妥当と判断した場合には「出力」ボタン１４０４を押し、出力処理を実行させる。尚、所望の結果が得られていないと判断した場合には「元に戻る」ボタン１４０３を押す。これにより、拡大操作は解除され元の状態へ戻すことが可能である。 Here, when the “enlarge character” button 1402 is pressed, the selected block is displayed again in an enlarged layout. When the user determines that the display result is appropriate, the user presses an “output” button 1404 to execute output processing. If it is determined that the desired result is not obtained, a “return” button 1403 is pressed. As a result, the enlargement operation can be canceled and returned to the original state.

ここで、文字拡大によるレイアウト手法に関して詳説する。図１５（ａ）は本実施形態の説明で用いるベクトル化された紙文書である。尚、後の説明を簡単にするため、拡大する文字形状を点線矩形にて示しているが、点線内部にアウトライン化された文字形状が入っているものとする。図１５（ｂ）には、図１５（ａ）をブロックセレクション処理することにより得られた情報を示したものである。図中、１５１１及び１５１２は、ＴＡＢＬＥと判断されたブロック（実線矩形）であり、それ以外はＴＥＸＴと判断されたブロック（一点鎖線矩形）である。 Here, the layout method by character enlargement will be described in detail. FIG. 15A shows a vectorized paper document used in the description of this embodiment. In order to simplify the following description, the character shape to be enlarged is indicated by a dotted rectangle, but it is assumed that the outlined character shape is contained inside the dotted line. FIG. 15B shows information obtained by subjecting FIG. 15A to block selection processing. In the figure, reference numerals 1511 and 1512 denote blocks (solid line rectangles) determined to be TABLE, and the other blocks are blocks (dotted line rectangle) determined to be TEXT.

図１６は紙文書中の文字を拡大したベクトルデータを作成するためのフローチャートである。 FIG. 16 is a flowchart for creating vector data obtained by enlarging characters in a paper document.

ステップＳ１６０１では文字（ＴＥＸＴ）・表（ＴＡＢＬＥ）ブロックを拡大する。ステップＳ１６０１の文字・表ブロックの拡大では、ブロックが重なっている、例えば表ブロック内の文字ブロックについては、外側の表ブロックと内側の文字オブジェクトをセットとして考え、外側の表ブロックの拡大に合わせて同じ拡大率で表ブロック内にマッピングするようにする。よって、文字・表ブロックの拡大では、図１５（ｂ）のオブジェクトのうち、図１７（ａ）の太枠で示したブロックの拡大レイアウトを考える。 In step S1601, the character (TEXT) / table (TABLE) block is enlarged. In the enlargement of the character / table block in step S1601, the blocks overlap. For example, for the character blocks in the table block, the outer table block and the inner character object are considered as a set, and the outer table block is enlarged. Map within the table block with the same magnification. Therefore, in the enlargement of the character / table block, the enlarged layout of the block shown by the thick frame in FIG. 17A among the objects in FIG. 15B is considered.

次に拡大手法について説明する。各ブロックの縦横比を一定に保ち、ブロックを広げていき、他のブロックもしくはテキストブロックではない他のオブジェクトと重なったとき拡大終了とし、その領域をブロックの拡大領域とする。結果として図１７（ａ）の各領域は、図１７（ｂ）のように拡大される。この時、ブロックの枠の位置は周辺の他のブロック及びオブジェクトとの位置関係が変わらない限り、多少ずれても構わない。尚、ブロックの拡大により各文字オブジェクトの文字形状、表オブジェクトの罫線等も線形変換により拡大される。 Next, an enlargement method will be described. The aspect ratio of each block is kept constant, the block is expanded, and when the block overlaps another block or another object that is not a text block, the expansion ends, and the region is set as the block expansion region. As a result, each area in FIG. 17A is enlarged as shown in FIG. At this time, the position of the frame of the block may be slightly shifted as long as the positional relationship with other peripheral blocks and objects does not change. It should be noted that the character shape of each character object, the ruled line of the table object, and the like are enlarged by linear transformation as the block is enlarged.

次にステップＳ１６０２では文字ブロックを拡大する。ステップＳ１６０２における文字ブロックの拡大では、ブロックセレクション処理で得られる各文字ブロックの枠のみを拡大するだけであって、ブロック内部の文字形状は枠の拡大に合わせて拡大しない。この文字ブロックの枠の拡大では、縦横比は一定ではなくブロックの四方についてそれぞれ拡大できるか判断し、広げられる限り広げる。この時、表オブジェクト内の文字ブロックであれば、ステップＳ６０６でセル構成及び表罫線の忠実な位置が抽出されていることから、ステップＳ１６０１で表枠が拡大した各セルの領域がそのまま文字ブロックの枠の拡大領域となる。結果、図１５（ａ）の紙文書については図１８のような結果が得られる。 In step S1602, the character block is enlarged. In the enlargement of the character block in step S1602, only the frame of each character block obtained by the block selection process is enlarged, and the character shape inside the block is not enlarged in accordance with the enlargement of the frame. In the enlargement of the character block frame, the aspect ratio is not constant, and it is determined whether the four sides of the block can be enlarged. At this time, if it is a character block in the table object, since the cell configuration and the faithful position of the table ruled line are extracted in step S606, the area of each cell whose table frame is expanded in step S1601 is the character block as it is. It becomes an enlarged area of the frame. As a result, the result shown in FIG. 18 is obtained for the paper document shown in FIG.

ステップＳ１６０３では、ステップＳ１６０２で拡大した各文字ブロック内の文字形状を拡大する。ここでは、ステップＳ６０２の文字認識処理により、文字単位で各文字形状が切り出されており、各文字は文字列として扱うことが可能である。この文字形状の拡大について図１９を用いて説明する。 In step S1603, the character shape in each character block enlarged in step S1602 is enlarged. Here, each character shape is cut out in character units by the character recognition processing in step S602, and each character can be handled as a character string. The enlargement of the character shape will be described with reference to FIG.

図１９（ａ）はステップＳ１６０２による文字ブロックの枠と文字ブロック内の文字形状を示したものである。１９０１はステップＳ１６０２で得られる文字ブロックの枠（図１９（ａ））について各文字幅をΔα拡大したのが図１９（ｂ）である。Δα拡大したことにより、左端の文字２３００は文字ブロックの枠をはみ出してしまうが、文字列として扱うことで、図１９（ｃ）のようにはみ出した文字形状を次の行へ反映させることができる。尚、次の行への移行で新しく行を作成する際は文字サイズに応じて行間を決定する。また、行間は文字サイズに応じて変更してもよい。また、次の行へ追加したことによって、次の行が文字ブロックの枠をはみ出してしまった場合は、再度次の行へ追加する。以上のΔαの拡大を文字列が文字ブロックの枠に収まる範囲で繰り返す。 FIG. 19A shows the character block frame and the character shape in the character block in step S1602. In FIG. 19B, reference numeral 1901 shows that the character width of the character block frame (FIG. 19A) obtained in step S1602 is enlarged by Δα. By expanding Δα, the leftmost character 2300 protrudes from the character block frame, but by treating it as a character string, the protruding character shape as shown in FIG. 19C can be reflected in the next line. . When creating a new line by shifting to the next line, the line spacing is determined according to the character size. The line spacing may be changed according to the character size. If the next line extends beyond the character block frame due to the addition to the next line, the line is added again to the next line. The above expansion of Δα is repeated as long as the character string fits within the character block frame.

以上の処理を実行することにより、図１５（ａ）の紙文書から、図２０のような文字拡大画像を生成することができる。 By executing the above processing, a character enlarged image as shown in FIG. 20 can be generated from the paper document shown in FIG.

尚、本実施形態では、「文字拡大」ボタン１４０２のワンタッチで文字を最大限拡大する「文字拡大」の一例について述べたが、特にこれに限定されるものではなく、例えば、「文字拡大」ボタン１４０２の一回のタッチにおけるステップＳ１６０１〜ステップＳ１６０３の各拡大率を予め設定しておき、ユーザに複数回のタッチによって所望の結果を得るように構成することも可能である。 In this embodiment, an example of “character enlargement” that enlarges a character to the maximum with one touch of the “character enlargement” button 1402 has been described. However, the present invention is not limited to this. For example, a “character enlargement” button It is also possible to previously set each enlargement ratio in steps S1601 to S1603 in one touch of 1402, and obtain a desired result by touching the user a plurality of times.

３．６ベクトル出力
ステップＳ３０６では、ベクトルデータを出力する。ベクトルデータは、出力する際に一旦アプリケーションデータへ変換される。アプリケーションデータは、ステップＳ３０４で作成されたＤＡＯＦと文書ツリー構造を用いて生成可能である。尚、ＤＡＯＦと文書ツリー構造はステップＳ３０５のベクトル編集処理により変更されている。 3.6 Vector Output In step S306, vector data is output. Vector data is temporarily converted into application data when it is output. Application data can be generated using the DAOF created in step S304 and the document tree structure. Note that the DAOF and the document tree structure are changed by the vector editing process in step S305.

図１３の場合の文書ツリー構造についてのアプリケーションデータ生成処理について説明する。図１８において、Ｈ１は横方向に２つのブロックＴ１とＴ２があるので、２カラムとし、Ｔ１の内部情報（ＤＡＯＦを参照、文字認識結果の文章、画像など）を出力後、カラムを変え、Ｔ２の内部情報出力、その後Ｓ１を出力となる。Ｈ２は横方向に２つのブロックＶ１とＶ２があるので、２カラムとして出力、Ｖ１はＴ３、Ｔ４、Ｔ５の順にその内部情報を出力、その後カラムを変え、Ｖ２のＴ６、Ｔ７の内部情報を出力する。以上によりアプリケーションデータを生成し、該アプリケーションデータを紙面上へ出力、またはネットワーク上へ出力する。 The application data generation process for the document tree structure in the case of FIG. 13 will be described. In FIG. 18, since there are two blocks T1 and T2 in the horizontal direction, H1 has two columns, and after T1 internal information (refer to DAOF, text of character recognition result, image, etc.) is output, the column is changed, and T2 The internal information is output, and then S1 is output. Since H2 has two blocks V1 and V2 in the horizontal direction, it outputs as two columns, V1 outputs its internal information in the order of T3, T4, T5, then changes the column, and outputs the internal information of T6, T7 of V2 To do. The application data is generated as described above, and the application data is output on a paper surface or output on a network.

以上の説明から明らかなように、本実施形態によれば、紙文書より表オブジェクト及び各セル罫線位置情報を抽出し、各文字形状を拡大することで、紙文書中のレイアウトを崩すことなく、元々文字の小さいことが多い紙文書中の表内文字について大きく拡大することが可能となる。尚、文字形状はベクトル化処理によりアウトライン化されているので解像度に依存せず、画像ベースの拡大に比べ画像の損失がなく拡大できる。 As is clear from the above description, according to the present embodiment, the table object and each cell ruled line position information is extracted from the paper document, and each character shape is expanded without breaking the layout in the paper document. It is possible to greatly enlarge the characters in the table in the paper document that originally has many small characters. Since the character shape is outlined by vectorization processing, it does not depend on the resolution and can be enlarged without loss of image compared to image-based enlargement.

尚、表内の文字を非常に大きく拡大可能な反面、表内に存在する複数の文字オブジェクトを別々の拡大率で拡大しては、反って表全体として画像を損なう可能性があるが、表構成を予め認識しているので、文字を拡大する際に表内部の全ての文字の拡大率を揃える等の工夫も種々可能である。 Although the characters in the table can be enlarged greatly, if multiple character objects existing in the table are enlarged at different magnifications, there is a possibility that the image will be damaged as a whole. Since the configuration is recognized in advance, when the characters are enlarged, various ideas such as making the enlargement ratios of all the characters in the table uniform are possible.

［第２の実施形態］
上記第１の実施形態では、拡大する倍率を最大限大きくすることとしたが、本発明は特にこれに限られない。表について、たとえ同じセル内では等倍に文字が拡大されていたとしても、各セルごとに倍率がバラバラでは、かえって紙文書中の文字として見づらいことが多い。そこで、文字を拡大する際に、ブロックセレクション処理により抽出された表全ての文字について倍率をそろえるようにしてもよい。また、倍率をそろえることは項目毎でも問題ないため、列毎、行毎に設定する手段を設けるようにしてもよい。 [Second Embodiment]
In the first embodiment, the enlargement magnification is maximized, but the present invention is not particularly limited to this. In the table, even if the characters are enlarged at the same magnification in the same cell, if the magnification is different for each cell, it is often difficult to see the characters in the paper document. Therefore, when enlarging characters, the magnifications may be set for all characters in the table extracted by the block selection process. In addition, since there is no problem in making the magnifications uniform for each item, means for setting for each column or row may be provided.

［第３の実施形態］
上記第１の実施形態では、文字の拡大レイアウトとしてアウトライン化した文字形状を使用したが、本発明は特にこれに限られない。アウトライン化した文字形状を使用したのは文字の誤認識による画像の損失を考慮したためであり、文字認識部６０２の精度、さらにフォント認識の精度が十分であれば、文字形状を認識したテキストコードによるフォント情報に置き換えてもよい。 [Third Embodiment]
In the first embodiment, an outline character shape is used as an enlarged layout of characters, but the present invention is not particularly limited to this. The outlined character shape is used because image loss due to erroneous character recognition is taken into account. If the accuracy of the character recognition unit 602 and the accuracy of font recognition are sufficient, the character code that recognizes the character shape is used. It may be replaced with font information.

尚、フォント認識処理については文字認識部６０２における文字認識の後に実施することとし、文字認識の際に用いる、字種数分の辞書特徴ベクトルを、文字形状種すなわちフォント種に対して複数用意し、マッチングの際に文字コードとともにフォント種を出力するようにしてもよい。これにより、文字フォントの認識処理が実現できる。 Note that the font recognition processing is performed after character recognition in the character recognition unit 602, and a plurality of dictionary feature vectors corresponding to the number of character types used for character recognition are prepared for the character shape type, that is, the font type. The font type may be output together with the character code at the time of matching. Thereby, the recognition process of a character font is realizable.

［他の実施形態］
なお、本発明は、複数の機器（例えばホストコンピュータ、インタフェイス機器、リーダ、プリンタなど）から構成されるシステムに適用しても、一つの機器からなる装置（例えば、複写機、ファクシミリ装置など）に適用してもよい。 [Other Embodiments]
Note that the present invention can be applied to a system including a plurality of devices (for example, a host computer, an interface device, a reader, and a printer), and a device (for example, a copying machine and a facsimile device) including a single device. You may apply to.

また、本発明の目的は、前述した実施形態の機能を実現するソフトウェアのプログラムコードを記録した記憶媒体を、システムあるいは装置に供給し、そのシステムあるいは装置のコンピュータ（またはＣＰＵやＭＰＵ）が記憶媒体に格納されたプログラムコードを読出し実行することによっても、達成されることは言うまでもない。 Another object of the present invention is to supply a storage medium storing software program codes for implementing the functions of the above-described embodiments to a system or apparatus, and the computer (or CPU or MPU) of the system or apparatus stores the storage medium. Needless to say, this can also be achieved by reading and executing the program code stored in the.

この場合、記憶媒体から読出されたプログラムコード自体が前述した実施形態の機能を実現することになり、そのプログラムコードを記憶した記憶媒体は本発明を構成することになる。 In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiments, and the storage medium storing the program code constitutes the present invention.

プログラムコードを供給するための記憶媒体としては、例えば、フロッピ（登録商標）ディスク、ハードディスク、光ディスク、光磁気ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、磁気テープ、不揮発性のメモリカード、ＲＯＭなどを用いることができる。 As a storage medium for supplying the program code, for example, a floppy (registered trademark) disk, hard disk, optical disk, magneto-optical disk, CD-ROM, CD-R, magnetic tape, nonvolatile memory card, ROM, or the like is used. be able to.

また、コンピュータが読出したプログラムコードを実行することにより、前述した実施形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼働しているＯＳ（オペレーティングシステム）などが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, by executing the program code read by the computer, not only the functions of the above-described embodiments are realized, but also an OS (operating system) operating on the computer based on the instruction of the program code. It goes without saying that a case where the function of the above-described embodiment is realized by performing part or all of the actual processing and the processing is included.

さらに、記憶媒体から読出されたプログラムコードが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書込まれた後、そのプログラムコードの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵなどが実際の処理の一部または全部を行い、その処理によって前述した実施形態の機能が実現される場合も含まれることは言うまでもない。 Further, after the program code read from the storage medium is written into a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer, the function expansion is performed based on the instruction of the program code. It goes without saying that the CPU or the like provided in the board or the function expansion unit performs part or all of the actual processing, and the functions of the above-described embodiments are realized by the processing.

本発明の第１の実施形態にかかる画像処理装置（ＭＦＰ１００）を備える文書管理システムの構成例を示すブロック図である。1 is a block diagram illustrating a configuration example of a document management system including an image processing apparatus (MFP 100) according to a first embodiment of the present invention. 本発明の第１の実施形態にかかる画像処理装置（ＭＦＰ１００）の機能構成を示す図である。1 is a diagram illustrating a functional configuration of an image processing apparatus (MFP 100) according to a first embodiment of the present invention. FIG. ＭＦＰ１００における全体処理概要を示す図である。2 is a diagram showing an overview of overall processing in MFP 100. FIG. ブロックセレクション処理の概要を説明するための図である。It is a figure for demonstrating the outline | summary of a block selection process. ブロックセレクション処理で得られた各ブロックに対するブロック情報の一例を示す図である。It is a figure which shows an example of the block information with respect to each block obtained by the block selection process. ブロックセレクション処理（ステップＳ３０２）で得られた各ブロックに対して行うベクトル化処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the vectorization process performed with respect to each block obtained by the block selection process (step S302). アウトライン作成部における処理の概要を説明するための図である。It is a figure for demonstrating the outline | summary of the process in an outline preparation part. アウトライン化する前の画像データとアウトライン化したベクトルデータの例を示す図である。It is a figure which shows the example of the image data before making it outline, and the vector data made outline. 表処理部における処理の概要を説明するための図である。It is a figure for demonstrating the outline | summary of the process in a table process part. 表処理部における処理の概要を説明するための図である。It is a figure for demonstrating the outline | summary of the process in a table process part. 図形認識部における処理の概要を説明するための図である。It is a figure for demonstrating the outline | summary of the process in a figure recognition part. ＤＡＯＦのデータ構造の一例を示す図である。It is a figure which shows an example of the data structure of DAOF. 文書構造ツリーを説明するための図である。It is a figure for demonstrating a document structure tree. 紙原稿中の簡易文字拡大を実現する一例を示すＵＩを示す図である。FIG. 3 is a diagram illustrating a UI showing an example of realizing simple character enlargement in a paper document. 文字拡大によるレイアウト手法について説明するための図である。It is a figure for demonstrating the layout method by character expansion. 紙原稿中の文字を拡大したベクトル画像を作成するためのフローチャートである。6 is a flowchart for creating a vector image in which characters in a paper document are enlarged. 原稿内のオブジェクト領域拡大を行う場合の一例を示す図である。FIG. 6 is a diagram illustrating an example when an object area in a document is enlarged. セル罫線を用いて文字領域拡大を行う場合の一例を示す図である。It is a figure which shows an example in the case of enlarging a character area using a cell ruled line. 文字オブジェクト領域内で各文字を拡大マッピングする場合の一例を示す図である。It is a figure which shows an example in the case of carrying out enlarged mapping of each character within a character object area | region. 文字拡大レイアウトを行った場合の一例を示す図である。It is a figure which shows an example at the time of performing a character expansion layout.

Claims

An image processing apparatus that processes image data obtained by scanning a document including a plurality of objects having different attributes,
Dividing means for dividing the image data into blocks each made up of objects having different attributes;
A character processing means for vectorizing an object whose attribute is determined to be a character by the dividing means;
When the object vectorized by the character processing means is located in a cell constituting a table block among the blocks divided by the dividing means, the character block consisting of the object is inscribed in the cell. An image processing apparatus, comprising: a character enlargement unit that enlarges the object included in the character block in correspondence with the enlarged character block.

Table processing means for vectorizing an object whose attribute is determined to be a table by the dividing means;
The table block enlarging means for enlarging the object by enlarging the table block made of the object vectorized by the table processing means until it circumscribes another block. The image processing apparatus described.

The character block enlarging means for enlarging the object by enlarging the character block consisting of the object vectorized by the character processing means until it circumscribes another block. Image processing device.

The character block enlarging means is
When the object vectorized by the character processing means is located in a cell constituting a table block among the blocks divided by the dividing means, according to the enlargement by the table block enlargement means, The image processing apparatus according to claim 3, wherein the object is enlarged.

The character block enlarging means enlarges the character object by determining a character line folding position within a range that does not protrude from the character block of the character object included in the enlarged character block. The image processing apparatus according to claim 1.

An image processing method in an image processing apparatus for processing image data obtained by scanning a document including a plurality of objects having different attributes,
A division step of dividing the image data into blocks each made up of objects having different attributes;
A character processing step of vectorizing an object whose attribute is determined to be a character by the dividing step;
When the object vectorized by the character processing step is located in a cell constituting a table block among the blocks divided by the dividing step, the character block made up of the object is inscribed in the cell. And a character enlargement step of enlarging the object included in the character block corresponding to the enlarged character block.

A table processing step of vectorizing an object whose attribute is determined to be a table by the dividing step;
The table block enlarging step of enlarging the object by enlarging the table block made of the object vectorized by the table processing step until it circumscribes the other block. The image processing method as described.

The character block enlargement step according to claim 7, further comprising a character block enlargement step of enlarging the object by enlarging a character block made up of the object vectorized by the character processing step until it circumscribes another block. Image processing method.

The character block expansion step includes
When the object vectorized by the character processing step is located in a cell constituting a table block among the blocks divided by the division step, according to the enlargement by the table block enlargement step, The image processing method according to claim 8, wherein the object is enlarged.

In the character block enlargement step, the character object included in the enlarged character block is enlarged by determining a wrapping position of the character line within a range that does not protrude from the character block. The image processing method according to claim 6.

A storage medium storing a control program for realizing the image processing method according to claim 6 by a computer.

A control program for realizing the image processing method according to claim 6 by a computer.