JP2007043662A

JP2007043662A - Image forming apparatus and image processor

Info

Publication number: JP2007043662A
Application number: JP2006153102A
Authority: JP
Inventors: Yuzuru Suzuki; 譲鈴木; Hiroyuki Kono; 裕之河野
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2005-07-01
Filing date: 2006-06-01
Publication date: 2007-02-15
Anticipated expiration: 2026-06-01
Also published as: JP4811133B2

Abstract

<P>PROBLEM TO BE SOLVED: To produce a structured document from an original image with a little user burden and high precision while maintaining high productivity of a compound machine. <P>SOLUTION: An image of an original read by a compound machine is separated into image regions by an image region separation circuit 28, position/shape information 102 and an image region class 104 are determined for each image region, and a spatial frequency and other image feature information are determined for images in each of the image regions. A structuring section 40 refers to schemer information of structured documents registered on a schemer DB 42 and collates the position/shape, image region class and image features of each of elements with the position/shape, image region class and image features of each of the regions of the original image, thereby determining whether the original is suited to the schemer information. If suited, the original image is converted into structured document in accordance with the schemer information. If not suited, schemer information with respective image regions as temporary components is created and presented to a user and after correction from the user, a result of the correction is registered on the schemer DB 42 as new schemer information. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、画像形成装置又は画像処理装置に関し、特に読み取られた原稿の画像から構造化文書のデータを生成するための技術に関する。 The present invention relates to an image forming apparatus or an image processing apparatus, and more particularly to a technique for generating structured document data from a read original image.

スキャナで読み取った原稿画像からＳＧＭＬ(Standard Generalized Markup Language )やＨＴＭＬ(Hypertext Markup Language)、ＸＭＬ(eXtensible Markup Language)等の構造化記述言語で記述された構造化文書を生成する技術として、特許文献１〜３に示されるものがある。 As a technique for generating a structured document described in a structured description language such as SGML (Standard Generalized Markup Language), HTML (Hypertext Markup Language), or XML (eXtensible Markup Language) from a document image read by a scanner, Patent Document 1 There are those shown in ~ 3.

特許文献１の装置は、光学的に読み取った文書画像から文字領域及び画像領域をそれぞれ区別して抽出し、各領域のレイアウトを求め、文字領域には文字認識を行い、レイアウトと文字認識結果を用いて構造化文書を作成する。この装置では、この一連の処理を自動処理で行っている。 The apparatus of Patent Document 1 extracts a character area and an image area from an optically read document image, obtains a layout of each area, performs character recognition on the character area, and uses the layout and the character recognition result. To create a structured document. In this apparatus, this series of processing is performed automatically.

特許文献２の装置は、読み取った文書の画像を文字領域、表領域、イメージ領域に分離し、各領域の矩形を表示する。その矩形をユーザが移動させることによって、レイアウト編集が行われ、編集されたレイアウトにしたがったハイパーテキストが作成される。 The apparatus of Patent Document 2 separates a read document image into a character area, a table area, and an image area, and displays a rectangle of each area. When the user moves the rectangle, layout editing is performed, and hypertext is created according to the edited layout.

特許文献３の装置は、読み取った文書に対してユーザから領域の指定及び各領域の属性（該領域が文字列か又は図表か、文字認識のための辞書の種類など）の入力を受け、例えば領域の文字認識を行う場合、その領域の属性に基づき選択した辞書を使用する。そして、属性に基づいて、文字領域、図表領域の双方に対してマークアップ処理を行って構造化文書を生成している。 The apparatus of Patent Document 3 receives an input of an area designation and an attribute of each area (whether the area is a character string or a chart, a dictionary type for character recognition, etc.) from the user for the read document. When character recognition of an area is performed, a dictionary selected based on the attribute of the area is used. Based on the attributes, a markup process is performed on both the character area and the chart area to generate a structured document.

特開平１１−０６６１９６号公報Japanese Patent Application Laid-Open No. 11-066196 特開平１１−３１２２３１号公報JP 11-312231 A 特開平１０−１６２０９８号公報Japanese Patent Laid-Open No. 10-162098

特許文献１及び２の技術は、文書画像の自動的な領域分離の結果を構造化文書に反映してはいるものの、自動的な領域分離で分離可能なのは文字領域と画像領域、或いは表領域といった比較的大まかな分類までである。例えば表題も本文も同じ文字領域としか分類されないので、表題要素と本文要素とを分けた構造化文書を作成したくても、自動的にはそれを実現することができない。 Although the techniques of Patent Documents 1 and 2 reflect the result of automatic region separation of a document image in a structured document, character regions and image regions or table regions can be separated by automatic region separation. This is a relatively rough classification. For example, since the title and the text are only classified as the same character area, even if it is desired to create a structured document in which the title element and the text element are separated, it cannot be automatically realized.

特許文献３の技術は、ユーザが領域指定を行う必要があるため、細かな領域分類が可能かも知れないが、一つ一つの文書に対して全てユーザが領域指定を行うのでは、ユーザ負担が大きすぎる。 Since the technique of Patent Document 3 requires the user to specify the area, it may be possible to classify the area finely. However, if the user specifies the area for every single document, the user burden is reduced. Too big.

本発明は、光学的に読み取った原稿の画像から構造化文書を生成する処理を、ユーザの作業負担が少ない形で、高精度に、かつデジタル複合機の高生産性を維持しつつ実現できるようにする。 According to the present invention, a process for generating a structured document from an optically read original image can be realized with high accuracy and high productivity of a digital multi-function peripheral in a form that reduces the burden on the user. To.

本発明は、原稿を読み取る読取部と、前記読取部が読み取った原稿の画像を像域分離する像域分離部と、前記像域分離部が分離した各像域に対しそれぞれその像域の画像種別に応じた画像処理を施す画像処理部と、前記画像処理部で画像処理された原稿の画像を出力する出力部と、を備える画像形成装置であって、前記像域分離部が分離した各像域についてそれぞれ画像特徴を求める特徴演算部と、構造化文書の各要素の位置及び画像特徴とが該構造化文書の構造情報として登録される構造情報登録部と、前記像域分離部が求めた各像域の位置及び前記特徴演算部が求めた各像域の画像特徴と、前記構造情報登録部に登録された情報とに基づき、それら各像域を構造化文書の各要素を対応づけ、この対応付け結果に基づき前記原稿画像に対応する構造化文書を生成する構造化部と、を備える。 The present invention provides a reading unit that reads an original, an image area separating unit that separates an image of an original read by the reading unit, and an image in the image area for each image area separated by the image area separating unit. An image forming apparatus comprising: an image processing unit that performs image processing according to a type; and an output unit that outputs an image of a document image-processed by the image processing unit. A feature calculation unit for obtaining an image feature for each image area, a structure information registration unit for registering the position and image feature of each element of the structured document as structure information of the structured document, and the image area separation unit Each image area is associated with each element of the structured document based on the position of each image area, the image feature of each image area obtained by the feature calculation unit, and the information registered in the structure information registration unit. , Corresponding to the original image based on the result of this association Comprising a structural unit which generates a structured document that, a.

本発明の好適な態様では、前記像域分離部は、該画像の各部の画像特徴を求め、求めた画像特徴を用いて前記原稿の画像の像域分離を実行するものであり、前記特徴演算部は、前記各像域についての画像特徴を求めるに当たり、前記像域分離部が像域分離のための画像特徴を求めるために行った演算結果を利用する。 In a preferred aspect of the present invention, the image area separation unit obtains an image feature of each part of the image, performs image area separation of the image of the document using the obtained image feature, and performs the feature calculation. The unit uses the result of the calculation performed by the image region separation unit to obtain the image feature for image region separation when obtaining the image feature for each image region.

別の好適な態様では、前記構造化部は、前記像域分離部の処理により得られた前記各像域のレイアウトをユーザに提示するレイアウト提示部と、該レイアウト中の各像域に対応する新規な構造化文書の要素の指定をユーザから受け付け、各像域の位置及び画像特徴を当該像域に対してユーザが指定した要素と対応づけて、前記新規な構造化文書の構造情報として前記構造情報登録部に登録する登録処理部と、を備える。 In another preferred aspect, the structuring unit corresponds to each image area in the layout, and a layout presentation unit that presents a layout of each image area obtained by the processing of the image area separation unit to the user. The specification of the element of the new structured document is received from the user, the position and image feature of each image area are associated with the element specified by the user for the image area, and the structure information of the new structured document is A registration processing unit registered in the structure information registration unit.

以下、図面を参照して、本発明を実施するための最良の形態（以下「実施形態」と呼ぶ）について説明する。 The best mode for carrying out the present invention (hereinafter referred to as “embodiment”) will be described below with reference to the drawings.

図１は、本実施形態のシステム構成例を示す図である。図１に示すように、このシステムは、ＬＡＮ（ローカル・エリア・ネットワーク）等のネットワーク４を介して相互に接続されたデジタル複合機１、クライアントマシン２及び文書ＤＢ（データベース）サーバ３から構成されている。 FIG. 1 is a diagram illustrating a system configuration example of the present embodiment. As shown in FIG. 1, this system comprises a digital multi-function peripheral 1, a client machine 2, and a document DB (database) server 3 connected to each other via a network 4 such as a LAN (local area network). ing.

デジタル複合機１は、ネットワークプリンタ、ネットワークスキャナ、複写機、ファクシミリ装置等の機能を兼ね備えた多機能装置である。本実施形態では、デジタル複合機１に対し、スキャナ機能で読み取った原稿画像をＸＭＬ等の構造化記述言語の構造化文書データに変換する機能を持たせる。そして、この変換機能により紙原稿から構造化文書データを生成し、これを文書ＤＢサーバ３に登録するといった定型的な業務の自動化を目指す。 The digital multi-function peripheral 1 is a multi-function device having functions of a network printer, a network scanner, a copying machine, a facsimile machine, and the like. In the present embodiment, the digital multi-function peripheral 1 is provided with a function of converting a document image read by the scanner function into structured document data in a structured description language such as XML. This conversion function aims to automate routine tasks such as generating structured document data from a paper document and registering it in the document DB server 3.

原稿画像を構造化文書に変換する機能には、デジタル複合機１が備える像域分離機能を利用する。すなわち、複写機や複合機等の画像形成装置は、印刷画質向上のために、読み取った画像中の文字領域と画像領域とを分離する像域分離の機能を備えるものが一般的になっている。また、高速な印刷処理のため、この像域分離の機能の多くの部分はＡＳＩＣ(Application Specific Integrated Circuit) やＤＳＰ(Digital Signal Processor)等のハードウエア回路として実装されている。本実施形態では、このような像域分離回路に必要に応じて拡張を施したものを利用することで、構造化文書への変換の効率化を図る。 For the function of converting a document image into a structured document, an image area separation function provided in the digital multi-function peripheral 1 is used. That is, an image forming apparatus such as a copying machine or a multi-function machine is generally provided with an image area separation function for separating a character area and an image area in a read image in order to improve print image quality. . For high-speed printing processing, many parts of the image area separation function are implemented as hardware circuits such as an ASIC (Application Specific Integrated Circuit) and a DSP (Digital Signal Processor). In the present embodiment, the conversion into a structured document is made efficient by using such an image area separation circuit that has been expanded as necessary.

ただし、像域分離処理では画像中の文字領域や画像領域などの領域分けは分かるものの、各領域がどの種類の構造化文書のどの要素に該当するかなどの構造情報は像域分離だけでは得られない。そこで、本実施形態では、デジタル複合機１の暫定的な自動変換結果に対し、ユーザが修正・編集を加えることで、構造化文書の構造情報を定め、この構造情報をデジタル複合機１にフィードバックすることで、自動変換の精度を向上させていくというアプローチをとる。図１の例では、デジタル複合機１の暫定的な変換結果をクライアントマシン２にインストールされた構造化文書エディタ２ａに渡し、このエディタ２ａ上でユーザが構造情報の編集作業を行う。 However, the image area separation process can identify the areas such as character areas and image areas in the image, but structural information such as which element of each type of structured document each area can be obtained only by image area separation. I can't. Therefore, in the present embodiment, the user modifies and edits the provisional automatic conversion result of the digital multi-function peripheral 1 to determine the structure information of the structured document, and the structure information is fed back to the digital multi-function peripheral 1. By doing so, we take the approach of improving the accuracy of automatic conversion. In the example of FIG. 1, the provisional conversion result of the digital multi-function peripheral 1 is passed to the structured document editor 2a installed in the client machine 2, and the user edits the structural information on the editor 2a.

デジタル複合機１の制御機構の主要部を図２に示す。図２においてＲＯＭ（リード・オンリ・メモリ）１２には、デジタル複合機１の動作制御のための制御プログラムなどのデジタル情報が格納されている。ＣＰＵ（中央処理装置）１０がこのＲＯＭ１２内の制御プログラムを実行することにより、デジタル複合機１の各部の制御が実現される。 The main part of the control mechanism of the digital multi-function peripheral 1 is shown in FIG. In FIG. 2, a ROM (Read Only Memory) 12 stores digital information such as a control program for controlling the operation of the digital multi-function peripheral 1. A CPU (Central Processing Unit) 10 executes a control program in the ROM 12 to realize control of each part of the digital multi-function peripheral 1.

ＲＡＭ（ランダム・アクセス・メモリ）１４は、デジタル複合機１の主記憶装置であり、制御プログラムの実行の際にワークメモリとしても用いられる。ＲＡＭ１４は、例えば、プリントエンジン２４に供給する１ページ分の画像データを蓄えるページバッファとして用いることもできる。 A RAM (Random Access Memory) 14 is a main storage device of the digital multi-function peripheral 1 and is also used as a work memory when executing a control program. The RAM 14 can also be used as a page buffer that stores image data for one page to be supplied to the print engine 24, for example.

大容量記憶装置１６は、各種のデータを保存するための補助記憶装置であり、例えばハードディスクやＥＥＰＲＯＭ(Electrically Erasable Programmable Read-Only Memory)などの不揮発性の記憶装置である。 The mass storage device 16 is an auxiliary storage device for storing various data, and is a non-volatile storage device such as a hard disk or an EEPROM (Electrically Erasable Programmable Read-Only Memory).

操作パネル１８は、この画像形成装置のユーザインタフェースのための表示や、ユーザからの各種指示の入力受付などのためのユーザインタフェース手段である。操作パネル１８は、例えば、スタートボタンなどの機械的な操作ボタンと、ＧＵＩ（グラフィカル・ユーザ・インタフェース）のための液晶タッチパネルと、を備える。液晶タッチパネルは、ＣＰＵ１０で実行される制御プログラムが生成したＧＵＩ画面を表示し、そのディスプレイに対するユーザのタッチ位置を検出して制御プログラムに渡す。制御プログラムは、そのタッチ位置の情報からユーザの入力内容を解釈する。 The operation panel 18 is a user interface means for displaying the user interface of the image forming apparatus and receiving input of various instructions from the user. The operation panel 18 includes, for example, mechanical operation buttons such as a start button, and a liquid crystal touch panel for GUI (graphical user interface). The liquid crystal touch panel displays a GUI screen generated by the control program executed by the CPU 10, detects the touch position of the user on the display, and passes it to the control program. The control program interprets the input content of the user from the information on the touch position.

通信インタフェース２０は、ネットワーク４上の他の装置とのデータ通信のための制御を行う装置である。リモートホストからのプリント指示等は、この通信インタフェース２０を介して画像形成装置内に入力される。 The communication interface 20 is a device that performs control for data communication with other devices on the network 4. A print instruction or the like from the remote host is input into the image forming apparatus via the communication interface 20.

スキャンエンジン２２は、原稿を光学的に読み取って電子的な画像データを生成するスキャナ機能を提供する装置である。自動原稿送り装置（ＡＤＦ）（図示省略）にセットされた原稿は、ＡＤＦの機能により１枚ずつスキャンエンジンに送られ、光学的に読み取られる。 The scan engine 22 is a device that provides a scanner function for optically reading a document and generating electronic image data. Documents set on an automatic document feeder (ADF) (not shown) are fed one by one to the scan engine by the ADF function and optically read.

プリントエンジン２４は、ＣＰＵ１０の制御により供給される画像データを用紙に画像形成（印刷）するプリンタ機能を提供する装置である。 The print engine 24 is a device that provides a printer function for forming (printing) image data supplied on the paper under the control of the CPU 10.

ファクシミリモジュール２６は、ファクシミリデータの送受信を行うモジュールである。 The facsimile module 26 is a module that transmits and receives facsimile data.

像域分離回路２８は、スキャンエンジン２２が読み取った原稿画像の像域分離を行う回路である。よく知られるように、像域分離では、画像のエッジ強度や空間周波数などの各種の画像特徴を用いて文字や画像などの像域を判別する。従来の像域分離回路はそのような像域分離の結果を出力するものであるが、本実施形態の像域分離回路２８はその像域分離結果に加え、各像域の画像特徴のデータも併せて出力する（詳細は後述）。また、従来は、像域分離回路は、もっぱら印刷などといった画像出力のためのみに用いられていたが、本実施形態の像域分離回路２８は、画像出力だけでなく、原稿画像から構造化文書を生成する際の前処理にも利用する。 The image area separation circuit 28 is a circuit that performs image area separation of a document image read by the scan engine 22. As is well known, in image area separation, image areas such as characters and images are discriminated using various image features such as edge strength and spatial frequency of the image. Although the conventional image area separation circuit outputs the result of such image area separation, the image area separation circuit 28 according to the present embodiment also provides image characteristic data of each image area in addition to the image area separation result. Also output (details will be described later). Conventionally, the image area separation circuit is exclusively used for image output such as printing, but the image area separation circuit 28 of the present embodiment is not only for image output, but also from a document image to a structured document. It is also used for preprocessing when generating.

画像処理回路３０は、原稿画像に対し、印刷やファクシミリ送信等の用途に応じた画像処理を施す回路である。例えば、印刷を行う場合、像域分離回路２８で分離された文字と画像の像域に対し、それぞれ文字用の画像処理（エッジ強調など）と画像用の画像処理（階調補正など）を施す。 The image processing circuit 30 is a circuit that performs image processing on a document image according to applications such as printing and facsimile transmission. For example, when printing, character image processing (such as edge enhancement) and image processing (such as gradation correction) are performed on the character and image image regions separated by the image region separation circuit 28, respectively. .

このようなデジタル複合機１において、原稿画像を構造化文書へ変換する機能を実現する仕組みについて、図３を参照して説明する。 A mechanism for realizing the function of converting a document image into a structured document in the digital multi-function peripheral 1 will be described with reference to FIG.

図３に示す機能モジュールのうち、像域分離回路２８はＡＳＩＣやＤＳＰなどのハードウエア回路であり、構造化部４０と文字認識部４４はＣＰＵ１０でプログラムを実行することによりソフトウエア的に実現する。また、スキーマＤＢ４２は、例えば大容量記憶装置１６を用いて構築される。ただし、これはあくまで一例に過ぎず、構造化部４０や文字認識部４４の一部又は全部をハードウエア回路として実現することも考えられるし、像域分離処理の一部をソフトウエア的に実現することも考えられる。このように本実施形態では複合機の内部処理機構として従来からある像域分離回路を利用して、構造化文書生成の前処理を行っている。 Among the functional modules shown in FIG. 3, the image area separation circuit 28 is a hardware circuit such as an ASIC or a DSP, and the structuring unit 40 and the character recognition unit 44 are realized by software by the CPU 10 executing a program. . Further, the schema DB 42 is constructed using, for example, the mass storage device 16. However, this is only an example, and part or all of the structuring unit 40 and the character recognition unit 44 may be realized as a hardware circuit, and part of the image area separation processing is realized by software. It is also possible to do. As described above, in this embodiment, the conventional processing for generating a structured document is performed by using a conventional image area separation circuit as an internal processing mechanism of a multifunction peripheral.

原稿画像を構造化文書へ変換する処理では、まずスキャンエンジン２２で読み取られた原稿画像が、像域分離回路２８に供給される。像域分離回路２８は、その原稿画像に対し像域分離を実行するとともに、分離された各像域の画像特徴データを計算する。この像域分離の処理では、原稿画像各部のエッジ強度や空間周波数（例えばＤＣＴ（離散コサイン変換）の周波数成分）などといった１乃至複数の画像特徴を求め、これら画像特徴を総合評価することで画像を１乃至複数の像域に分離し、各像域の種別（文字か写真かグラフィクスか、など）を判別する。従来の画像形成装置でも、その画像特徴の計算とそれに基づく像域分離はその全部又は大部分がハードウエア回路で実現されていたが、本実施形態の像域分離回路２８も、それら従来と同様の画像特徴計算及び像域分離のためのハードウエア回路を備える。 In the process of converting a document image into a structured document, a document image read by the scan engine 22 is first supplied to the image area separation circuit 28. The image area separation circuit 28 performs image area separation on the original image and calculates image feature data of each separated image area. In this image area separation processing, one or a plurality of image features such as edge intensity and spatial frequency (for example, DCT (discrete cosine transform frequency component)) of each part of the original image are obtained, and these image features are comprehensively evaluated to obtain an image. Are divided into one or a plurality of image areas, and the type of each image area (character, photograph, graphics, etc.) is determined. Even in the conventional image forming apparatus, the calculation of the image feature and the image area separation based on the calculation are all or most realized by the hardware circuit, but the image area separation circuit 28 of the present embodiment is also the same as the conventional image forming apparatus. A hardware circuit for image feature calculation and image area separation is provided.

そして、本実施形態の像域分離回路２８は、更に、分離された各像域について、１乃至複数の画像特徴を求める機能を備える。ここで求める各像域の画像特徴には、例えば、当該像域内の画像の空間周波数情報、当該像域内の画素値のヒストグラム、当該像域内の画像を二値化した二値画像における画素のオン・オフ比（例えばオン画素の割合）、その二値画像におけるランレングスのヒストグラムなどを例示できる。空間周波数情報は、例えば像域の空間周波数分布の特徴を示す平均周波数やピーク周波数などの指標値やその組合せなどである。画素値のヒストグラムは、図４に示すように、画素値ごとに、その画素値を持つ画素が像域内に幾つあるか、その頻度を集計したヒストグラムである。図４は、文字像域の画素値ヒストグラムを模式的に示したものであり、文字像域では文字部分と非文字部分とでコントラストが非常に大きいので、画素値の大きい部分と小さい部分とにピークが現れている。画像像域の場合、ヒストグラムは文字像域のような顕著なピークパターンは示さないため、文字・画像の像域を判別するのにこのヒストグラムを利用できる。そして、同じ文字像域でも文字数や文字サイズ、フォント、字詰め、行詰めなどが異なれば、ヒストグラムのピークの高さや位置が変わってくるため、このヒストグラムは個々の文字像域の識別のための１つの手がかりとなる。画像（写真）像域の場合も同様であり、画像の内容によってヒストグラムの形状は様々に変わるので、このヒストグラムを画像像域自体の識別のための判断材料として使うこともできる。画像特徴としては、このヒストグラムそのものを用いてもよい。また、そのヒストグラムの特徴を示す指標値、例えば各ピークごとの位置（画素値）と高さ（頻度）のペア、を画像特徴としてもよい。カラー画像の場合、Ｒ，Ｇ，Ｂなどといった各原色のそれぞれについてのヒストグラム（又はその指標値）の組合せを画像特徴として用いることができる。 The image area separation circuit 28 of the present embodiment further has a function of obtaining one or more image features for each separated image area. The image characteristics of each image area obtained here include, for example, spatial frequency information of an image in the image area, a histogram of pixel values in the image area, pixel on in a binary image obtained by binarizing the image in the image area. An off ratio (for example, a ratio of on pixels), a run length histogram in the binary image, and the like can be exemplified. The spatial frequency information is, for example, an index value such as an average frequency or a peak frequency indicating the characteristics of the spatial frequency distribution in the image area, a combination thereof, or the like. As shown in FIG. 4, the histogram of pixel values is a histogram in which the number of pixels having the pixel value for each pixel value and the frequency of the pixels are counted. FIG. 4 schematically shows a pixel value histogram of the character image area. In the character image area, the contrast between the character part and the non-character part is very large. A peak appears. In the case of an image image area, the histogram does not show a prominent peak pattern as in a character image area, so that this histogram can be used to determine the image area of a character / image. Even if the number of characters, character size, font, padding, line padding, etc. are different even in the same character image area, the peak height and position of the histogram change. This histogram is used to identify individual character image areas. One clue. The same applies to the image (photo) image area, and the shape of the histogram changes variously depending on the content of the image. Therefore, this histogram can also be used as a judgment material for identifying the image image area itself. The histogram itself may be used as the image feature. In addition, an index value indicating the feature of the histogram, for example, a position (pixel value) and height (frequency) pair for each peak may be used as the image feature. In the case of a color image, a combination of histograms (or index values thereof) for each primary color such as R, G, B, etc. can be used as an image feature.

像域の二値化結果のランレングスのヒストグラムも、同様に、ランレングス値ごとに、当該像域の二値化結果においてそのランレングス値が現れる頻度を求めてグラフ化したものである。ランレングスは、例えばオン画素について求めればよい。ランレングスのヒストグラムの場合も、そのヒストグラム自身を画像特徴として用いてもよいし、そのヒストグラムの特徴を示す指標値を画像特徴として用いてもよい。 Similarly, the run length histogram of the binarization result of the image area is a graph obtained by calculating the frequency at which the run length value appears in the binarization result of the image area for each run length value. The run length may be obtained for the on pixel, for example. In the case of a run-length histogram, the histogram itself may be used as an image feature, or an index value indicating the feature of the histogram may be used as an image feature.

これら像域ごとの画像特徴の算出には、像域分離のための基礎データとして求めた原稿画像各部の画像特徴のデータやその基礎データを求める過程で求めた各種演算結果を流用することができる。例えば、像域分離の基礎に用いる画像特徴には画像の二値化結果を用いるものがあるので、像域分離のために求めた二値化画像は、分離した各像域の画像特徴の算出に利用できる。また、像域分離の方式によっては、像域分離のために原稿画像各部における空間周波数を求めるものもあるので、そのような方式の場合、求めた空間周波数の情報を、各像域の空間周波数分布の特徴を求める際に利用できる。このように、本実施形態の各像域の画像特徴の算出は、既存の像域分離回路の処理結果を利用できるので、像域分離及び画像特徴算出の処理全体としての回路規模の増大を抑えることができる。 In calculating the image features for each image area, it is possible to use the image feature data of each part of the original image obtained as basic data for image area separation and the various calculation results obtained in the process of obtaining the basic data. . For example, some image features used for image area separation use image binarization results, so the binarized image obtained for image area separation calculates the image features for each separated image area. Available to: Some image area separation methods determine the spatial frequency in each part of the original image for image area separation. In such a system, the information on the obtained spatial frequency is used as the spatial frequency of each image area. This can be used to determine distribution characteristics. As described above, the calculation of the image feature of each image area according to the present embodiment can use the processing result of the existing image area separation circuit, thereby suppressing an increase in the circuit scale of the entire image area separation and image feature calculation process. be able to.

以上に例示した像域の画像特徴の演算は、ワイヤードロジック又はＤＳＰ或いはそれらの組合せのハードウエア回路として実現できるものである。したがって、従来の像域分離回路にその像域の画像特徴を演算するための回路を追加することで、本実施形態の像域分離回路２８を実現することができる。 The calculation of the image feature of the image area exemplified above can be realized as a hardware circuit of wired logic, DSP, or a combination thereof. Therefore, the image area separation circuit 28 of the present embodiment can be realized by adding a circuit for calculating the image characteristics of the image area to the conventional image area separation circuit.

以上、像域分離回路２８が求める各像域の画像特徴をいくつか示したが、これらはあくまで例示的なものに過ぎない。目的・用途に応じて適切な画像特徴を選択し利用すればよい。例示した全てを用いる必要はないし、例示したもの以外を用いてももちろんよい。また、以上に例示した像域の画像特徴は、ハードウエア回路で演算できるものであったが、これに限らず像域の画像特徴のうちのいくつかをソフトウエア処理で求めてももちろんよい。ソフトウエア処理で求められる画像特徴の一例としては、例えば像域の文字数、行数などを例示することができる。 Although several image features of each image area required by the image area separation circuit 28 have been described above, these are merely illustrative. An appropriate image feature may be selected and used according to the purpose and application. It is not necessary to use all of the exemplified ones, and other than those exemplified may of course be used. The image characteristics of the image area exemplified above can be calculated by a hardware circuit. However, the present invention is not limited to this, and some of the image characteristics of the image area may be obtained by software processing. As an example of the image characteristics required by software processing, for example, the number of characters in the image area, the number of lines, and the like can be exemplified.

以上のようにして、像域分離回路２８は、例えば図３に示すように像域属性１００と像域画像１１０とを出力する。像域属性１００は、当該像域の位置及び形状を示す位置・形状情報１０２と、当該像域の種別を示す像域種別１０４と、当該像域の各画像特徴（空間周波数、二値化画像のオン・オフ比など）を示す画像特徴情報１０６とを含む。位置・形状情報１０２は、像域の形状を矩形とするならば、その対角線上の２頂点の座標情報でよい。この他にも、像域の位置や形状は従来の像域分離で用いられる方式で表現できる。像域画像１１０は、当該像域の画像データである。これら像域属性１００及び像域画像１１０は、構造化部４０に渡される。なお、像域分離回路２８が像域属性１００と像域画像１１０の両方を構造化部４０に渡す代わりに像域属性１００のみを渡し、構造化部４０が、その像域属性１００の位置・形状情報１０２を用いて、スキャンエンジン２２が読み取った原稿画像の中から像域画像１１０を取り出して利用してもよい。 As described above, the image area separation circuit 28 outputs the image area attribute 100 and the image area image 110 as shown in FIG. 3, for example. The image area attribute 100 includes position / shape information 102 indicating the position and shape of the image area, an image area type 104 indicating the type of the image area, and each image feature (spatial frequency, binarized image) of the image area. Image feature information 106 indicating the on / off ratio of the image). If the shape of the image area is a rectangle, the position / shape information 102 may be coordinate information of two vertices on the diagonal line. In addition, the position and shape of the image area can be expressed by a method used in conventional image area separation. The image area image 110 is image data of the image area. The image area attribute 100 and the image area image 110 are passed to the structuring unit 40. Note that instead of the image area separation circuit 28 passing both the image area attribute 100 and the image area image 110 to the structuring unit 40, only the image area attribute 100 is passed. The image area image 110 may be extracted from the original image read by the scan engine 22 using the shape information 102 and used.

構造化部４０は、像域属性１００及び像域画像１１０の情報を用いて、原稿画像に対応する構造化文書のデータを作成する。この処理において、構造化部４０は、必要に応じ、スキーマＤＢ４２に登録されたスキーマ情報を参照する。 The structuring unit 40 creates structured document data corresponding to the document image using the information of the image area attribute 100 and the image area image 110. In this process, the structuring unit 40 refers to the schema information registered in the schema DB 42 as necessary.

スキーマＤＢ４２には、図５に示すように、文書種別ごとに、文書種別名２００，構造データ２０２，及び要素データ２０４を含んだスキーマ情報が登録される。文書種別は、ユーザが適宜定めるものである。例えば、週報、連絡書、依頼書、技術解説、論文など、ユーザが文書ＤＢサーバ３に登録して管理しようとする様々な文書に対し、ユーザは適宜その種別を設定することができる。文書種別名２００は文書種別に対しユーザが設定した識別名である。 As shown in FIG. 5, schema information including a document type name 200, structure data 202, and element data 204 is registered in the schema DB 42 as shown in FIG. The document type is appropriately determined by the user. For example, the user can appropriately set the type of various documents that the user wants to register and manage in the document DB server 3 such as a weekly report, a communication form, a request form, a technical commentary, and a thesis. The document type name 200 is an identification name set by the user for the document type.

構造データ２０２は、当該文書種別に該当する構造化文書の文書構造を示すデータである。周知のように、ＳＧＭＬ、ＸＭＬ等を代表とする構造化文書は、図６に示すように、文書要素（以下、単に「要素」という）が構成するツリー（木）構造として規定される。図６に例示する構造化文書は、要素Ａの子に要素Ｂ及びＦが存在し、要素Ｂの子に要素Ｃ，Ｄ，Ｅが存在するというツリー構造を持った文書である。構造データ２０２は、このように、当該文書種別の構造化文書の要素群がなすツリー構造を示すデータである。 The structure data 202 is data indicating the document structure of the structured document corresponding to the document type. As is well known, a structured document represented by SGML, XML, etc. is defined as a tree structure formed by document elements (hereinafter simply referred to as “elements”) as shown in FIG. The structured document illustrated in FIG. 6 is a document having a tree structure in which elements B and F exist as children of the element A, and elements C, D, and E exist as children of the element B. As described above, the structure data 202 is data indicating a tree structure formed by the element group of the structured document of the document type.

要素データ２０４は、ツリー構造を構成する各要素の個別の情報であり、図７に示すように、要素名２１０，位置・形状２１２，像域種別２１４，画像特徴２１６，及びスタイル属性２２２を含む。要素名２１０は当該要素の識別名である。位置・形状２１２は、原稿画像中で当該要素の占める領域（像域）の位置・形状を示し、像域種別２１４は、当該要素の像域の種別（文字、写真などの区別）を示す。画像特徴２１６は、当該要素の像域の画像が持つ１乃至複数の画像特徴の情報である。画像特徴２１６には、ＤＣＴ周波数２１８（あるいは空間周波数）、二値化画像のオン・オフ比率２２０、画素値のヒストグラム情報など、所定の各特徴項目のデータが含まれる。スタイル属性２２２は、当該要素の像域内の文字・画像に対するスタイル情報である。例えば、像域内の文字のサイズやフォント、字詰めなどの情報がスタイル属性２２２の一例である。スタイル属性２２２は、原稿画像中の各像域の内部の解析の際の補助情報として利用したり（例えば文字認識の際の各文字の切り出しに文字サイズや字詰めの情報を活用するなど）、構造化文書を作成する際にその中の要素の属性として記述したりするなどの用途で利用できる。 The element data 204 is individual information of each element constituting the tree structure, and includes an element name 210, a position / shape 212, an image area type 214, an image feature 216, and a style attribute 222 as shown in FIG. . The element name 210 is an identification name of the element. The position / shape 212 indicates the position / shape of the area (image area) occupied by the element in the document image, and the image area type 214 indicates the type of image area of the element (distinguishment between characters, photographs, etc.). The image feature 216 is information on one or more image features that the image in the image area of the element has. The image feature 216 includes data of predetermined feature items such as a DCT frequency 218 (or spatial frequency), an on / off ratio 220 of a binarized image, and histogram information of pixel values. The style attribute 222 is style information for characters / images in the image area of the element. For example, information such as the size, font, and padding of characters in the image area is an example of the style attribute 222. The style attribute 222 is used as auxiliary information when analyzing the inside of each image area in the document image (for example, information on character size or padding is used to cut out each character during character recognition). It can be used for the purpose of describing as an attribute of an element in creating a document.

以上では、様々な画像特徴を例示したが、１つの要素の要素データ２０４に、それら全ての種類の画像特徴が登録される必要はない。例えば、画像特徴の中には、文字像域同士を区別する際の指標として適切なものもあれば、画像（写真）像域同士の区別のための指標として適切なものもある（種類の異なる像域の識別は像域種別の情報でできる）。したがって、要素データ２０４には、当該要素の像域種別に対応して選ばれた画像特徴が登録されるようにしてもよい。 In the above, various image features have been exemplified, but it is not necessary to register all types of image features in the element data 204 of one element. For example, some image features are suitable as indices for distinguishing between character image areas, while others are suitable as indices for distinguishing between image (photo) image areas (different types). The image area can be identified by information on the image area type). Therefore, the image data selected corresponding to the image area type of the element may be registered in the element data 204.

なお、同種の文書でも細部が微妙に異なる場合が多いため、同種の文書でも各像域の画像特徴が文書間でぴったり一致することは極めて稀である。このため、同じ文書種別の構造化文書として構造化部４０に認識されるようにするために、位置・形状２１２や画像特徴２１６の各項目の値は、１つの「点」ではなく、適切な幅を持った「範囲」として設定しておくことが好ましい。 It should be noted that since the details of the same kind of documents are often slightly different, it is very rare that the image features of each image area are exactly the same among the documents of the same kind. Therefore, in order to be recognized by the structuring unit 40 as a structured document of the same document type, the value of each item of the position / shape 212 and the image feature 216 is not a single “point” but an appropriate value. It is preferable to set it as a “range” having a width.

以上に説明したスキーマＤＢ４２の登録データ構造はあくまで概念的なものであり、実際の登録データは上述の構造を表現できるものであればどのような表現形式（例えばＤＴＤ（文書型定義）、ＸＭＬスキーマ）のものでもよい。 The registration data structure of the schema DB 42 described above is conceptual only, and the actual registration data can be expressed in any representation format (for example, DTD (document type definition), XML schema) as long as it can express the above-described structure. ).

構造化部４０は、原稿画像中の各像域の像域属性１００と像域画像１１０が与えられた場合、スキーマＤＢ４２中の各文書種別のスキーマ情報を参照し、当該原稿画像に適合する文書種別を探す。すなわち、１つの文書種別を取り出して、その文書種別のスキーマ情報の各要素の位置・形状２１２，像域種別２１４，画像特徴２１６が、当該原稿画像の各像域の位置・形状情報１０２，像域種別１０４，画像特徴情報１０６と整合するか否かを判定し、整合すれば当該原稿画像はその文書種別に属すると判定する。この場合、例えば、１つの像域の位置・形状情報１０２が、ある要素の位置・形状２１２と一致し（或いはその位置・形状２１２の値からあらかじめ設定された許容範囲内であり）、且つその像域の像域種別１０４がその要素の像域種別２１４と一致し、且つその像域の画像特徴情報１０６の各項目の値がその要素の画像特徴２１６の対応項目の値と一致する（或いは許容範囲内である）場合に、その像域がその要素に整合したと判定する。原稿画像中の全ての像域とスキーマ情報の全ての要素とが一対一で余りなく対応し、整合した場合、その原稿画像がそのスキーマ情報の文書種別であると判定される。この判定に伴い、原稿画像の各像域に対応する要素も特定される。また、文書種別の各要素の位置・形状２１２や画像特徴２１６の各項目に理想値を定め、原稿画像の各像域の位置・形状や画像特徴の各項目の値を当該原稿画像の特徴を示す座標と見て、その座標と上記理想値の組が示す座標との距離などを原稿画像と文書種別との類似度の指標値として求め、その類似度の指標値が最も良好（距離の場合、最小）なものを当該原稿画像の文書種別と判定してもよい。ただしこの場合、最良の類似度指標値でも、あらかじめ設定したしきい値より低い場合は、その原稿画像に該当する文書種別がないと判定する。該当する文書種別がない場合は、新規の文書種別として処理する（詳細は後述）。 When the image area attribute 100 and the image area image 110 of each image area in the document image are given, the structuring unit 40 refers to the schema information of each document type in the schema DB 42 and conforms to the document image. Find the type. That is, one document type is extracted, and the position / shape 212, the image area type 214, and the image feature 216 of each element of the schema information of the document type are the position / shape information 102, image of each image area of the document image. It is determined whether or not the area type 104 and the image feature information 106 are matched. If they match, it is determined that the document image belongs to the document type. In this case, for example, the position / shape information 102 of one image area coincides with the position / shape 212 of an element (or is within an allowable range set in advance from the value of the position / shape 212), and The image area type 104 of the image area matches the image area type 214 of the element, and the value of each item of the image feature information 106 of the image area matches the value of the corresponding item of the image feature 216 of the element (or If it is within the tolerance range, it is determined that the image area matches the element. If all image areas in the document image correspond to all elements of the schema information in a one-to-one correspondence and match, it is determined that the document image is the document type of the schema information. With this determination, elements corresponding to each image area of the document image are also specified. Also, ideal values are set for the items of the position / shape 212 and the image feature 216 of each element of the document type, and the position / shape of each image area of the document image and the value of each item of the image feature are set as the features of the document image. The distance between the coordinates and the coordinates indicated by the ideal value pair is obtained as an index value of the similarity between the document image and the document type, and the similarity index value is the best (in the case of distance) (Minimum) may be determined as the document type of the document image. However, in this case, if even the best similarity index value is lower than a preset threshold value, it is determined that there is no document type corresponding to the document image. If there is no corresponding document type, it is processed as a new document type (details will be described later).

原稿画像が属する文書種別が判別できると、構造化部４０は、その文書種別の構造データ２０２及び要素データ２０４を用いて、その原稿画像を構造化文書に変換する。すなわち、原稿画像の各像域ごとに、当該像域が文字像域であれば、公知の文字認識処理を行う文字認識部４４にその像域の画像を渡して文字認識処理を行わせ、その像域内の文章のテキストデータを得る。この文字認識の際に、スタイル属性２２２を文字認識部４４に渡せば、認識精度の向上が見込める。そして、そのテキストデータを当該像域に対応する要素の要素名２１０のタグで修飾することにより、当該像域に対応する要素の記述を構成する。また、画像（写真）像域の場合、例えば、その像域の画像のファイルを作成し、その像域に対応する要素の要素名２１０のタグにその画像ファイルへの参照を記述することで、その像域に対応する要素の記述を構成する。そして、構造データ２０２が示す要素間のツリー構造に従ってそれら各要素の記述を配列し、必要に応じて像域に対応しない要素の記述を追加することで、構造化文書を生成する。 When the document type to which the document image belongs can be determined, the structuring unit 40 converts the document image into a structured document using the structure data 202 and element data 204 of the document type. That is, for each image area of the original image, if the image area is a character image area, the character recognition unit 44 that performs a known character recognition process passes the image area image to perform the character recognition process. Get text data of text in the image area. At the time of character recognition, if the style attribute 222 is passed to the character recognition unit 44, the recognition accuracy can be improved. Then, by modifying the text data with a tag of the element name 210 of the element corresponding to the image area, a description of the element corresponding to the image area is configured. In the case of an image (photo) image area, for example, an image file of the image area is created, and a reference to the image file is described in the tag of the element name 210 of the element corresponding to the image area. Construct a description of the element corresponding to the image area. Then, a description of each element is arranged in accordance with a tree structure between elements indicated by the structure data 202, and a description of an element that does not correspond to an image area is added as necessary, thereby generating a structured document.

次に、図８を参照して、デジタル複合機１の処理手順を説明する。例えば、ユーザが、デジタル複合機１の操作パネル１８に表示される操作メニューから構造化文書作成を選び、原稿をプラテン又はＡＤＦにセットして処理の実行を指示すると、図８の処理手順が開始される。 Next, the processing procedure of the digital multi-function peripheral 1 will be described with reference to FIG. For example, when the user selects structured document creation from the operation menu displayed on the operation panel 18 of the digital multi-function peripheral 1, sets the original on the platen or ADF, and instructs the execution of the processing, the processing procedure of FIG. 8 starts. Is done.

この手順では、まずスキャンエンジン２２がその原稿を読み取り（Ｓ１）、読み取った原稿画像を像域分離回路２８に渡す。像域分離回路２８は、その原稿画像に対して公知の像域分離処理を施し、その結果分離された各像域につき上述の画像特徴を算出し、それらの処理結果（すなわち像域属性１００）を構造化部４０に渡す。構造化部４０は、各像域の像域属性１００と、スキーマＤＢ４２に登録された各文書種別の情報とを照合することで（Ｓ３）、原稿画像が属する文書種別を探す。そのような文書種別が見つかれば（Ｓ４の判定結果が肯定（Ｙ））、構造化部４０は、その原稿画像に最も適合する文書種別の構造データ２０２及び要素データ２０４に従って、上述のようにしてその原稿画像を構造化文書のデータに変換し（Ｓ５）、得られた構造化文書データを文書ＤＢサーバ３に登録する（Ｓ６）。 In this procedure, first, the scan engine 22 reads the original (S 1), and passes the read original image to the image area separation circuit 28. The image area separation circuit 28 performs a known image area separation process on the original image, calculates the above-described image features for each image area separated as a result, and processes the results (that is, the image area attribute 100). To the structuring unit 40. The structuring unit 40 collates the image area attribute 100 of each image area with the information of each document type registered in the schema DB 42 (S3), and searches for the document type to which the document image belongs. If such a document type is found (the determination result in S4 is affirmative (Y)), the structuring unit 40, as described above, according to the structure data 202 and element data 204 of the document type most suitable for the document image. The document image is converted into structured document data (S5), and the obtained structured document data is registered in the document DB server 3 (S6).

スキーマＤＢ４２から原稿画像が属する文書種別が見つからなかった場合（Ｓ４の判定結果が否定（Ｎ））、構造化部４０は、各像域の像域属性１００及び像域画像１１０から、暫定的な構造化文書を作成し（Ｓ７）、これをクライアントマシン２の構造化文書エディタ２ａに提供して（Ｓ８）、ユーザの修正・編集を受ける（Ｓ９）。 When the document type to which the document image belongs is not found from the schema DB 42 (the determination result in S4 is negative (N)), the structuring unit 40 temporarily determines from the image area attribute 100 and the image area image 110 of each image area. A structured document is created (S7), which is provided to the structured document editor 2a of the client machine 2 (S8), and is subjected to user modification / editing (S9).

デジタル複合機１が読み込んだ文書が今までに全く入力されたことのない種別の文書である場合、このステップＳ７以降の処理が行われる。例えば、図９に示す原稿画像３００が今までにない種別の文書であった場合、像域分離回路２８によりその原稿画像３００を像域Ｒ１〜Ｒ５に分けた像域分離結果３１０が求められ、ステップＳ３，４でその原稿画像３００が既存のどの文書種別にも該当しないことが分かると、ステップＳ７で構造化部４０は、それら各像域に対して仮の要素名（ここでは便宜上像域の名前に合わせてＲ１，Ｒ２，・・・，Ｒ５とする）を付与し、例えば、それら各要素Ｒ１〜Ｒ５が同列にルート要素の子となる、図１０のような仮の文書構造を構築する。また、構造化部４０は、図１１に示すように、各像域Ｒ１〜Ｒ５の位置・形状や像域種別、画像特徴（これらは像域分離回路２８により既に求められている）を、それら各像域に対応する要素Ｒ１〜Ｒ２の属性情報として作成する。そして、構造化部４０は、前述のスキーマＤＢ４２のスキーマ情報を用いた構造化文書の作成方法と同様の方法で、それら文書構造、各要素の属性情報及び各像域の画像から暫定的な構造化文書を作成する。そして、構造化部４０は、ステップＳ８にて、その暫定的な構造化文書とそのスキーマ（図１０の文書構造及び図１１の各要素の属性情報の情報内容を記述したデータ）のデータをクライアントマシン２の構造化文書エディタ２ａに提供する。この提供は、例えばデジタル複写機１にあらかじめクライアントマシン２のアドレス（ユーザのメールアドレスなど）を登録しておき、そのアドレス宛にそのデータを送信することで実現してもよいし、あるいはそのデータを、あらかじめ指定されたユーザの親展ボックスに保存し、そのユーザがクライアントマシン２からその親展ボックスにアクセスしてそのデータをダウンロードすることにより実現してもよい。また、ユーザがデジタル複合機１に対してパスワード入力等によるユーザ認証を済ませた上で構造化文書作成を指示していたならば、構造化文書やスキーマ情報は、そのユーザがあらかじめデジタル複合機１（或いは該複合機１がユーザ認証を依頼するユーザ管理サーバ）に登録した電子メールアドレスに送ったり、そのユーザの親展ボックスに保存したりすることもできる。 If the document read by the digital multi-function peripheral 1 is a document of a type that has never been input, the processes after step S7 are performed. For example, when the original image 300 shown in FIG. 9 is a document of an unprecedented type, an image area separation result 310 obtained by dividing the original image 300 into image areas R1 to R5 is obtained by the image area separation circuit 28. If it is found in step S3, 4 that the original image 300 does not correspond to any existing document type, in step S7, the structuring unit 40 provides a temporary element name (in this case, the image area for convenience). R1, R2,..., R5) are assigned according to the names of, and for example, a temporary document structure as shown in FIG. 10 is constructed in which each of the elements R1 to R5 is a child of the root element in the same row. To do. Further, as shown in FIG. 11, the structuring unit 40 determines the position / shape, image area type, and image characteristics (these have already been obtained by the image area separation circuit 28) of the image areas R1 to R5. Created as attribute information of elements R1 to R2 corresponding to each image area. Then, the structuring unit 40 uses a method similar to the method for creating the structured document using the schema information of the schema DB 42 described above, and determines the provisional structure from the document structure, the attribute information of each element, and the image of each image area. Create a document. In step S8, the structuring unit 40 transmits the data of the provisional structured document and the schema (data describing the information structure of the attribute structure of the document structure in FIG. 10 and each element in FIG. 11) to the client. This is provided to the structured document editor 2a of the machine 2. This provision may be realized, for example, by registering the address of the client machine 2 (such as the user's mail address) in advance in the digital copying machine 1 and transmitting the data to that address, or the data. May be stored in a user's confidential box designated in advance, and the user may access the confidential box from the client machine 2 and download the data. If the user has instructed the digital multifunction device 1 to create a structured document after performing user authentication by inputting a password, the structured document and schema information are stored in advance by the user in the digital multifunction device 1. (Alternatively, it can be sent to an e-mail address registered in the user management server from which the multifunction device 1 requests user authentication) or can be stored in the confidential box of the user.

このようにして暫定的な構造化文書とそのスキーマの情報を受け取ったクライアントマシン２で、ユーザが構造化文書エディタ２ａを起動してその暫定的な構造化文書（及びスキーマ）のデータを編集対象として指定すると、構造化文書エディタ２ａは、クライアントマシン２のディスプレイの画面に、図１２に示すように、暫定的な構造化文書の画像３２０を表示する。図１２では、煩雑さを避けるため、暫定的な構造化文書の画像３２０として各要素Ｒ１〜Ｒ５に対応する像域の輪郭を示したが、実際には図９の原稿画像３００のように各像域の内容（テキストや画像など）を表示する。また、その表示に、それら各像域の範囲を示す画像（例えば像域の輪郭線）を重畳してもよい。このように表示された暫定的な構造化文書に対し、構造化文書エディタ２ａは、ユーザから文書種別名の入力や、各要素（像域）の要素名の入力を受け付ける。要素名の入力は、例えば表示された構造化文書の要素をマウスのクリック操作等で選択すると、要素名入力用のダイアログボックスが表示され、それに対しユーザが要素名を入力するなどといったユーザインタフェースを用いればよい。同様に、選択した要素に対し、スタイル属性をユーザが入力できるようにすることもできる。スタイル属性、例えば文字サイズやフォントなどは、例えばプルダウンメニュー等の形で、あらかじめ用意された選択肢から選択するようにすればよい。このようなユーザの編集作業により、文書種別名や各要素の要素名等が確定した構造化文書３３０ができる。この例では、各要素の要素名を指定しただけでツリー構造は変わらないため、図１０に示した仮の要素名のツリーが、図１３に示すように、同じツリー構造で要素名のみが仮のものから真のものに変わるだけである。ここで、ルート要素の要素名には、例えば、入力された文書種別名がセットされる。 The client machine 2 that has received the provisional structured document and its schema information in this way activates the structured document editor 2a and edits the data of the provisional structured document (and schema). The structured document editor 2a displays a provisional structured document image 320 on the display screen of the client machine 2 as shown in FIG. In FIG. 12, in order to avoid complication, the outline of the image area corresponding to each of the elements R1 to R5 is shown as the provisional structured document image 320. Display the contents of the image area (text, images, etc.). Further, an image indicating the range of each image area (for example, an outline of the image area) may be superimposed on the display. For the temporary structured document displayed in this way, the structured document editor 2a accepts an input of a document type name and an element name of each element (image area) from the user. For the input of the element name, for example, when an element of the displayed structured document is selected by clicking the mouse, etc., a dialog box for inputting the element name is displayed, and the user interface such as the user inputting the element name is displayed. Use it. Similarly, a user can input a style attribute for a selected element. The style attributes, such as character size and font, may be selected from options prepared in advance, for example, in the form of a pull-down menu. By such user editing work, a structured document 330 in which the document type name, the element name of each element, and the like are determined can be created. In this example, the tree structure does not change just by specifying the element name of each element. Therefore, the temporary element name tree shown in FIG. 10 is the same tree structure as shown in FIG. It just changes from the true to the real. Here, for example, the input document type name is set as the element name of the root element.

このように入力された文書種別名や各要素名、スタイル属性などの情報が、当該構造化文書のスキーマ（例えば図１１）に登録されることで、当該構造化文書のスキーマの編集が完了する。スキーマの他の情報、すなわち位置・形状や像域種別、画像特徴などは、デジタル複合機１から提供されたときからの情報内容が維持される。 The information such as the document type name, each element name, and the style attribute input in this way is registered in the schema of the structured document (for example, FIG. 11), thereby completing the editing of the schema of the structured document. . The other information of the schema, that is, the position / shape, the image area type, the image feature, and the like, the information content from when it was provided from the digital multifunction peripheral 1 is maintained.

なお、以上のように単に要素名を入力するだけでなく、既存要素を束ねる要素を挿入するなどの、構造レベルでの編集も可能である。これは、例えば、暫定的な構造化文書を表示した画面上で、束ねたい要素たちを内包する範囲の範囲指定を受け付け、その範囲に対して要素名の入力を受け付けるなどのユーザインタフェース機構により実現できる。これにより、例えば図１４に例示するように、「表題」、「著者名」、「要約」の各要素を束ねる要素「代表項目」を、ルート要素「報告書」の子として挿入できる。このようなツリー構造上の編集結果もスキーマ情報（の構造情報部分）に反映される。 As described above, it is possible to perform editing at the structure level, such as inserting an element that bundles existing elements as well as simply inputting an element name. This is realized by a user interface mechanism such as accepting the range specification of the range containing the elements to be bundled on the screen displaying the temporary structured document, and accepting the input of the element name for the range. it can. Thereby, as illustrated in FIG. 14, for example, an element “representative item” that bundles the elements “title”, “author name”, and “summary” can be inserted as a child of the root element “report”. The editing result on the tree structure is also reflected in the schema information (the structure information portion).

ユーザは、このようにして編集された構造化文書のスキーマ情報をデジタル複合機１に送り、スキーマＤＢ４２に登録する。 The user sends the schema information of the structured document edited in this way to the digital multi-function peripheral 1 and registers it in the schema DB 42.

このようにしてある文書種別のスキーマ情報がデジタル複合機１のスキーマＤＢ４２に一旦登録されると、その後同じ文書種別の原稿がデジタル複合機１に入力されれば、その原稿画像の各像域はそのスキーマ情報が示す各要素の位置・形状及び画像特徴と整合するため、そのスキーマ情報を用いて構造化文書へと変換される。これにより、例えば図１５に示すように、「○○○○の調査」という表題の報告書がデジタル複合機１に入力され、構造化文書への変換が指示されると、「報告書」のスキーマ情報に従って処理されることで、その文書「○○○○の調査」の各像域のテキストや画像は「報告書」の各要素に対応づけられ、これに基づき構造化文書が作成される。 Once the schema information of a certain document type is once registered in the schema DB 42 of the digital multi-function peripheral 1 in this way, if an original of the same document type is subsequently input to the digital multi-function peripheral 1, each image area of the original image is In order to match the position / shape and image feature of each element indicated by the schema information, the schema information is used to convert it into a structured document. As a result, for example, as shown in FIG. 15, when a report entitled “Investigation of XXX” is input to the digital multi-function peripheral 1 and conversion to a structured document is instructed, By processing according to the schema information, the text and images in each image area of the document “Investigating XXX” are associated with each element of the “report”, and a structured document is created based on this. .

このように、本実施形態によれば、ユーザが作成したスキーマ情報がデジタル複合機１のスキーマＤＢ４２に一旦登録されれば、その後同じ文書種別の原稿がデジタル複合機１に入力されれば、原稿の画像からそのスキーマ情報に従って自動的に構造化文書データが生成される。この自動処理では、原稿画像の各像域の画像特徴を構造化文書の各要素の画像特徴（これはスキーマ情報に含まれる）と比較することで、各像域がどの要素に該当するかを判定するので、単に像域と要素の位置・形状の比較だけで判定する方式よりも高精度の判定が可能である。また、このとき用いる画像特徴は、像域分離回路２８のハードウエア処理により求めるので、高速処理、リアルタイム処理が実現可能である。 As described above, according to the present embodiment, once schema information created by the user is registered in the schema DB 42 of the digital multi-function peripheral 1, an original of the same document type is input to the digital multi-function peripheral 1 thereafter. Structured document data is automatically generated from the image according to the schema information. In this automatic processing, the image features of each image area of the original image are compared with the image features of each element of the structured document (this is included in the schema information) to determine which element each image area corresponds to. Since the determination is performed, it is possible to perform a determination with higher accuracy than a method in which the determination is made only by comparing the image area and the position / shape of the element. Further, since the image feature used at this time is obtained by hardware processing of the image area separation circuit 28, high-speed processing and real-time processing can be realized.

また、本実施形態では、新規な文書種別の原稿が入力された場合、デジタル複合機１が、像域分離回路２８及び構造化部４０により暫定的な構造化文書とそのスキーマ情報を生成し、これをユーザに提供するので、ユーザは比較的簡単な編集作業でスキーマ情報を完成させることができる。 In the present embodiment, when a document of a new document type is input, the digital multi-function peripheral 1 generates a temporary structured document and its schema information by the image area separation circuit 28 and the structuring unit 40. Since this is provided to the user, the user can complete the schema information with a relatively simple editing operation.

次に、上記実施形態の変形例を、図１６を参照して説明する。図１６において、図８に示したステップと同一のステップには同一符号を付して説明を省略する。 Next, a modification of the above embodiment will be described with reference to FIG. In FIG. 16, the same steps as those shown in FIG.

上記実施形態では、スキーマＤＢ４２に登録されるスキーマ情報には、そのスキーマ情報の元になった原稿の各像域の位置・形状や各種画像特徴の値が登録されており、これらの値を基準に同一文書種別の原稿の解析が行われる。ところが、同一の文書種別の文書であっても、その中の各要素の位置・形状及び各画像特徴の値はある程度ばらつくので、原稿画像の文書種別を判定する際、原稿の各像域の値とスキーマ情報の各要素の値との一致を厳格に要求しすぎると、文書種別の判定が成り立たなくなる。かといって、一致と判定する際の許容範囲を広くとりすぎると誤判定が増える。適切な許容範囲は、文書種別ごと、像域・要素ごとに異なると考えられる。そこで、この変形例では、同じ文書種別に属する多くの原稿の読み取り結果を用いてスキーマ情報内の位置・形状２１２や画像特徴２１６の各項目の値の許容範囲を適切に決める処理を説明する。 In the above embodiment, the schema information registered in the schema DB 42 registers the position / shape of each image area and various image feature values based on the schema information. The document of the same document type is analyzed. However, even in the case of documents of the same document type, the position / shape of each element and the value of each image feature vary to some extent, so when determining the document type of a document image, the value of each image area of the document The document type cannot be determined if the strict match between the value of each element of the schema information is strictly requested. However, if the allowable range for determining coincidence is too wide, misjudgments increase. Appropriate tolerances are considered to be different for each document type and image area / element. Therefore, in this modification, a process for appropriately determining the allowable ranges of the values of the items of the position / shape 212 and the image feature 216 in the schema information using the reading results of many originals belonging to the same document type will be described.

この処理では、ユーザは、構造化文書作成の対象として入力する原稿の文書種別を知っている場合、その種別をデジタル複合機１に入力する。すなわち、例えば、メニュー上で構造化文書作成が指示された場合、スキーマＤＢ４２に登録された文書種別の種別名のリストを操作パネル１８のディスプレイに表示し、その中から入力する原稿の文書種別をユーザに選択させればよい。ユーザは、文書種別を知っていれば、その種別を選択してから処理開始を指示し、知らなければ選択せずに処理開始を指示すればよい。 In this process, when the user knows the document type of a document to be input as a structured document creation target, the user inputs the type to the digital multifunction peripheral 1. That is, for example, when structured document creation is instructed on the menu, a list of document type type names registered in the schema DB 42 is displayed on the display of the operation panel 18, and the document type of the document to be input from among them is displayed. The user may select it. If the user knows the document type, the user may instruct the start of processing after selecting the type, and if not known, the user may instruct the start of processing without selecting it.

そして、処理開始が指示されると、デジタル複合機１は、原稿を読み取って（Ｓ１）、像域分離及び画像特徴の計算（Ｓ２）を行う。そして、文書種別の選択がなされていなければ、前述の図８のステップＳ３以降の処理に進む。一方、文書種別の選択がなされている場合は、構造化部４０は、その選択された文書種別のスキーマ情報をスキーマＤＢ４２から求め、そのスキーマ情報に示される各要素の位置・形状２１２、像域種別２１４及び画像特徴２１６と、原稿画像の各像域の位置・形状情報、像域種別、及び画像特徴情報とを照合し（Ｓ１２）、各要素と各像域とが一対一で対応し整合するかどうかを判定する（Ｓ１３）。この照合・判定は、ユーザの文書種別選択の誤りを検出する目的なので、図８のステップＳ３，Ｓ４での判定の際よりも甘い判定基準（例えば許容範囲がＳ３，Ｓ４の場合よりも大きい、など）で判定する。そして、整合しないと判定した場合は、構造化部４０は、文書種別選択の誤りの可能性を示唆するメッセージを操作パネル１８に表示するなどのエラー処理を行う（Ｓ１４）。一方、整合すると判定した場合は、構造化部４０は、選択された文書種別のスキーマ情報に従って、原稿画像を構造化文書へと変換し（Ｓ１５）、その変換結果を文書ＤＢサーバ１６に登録する（Ｓ１６）。そして更に構造化部４０は、その原稿画像の各像域の位置・形状情報１０２及び画像特徴情報１０６の各項目の値を、スキーマＤＢ４２内の当該スキーマ情報の位置・形状２１２及び画像特徴２１６の各項目の情報に反映させる（Ｓ１７）。ここでは、例えば、原稿画像の像域の位置・形状が、スキーマ情報に登録された対応要素の位置・形状の許容範囲から外れている場合、その像域の位置・形状の値を含むようその許容範囲を修正する。また、スキーマ情報における位置・形状等が、平均値や分散などの統計データで規定されている場合、その統計データをいま処理した原稿画像の各像域の情報を用いて修正する。なお、スキーマ情報の各項目を統計データで表す場合、原稿の各像域が各要素に対応するかどうかは、各像域の位置・形状等が文書のある要素であることについての統計的な確からしさをその要素の統計データから求め、その確からしさに基づき判定を行えばよい。 When the start of processing is instructed, the digital multi-function peripheral 1 reads the document (S1), and performs image area separation and image feature calculation (S2). If no document type has been selected, the process proceeds to step S3 and subsequent steps in FIG. On the other hand, when the document type is selected, the structuring unit 40 obtains the schema information of the selected document type from the schema DB 42, and the position / shape 212 of each element indicated by the schema information, the image area The type 214 and the image feature 216 are collated with the position / shape information of each image area of the original image, the image area type, and the image feature information (S12). It is determined whether or not to perform (S13). Since this collation / determination is for the purpose of detecting an error in the user's document type selection, it is a judgment criterion that is sweeter than the determination in steps S3 and S4 in FIG. 8 (for example, the allowable range is larger than that in the case of S3 and S4. Etc.). If it is determined that they do not match, the structuring unit 40 performs error processing such as displaying a message on the operation panel 18 that suggests the possibility of an error in selecting the document type (S14). On the other hand, if it is determined that they match, the structuring unit 40 converts the document image into a structured document according to the schema information of the selected document type (S15), and registers the conversion result in the document DB server 16. (S16). Further, the structuring unit 40 uses the values of the items of the position / shape information 102 and the image feature information 106 of each image area of the document image to determine the position / shape 212 of the schema information and the image feature 216 in the schema DB 42. It is reflected in the information of each item (S17). Here, for example, when the position / shape of the image area of the document image is out of the allowable range of the position / shape of the corresponding element registered in the schema information, the value of the position / shape of the image area is included. Correct the tolerance. Further, when the position / shape and the like in the schema information are defined by statistical data such as an average value and variance, the statistical data is corrected using information on each image area of the original document image that has been processed. When each item of schema information is represented by statistical data, whether or not each image area of the document corresponds to each element is statistically determined that the position / shape of each image area is an element of the document. The certainty may be obtained from the statistical data of the element, and the determination may be made based on the certainty.

以上、本発明の好適な実施形態と変形例を説明したが、以上に説明したものはあくまで一例に過ぎず、本発明の範囲内で様々な変形が可能である。例えば、以上の例では、デジタル複合機１が作成した暫定的な構造化文書のスキーマ情報を、クライアントマシン２の構造化文書エディタ２ａで修正したが、これに限らず、例えばデジタル複合機１に上述したスキーマ情報の編集機能を持たせてもよい。 The preferred embodiments and modifications of the present invention have been described above. However, what has been described above is merely an example, and various modifications can be made within the scope of the present invention. For example, in the above example, the schema information of the provisional structured document created by the digital multifunction device 1 is corrected by the structured document editor 2a of the client machine 2. However, the present invention is not limited to this. The above-described schema information editing function may be provided.

また、以上の例では、デジタル複合機１が読み取った原稿画像について、そのデジタル複合機１が像域分離を行い、構造化文書に変換したが、上述の像域分離及び構造化文書への変換のための上述の各機能は、原稿画像を読み取った装置とは別の装置で実行されてもよい。このことは、以下に示す各変形例についても言えることである。すなわち、以下では、更なる変形例を、上述のデジタル複合機１で実現した場合を例にとって説明するが、以下に示す変形例の機能は、原稿画像を読み取った装置とは別の装置で実行されてもよい。 In the above example, the original image read by the digital multi-function peripheral 1 is subjected to image area separation by the digital multi-function apparatus 1 and converted into a structured document. Each of the functions described above may be executed by a device other than the device that reads the document image. This is also true for the following modifications. That is, in the following, a case where a further modification is realized by the above-described digital multi-function peripheral 1 will be described as an example, but the functions of the modification described below are executed by a device different from the device that read the document image. May be.

この場合、別の装置は、例えば、汎用のコンピュータにて上述の各部の機能又は処理内容を記述したプログラムを実行することにより実現してもよい。コンピュータは、例えば、ハードウエアとして、ＣＰＵ（中央演算装置）、メモリ（一次記憶）、各種Ｉ／Ｏ（入出力）インタフェース等がバスを介して接続された回路構成を有する。また、そのバスに対し、例えばＩ／Ｏインタフェース経由で、ＨＤＤ（ハードディスクドライブ）やＣＤやＤＶＤ、フラッシュメモリなどの各種規格の可搬型の不揮発性記録媒体を読み取るためのディスクドライブが接続される。このようなドライブは、メモリに対する外部記憶装置として機能する。実施形態の処理内容が記述されたプログラムがＣＤやＤＶＤ等の記録媒体を経由して、又はネットワーク経由で、ＨＤＤ等の固定記憶装置に保存され、コンピュータにインストールされる。固定記憶装置に記憶されたプログラムがメモリに読み出されＣＰＵにより実行されることにより、実施形態の処理が実現される。 In this case, another device may be realized, for example, by executing a program describing the function or processing content of each unit described above on a general-purpose computer. The computer has, for example, a circuit configuration in which a CPU (Central Processing Unit), a memory (primary storage), various I / O (input / output) interfaces and the like are connected via a bus as hardware. Also, a disk drive for reading various types of portable non-volatile recording media such as an HDD (hard disk drive), a CD, a DVD, and a flash memory is connected to the bus via, for example, an I / O interface. Such a drive functions as an external storage device for the memory. A program in which the processing content of the embodiment is described is stored in a fixed storage device such as an HDD via a recording medium such as a CD or DVD, or via a network, and is installed in a computer. The program stored in the fixed storage device is read into the memory and executed by the CPU, whereby the processing of the embodiment is realized.

別の変形例の処理手順を、図１７を参照して説明する。図１７の手順では、デジタル複合機１の像域分離回路２８が構造化の対象となる原稿画像を像域分離し、構造化部４０がその像域分離の結果を用いて、暫定的な構造化文書及びスキーマ情報（例えば、図１０に例示したようなフラットな構造のもの）を作成する（Ｓ２１）。また、この像域分離の結果における分離精度（像域分離精度と呼ぶ）を計算する（Ｓ２２）（便宜上ステップＳ２１の後にＳ２２を示したが、像域分離精度の計算は、像域分離処理と平行して実行してもよい）。像域分離精度は、像域分離処理により求められる個々の像域ごとに計算することができる。 The processing procedure of another modification will be described with reference to FIG. In the procedure of FIG. 17, the image area separation circuit 28 of the digital multi-function peripheral 1 separates the image area of the original image to be structured, and the structuring unit 40 uses the result of the image area separation to make a provisional structure. A document and schema information (for example, a flat structure as illustrated in FIG. 10) is created (S21). Further, the separation accuracy (referred to as image region separation accuracy) in the image region separation result is calculated (S22) (S22 is shown after step S21 for convenience, but the calculation of the image region separation accuracy is the same as the image region separation process. May be run in parallel). The image area separation accuracy can be calculated for each individual image area obtained by the image area separation process.

すなわち、像域分離では、画素、或いは複数の画素からなるブロック、等といった画像単位について画像特徴を求め、画像単位ごとに、その周波数特性等の画像特徴から、その画像単位が文字、連続調（写真）画像、誤差拡散画像、などといった像域種別のいずれに属するかを判定する。この判定のために、例えば、像域種別ごとに画像特徴の値（或いは複数の画像特徴の値の組。以下、簡単のため「画像特徴値」と総称する）の範囲が定められており、画像単位の画像特徴値が、ある像域種別に対応する範囲に含まれれば、その画像単位はその像域種別に該当すると判定される。しかし、そのように範囲に含まれるか否かに従って自動判定した場合、画像単位の画像特徴値がある像域種別に対応する範囲に含まれてさえいれば、その画像特徴値がその範囲と別の像域種別の範囲との境界部に位置しようと、そのような別の像域種別の範囲と紛れない、当該像域種別の可能性が高い位置に位置しようと、その画像単位は等しくその像域種別と判定されてしまう。しかし、本来ならば、前者（境界部に位置する場合）は、後者よりも、当該画像単位が判定結果の像域種別である可能性は低いのである。このように、画像単位が同じようにある像域種別と判定された場合でも、前者よりも後者の方が、その判定結果の確からしさが高い。この確からしさが、前述の像域分離精度である。例えば、画像単位の画像特徴値と、像域種別に対応する画像特徴値の範囲とを用いて、その画像単位がその像域種別に属する確からしさ、すなわち像域分離精度を求めるための評価式、或いは評価ルールなどを定義することができる。例えば、単純な例では、画像特徴値がａからｂの範囲に含まれれば、当該画像単位は像域分離精度０．１で像域種別Ａに属し、ｂからｃの範囲に含まれれば、像域分離精度０．９で像域種別Ａに属する、等といったルールを定義することができる。 In other words, in image area separation, image features are obtained for image units such as pixels or a block made up of a plurality of pixels, and for each image unit, the image units are converted to character, continuous tone ( It is determined whether the image belongs to an image area type such as a photograph) image or an error diffusion image. For this determination, for example, a range of image feature values (or a set of a plurality of image feature values; hereinafter collectively referred to as “image feature values”) is defined for each image area type. If the image feature value of the image unit is included in a range corresponding to a certain image area type, it is determined that the image unit corresponds to the image area type. However, when the automatic determination is made according to whether or not the image is included in the range, the image feature value of the image unit is different from the range as long as the image feature value is included in the range corresponding to the image area type. If the image unit is located at the boundary with the range of the image area type, the image unit is equal to the range of the other image area type, and the position of the image area type is high. It will be determined as the image area type. However, originally, the former (when located at the boundary) is less likely to be the image area type of the determination result than the latter. As described above, even when it is determined that the image unit is the same image area type, the determination result is more likely in the latter than in the former. This certainty is the aforementioned image area separation accuracy. For example, an evaluation formula for determining the probability that the image unit belongs to the image area type, that is, the image area separation accuracy, using the image feature value of the image unit and the range of the image feature value corresponding to the image area type. Alternatively, an evaluation rule or the like can be defined. For example, in a simple example, if the image feature value is included in the range from a to b, the image unit belongs to the image area type A with an image area separation accuracy of 0.1, and if included in the range from b to c, Rules such as belonging to image area type A with an image area separation accuracy of 0.9 can be defined.

以上一例を示したが、上述の方式に限らず、像域種別の判定にどのような方法を用いるにしても、画像単位の画像特徴値が各像域種別における画像特徴の典型的な値にどれだけ近いか又は遠いかに応じて、その画像単位が各像域種別に属する確からしさを数値化する
ことができる。像域分離精度を求める機能は、像域分離回路２８に設けることができる。 Although an example has been shown above, the image feature value for each image region is set to a typical value of the image feature in each image region type regardless of the method used to determine the image region type, not limited to the above-described method. Depending on how close or far it is, the probability that the image unit belongs to each image area type can be quantified. The function for obtaining the image area separation accuracy can be provided in the image area separation circuit 28.

このような評価式や評価ルールなどにより求められた各画像単位の像域分離精度を総合することで、原稿画像の像域分離結果全体についての像域分離精度を求めることができる。各画像単位の像域分離精度を総合してを求めるための演算には、様々なものが考えられる。例えば、各画像単位の像域分離精度の中で最も低い値を、像域分離結果全体の像域分離精度とする方式がその一つである。また、各画像単位の像域分離精度から所定の計算式（例えば平均を求めるための式や、二乗平均を求めるための式など）を用いて全体の像域分離精度を求めてもよい。どのような計算式が適切かは、用いる画像特徴の種類にもよるし、像域分離処理の内容にもよるが、いずれにしても、各画像単位の像域分離精度から像域分離結果全体の像域分離精度を定義することは可能である。 By combining the image area separation accuracy of each image unit obtained by such an evaluation formula or evaluation rule, the image area separation accuracy for the entire image area separation result of the original image can be obtained. Various calculations can be considered for obtaining the total image area separation accuracy of each image unit. For example, one of the methods is to use the lowest value of the image area separation accuracy of each image unit as the image area separation accuracy of the entire image area separation result. Further, the entire image area separation accuracy may be obtained using a predetermined calculation formula (for example, an expression for obtaining an average or an expression for obtaining a mean square) from the image area separation accuracy of each image unit. Which calculation formula is appropriate depends on the type of image feature used and on the content of the image area separation process, but in any case, the entire image area separation result is determined from the image area separation accuracy of each image unit. It is possible to define the image area separation accuracy.

このようにして、像域分離結果全体の像域分離精度が求められると、その精度が予め定めた閾値以上であるか否かが判定される（Ｓ２３）。閾値以上であれば、像域分離精度が十分に高いということであり、その場合、デジタル複合機１は、ステップＳ２１で求めた暫定的な構造化文書及びスキーマ情報を、正式の構造化文書及びスキーマ情報として文書ＤＢサーバ３及びスキーマＤＢ４２に登録する（Ｓ２４）。 When the image area separation accuracy of the entire image region separation result is thus obtained, it is determined whether or not the accuracy is equal to or greater than a predetermined threshold value (S23). If it is equal to or greater than the threshold value, the image area separation accuracy is sufficiently high. In this case, the digital multi-function peripheral 1 uses the provisional structured document and schema information obtained in step S21 as the formal structured document and the schema information. The schema information is registered in the document DB server 3 and the schema DB 42 (S24).

一方、像域分離精度が閾値未満であれば、暫定的な構造化文書及びスキーマ情報をクライアントに提供し（Ｓ２５）、構造化文書及びスキーマ情報に対する編集作業をユーザから受けて（Ｓ２６）、その編集結果を反映した構造化文書及びスキーマ情報を文書ＤＢサーバ３及びスキーマＤＢ４２に登録する（Ｓ２７）。ステップＳ２６での編集は、大略的には図８の手順におけるステップＳ９での編集と同様である。ただし、ステップＳ２６の場合、像域分離精度が低いので、誤った像域分離がなされている可能性が高い。したがって、ステップＳ２６の編集では、暫定的なスキーマ情報に示される像域そのものの位置、形状、サイズなどがユーザにより変更される場合もある。そのような像域についての変更は、スキーマ情報や、このスキーマ情報に準拠する構造化文書へと反映される。このように、像域分離精度が閾値より低い場合は、像域分離結果そのものの信頼性が低いため、ユーザの介入を求めるのである。 On the other hand, if the image area separation accuracy is less than the threshold, provisional structured document and schema information are provided to the client (S25), and editing work on the structured document and schema information is received from the user (S26). The structured document and schema information reflecting the editing result are registered in the document DB server 3 and the schema DB 42 (S27). The editing in step S26 is generally the same as the editing in step S9 in the procedure of FIG. However, in the case of step S26, since the image area separation accuracy is low, there is a high possibility that erroneous image area separation is performed. Accordingly, in the editing in step S26, the position, shape, size, etc. of the image area itself indicated in the provisional schema information may be changed by the user. Such a change in the image area is reflected in schema information and a structured document conforming to the schema information. As described above, when the image area separation accuracy is lower than the threshold value, the reliability of the image area separation result itself is low, and thus user intervention is required.

なお、ステップＳ２３で用いる閾値は、実験などにより妥当な値を予め定めておき、それをデジタル複合機に登録しておけばよい。 It should be noted that an appropriate value for the threshold used in step S23 may be determined in advance through experiments or the like and registered in the digital multi-function peripheral.

このように、図１７の手順では、像域分離精度が閾値以上の場合には、ユーザの手を煩わせることなく、自動的に構造化文書及びスキーマ情報の登録を行う。このように自動的に登録される構造化文書及びスキーマ情報には、論理構造がある限定されたものになる（例えばフラット構造）、或いは図１２に例示したような要素の意味づけができない（あるいは言語解析その他の高度な分析処理によりある程度の意味づけをすることも不可能ではないが、そのような自動処理では高精度の意味づけは期待できない）、などといった点はある。しかし、このような点を考慮しても、ユーザの省力化の方が重要であれば、図１７のような手順を用いる意義がある。 As described above, in the procedure of FIG. 17, when the image area separation accuracy is equal to or higher than the threshold value, the structured document and schema information are automatically registered without bothering the user. In such structured documents and schema information that are automatically registered, there is a limited logical structure (for example, a flat structure), or the meaning of elements as illustrated in FIG. 12 cannot be given (or It is not impossible to give meaning to some extent by language analysis or other advanced analysis processing, but high-precision meaning cannot be expected with such automatic processing). However, even if these points are taken into account, if it is more important for the user to save labor, it is meaningful to use the procedure shown in FIG.

次に、図１７の手順の変形を、図１８を用いて説明する。図１８の処理のために、スキーマＤＢ４２における各文書種別のスキーマ情報に対応づけて、その文書種別に該当する文書の画像を仮に像域分離したとした場合に得られると想定される像域分離の結果（言い換えれば、各画像単位の像域種別の判定結果）とそれら各画像単位の像域分離精度とが参照値として登録されているものとする。そのような参照値は、例えばその文書種別に該当する文書の画像に対し、実際に像域分離を施すことで求めることができる。新規の文書種別に属する原稿を本システムに読み込んだときに求められる像域分離結果や像域分離精度を、その文書種別の参照値とすることもできる。また、文書種別に属する複数の原稿の平均的な像域分離結果や像域分離精度を参照値としてもよい。 Next, a modification of the procedure in FIG. 17 will be described with reference to FIG. For the processing of FIG. 18, image area separation assumed to be obtained when the image of the document corresponding to the document type is associated with the schema information of each document type in the schema DB 42 and the image area is temporarily separated. (In other words, the determination result of the image area type for each image unit) and the image area separation accuracy for each image unit are registered as reference values. Such a reference value can be obtained, for example, by actually performing image area separation on an image of a document corresponding to the document type. The image area separation result and image area separation accuracy required when a document belonging to a new document type is read into the system can be used as a reference value for the document type. The average image area separation result and image area separation accuracy of a plurality of documents belonging to the document type may be used as the reference value.

図１８の手順では、デジタル複合機１は、原稿画像に対して像域分離を行い（Ｓ３１）、その像域分離の結果と、各画像単位の像域分離精度とを求める（Ｓ３２）。そして、それら各画像単位の像域種別の判定結果と各画像単位の像域分離精度と、スキーマＤＢ４２中の各文書種別のスキーマ情報に対応づけられた像域分離結果及び像域分離精度の参照値との比較から、原稿画像に適合する文書種別を検索する（Ｓ３３）。ステップＳ３３では、例えば、原稿画像の像域分離結果及び像域分離精度と、文書種別に対応する参照値との距離を求め、その距離が所定の閾値以下であれば、その文書種別は原稿画像に適合する（すなわち、その原稿画像がその文書種別に属する可能性が所定の閾値以上の確からしさで言える）と判定し、抽出することができる。そして、検索された文書種別の中の、原稿画像に対して最も適合する文書種別（例えば、前述の距離が最も小さいもの）に対応するスキーマ情報に従って、原稿画像を構造化文書に変換する（Ｓ３５）。このステップＳ３５での変換処理は、図８の手順のステップＳ５と同様の処理でよい。そして、その結果生成された構造化文書を、文書ＤＢサーバ３に登録する。 In the procedure of FIG. 18, the digital multi-function peripheral 1 performs image area separation on the document image (S31), and obtains the result of the image area separation and the image area separation accuracy for each image unit (S32). Then, the determination result of the image area type of each image unit, the image area separation accuracy of each image unit, and the image area separation result and image area separation accuracy associated with the schema information of each document type in the schema DB 42 From the comparison with the value, a document type that matches the document image is searched (S33). In step S33, for example, the distance between the image area separation result and image area separation accuracy of the document image and the reference value corresponding to the document type is obtained. If the distance is equal to or less than a predetermined threshold, the document type is the document image. (That is, the possibility that the document image belongs to the document type can be said with a certainty or more with a certain threshold or more) and can be extracted. Then, the document image is converted into a structured document in accordance with the schema information corresponding to the document type most suitable for the document image among the retrieved document types (for example, the one having the smallest distance as described above) (S35). ). The conversion process in step S35 may be the same process as step S5 in the procedure of FIG. Then, the structured document generated as a result is registered in the document DB server 3.

また、ステップＳ３４で原稿画像に適合する文書種別がないと判定された場合は、例えば、図１７の手順のステップＳ２２に進めばよい。この場合、デジタル複合機１は、原稿画像全体の像域分離精度を求め、これが所定の閾値以上であれば、その原稿画像が新規の文書種別（すなわちスキーマＤＢ４２に今まで登録されていなかった文書種別）であると判断して、その原稿画像の像域分離結果から生成した暫定的な構造化文書及びスキーマ情報を、文書ＤＢサーバ３及びスキーマＤＢ４２に登録する（Ｓ２４）。また、原稿画像全体の像域分離精度が閾値未満の場合は、ユーザに暫定的な構造化文書及びスキーマ情報を提供して、これに対する確認又は編集を受け、その結果得られる構造化文書及びスキーマ情報を文書ＤＢサーバ３及びスキーマＤＢ４２に登録する（Ｓ２５〜Ｓ２７）。 If it is determined in step S34 that there is no document type that matches the document image, for example, the process may proceed to step S22 in the procedure of FIG. In this case, the digital multi-function peripheral 1 obtains the image area separation accuracy of the entire original image, and if this is equal to or greater than a predetermined threshold, the original image is a new document type (ie, a document that has not been registered in the schema DB 42 until now). The provisional structured document and schema information generated from the image area separation result of the original image are registered in the document DB server 3 and the schema DB 42 (S24). If the image area separation accuracy of the entire original image is less than the threshold value, provisional structured document and schema information are provided to the user, and confirmation or editing is performed on the document, and the resulting structured document and schema are obtained. Information is registered in the document DB server 3 and the schema DB 42 (S25 to S27).

なお、ステップＳ３４で原稿画像に適合する文書種別がないと判定された場合に、以上のような処理の代わりに、図８の手順のステップＳ３以降の処理を行うようにしてもよい。 If it is determined in step S34 that there is no document type that matches the document image, the processing from step S3 onward in the procedure of FIG. 8 may be performed instead of the above processing.

次に、更なる変形例を、図１９を参照して説明する。図１９の処理のために、スキーマＤＢ４２における各文書種別のスキーマ情報に対応づけて、その文書種別に該当する文書の代表画像がシステムに登録されている。代表画像は、例えば、その文書種別に該当する具体的な文書画像の１つであってもよいし、その文書画像に該当する複数の文書の画像の平均画像であってもよい。各文書種別の代表画像は、デジタル複合機１からアクセスできるところであればどこに格納されていてもよい。例えば、代表画像は、スキーマＤＢ４２に格納されていてもよいし、文書ＤＢサーバ３に格納されていてもよい。 Next, a further modification will be described with reference to FIG. For the processing of FIG. 19, a representative image of a document corresponding to the document type is registered in the system in association with the schema information of each document type in the schema DB. For example, the representative image may be one of specific document images corresponding to the document type, or may be an average image of images of a plurality of documents corresponding to the document image. The representative image of each document type may be stored anywhere as long as it can be accessed from the digital multi-function peripheral 1. For example, the representative image may be stored in the schema DB 42 or may be stored in the document DB server 3.

図１９の手順では、デジタル複合機１は、構造化の対象である原稿画像を取得すると、システムに登録された各文書種別の代表画像の中から、その原稿画像に対する類似度が所定値以上であるものを検索する（Ｓ４１）。原稿画像に対する代表画像の類似度は、画像同士の類似度を求める既存の手法により求めることができる。例えば、原稿画像と代表画像の対応画素同士の値の差分について、全画素にわたっての二乗平均をとったものを原稿画像と代表画像の距離とし、その距離が近いほど類似度が高くなる評価式により類似度を計算する、などである。このような画像同士の類似度は、ハードウエア回路でもソフトウエア処理でも求めることができる。本システムに、そのような類似度を求めるためのハードウエア又はソフトウエアを搭載すればよい。 In the procedure of FIG. 19, when the digital multifunction peripheral 1 acquires a document image to be structured, the similarity to the document image is greater than or equal to a predetermined value from the representative images of each document type registered in the system. A certain thing is searched (S41). The similarity of the representative image to the document image can be obtained by an existing method for obtaining the similarity between images. For example, the difference between the values of the corresponding pixels of the original image and the representative image is obtained by calculating the root mean square over all the pixels as the distance between the original image and the representative image, and the similarity is higher as the distance is shorter. Calculating similarity, etc. Such similarity between images can be obtained by hardware circuit or software processing. The system may be equipped with hardware or software for obtaining such similarity.

そして、所定値以上の類似度を持つ文書種別があるかどうかを判定し（Ｓ４２）、そのような文書種別が見つかった場合は、その中で最も類似度の高い文書種別のスキーマ情報に従って、原稿画像を構造化文書に変換する（Ｓ４３）。この変換処理は、図８の手順のステップＳ５と同様な処理内容でよい。そして、得られた構造化文書を文書ＤＢサーバ３に登録する（Ｓ４４）。 Then, it is determined whether or not there is a document type having a similarity greater than or equal to a predetermined value (S42). If such a document type is found, the document is determined according to the schema information of the document type having the highest similarity among them. The image is converted into a structured document (S43). This conversion process may have the same processing contents as step S5 in the procedure of FIG. The obtained structured document is registered in the document DB server 3 (S44).

また、このような処理の代わりに、ステップＳ４２の所定値以上の類似度を持つ文書種別が複数見つかった場合、図８のＳ２以降の処理を行ってもよい。この場合、原稿画像に適合する文書種別を、スキーマＤＢ４２に登録されたすべての文書種別の中から探す代わりに、それら所定値以上の類似度を持つ文書種別の中から探索すればよい。 Further, instead of such processing, when a plurality of document types having a similarity equal to or higher than the predetermined value in step S42 are found, the processing after S2 in FIG. 8 may be performed. In this case, instead of searching for all document types registered in the schema DB 42 for document types that match the document image, it is only necessary to search for document types having a similarity equal to or higher than the predetermined value.

また、ステップＳ４２の判定で、所定値以上の類似度を持つ文書種別がないと判定された場合には、例えば図１７の処理を行えばよい。また、この代わりに図８の処理を行ってもよいし、図１８の処理を実行するようにしてもよい。 If it is determined in step S42 that there is no document type having a similarity greater than or equal to a predetermined value, for example, the process of FIG. 17 may be performed. Alternatively, the process of FIG. 8 may be performed, or the process of FIG. 18 may be performed.

このように、図１９の処理手順では、像域分離とそれに基づく構造化の処理を行う前に、画像のレベルでの類似度により原稿画像が属する可能性の高い文書種別を絞り込み、そのような絞り込み結果に対して像域分離に基づく構造化のための処理を行う。したがって、像域分離に基づく構造化のための処理に要する時間が短くなる。 As described above, in the processing procedure of FIG. 19, before performing the image area separation and the structuring process based thereon, the document types to which the document image is likely to belong are narrowed down by the similarity at the image level. Processing for structuring based on image area separation is performed on the narrowed-down result. Therefore, the time required for processing for structuring based on image area separation is shortened.

以上の図１７〜図１９を用いて説明した各例では、各種の判定のためにそれぞれ閾値を用いたが、このような閾値を設定したり、変更したりするためのユーザインタフェースを、デジタル複合機１に設けてもよい。また、図１７〜図１９を用いて説明した各処理を実行するか否かを設定するためのユーザインタフェースをデジタル複合機１に設けてもよい。 In each of the examples described with reference to FIGS. 17 to 19, threshold values are used for various determinations. A user interface for setting or changing such a threshold value is a digital composite. It may be provided in the machine 1. Further, a user interface for setting whether or not to execute each process described with reference to FIGS. 17 to 19 may be provided in the digital multi-function peripheral 1.

実施形態のシステム構成例を示す図である。It is a figure which shows the system configuration example of embodiment. 実施形態のデジタル複合機の制御機構のハードウエア構成を示す図である。It is a figure which shows the hardware constitutions of the control mechanism of the digital multi-functional peripheral of embodiment. 実施形態のデジタル複合機における、原稿画像を構造化文書へ変換する機能の詳細を示す機能ブロック図である。FIG. 3 is a functional block diagram illustrating details of a function of converting a document image into a structured document in the digital multi-function peripheral according to the embodiment. 画素値のヒストグラムの例を示す図である。It is a figure which shows the example of the histogram of a pixel value. スキーマＤＢに登録されたスキーマ情報の概略的なデータ内容を示す図である。It is a figure which shows the schematic data content of the schema information registered into schema DB. 構造データが示すツリー構造の例を示す図である。It is a figure which shows the example of the tree structure which structure data shows. 要素データの構造の例を示す図である。It is a figure which shows the example of the structure of element data. デジタル複合機が実行する構造化文書作成処理の手順の一例を示すフローチャートである。10 is a flowchart illustrating an example of a procedure of structured document creation processing executed by the digital multifunction peripheral. 原稿画像に対応する文書種別がない場合の、暫定的な構造化処理を説明するための図である。It is a figure for demonstrating provisional structuring processing when there is no document type corresponding to a manuscript image. 暫定的な構造化結果のツリー構造の一例を示す図である。It is a figure which shows an example of the tree structure of a temporary structure result. 暫定的な構造化結果の要素データの一例を示す図である。It is a figure which shows an example of the element data of a temporary structure result. 構造化文書エディタ上での編集操作を説明するための図である。It is a figure for demonstrating editing operation on a structured document editor. 編集された構造化文書のツリー構造の一例を示す図である。It is a figure which shows an example of the tree structure of the edited structured document. 編集された構造化文書のツリー構造の別の一例を示す図である。It is a figure which shows another example of the tree structure of the edited structured document. スキーマ情報が登録された文書種別の原稿を読み取った時の構造化部の処理を説明するための図である。It is a figure for demonstrating the process of the structure part when the original of the document classification to which schema information was registered is read. 変形例の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of a modification. 更に別の例の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of another example. 更に別の例の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of another example. 更に別の例の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of another example.

Explanation of symbols

１デジタル複合機、２クライアントマシン、２ａ構造化文書エディタ、３文書ＤＢサーバ、４ネットワーク、２８像域分離回路、４０構造化部、４２スキーマＤＢ、４４文字認識部、１００像域属性、１０２位置・形状情報、１０４像域種別、１０６画像特徴情報、１１０像域画像。 1 Digital MFP, 2 Client machine, 2a Structured document editor, 3 Document DB server, 4 Network, 28 Image area separation circuit, 40 Structured part, 42 Schema DB, 44 Character recognition part, 100 Image area attribute, 102 Position Shape information, 104 image area types, 106 image feature information, 110 image area images.

Claims

A reading section for reading a document;
An image area separating unit that separates an image area of an image of a document read by the reading unit;
An image processing unit that performs image processing according to the image type of each image area for each image area separated by the image area separation unit;
An output unit that outputs an image of a document image-processed by the image processing unit;
An image forming apparatus comprising:
A feature calculator for obtaining image features for each image area separated by the image area separator;
A structure information registration unit in which the position and image feature of each element of the structured document are registered as structure information of the structured document;
Each image area is structured based on the position of each image area obtained by the image area separation unit, the image feature of each image area obtained by the feature calculation unit, and the information registered in the structure information registration unit. A structuring unit that associates each element of the document and generates a structured document corresponding to the document image based on the correspondence result;
An image forming apparatus comprising:

The image area separation unit obtains image features of each part of the image, and performs image area separation of the image of the document using the obtained image features.
The feature calculation unit uses an operation result performed by the image area separation unit to obtain an image feature for image area separation when obtaining an image feature for each image area.
The image forming apparatus according to claim 1.

3. The image forming apparatus according to claim 2, wherein the image forming apparatus is a digital multi-function peripheral, and the image area separation unit and the feature calculation unit are configured as an internal processing mechanism of the digital multi-function peripheral.

The image forming apparatus according to claim 1, wherein the image feature of each image area obtained by the feature calculation unit is a distribution of spatial frequency components of the image in the image area.

The image forming apparatus according to claim 1, wherein the image feature of each image area obtained by the feature calculation unit is a ratio of black and white pixels in a binarization result of an image of the image area.

The image forming apparatus according to claim 1, wherein the image feature of each image area obtained by the feature calculation unit is an appearance frequency distribution of each pixel value in the image area.

The image forming apparatus according to claim 1, wherein the image feature of each image area obtained by the feature calculation unit is an appearance frequency distribution of each run-length value in the binarization result of the image of the image area.

2. The image forming apparatus according to claim 1, wherein an image feature corresponding to an image area type of an element of the structured document is registered in the structure information registration unit.

The structured part is
A layout presentation unit that presents a layout of each image area obtained by the processing of the image area separation unit to a user;
The specification of a new structured document element corresponding to each image area in the layout is received from the user, the position and image feature of each image area are associated with the element specified by the user for the image area, and A registration processing unit for registering in the structure information registration unit as structure information of a new structured document;
The image forming apparatus according to claim 1, further comprising:

The layout presentation unit provides the layout to a predetermined user terminal,
The registration processing unit registers the position and image feature information of each element of the structured document input from the user terminal with respect to the provided layout in the structure information registration unit as new structured document information. To
The image forming apparatus according to claim 9.

The structuring unit, when the position and image feature of each image area separated by the image region separation unit does not match the position and image feature of each element of the structured document registered in the structure information registration unit, The image forming apparatus according to claim 9, wherein the layout presenting unit and the registration processing unit perform processing for registering a new structured document.

An image area separating unit that separates an image area of a document image obtained by reading an original, and outputs each separated image area in association with the position and image characteristics of the image area;
Based on the position and image feature associated with each image area separated by the image area separation unit, each image area is associated with each element of the structured document, and based on this association result, the original image is associated. A structured part for generating a structured document;
An image processing apparatus comprising:

An evaluation unit for evaluating the accuracy of image area separation by the image area separation unit;
When the accuracy evaluated by the evaluation unit is higher than a predetermined threshold, structure information having each image area as a result of image area separation by the image area separation unit is created and registered in the structure information registration unit A registration department to
The image forming apparatus according to claim 1, further comprising:

An evaluation unit for evaluating the accuracy of image area separation by the image area separation unit;
When the accuracy evaluated by the evaluation unit is higher than a predetermined threshold, structure information of a structured document having each image area as a result of image area separation by the image area separation unit is created and registered as structure information A registration processing unit to register with the
Further comprising
The structuring unit refers to the structure information registered in the structure information registration unit and generates a structured document corresponding to the document image;
The image processing apparatus according to claim 12.

An evaluation unit for evaluating the accuracy of the image area separation by the image area separation unit;
The structure information registration unit
In the structure information registration unit, the structure information including the image area separation accuracy obtained by the evaluation unit as at least one of the image features is registered,
The structured part is
By searching the structure information registration unit using the image area separation accuracy obtained by the evaluation unit for the document image as a key, one or more structure information corresponding to the document image is obtained, and the structure information is obtained based on the obtained structure information. Generate a structured document corresponding to the original image,
The image forming apparatus according to claim 1.

An evaluation unit for evaluating the accuracy of the image area separation by the image area separation unit;
The structure information registration unit
In the structure information registration unit, the structure information including the image area separation accuracy obtained by the evaluation unit as at least one of the image features is registered,
The structured part is
By searching the structure information registration unit using the image area separation accuracy obtained by the evaluation unit for the document image as a key, one or more structure information corresponding to the document image is obtained, and the one or more structure information is included in the one or more structure information. Generating a structured document corresponding to the original image,
The image processing apparatus according to claim 12.

A document image storage unit that stores an image of a document corresponding to the structure information corresponding to each structure information registered in the structure information registration unit;
A search unit that searches the document image storage unit for an image having a similarity equal to or higher than a predetermined threshold with respect to the document image read by the reading unit;
Further comprising
The structuring unit generates a structured document corresponding to the document image based on the structure information corresponding to the image searched by the search unit;
The image forming apparatus according to claim 1.

A document image storage unit for storing a representative image representing each document type;
A search unit that searches the document image storage unit for a representative image having a similarity equal to or higher than a predetermined threshold with respect to the document image;
Further comprising
The structuring unit associates between each image area of the original image output from the image area separation unit and each element of the structured document of the document type corresponding to the representative image searched by the search unit. Generate a structured document by doing
The image processing apparatus according to claim 12.