JP2006301695A

JP2006301695A - Document processing device and program

Info

Publication number: JP2006301695A
Application number: JP2005118135A
Authority: JP
Inventors: Kei Tanaka; 圭田中; Toshiya Koyama; 俊哉小山; Shoichi Tateno; 昌一舘野; Masayoshi Sakakibara; 正義榊原; Teruka Saito; 照花斎藤; Kotaro Nakamura; 浩太郎中村; Takashi Nagao; 隆長尾; Shinu Ho; 新宇彭
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2005-04-15
Filing date: 2005-04-15
Publication date: 2006-11-02

Abstract

<P>PROBLEM TO BE SOLVED: To gather descriptions in the range inferred as mutually relevant based on the postscript marks handwritten on a document and intelligibly present the contents to a user. <P>SOLUTION: A document processing device comprises an extracting means for analyzing a document image data obtained by computerizing a paper document and the postscript marks handwritten on the original document expressed by the document image data, an identifying means for identifying the image area corresponding to the extracted postscript marks respectively, a categorizing means for categorizing every postscript mark having the same or similar marks by analyzing the shape of each postscript mark when a plurality of the postscript marks are extracted, and an output means for outputting the image located in the identified image area corresponding to the postscript mark along a predetermined layout for each postscript mark categorized as having the same or similar shape. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、文書に対して手書きで追記された文字や記号、図形などの追記マークに基づいて、その追記マークが付されている領域の記載内容を抽出する技術に関する。 The present invention relates to a technique for extracting description contents of a region to which an additional mark is attached based on additional marks such as characters, symbols, and figures added by hand to a document.

紙文書には、読み手によって各種のコメントが追記される場合がある。係るコメントのように紙文書に追記された事項のなかには、その読み手にとって重要な情報が含まれている場合があるので、このような追記事項のみを紙文書から抽出したいといったニーズがあった。そこで、このようなことを可能にする技術が従来より種々提案されており、その一例としては特許文献１に開示された技術が挙げられる。特許文献１には、追記事項の属性を各追記事項が付されている文脈などに基づいて分類し、各属性毎に追記事項をリスト化して出力する技術が開示されている。
特開平１１−２１９２４５号公報 Various comments may be added to the paper document by the reader. Some items added to a paper document, such as such comments, may contain important information for the reader, so there is a need to extract only such additional items from the paper document. Therefore, various techniques for enabling such a technique have been proposed in the past, and an example thereof is the technique disclosed in Patent Document 1. Patent Document 1 discloses a technique for classifying attributes of additional items on the basis of the context or the like to which each additional item is attached, and listing and outputting the additional items for each attribute.
Japanese Patent Laid-Open No. 11-219245

ところで、紙文書の読み手は、上記のようなコメントの他に、その読み手にとって重要な情報が記載されていると判断した章や段落にその旨を表す特定の文字や記号、図形など追記する場合がある。以下では、このような目的で紙文書に手書きで追記された文字や記号、図形などを「追記マーク」と呼ぶ。このような追記マークが付された章や段落には、読み手にとって重要な情報が記載されているのであるから、追記マークが付された領域の記載内容のみを抽出して提供することができると便利である。また、読み手が追記マークを紙文書へ書き込む際には、互いに関連性を有する記載が為されている箇所には同一の追記マークを付与することが一般的であるから、同一の追記マークが付与された領域など互いに関連性を有すると推測される領域の記載内容を集めてユーザに提示することができたならば便利である。しかしながら、特許文献１に開示された技術では、追記事項をその属性に応じてリスト化して出力するだけであるから、このようなことを実現することはできない。 By the way, when a reader of a paper document adds a specific character, symbol, figure, or the like indicating that to a chapter or paragraph that is judged to contain important information for the reader in addition to the comments as described above There is. Hereinafter, characters, symbols, figures, and the like added by hand to a paper document for such a purpose are referred to as “additional marks”. Since chapters and paragraphs marked with such additional marks contain important information for the reader, it is possible to extract and provide only the contents of the areas marked with additional marks. Convenient. In addition, when a reader writes a write-once mark on a paper document, it is common to give the same write-once mark to places where there is a description related to each other. It would be convenient if the description contents of the regions estimated to be related to each other such as the selected regions could be collected and presented to the user. However, since the technique disclosed in Patent Document 1 only lists and outputs additional items according to their attributes, this cannot be realized.

本発明は、文書に対して手書きで追記された文字や記号、図形などの追記マークに基づいて、互いに関連性を有すると推測される領域の記載内容を集めてユーザに判りやすく提示することを可能にする技術を提供することを目的としている。 The present invention collects description contents of regions that are presumed to be related to each other based on additional marks such as characters, symbols, and figures that are added by hand to a document and presents them to the user in an easily understandable manner. It aims to provide the technology that makes it possible.

上記課題を解決するために、本発明は、紙文書を電子化して得られる文書画像データが入力される入力手段と、前記入力手段へ入力された文書画像データを解析し、その文書画像データの表す文書の原本に対して手書きで追記された追記マークを特定する第１の特定手段と、前記第１の特定手段により特定された追記マークの各々について、その追記マークが対応付けられている前記文書画像データの画像領域を特定する第２の特定手段と、前記第１の特定手段により複数の追記マークが特定された場合に、各追記マークの形状を解析して、同一又は類似する形状を有する追記マーク毎に分類する分類手段と、前記分類手段により同一又は類似する形状を有すると分類された追記マークの各々について、その追記マークに対応付けられていると前記第２の特定手段により特定された画像領域に配置されている画像を所定のレイアウトにしたがって配置して出力する出力手段とを有する文書処理装置を提供する。 In order to solve the above problems, the present invention analyzes an input unit for inputting document image data obtained by digitizing a paper document, and analyzes the document image data input to the input unit. The first specifying means for specifying the additional mark added by hand writing to the original document to be represented, and the additional mark specified by the first specifying means are associated with the additional mark. When a plurality of additional marks are specified by the second specifying means for specifying the image area of the document image data and the first specifying means, the shape of each additional mark is analyzed, and the same or similar shape is analyzed. Classifying means for classifying each additional writing mark and each additional writing mark classified as having the same or similar shape by the classification means are associated with the additional writing mark. To provide a document processing apparatus and an output means for outputting the image disposed in the image area specified by the second specifying means arranged according to a predetermined layout.

このような文書処理装置によれば、同一又は類似する形状を有する追記マークが対応付けられた画像領域に配置される画像が所定のレイアウトに配置されて出力される。このため、上記所定のレイアウトとして、例えば、複数の画像をリスト形式に並べて配置するレイアウトを採用すれば、上記文書画像データの表す画像において、同一又は類似する追記マークが対応付けられた画像領域に配置される画像がリスト形式に並べられて出力されることになる。同一又は類似する追記マークが対応付けられた画像領域に配置される画像は、互いに関連性を有する事項の画像を表していると推測されることは前述した通りであるから、本発明にかかる文書処理装置によれば、上記紙文書に記載されている事項のうち、互いに関連性を有すると推測される事項を集めて、例えばリスト形式で出力するなど、それら互いに関連性を有する事項のみをユーザに判りやすく提示することが可能になる。 According to such a document processing apparatus, an image arranged in an image region associated with an additional mark having the same or similar shape is arranged and output in a predetermined layout. For this reason, for example, if a layout in which a plurality of images are arranged and arranged in a list format is adopted as the predetermined layout, in the image represented by the document image data, an image region associated with the same or similar additional mark is associated. The arranged images are output in a list format. As described above, it is presumed that the images arranged in the image region associated with the same or similar additional mark are assumed to represent images of matters having relevance to each other. According to the processing apparatus, among the items described in the paper document, the items that are estimated to be related to each other are collected, and for example, the items that are related to each other are output in a list format. Can be presented in an easy-to-understand manner.

より好ましい態様においては、前記第１の特定手段は、前記入力手段へ入力された文書画像データとその文書画像データの表す紙文書の原本を表す原本画像データとを比較して、前記追記マークを特定することを特徴としている。このような態様においては、上記原本に対して付された追記マークが確実に特定される。 In a more preferred aspect, the first specifying unit compares the document image data input to the input unit with original image data representing an original of a paper document represented by the document image data, and adds the additional mark. It is characterized by identification. In such an aspect, the postscript mark attached to the original is surely specified.

また、別の好ましい態様においては、前記第１の特定手段は、前記入力手段へ入力された文書画像データの表す画像の色とその文書画像データの表す紙文書の原本の色との差に基づいて、前記追記マークを特定することを特徴としている。このような態様においては、上記原本に付された追記マークの色がその原本の記載に用いられている色とは異なっている場合に、その追記マークが確実に特定される。 In another preferred embodiment, the first specifying means is based on a difference between an image color represented by the document image data input to the input means and an original color of the paper document represented by the document image data. The additional mark is specified. In such an aspect, when the color of the additional mark attached to the original is different from the color used in the description of the original, the additional mark is reliably specified.

また、別の好ましい態様においては、前記分類手段は、追記マークの形状を解析してその特徴量を算出し、算出された特徴量が同一である追記マーク同士を同一の形状を有する追記マークとして分類する一方、算出された特徴量の乖離が所定の範囲内である追記マーク同士を互いに類似する追記マークとして分類することを特徴としている。このような態様においては、各追記マークの色や回転の有無などの配置態様によらずに、同一又は類似する形状を有する追記マークが確実に分類される。 In another preferred embodiment, the classifying unit analyzes the shape of the write-once mark and calculates a feature amount thereof, and the write-once marks having the same calculated feature amount are used as write-once marks having the same shape. On the other hand, it is characterized in that the recordable marks whose deviation of the calculated feature amount is within a predetermined range are classified as similar recordable marks. In such an aspect, the additional recording marks having the same or similar shapes are surely classified regardless of the arrangement mode such as the color of each additional recording mark and the presence / absence of rotation.

また、上記課題を解決するために、本発明は、コンピュータ装置に、紙文書を電子化して得られる文書画像データを解析し、その文書画像データの表す文書の原本に対して手書きで追記された追記マークを特定する第１のステップと、前記第１のステップにて特定された追記マークの各々について、その追記マークが対応付けられている前記文書画像データの画像領域を特定する第２のステップと、前記第１のステップにて複数の追記マークが特定された場合に、各追記マークの形状を解析して、同一又は類似する形状を有する追記マーク毎に分類する第３のステップと、前記第３のステップにて同一又は類似する形状を有する追記マークを表すと分類された追記マークの各々について、その追記マークに対応付けられていると前記第２のステップにて特定された画像領域に配置されている画像を所定のレイアウトにしたがって配置して出力する第４のステップとを実行させるプログラムプログラムを提供する。 Further, in order to solve the above-mentioned problems, the present invention analyzes document image data obtained by digitizing a paper document in a computer device, and is added by handwriting to the original document represented by the document image data. A first step of specifying a write-once mark and a second step of specifying an image area of the document image data associated with the write-once mark for each of the write-once marks specified in the first step And when a plurality of additional marks are identified in the first step, a third step of analyzing the shape of each additional mark and classifying each additional mark having the same or similar shape, If each additional recording mark classified as representing an additional recording mark having the same or similar shape in the third step is associated with the additional recording mark, the second scan is performed. Providing a program program for executing a fourth step of outputting the image disposed in the image region specified by arranging in accordance with a predetermined layout in-up.

このようなプログラムによれば、このプログラムを一般的なコンピュータ装置へインストールしそのコンピュータ装置を該プログラムにしたがって作動させることによって、そのコンピュータ装置に本発明にかかる文書処理装置と同一の機能が付与される。なお、本発明の別の態様においては、コンピュータ装置読取り可能な記録媒体に上記プログラムを書き込んで提供するとしても勿論良い。 According to such a program, by installing this program in a general computer device and operating the computer device according to the program, the same function as that of the document processing device according to the present invention is given to the computer device. The In another aspect of the present invention, it is of course possible to provide the program by writing it in a computer-readable recording medium.

本発明によれば、文書に対して手書きで追記された追記マークに基づいて、互いに関連性を有すると推測される領域の記載内容を集めてユーザに判りやすく提示することが可能になる、といった効果を奏する。 According to the present invention, it is possible to collect the description contents of regions that are presumed to be related to each other and present them to the user in an easy-to-understand manner based on the postscript marks that are handwritten on the document. There is an effect.

以下、本発明を実施する際の最良の形態について図面を参照しつつ説明する。
（Ａ：構成）
図１は、本発明の１実施形態に係る文書処理装置１１０を含んでいる文書処理システム１０の構成例を示すブロック図である。図１の画像読取装置１２０は、例えばＡＤＦ（Auto Document Feeder）などの自動給紙機構を備えたスキャナ装置であり、ＡＤＦにセットされた紙文書を１ページずつ光学的に読み取り、読み取った画像を表す画像データ（以下、文書画像データ）をＬＡＮ（Local Area Network）などの通信線１３０を介して文書処理装置１１０へ引渡すものである。なお、本実施形態では、通信線１３０がＬＡＮである場合について説明するが、ＷＡＮ（Wide Area Network）やインターネットなどであっても良いことは勿論である。また、本実施形態では、文書処理装置１１０と画像読取装置１２０とを夫々個別のハードウェアとして構成する場合について説明するが、両者を一体のハードウェアで構成するとしても良いことは勿論である。このような態様にあっては、通信線１３０は、係るハードウェア内で文書処理装置１１０と画像読取装置１２０とを接続する内部バスである。 The best mode for carrying out the present invention will be described below with reference to the drawings.
(A: Configuration)
FIG. 1 is a block diagram illustrating a configuration example of a document processing system 10 including a document processing apparatus 110 according to an embodiment of the present invention. An image reading device 120 in FIG. 1 is a scanner device having an automatic paper feeding mechanism such as an ADF (Auto Document Feeder), for example, and optically reads a paper document set in the ADF page by page, and reads the read image. The image data to be represented (hereinafter, document image data) is delivered to the document processing apparatus 110 via a communication line 130 such as a LAN (Local Area Network). In the present embodiment, the case where the communication line 130 is a LAN will be described, but it is needless to say that it may be a WAN (Wide Area Network), the Internet, or the like. In the present embodiment, the document processing apparatus 110 and the image reading apparatus 120 are described as separate hardware, but it is needless to say that both may be configured as integral hardware. In such an aspect, the communication line 130 is an internal bus that connects the document processing apparatus 110 and the image reading apparatus 120 within the hardware.

さて、図１に示す文書処理システム１０においては、読み手によって追記マークが手書きされた紙文書が画像読取装置１２０の自動給紙機構にセットされ、その紙文書の画像を表す文書画像データが文書処理装置１１０へ送られる。一方、文書処理装置１１０には、上記紙文書の原本（すなわち、手書きによる追記が為されていない紙文書）の画像を表す画像データ（以下、「原本画像データ」）が予め格納されている。そして、文書処理装置１１０は、画像読取装置１２０から受取った文書画像データと自装置に予め格納されている原本画像データとを比較して、上記読み手により付された追記マークを特定し、同一又は類似する追記マークが付されている章や段落の記載内容を集めて所定のレイアウトにしたがって出力する機能を備えている。以下、文書処理装置１１０の構成および動作を中心に説明する。 In the document processing system 10 shown in FIG. 1, a paper document in which an additional mark is handwritten by a reader is set in the automatic paper feed mechanism of the image reading device 120, and document image data representing an image of the paper document is processed by the document processing. Sent to device 110. On the other hand, the document processing apparatus 110 stores in advance image data (hereinafter, “original image data”) representing an image of an original of the paper document (that is, a paper document that has not been added by handwriting). Then, the document processing apparatus 110 compares the document image data received from the image reading apparatus 120 with the original image data stored in advance in its own apparatus, identifies the additional mark added by the reader, and is identical or It has a function to collect descriptions of chapters and paragraphs with similar append marks and output them according to a predetermined layout. Hereinafter, the configuration and operation of the document processing apparatus 110 will be mainly described.

図２は、文書処理装置１１０のハードウェア構成の一例を示す図である。
図２に示されているように、文書処理装置１１０は、制御部２００と、通信インターフェイス（以下、ＩＦ）部２１０と、表示部２２０と、操作部２３０と、記憶部２４０と、これら各構成要素間のデータ授受を仲介するバス２５０と、を備えている。 FIG. 2 is a diagram illustrating an example of a hardware configuration of the document processing apparatus 110.
As shown in FIG. 2, the document processing apparatus 110 includes a control unit 200, a communication interface (hereinafter referred to as IF) unit 210, a display unit 220, an operation unit 230, a storage unit 240, and each of these components. And a bus 250 that mediates data exchange between elements.

制御部２００は、例えばＣＰＵ（Central Processing Unit）であり、後述する記憶部２４０に格納されている各種ソフトウェアを実行することによって、文書処理装置１１０の各部を中枢的に制御するものである。通信ＩＦ部２１０は、通信線１３０を介して画像読取装置１２０に接続されており、この通信線１３０を介して画像読取装置１２０から送られてくる文書画像データを受取り、制御部２００へ引渡すものである。つまり、この通信ＩＦ部２１０は、画像読取装置１２０から送られてくる文書画像データが入力される入力手段として機能する。 The control unit 200 is, for example, a CPU (Central Processing Unit), and centrally controls each unit of the document processing apparatus 110 by executing various kinds of software stored in a storage unit 240 described later. The communication IF unit 210 is connected to the image reading device 120 via the communication line 130, receives the document image data sent from the image reading device 120 via the communication line 130, and delivers it to the control unit 200. It is. That is, the communication IF unit 210 functions as an input unit to which document image data sent from the image reading device 120 is input.

表示部２２０は、例えば液晶ディスプレイとその駆動回路であり、制御部２００から引渡されたデータに応じた画像を表示するものである。操作部２３０は、例えば、複数の操作子（図示省略）を備えたキーボードであり、それら操作子の操作内容に応じたデータ（以下、操作内容データ）を制御部２００へ引渡すことによって、ユーザによる上記各操作子の操作内容を制御部２００へ伝達するためのものである。 The display unit 220 is, for example, a liquid crystal display and a drive circuit thereof, and displays an image corresponding to data delivered from the control unit 200. The operation unit 230 is, for example, a keyboard provided with a plurality of operators (not shown), and passes data (hereinafter referred to as operation content data) according to the operation contents of these operators to the control unit 200, so that the user can This is for transmitting the operation content of each of the above-mentioned operators to the control unit 200.

記憶部２４０は、図２に示されているように、揮発性記憶部２４０ａと不揮発性記憶部２４０ｂとを含んでいる。揮発性記憶部２４０ａは、例えばＲＡＭ（Random Access Memory）であり、後述する各種ソフトウェアにしたがって作動している制御部２００によってワークエリアとして利用される。一方、不揮発性記憶部２４０ｂは、例えば、ハードディスクである。この不揮発性記憶部２４０ｂには、本実施形態に係る文書処理装置１１０に特有な機能を制御部２００に実現させるためのデータやソフトウェアが格納されている。 As shown in FIG. 2, the storage unit 240 includes a volatile storage unit 240a and a nonvolatile storage unit 240b. The volatile storage unit 240a is, for example, a RAM (Random Access Memory), and is used as a work area by the control unit 200 operating according to various software described below. On the other hand, the non-volatile storage unit 240b is, for example, a hard disk. The nonvolatile storage unit 240b stores data and software for causing the control unit 200 to realize functions unique to the document processing apparatus 110 according to the present embodiment.

不揮発性記憶部２４０ｂに格納されているデータの一例としては、前述した原本画像データが挙げられる。一方、不揮発性記憶部２４０ｂに格納されているソフトウェアの一例としては、オペレーティングシステム（Operating System 以下、「ＯＳ」）を制御部２００に実現させるためのＯＳソフトウェアと編集ソフトウェアとが挙げられる。ここで、編集ソフトウェアとは、画像読取装置１２０から入力された文書画像データにより表される文書にて同一又は類似する追記マークを付与されている記載箇所を章や段落などの領域毎に抽出し、それらを所定のレイアウトにしたがって表示部２２０に表示する処理を制御部２００に実行させるためのソフトウェアである。以下、これらソフトウェアを実行することによって制御部２００に付与される機能について説明する。 As an example of the data stored in the nonvolatile storage unit 240b, the above-described original image data can be cited. On the other hand, examples of software stored in the nonvolatile storage unit 240b include OS software and editing software for causing the control unit 200 to implement an operating system (hereinafter referred to as “OS”). Here, the editing software extracts, for each region such as a chapter or a paragraph, a description portion to which the same or similar additional mark is given in the document represented by the document image data input from the image reading device 120. This is software for causing the control unit 200 to execute processing for displaying them on the display unit 220 according to a predetermined layout. Hereinafter, functions provided to the control unit 200 by executing these software will be described.

文書処理装置１１０の電源（図示省略）が投入されると、制御部２００は、まず、ＯＳソフトウェアを不揮発性記憶部２４０ｂから読み出し実行する。ＯＳソフトウェアにしたがって作動しＯＳを実現している状態の制御部２００には、文書処理装置１１０の各部を制御する機能や、ユーザの指示に応じて他のソフトウェアを不揮発性記憶部２４０ｂから読み出し実行する機能が付与される。例えば、上記編集ソフトウェアの実行を指示されると、制御部２００は、上記編集ソフトウェアを不揮発性記憶部２４０ｂから読み出し、これを実行する。この編集ソフトウェアにしたがって作動している制御部２００には、以下に述べる４個の機能が付与される。 When the power (not shown) of the document processing apparatus 110 is turned on, the control unit 200 first reads and executes the OS software from the nonvolatile storage unit 240b. The control unit 200 operating according to the OS software and realizing the OS reads out and executes functions for controlling each unit of the document processing apparatus 110 and other software from the nonvolatile storage unit 240b in accordance with a user instruction. Function is given. For example, when the execution of the editing software is instructed, the control unit 200 reads the editing software from the nonvolatile storage unit 240b and executes it. The control unit 200 operating according to the editing software is given the following four functions.

第１に、画像読取装置１２０から通信ＩＦ部２１０を介して受取った文書画像データを解析し、その文書画像データの表す文書の原本に対して手書きで追記された追記マークを特定する第１の特定機能である。具体的には、制御部２００は、上記文書画像データと上記原本画像データとを比較し、両者の差分を表す画像を上記原本に対する追記事項を表す追記画像として特定し、その追記画像の各々を表す画像データ（以下、追記画像データ）を生成する。このようにして生成された追記画像データの表す追記画像には、上述した追記マークの画像が含まれているのであるから、上記追記画像データを生成することによって追記マークが特定されることになる。 First, the document image data received from the image reading device 120 via the communication IF unit 210 is analyzed, and the additional mark added by hand to the original document represented by the document image data is specified. It is a specific function. Specifically, the control unit 200 compares the document image data and the original image data, specifies an image representing the difference between the two as an additional image representing an additional item for the original, and sets each of the additional images. Image data to be represented (hereinafter referred to as additional image data) is generated. Since the additional image represented by the additional image data generated in this way includes the image of the additional recording mark described above, the additional recording mark is specified by generating the additional recording image data. .

第２に、上記第１の特定機能により特定された追記マークの各々について、その追記マークが対応付けられている上記文書画像データの画像領域を特定する第２の特定機能である。より詳細に説明すると、制御部２００は、上記第１の特定機能により生成された各追記画像データの表す追記画像について、文書画像データの表す画像内での位置を特定するとともに、上記原本画像データの表す画像についてレイアウト解析を施し、各画像領域の配置位置を特定する。そして、上記追記画像と所定の位置関係にある画像領域（例えば、その追記画像からの距離が所定の閾値以下である画像領域）をその追記画像の表す追記マークに対応付けられている画像領域として特定する。 Secondly, for each additional mark specified by the first specifying function, a second specifying function for specifying an image area of the document image data associated with the additional mark. More specifically, the control unit 200 specifies the position in the image represented by the document image data with respect to the postscript image represented by each postscript image data generated by the first identifying function, and the original image data. Is subjected to layout analysis, and the arrangement position of each image region is specified. Then, an image area (for example, an image area whose distance from the additional image is equal to or less than a predetermined threshold) that is in a predetermined positional relationship with the additional image is defined as an image region that is associated with the additional mark represented by the additional image. Identify.

第３に、上記第１の特定機能により複数の追記マークが特定された場合に、各追記マークの形状を解析し、同一又は類似する形状を有する追記マーク毎に分類する分類機能である。より詳細に説明すると、制御部２００は、各追記画像データの表す追記画像を解析してその追記画像の表す追記マークの形状についての特徴量を算出し、その特徴量が同一である追記マーク同士を同一の追記マークとして分類し、特徴量が互いに異なっているもののその乖離が所定の範囲内である追記マークを互いに類似する追記マークとして分類する。 Third, when a plurality of additional marks are specified by the first specifying function, the shape of each additional mark is analyzed and classified for each additional mark having the same or similar shape. More specifically, the control unit 200 analyzes the postscript image represented by each postscript image data, calculates the feature amount of the shape of the postscript mark represented by the postscript image, and the postscript marks having the same feature amount Are classified as the same additional recording marks, and the additional recording marks whose feature amounts are different from each other but within a predetermined range are classified as similar additional recording marks.

そして、第４に、上記分類機能により同一又は類似する形状を有する追記マークと分類された追記マークの各々について、その追記マークに対応付けられていると上記第２の特定機能により特定された画像領域に配置されている画像を所定のレイアウトにしたがって配置して形成される画像を表す画像データ（以下、編集済み画像データ）を生成し、その画像データを表示部２２０へ出力する出力機能である。なお、本実施形態では、上記のようにして生成された編集済み画像データを表示部２２０へ出力してその編集済み画像データに応じた画像を表示させる場合について説明するが、上記編集済み画像データを、例えばプリンタ装置などの画像形成装置へ転送し、その編集済み画像データの表す画像を印刷用紙やＯＨＰシートなどの記録材上に形成させるようにしても勿論良い。 And fourth, the image specified by the second specifying function when each of the additional marks classified as the additional mark having the same or similar shape by the classification function is associated with the additional mark. This is an output function for generating image data representing an image formed by arranging images arranged in a region according to a predetermined layout (hereinafter referred to as edited image data) and outputting the image data to the display unit 220. . In this embodiment, the case where the edited image data generated as described above is output to the display unit 220 and an image corresponding to the edited image data is displayed is described. Of course, the image may be transferred to an image forming apparatus such as a printer, and an image represented by the edited image data may be formed on a recording material such as printing paper or an OHP sheet.

以上に説明したように、本実施形態に係る文書処理装置１１０のハードウェア構成は一般的なコンピュータ装置のハードウェア構成と同一であり、不揮発性記憶部２４０ｂに格納されている各種ソフトウェアを制御部２００に実行させることによって、本発明に係る文書処理装置に特有な機能が実現される。このように、本実施形態では、本発明に係る文書処理装置に特有な機能をソフトウェアモジュールで実現する場合について説明したが、これらの機能を各々担っているハードウェアモジュールを組み合わせて本発明に係る文書処理装置を構成するとしても良いことは勿論である。 As described above, the hardware configuration of the document processing apparatus 110 according to the present embodiment is the same as the hardware configuration of a general computer apparatus, and various control software stored in the non-volatile storage unit 240b is controlled by the control unit. By executing the processing 200, functions unique to the document processing apparatus according to the present invention are realized. As described above, in the present embodiment, the case where the functions specific to the document processing apparatus according to the present invention are realized by software modules has been described. However, the present invention is based on a combination of hardware modules each having these functions. Of course, the document processing apparatus may be configured.

（Ｂ：動作）
次いで、文書処理装置１１０が行う動作のうち、その特徴を顕著に示す動作について図面を参照しつつ説明する。なお、以下に説明する動作例では、文書処理装置１１０の制御部２００は上記編集ソフトウェアにしたがって作動しており、画像読取り装置１２０から文書画像データが送られてくることを待ち受けているものとする。 (B: Operation)
Next, of the operations performed by the document processing apparatus 110, operations that significantly show the features will be described with reference to the drawings. In the operation example described below, the control unit 200 of the document processing apparatus 110 operates according to the editing software, and is waiting for document image data to be sent from the image reading apparatus 120. .

ユーザが画像読取装置１２０のＡＤＦに紙文書をセットし、所定の操作（例えば、画像読取装置１２０の操作部に設けられている起動ボタンの押下など）を行うと、その紙文書に対応する画像が画像読取装置１２０によって読み取られ、その画像に対応する文書画像データが通信線１３０を介して画像読取装置１２０から文書処理装置１１０へ送られる。本動作例では、図３（ａ）に示す紙文書が画像読取装置１２０のＡＤＦにセットされ、その紙文書に対応する画像を表す文書画像データが画像読取装置１２０から文書処理装置１１０へ送られるものとする。また、文書処理装置１１０の不揮発性記憶部２４０ｂには、図３（ａ）に示す紙文書の原本（図３（ｂ）参照）を表す原本画像データが予め１つだけ格納されているものとする。図３（ａ）と図３（ｂ）との比較から明らかなように、図３（ａ）に示す紙文書においては、その第１段落（図３（ａ）：“Ｂ０１”）には追記マークＭ０１が付与されており、その第３段落（図３（ａ）：“Ｂ０２”）には、追記マークＭ０２が付与されており、その第４段落（図３（ａ）：“Ｂ０４”）には、追記マークＭ０３が付与されている。図３（ａ）から明らかなように、追記マークＭ０１と追記マークＭ０３とは同一の形状を有する追記マークである。 When a user sets a paper document in the ADF of the image reading device 120 and performs a predetermined operation (for example, pressing a start button provided in the operation unit of the image reading device 120), an image corresponding to the paper document Is read by the image reading device 120, and document image data corresponding to the image is sent from the image reading device 120 to the document processing device 110 via the communication line 130. In this operation example, the paper document shown in FIG. 3A is set in the ADF of the image reading apparatus 120, and document image data representing an image corresponding to the paper document is sent from the image reading apparatus 120 to the document processing apparatus 110. Shall. The non-volatile storage unit 240b of the document processing apparatus 110 stores only one original image data representing the original of the paper document (see FIG. 3B) shown in FIG. To do. As is clear from the comparison between FIG. 3A and FIG. 3B, in the paper document shown in FIG. 3A, the first paragraph (FIG. 3A: “B01”) is added. The mark M01 is assigned, and the third paragraph (FIG. 3 (a): “B02”) is provided with the additional mark M02, and the fourth paragraph (FIG. 3 (a): “B04”). Is provided with a postscript mark M03. As is clear from FIG. 3A, the additional recording mark M01 and the additional recording mark M03 are additional recording marks having the same shape.

図４は、上記編集ソフトウェアにしたがって制御部２００が行う編集処理の流れを示すフローチャートである。図４に示すように、制御部２００は、通信線１３０を介して画像読取装置１２０から送られてくる文書画像データを通信ＩＦ部２１０によって受取ると（ステップＳＡ１００）、その文書画像データと不揮発性記憶部２４０ｂに格納されている原本画像データとを比較し、その原本画像データの表す原本に対して付された追記マークを特定する（ステップＳＡ１１０）。図３（ａ）および図３（ｂ）の比較から明らかなように、本動作例では、追記マークＭ０１、Ｍ０２およびＭ０３の各々を表す３つの追記画像データが上記第１の特定機能によって生成され、これら追記画像データによって各追記マークが特定されることになる。なお、本実施形態では、上記文書画像データの表す紙文書の原本を表す原本画像データが予め１つだけ文書処理装置１１０の不揮発性記憶部２４０ｂに格納されている場合について説明したが、複数種類の紙文書の各々の原本を表す原本画像データ（すなわち、複数の原本画像データ）が文書処理装置１１０の不揮発性記憶部２４０ｂに格納されているとしても良いことは勿論である。このような場合には、上記ステップＳＡ１１０の実行に先立って、上記文書画像データの表す紙文書の原本を表す原本画像データをユーザに指定させるようにすれば良い。 FIG. 4 is a flowchart showing the flow of editing processing performed by the control unit 200 in accordance with the editing software. As shown in FIG. 4, when the control unit 200 receives the document image data sent from the image reading device 120 via the communication line 130 by the communication IF unit 210 (step SA100), the control unit 200 and the document image data are stored in a nonvolatile manner. The original image data stored in the storage unit 240b is compared, and the additional mark attached to the original represented by the original image data is specified (step SA110). As is apparent from the comparison between FIG. 3A and FIG. 3B, in this operation example, three additional image data representing each of the additional recording marks M01, M02, and M03 are generated by the first specific function. Each additional recording mark is specified by these additional recording image data. In the present embodiment, the case where only one original image data representing the original of the paper document represented by the document image data is stored in advance in the nonvolatile storage unit 240b of the document processing apparatus 110 has been described. Of course, original image data representing each original of the paper document (that is, a plurality of original image data) may be stored in the nonvolatile storage unit 240b of the document processing apparatus 110. In such a case, prior to the execution of step SA110, the user may designate original image data representing the original of the paper document represented by the document image data.

次いで、制御部２００は、上記ステップＳＡ１１０にて特定した追記マークの各々について、上記文書画像データの表す画像における配置位置を特定（ステップＳＡ１２０）するとともに、上記原本画像データの表す画像についてレイアウト解析を施し、その原本画像データの表す原本における章や段落などのブロック毎にそのブロックを表す画像領域の配置位置を特定する（ステップＳＡ１３０）。そして、制御部２００は、上記追記画像との距離が所定の閾値以下である画像領域をその追記画像の表す追記マークに対応付けられている画像領域として特定する（ステップＳＡ１４０）。図３（ａ）に示すように、追記マークＭ０１は、処理対象の文書の第１段落を表す画像領域Ｂ０１の近傍に位置しているのであるから、上記ステップＳＡ１３０の処理によって追記マークＭ０１と画像領域Ｂ０１とが対応付けられていると特定される。同様に、上記ステップＳＡ１３０の処理によって、追記マークＭ０２と画像領域Ｂ０２とが対応付けられていると特定され、追記マークＭ０３と画像領域Ｂ０４とが対応付けられていると特定される。なお、本実施形態では、原本画像データの表す画像にレイアウト解析を施して、その原本における章や段落などのブロック毎にそのブロックを表す画像領域の配置位置を特定する場合について説明したが、画像読取装置１２０から引渡された文書画像データに対して、上記追記画像を表すデータを取り除く処理を施して得られる画像データにレイアウト解析を施して、上記原本における章や段落などのブロック毎にそのブロックを表す画像領域の配置位置を特定するとしても良いことは勿論である。 Next, the control unit 200 specifies an arrangement position in the image represented by the document image data for each additional mark specified in step SA110 (step SA120), and performs layout analysis on the image represented by the original image data. Then, for each block such as a chapter or paragraph in the original represented by the original image data, the arrangement position of the image area representing the block is specified (step SA130). Then, the control unit 200 identifies an image area whose distance from the additional image is equal to or less than a predetermined threshold as an image area associated with the additional mark represented by the additional image (step SA140). As shown in FIG. 3A, the additional mark M01 is located in the vicinity of the image area B01 representing the first paragraph of the document to be processed. It is specified that the region B01 is associated. Similarly, by the process of step SA130, it is specified that the additional recording mark M02 and the image area B02 are associated with each other, and it is specified that the additional recording mark M03 and the image area B04 are associated with each other. In the present embodiment, the case where layout analysis is performed on an image represented by original image data and an arrangement position of an image area representing the block is specified for each block such as a chapter or paragraph in the original has been described. The document image data delivered from the reading device 120 is subjected to layout analysis on the image data obtained by performing processing for removing the data representing the additional image, and each block such as a chapter or paragraph in the original is subjected to the block analysis. Of course, the arrangement position of the image area representing the position may be specified.

次いで、制御部２００は、上記ステップＳＡ１１０において複数の追記マークが特定されたか否かを判定し（ステップＳＡ１５０）、その判定結果が“Ｙｅｓ”である場合には、以下に述べるステップＳＡ１６０〜ＳＡ１７０の処理を実行する。前述したように、本動作例では、上記ステップＳＡ１１０において３つの追記マークが特定されているのであるか、上記ステップＳＡ１５０の判定結果は“Ｙｅｓ”になり、ステップＳＡ１６０〜ステップＳＡ１７０の処理が実行されることになる。 Next, the control unit 200 determines whether or not a plurality of additional marks are specified in step SA110 (step SA150). If the determination result is “Yes”, the control unit 200 performs steps SA160 to SA170 described below. Execute the process. As described above, in this operation example, whether or not three additional marks are specified in step SA110, the determination result in step SA150 is “Yes”, and the processes in steps SA160 to SA170 are executed. Will be.

ステップＳＡ１５０の判定結果が“Ｙｅｓ”である場合に後続して実行されるステップＳＡ１６０においては、制御部２００は、各追記マークについてその形状を表す特徴量を算出し、その特徴量が同一である追記マーク同士を同一の追記マークとして分類し、特徴量が互いに異なっていてもその乖離が所定の範囲内である追記マークを互いに類似する追記マークとして分類する。このステップＳＡ１６０の処理が実行されることによって、本動作例においては、上記ステップＳＡ１１０にて特定された３つの追記マークが、追記マークＭ０１と追記マークＭ０３のグループと、追記マークＭ０２との２組に分類されることになる（図３（ａ）参照）。 In step SA160, which is subsequently executed when the determination result in step SA150 is “Yes”, the control unit 200 calculates a feature amount representing the shape of each additional mark, and the feature amount is the same. The recordable marks are classified as the same recordable marks, and the recordable marks whose divergence is within a predetermined range even if the feature amounts are different from each other are classified as similar recordable marks. By executing the processing of step SA160, in this operation example, the three additional recording marks identified in step SA110 include two sets of the additional recording mark M01 and the additional recording mark M03, and the additional recording mark M02. (See FIG. 3A).

上記ステップＳＡ１６０に後続して実行されるステップＳＡ１７０においては、制御部２００は、上記ステップＳＡ１６０にて同一又は類似する形状を有すると分類された追記マークの各々について、その追記マークに対応付けられていると上記ステップＳＡ１３０にて特定された画像領域に配置されている画像を図５（ａ）に示すリスト形式のレイアウトにしたがって配置して得られる画像を表す画像データを生成し表示部２２０へ出力する。
以上が本実施形態に係る文書処理装置１１０が実行する編集処理である。 In step SA170 executed subsequent to step SA160, control unit 200 associates each additional mark marked as having the same or similar shape in step SA160 with the additional mark. If so, image data representing an image obtained by arranging the image arranged in the image area specified in step SA130 according to the layout of the list format shown in FIG. 5A is generated and output to the display unit 220. To do.
The above is the editing process executed by the document processing apparatus 110 according to the present embodiment.

以上に説明したように、本実施形態に係る文書処理装置１１０によれば、文書処理装置１１０の表示部２２０には、図５（ｂ）に示す画像が表示されることになる。図５（ｂ）に示す表示例では、処理対象の紙文書において、同一又は類似する形状を有する追記マーク毎に、その追記マークが対応付けられている領域の記載内容が並べて表示されている。前述したように、同一又は類似する形状を有する追記マークが付与されている領域の記載内容が関連性を有している可能性が高いことは前述した通りであるから、本実施形態に係る文書処理装置１１０によれば、紙文書に対して手書きで追記された文字や記号、図形などの追記マークに基づいて、互いに関連性を有すると推測される領域の記載内容を集めてユーザに判りやすく提示することをが可能になるといった効果を奏する。 As described above, according to the document processing apparatus 110 according to the present embodiment, the image shown in FIG. 5B is displayed on the display unit 220 of the document processing apparatus 110. In the display example shown in FIG. 5B, in the paper document to be processed, the description content of the region associated with the additional mark is displayed side by side for each additional mark having the same or similar shape. As described above, as described above, it is highly possible that the description content of the area to which the additional recording mark having the same or similar shape is attached is related. Therefore, the document according to this embodiment. According to the processing device 110, based on additional marks such as characters, symbols, and figures that are added by hand to a paper document, the description contents of the areas that are assumed to be related to each other are collected and easily understood by the user. There is an effect that it is possible to present.

（Ｃ：変形）
以上、本発明の１実施形態について説明したが、上述した実施形態を以下に述べるように変形しても良いことは勿論である。
（Ｃ−１：変形例１）
上述した実施形態では、画像読取装置１２０から引渡された文書画像データと原本画像データとを比較することによって、その原本画像データの表す原本に対して追記された事項を表す追記画像を文書処理装置１１０に抽出させる場合について説明した。しかしながら、画像読取装置から引渡された文書画像データを文書処理装置に解析させ、その文書画像データの表す画像のうち、原本を紙文書に印刷する際に利用される色として予め定められた色を表す画像以外の部分を追記画像として文書処理装置に抽出させるようにしても良い。このような態様によれば、例えば、原本がモノクロ印刷された紙文書に対して朱書きされた追記事項を、その原本を表す原本データとの比較を行うことなく、文書処理装置に抽出させることが可能になる。 (C: deformation)
Although one embodiment of the present invention has been described above, it is needless to say that the above-described embodiment may be modified as described below.
(C-1: Modification 1)
In the above-described embodiment, the document image data transferred from the image reading device 120 is compared with the original image data, whereby the additional image representing the item added to the original represented by the original image data is converted into the document processing apparatus. The case where 110 is extracted has been described. However, the document image data delivered from the image reading apparatus is analyzed by the document processing apparatus, and among the images represented by the document image data, a predetermined color is used as a color used when printing the original on a paper document. A part other than the image to be displayed may be extracted by the document processing apparatus as a postscript image. According to such an aspect, for example, the document processing apparatus can extract the additional information written in red for a paper document on which the original is printed in black and white without comparing the original with the original data representing the original. Is possible.

また、画像読取装置から引渡された文書画像データを文書処理装置に解析させ、その文書画像データの表す画像のうち、手書き文字や手書き図形を表す画像を追記画像として文書処理装置に抽出させるようにしても良い。このような態様によれば、例えば、活字などで印刷された紙文書に対して手書きで追記された事項を、その原本を表す原本データとの比較を行うことなく、文書処理装置に抽出させることが可能になる。また、原本の印刷に用いられている紙がアノトペーパであり、アノトペンにより追記がなされた場合には、そのアノトペンから出力されたデータに基づいてその追記事項およに追記位置を特定することも可能である。 Further, the document image data delivered from the image reading device is analyzed by the document processing device, and among the images represented by the document image data, an image representing a handwritten character or a handwritten figure is extracted as an additional image by the document processing device. May be. According to such an aspect, for example, the document processing apparatus can extract the items added by hand to a paper document printed in type without comparing with the original data representing the original. Is possible. In addition, if the paper used to print the original is Anoto Paper, and additional writing is performed with Anoto Pen, it is also possible to specify the additional writing position and the additional writing position based on the data output from the Anoto Pen. It is.

（Ｃ−２：変形例２）
上述した実施形態では、処理対象の紙文書が１頁で構成されている場合について説明した。しかしながら複数の頁から構成されている紙文書を本発明に係る文書処理装置の処理対象としても良いことは勿論である。このように処理対象の紙文書が複数の頁で構成されている場合には、その各頁の画像を表す画像データの各々について前述したステップＳＡ１１０〜ＳＡ１４０の処理を実行するようにすれば良い。また、上述した実施形態では、各追記マークに対応付けられている領域が紙文書の原本に記載されていた事項を表す領域である場合について説明したが、追記マークとともに手書きされた記載内容を表す画像が配置される画像領域（以下、手書き領域）がその追記マークの近傍に位置しているなどその追記マークと所定の位置関係を有している場合には、その手書き領域がその追記マークに対応付けられていると特定しても良いことは勿論である。例えば、図６（ａ）に示すような3頁で構成されている紙文書について本発明に係る文書処理装置により追記事項の抽出を行い、その抽出結果をその文書処理装置に出力（例えば、その抽出結果の表示など）させると、図６（ｂ）に示す画像が出力されることになる。 (C-2: Modification 2)
In the embodiment described above, the case where the paper document to be processed is composed of one page has been described. However, it is needless to say that a paper document composed of a plurality of pages may be processed by the document processing apparatus according to the present invention. In this way, when the paper document to be processed is composed of a plurality of pages, the processing in steps SA110 to SA140 described above may be executed for each piece of image data representing the image of each page. Further, in the above-described embodiment, the case where the area associated with each additional mark is an area representing an item described in the original of the paper document is described. If the image area where the image is placed (hereinafter referred to as the handwriting area) has a predetermined positional relationship with the additional mark, such as being located near the additional mark, the handwritten area becomes the additional mark. Of course, it may be specified that it is associated. For example, for a paper document composed of three pages as shown in FIG. 6 (a), an additional article is extracted by the document processing apparatus according to the present invention, and the extraction result is output to the document processing apparatus (for example, When the extraction result is displayed, the image shown in FIG. 6B is output.

より詳細に説明すると、図６（ａ）に示す紙文書においては、第１頁の第１段落Ｂ１１には追記マークＭ１１が手書きで付されており、余白の手書き領域Ｃ１１には追記マークＭ１２が手書きで付されている。また、第２頁の余白の手書き領域Ｃ２１には追記マークＭ２１が手書きで付されており、第３頁の余白の手書き領域Ｃ３１には追記マークＭ３１が手書きで付されている。なお、図６（ａ）を参照すれば明らかなように、追記マークＭ１１と追記マークＭ３１とは何れも英大文字“Ａ”を表す追記マークであり、追記マークＭ１２と追記マークＭ２１とは英大文字“Ｂ”を表す追記マークである。 More specifically, in the paper document shown in FIG. 6A, the additional writing mark M11 is handwritten in the first paragraph B11 of the first page, and the additional writing mark M12 is added in the blank handwritten area C11. It is attached by hand. Further, a write-once mark M21 is handwritten in the margin handwritten region C21 on the second page, and a write-once mark M31 is handwritten in the marginal handwritten region C31 on the third page. As is clear from FIG. 6A, the additional recording mark M11 and the additional recording mark M31 are both additional recording marks representing the capital letter “A”, and the additional recording mark M12 and the additional recording mark M21 are uppercase letters. This is a postscript mark indicating “B”.

このような紙文書の各頁に対応する画像データについて前述したステップＳＡ１１０〜ＳＡ１４０の処理が施される結果、図６（ａ）に示す紙文書については、追記マークＭ１１と追記マークＭ３１とは同一又は類似の形状を有する追記マークであると分類され、この追記マークＭ１１に対応付けられている領域（すなわち、第１頁第１段落Ｂ１１）と追記マークＭ３１に対応付けられている領域（すなわち、第３頁余白の手書き領域Ｃ３１）の記載内容を表す画像がその追記マークの画像と対応付けて並べて表示されることになる。一方、追記マークＭ１２と追記マークＭ２１とについても同一又は類似の形状を有する追記マークであると分類され、この追記マークＭ１２に対応付けられている領域（すなわち、第１頁余白の手書き領域Ｃ１１）と追記マークＭ２１に対応付けられている領域（すなわち、第２頁余白の手書き領域Ｃ２１）の記載内容を表す画像がその追記マークを表す画像と対応付けて並べて表示されることになる。その結果、図６（ｂ）に示す画像が表示されることになる。 As a result of performing the above-described steps SA110 to SA140 on the image data corresponding to each page of the paper document, the additional mark M11 and the additional mark M31 are the same for the paper document shown in FIG. Alternatively, the region is classified as a postscript mark having a similar shape and is associated with the postscript mark M11 (ie, the first page, first paragraph B11) and the region associated with the postscript mark M31 (ie, An image representing the description content of the third page margin handwritten area C31) is displayed side by side in association with the image of the additional mark. On the other hand, the additional recording mark M12 and the additional recording mark M21 are also classified as additional recording marks having the same or similar shape, and are associated with the additional recording mark M12 (ie, the handwritten region C11 in the first page margin). And an image representing the description content of the area associated with the additional mark M21 (that is, the handwritten area C21 in the second page margin) are displayed in association with the image representing the additional mark. As a result, the image shown in FIG. 6B is displayed.

（Ｃ−３：変形例３）
上述した実施形態では、追記マークの近傍に位置する画像領域に配置される画像を被追記画像として特定する場合について説明した。しかしながら、第１の追記マークの近傍に位置する画像領域に配置される画像を、その第１の追記マークの近傍に位置する第２の追記マークに対する被追記画像として特定するようにしても良い。このようにすると、例えば、図７に示すように、引き出し線などの第１の追記マークＭ４１の近傍に位置する第２の追記マークＭ４２と、その第１の追記マークＭ４１の近傍に位置する画像領域Ｂ４１に配置される画像とを対応付けることが可能になる。 (C-3: Modification 3)
In the embodiment described above, a case has been described in which an image arranged in an image region located in the vicinity of the additional recording mark is specified as the additional recording image. However, an image arranged in an image region located in the vicinity of the first additional recording mark may be specified as an image to be recorded with respect to the second additional recording mark positioned in the vicinity of the first additional recording mark. In this case, for example, as shown in FIG. 7, the second additional recording mark M42 located near the first additional recording mark M41 such as a lead line and the image positioned near the first additional recording mark M41. It is possible to associate the image arranged in the region B41.

（Ｃ−４：変形例４）
上述した実施形態では、本発明に係る文書処理装置に特有な機能を制御部２００に実現させるためのソフトウェアを不揮発性記憶部２４０ｂに予め格納しておく場合について説明した。しかしながら、例えばＣＤ−ＲＯＭ（Compact Disk- Read Only Memory）やＤＶＤ（Digital Versatile Disk）などのコンピュータ装置読み取り可能な記録媒体に、上記ソフトウェアを記録しておき、このような記録媒体を用いて一般的なコンピュータ装置に上記ソフトウェアをインストールするとしても良いことは勿論である。このようにすると、一般的なコンピュータ装置に、本発明に係る文書処理装置と同一の機能を付与することが可能になる、といった効果を奏する。 (C-4: Modification 4)
In the above-described embodiment, the case has been described in which the software for causing the control unit 200 to realize functions unique to the document processing apparatus according to the present invention is stored in the nonvolatile storage unit 240b in advance. However, for example, the software is recorded on a computer-readable recording medium such as a CD-ROM (Compact Disk-Read Only Memory) or a DVD (Digital Versatile Disk), and is generally used with such a recording medium. Of course, the software may be installed in a simple computer device. If it does in this way, there exists an effect that it becomes possible to provide the same function as a document processing device concerning the present invention to a general computer device.

本発明の１実施形態に係る文書処理装置１１０を有する文書処理システム１０の構成例を示すブロック図である。1 is a block diagram illustrating a configuration example of a document processing system 10 including a document processing apparatus 110 according to an embodiment of the present invention. 同文書処理装置１１０のハードウェア構成の一例を示すブロック図である。2 is a block diagram showing an example of a hardware configuration of the document processing apparatus 110. FIG. 同文書処理装置１１０へ入力される文書画像データの表す紙文書とその原本の一例を示す図である。2 is a diagram illustrating an example of a paper document represented by document image data input to the document processing apparatus 110 and an original document thereof. FIG. 同制御部２００が編集ソフトウェアにしたがって行う編集処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the edit process which the control part 200 performs according to edit software. 同文書処理装置１１０の表示部２２０に表示される編集結果一例を示す図である。6 is a diagram illustrating an example of an editing result displayed on a display unit 220 of the document processing apparatus 110. FIG. 同変形例２に係る編集処理を説明するための図である。It is a figure for demonstrating the edit process which concerns on the modification 2. FIG. 同変形例３に係る編集処理を説明するための図である。It is a figure for demonstrating the edit process which concerns on the modification 3. FIG.

Explanation of symbols

１０…文書処理システム、１１０…文書処理装置、１２０…画像読取装置、１３０…通信線、２００…制御部、２１０…通信ＩＦ部、２２０…表示部、２３０…操作部、２４０…記憶部、２４０ａ…揮発性記憶部、２４０ｂ…不揮発性記憶部、２５０…バス。 DESCRIPTION OF SYMBOLS 10 ... Document processing system 110 ... Document processing apparatus 120 ... Image reading apparatus 130 ... Communication line 200 ... Control part 210 ... Communication IF part 220 ... Display part 230 ... Operation part 240 ... Storage part 240a ... volatile storage unit, 240b ... non-volatile storage unit, 250 ... bus.

Claims

Input means for inputting document image data obtained by digitizing a paper document;
Analyzing the document image data input to the input means, and a first specifying means for specifying an additional mark added by hand writing on the original document represented by the document image data;
Second specifying means for specifying an image area of the document image data associated with the additional mark for each additional mark specified by the first specifying means;
When a plurality of additional marks are specified by the first specifying means, the shape of each additional mark is analyzed, and classification means for classifying each additional mark having the same or similar shape;
For each additional mark classified as having the same or similar shape by the classifying means, an image arranged in the image area specified by the second specifying means is associated with the additional mark. A document processing apparatus comprising: output means arranged and output according to a predetermined layout.

The first specifying means includes:
2. The document according to claim 1, wherein the additional mark is specified by comparing the document image data input to the input means with original image data representing an original of a paper document represented by the document image data. Processing equipment.

The first specifying means includes:
The additional mark is specified based on a difference between an image color represented by the document image data input to the input unit and an original color of the paper document represented by the document image data. The document processing apparatus described.

The classification means includes
Analyzing the shape of the write-once mark and calculating its feature value, classifying the write-once marks that have the same calculated feature amount as write-once marks having the same shape, while the calculated feature amount has a predetermined divergence The document processing apparatus according to claim 1, wherein the recordable marks within the range are classified as similar recordable marks.

Computer equipment,
A first step of analyzing document image data obtained by digitizing a paper document and identifying an additional mark added by hand to the original document represented by the document image data;
A second step of specifying an image area of the document image data associated with the additional mark for each additional mark specified in the first step;
When a plurality of additional marks are specified in the first step, a third step of analyzing the shape of each additional mark and classifying each additional mark having the same or similar shape;
The image area specified in the second step when each of the additional marks classified as representing the additional mark having the same or similar shape in the third step is associated with the additional mark. A program that executes a fourth step of arranging and outputting the image arranged in accordance with a predetermined layout.