JP2004362073A

JP2004362073A - Document processor and method and control program for document processor

Info

Publication number: JP2004362073A
Application number: JP2003157189A
Authority: JP
Inventors: Takanari Ueda; 隆也上田
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2003-06-02
Filing date: 2003-06-02
Publication date: 2004-12-24

Abstract

<P>PROBLEM TO BE SOLVED: To provide a document processor and method enabling a user to efficiently understand the contents of documents, while keeping the portion of summary documents at an appropriate amount; and to provide a control program for the document processor. <P>SOLUTION: The document processor comprises a component element extracting part 102 for extracting at least text elements and image elements from documents; a text summarizing process part 105 for summarizing the text elements; and a summary document composing part 107 for composing summary documents from the summarized texts generated by the text summarizing process part 105 and the image elements selected by an image selecting part 106. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

【０００１】
【発明の属する技術分野】
本発明は、文書を要約する文書処理装置及び文書処理方法及び文書処理装置の制御プログラムに関する。
【０００２】
【従来の技術】
インターネットやＰＣ（パーソナルコンピュータ）の普及に伴い、昨今では、膨大な量の文書が作成され、流通するようになった。生成される文書の量は、利用者の処理能力を超えているため、利用者が、大量の文書の中から必要とする情報を獲得するのが困難になってきている。
【０００３】
こうした問題への解決方法の一つとして、非特許文献１に記載されているようなテキストを要約する技術が開発されている。この技術を用いると、文書中のテキストを要約して提示することができるので、利用者がテキストの内容を効率的に把握することができる。
【０００４】
一方、別の解決方法として、特許文献１では、文書の各ページを一覧できるように縮小して印刷し、ユーザが文書の必要性を容易に判断することができる方法が開示されている。
【０００５】
また、特許文献２では、文書を構造化文書と文書画像とに分類し、構造化文書の場合は、テキストの論理構造情報を利用してテキストの要約を作成して提示し、文書画像の場合は、個々の構成要素を縮小して提示する方法が開示されている。ここで、文書画像とは、紙の文書をスキャンして取り込んだもののように、テキスト部分も含めて文書データが画像として表現されているものを指す。
【０００６】
【非特許文献１】
奥村学、難波英嗣、「テキスト自動要約に関する研究動向」、自然言語処理、「テキスト要約のための言語処理」特集号、Ｖｏｌ．６、Ｎｏ．６、１９９９年７月
【特許文献１】
特開平７−１４６８７５号公報
【特許文献２】
特開平１０−３２０４１２号公報
【０００７】
【発明が解決しようとする課題】
一般に、文書の多くはテキスト要素だけでなく、図表・写真・画像等の画像要素を含んでいる。こうした文書に対して要約文書を作成する場合、テキストを要約する方法では、テキストは要約されても、画像要素は残ることになる。
【０００８】
文書中に画像要素が大量に存在すると、紙に出力した場合等に要約文書全体の分量が多くなるため、利用者が文書の内容を容易に判断することが困難である。
【０００９】
また、上述した特許文献１や特許文献２に記載されているように、各ページを縮小して印刷する方法では、要約文書の分量は少なくすることが可能であるが、文書中のテキストの文字が小さくなるため、利用者がテキストの内容を理解することが困難である。
【００１０】
本発明は、上記従来技術の有する問題を解決するためになされたものであり、その目的は、要約文書の分量を適度な量に抑えつつ、文書の内容を利用者が効率的に把握できる文書処理装置及び文書処理方法及び文書処理装置の制御プログラムを提供することである。
【００１１】
【課題を解決するための手段】
上記目的を達成するために本発明の文書処理装置は、文書から少なくともテキスト要素と画像要素とを抽出する構成要素抽出手段と、前記テキスト要素から要約テキストを生成する要約テキスト生成手段と、前記画像要素から要約画像を生成する要約画像生成手段と、前記要約テキストと前記要約画像とから要約文書を生成する要約文書生成手段とを備えたことを特徴とする。
【００１２】
また、上記目的を達成するために本発明の文書処理方法は、文書から少なくともテキスト要素と画像要素とを抽出する構成要素抽出ステップと、前記テキスト要素から要約テキストを生成する要約テキスト生成ステップと、前記画像要素から要約画像を生成する要約画像生成ステップと、前記要約テキストと前記要約画像とから要約文書を生成する要約文書生成ステップとを備えたことを特徴とする。
【００１３】
更に、上記目的を達成するために本発明の文書処理装置の制御プログラムは、前記文書処理方法が備える各工程をコンピュータに実行させるためのプログラムコードから成ることを特徴とする。
【００１４】
【発明の実施の形態】
以下、本発明の各実施の形態を図面に基づき説明する。
【００１５】
（第１の実施の形態）
まず、本発明の第１の実施の形態を、図１乃至図７に基づき説明する。
【００１６】
図１は、本実施の形態に係る文書処理装置の基本構成を示すブロック図であり、同図において、１０１は文書保持部で、処理対象の文書を保持するものである。１０２は構成要素抽出部で、文書保持部１０１に保持された文書の中から、テキスト要素と画像要素とを抽出するものである。本実施の形態では、文書は構造を持った電子化文書であり、テキスト、図表、イメージ等の構成要素が容易に取得できるものとする。尚、テキスト以外の要素を、ここでは、画像要素として扱う。
【００１７】
１０３はテキスト要素保持部で、構成要素抽出部１０２で抽出されたテキスト要素を保持するものである。１０４は画像要素保持部で、構成要素抽出部１０２で抽出された画像要素を保持するものである。１０５はテキスト要約処理部で、テキスト要素保持部１０３に保持されたテキスト要素を要約するものである。１０６は画像選択部で、画像要素保持部１０４に保持された画像要素のうちの幾つかを選択するものである。１０７は要約文書合成部で、テキスト要約処理部１０５で生成された要約テキストと、画像選択部１０６で選択された画像要素とから、要約文書を合成するものである。１０８は要約文書保持部で、要約文書合成部１０７で合成された要約文書を保持するものである。１０９は要約文書出力部で、要約文書保持部１０８に保持された要約文書を出力するものである。
【００１８】
図２は、本実施の形態に係る文書処理装置の具体的構成を示すブロック図であり、同図において、２０１はＣＰＵ（中央演算処理装置）で、後述する手順を実現するプログラムに従って動作する。２０２はメモリで、文書保持部１０１と、テキスト要素保持部１０３と、画像要素保持部１０４と、要約文書保持部１０８とを実現すると共に、上記プログラムの動作に必要な記憶領域を提供する。２０３は制御メモリで、後述する手順を実現するプログラムを保持する。２０４は出力装置で、要約文書出力部１０９を実現する。具体的には、ディスプレイ装置やプリンタ装置である。２０５は各構成要素を結合するバスである。
【００１９】
次に、本実施の形態に係る文書処理装置の動作を、図３のフローチャートを参照して説明する。
【００２０】
まず、ステップＳ３０１で，構成要素抽出部１０２は、文書保持部１０１に保持された文書から文書構成要素を抽出する。文書構成要素は、テキスト要素若しくは画像要素である。本実施の形態においては、先に述べたように文書は構造を持った電子化文書であり、これらの要素を容易に抽出することができる。テキスト要素はテキスト要素保持部１０３に、画像要素は画像要素保持部１０４に、それぞれ保持する。
【００２１】
図６にテキスト要素保持部１０３の内容の一例を示す。テキストの構成単位毎にＩＤ（識別子）が付けられ、テキスト文字列及び参照先の画像要素のＩＤを保持している。こうした参照関係は、電子化文書中に記述されており、容易に抽出することができる。尚、テキストの構成単位は、ここでは文とするが、他の構成単位を採用しても良い。
【００２２】
また、図７に画像要素保持部１０４の内容の一例を示す。画像要素毎にＩＤが付けられ、画像データへのポインタ及びこの画像要素を参照しているテキスト要素のＩＤを保持している。
【００２３】
再び図３に戻って説明すると、ステップＳ３０２で、前記ステップＳ３０１において抽出した文書構成要素中にテキスト要素があるか否かを調べる。そして、テキスト要素があればステップＳ３０３へ進み、また、テキスト要素がなければステップＳ３０８へ進む。
【００２４】
ステップＳ３０３では、テキスト要約処理部１０５は、テキスト要素保持部１０３に保持されたテキスト要素の要約処理を行う。ここでは、テキスト中から重要文を選択することによって要約を生成する。具体的には、非特許文献１に記載されているような一般的に知られているテキスト要約手法によって実施することができ、任意の分量の要約テキストを生成することが可能である。生成する要約テキストの分量は、数値（文字数、若しくは元の文書の文字数に対する割合）によって与えるものとする。この数値は事前に設定しておいても良いし、処理の際に利用者が与えたり、自動で設定するようにしても良い。要約した結果、選択された文には、図６に示すようにテキスト要素保持部１０３は、要約欄にマークを付与する。
【００２５】
次に、ステップＳ３０４で、前記ステップＳ３０１において抽出した文書構成要素中に画像要素があるか否かを調べる。そして、画像要素があればステップＳ３０５へ、また、画像要素がなければステップＳ３０６へ、それぞれ進む。
【００２６】
ステップＳ３０５では、画像選択部１０６は、画像要素保持部１０４に保持された画像要素のうち、前記ステップＳ３０３において生成された要約テキストに関連するものを選択する。画像要素のうち前記ステップＳ３０５において選択されたものには、図７に示すように画像要素保持部１０４は、要約欄にマークを付与する。
【００２７】
前記ステップＳ３０５における画像要素の選択は、具体的には、要約テキストから参照されている画像要素を選択することによって行う。この参照関係は、画像要素保持部１０４に保持されている。図６及び図７の例では、要約テキスト中にＩＤ＝ｔｘ１の文は含まれるが、ＩＤ＝ｔｘ２，ｔｘ３の文は含まれない。この場合、ＩＤ＝ｔｘ１の文から参照されている画像要素ｉｍ１は選択し、ＩＤ＝ｔｘ２の文から参照されている画像要素ｉｍ２は選択しない。
【００２８】
次に、ステップＳ３０６で、テキスト要素保持部１０３に保持された要約テキストと、画像要素保持部１０４に保持された画像要素のうち選択されたものとから、要約文書合成部１０７は、要約文書を生成する。尚、テキスト要素保持部１０３に要約テキストが保持されていない場合は、画像要素のみから、画像要素保持部１０４に画像要素が保持されていない場合は、要約テキストのみから、それぞれ要約文書を生成する。生成した要約文書は、要約文書保持部１０８に保持する。
【００２９】
次に、ステップＳ３０７で、要約文書保持部１０８に保持された要約文書を、要約文書出力部１０９に出力した後、本処理動作を終了する。
【００３０】
一方、前記ステップＳ３０２において、テキスト要素保持部１０３にテキスト要素が存在しなかった場合は、ステップＳ３０８へ進み、画像要素保持部１０４に画像要素が保持されているか否かを調べる。そして、画像要素が保持されている場合はステップＳ３０９へ、また、画像要素が保持されていない場合は本処理動作を終了する。
【００３１】
ステップＳ３０９では、画像要素保持部１０４に保持されている画像要素のうち、一部の画像要素を選択してステップＳ３０６へ進む。選択手法については特に問わない。ここで選択する画像要素の分量（要素数、若しくは全体の要素数に対する割合）は、数値によって与えれば良い。この数値は、事前に設定しておいても良いし、処理の際に利用者が与えるようにしても良い。画像要素保持部１０４に保持されている画像要素のうち、前記ステップＳ３０９において選択されたものにはマークを付与する。
【００３２】
次に、図４及び図５を用いて、具体例に基づいて更に詳細に説明する。
【００３３】
図４に示す例では、テキスト要素中の文字列４０１から画像要素４０２が、テキスト要素中の文字列４０４から画像要素４０５が参照されている。ここで、生成された要約テキスト中に文字列４０１と文字列４０７は含まれるが、文字列４０４は含まれない場合を考える。この場合、文字列４０１から参照されている画像要素４０２は選択し、文字列４０４から参照されている画像要素４０５は選択しない。この文書の場合、図５に示すような出力が要約文書として得られる。要約テキストに含まれる文字列４０１，４０７及びそこから参照されている画像要素４０２は出力され、要約テキストに含まれない文字列４０４及びそこから参照されている画像要素４０５は出力されない。このように、要約文書には、要約テキスト及び必要な画像要素のみが出力される。尚、この例では、画像要素についてキャプションも出力しているが、出力しなくても良い。
【００３４】
（第２の実施の形態）
上記第１の実施の形態においては、図３のステップＳ３０５において要約テキストに関連する画像要素を選択する際に、要約テキスト中から参照されている画像要素を選択したが、要約テキストを含む構成単位に含まれている画像要素を選択するようにしても良い。ここで、構成単位とは、章、節、項、段落等を指し、どの構成単位に基づくかについては、事前に設定しておくものとする。例えば、要約テキストを含む段落に含まれている画像要素は選択し、そうでない画像要素は選択しない。
【００３５】
（第３の実施の形態）
また、上記第１の実施例においては、図３のステップＳ３０５においてテキスト要約に関連する画像要素を選択する際に、要約テキスト中から参照されている画像要素を選択したが、画像要素にキャプションが付加されている場合、要約テキストと類似したキャプションを有する画像要素を選択するようにしても良い。要約テキストとキャプションの類似度を判定するには、例えば、一般に知られているベクトル空間モデル（例えば、「Ｓａｌｔｏｎ，ＭｃＧｉｌｌ， “ＡｎＩｎｔｒｏｄｕｃｔｉｏｎｔｏＭｏｄｅｒｎＩｎｆｏｒｍａｔｉｏｎＲｅｔｒｉｅｖａｌ”，１９８３．」に記載されている）を使用して、要約テキストとキャプションの双方をベクトル表現し、ベクトル間の距離によって類似度を判定することができるが、他の手法によっても良い。
【００３６】
また、類似度判定の際には、要約テキスト全てを使用するのでなく、その一部を使用しても良い。例えば、画像要素の近傍に存在する要約テキストを使用することが考えられる。
【００３７】
（第４の実施の形態）
また、上記第１の実施例においては、図３のステップＳ３０５においてテキスト要約に関連する画像要素を選択する際に、要約テキスト中から参照されている画像要素を選択したが、画像要素指定手段を更に設け、要約文書に出力する画像要素を利用者が明示的に指定するようにしても良い。
【００３８】
（第５の実施の形態）
また、上記第１の実施例においては、画像要素の一部を選択して要約文書中に出力したが、画像要素を選択するのでなく、全ての画像要素を一律に縮小して要約文書中に出力するようにしても良い。この場合、画像選択手段の代わりに画像縮小処理手段を設け、画像要素選択処理の代わりに全ての画像要素を縮小する処理を行う。ここで、縮小比率は数値によって与えれば良い。この数値は、事前に設定しておいても良いし、処理の際に利用者が与えるようにしても良い。
【００３９】
また、画像要素の選択と画像要素の縮小を併用しても良い。
【００４０】
また、画像要素出力方法設定手段を設け、画像要素を選択して出力するか、画像要素を一律に縮小して出力するか、これらを併用するかを、利用者が指定するようにしても良い。
【００４１】
（第６の実施の形態）
また、上記第１の実施例においては、構造を持った電子化文書を処理対象としたが、スキャンして得られた文書画像を対象としても本発明は実施することが可能である。
【００４２】
また、文書画像からテキスト領域と画像領域を抽出する領域分離手段と、こうして抽出したテキスト領域からテキスト文字列を抽出する文字認識手段とを設けることにより、画像要素とテキスト要素を得ることができるので、上記第１の実施の形態と同様の方法で実施することができる。
【００４３】
（第７の実施の形態）
また、上記第１の実施例においては、各部を同一の電子計算機上で構成する場合について説明したが、これに限られるものではなく、複数の電子計算機上で実現しても良い。
【００４４】
（その他の実施の形態）
尚、本発明は、複数の機器から構成されるシステムに適用しても、一つの機器から成る装置に適用しても良い。
【００４５】
また、前述した実施の形態の機能を実現するソフトウェアのプログラムコードを格納した記憶媒体をシステム或いは装置に供給し、そのシステム或いは装置のコンピュータ（または、ＣＰＵやＭＰＵ）が記憶媒体に格納されたプログラムコードを読み出して実行することによっても達成されることは言うまでもない。
【００４６】
この場合、記憶媒体から読み出されたプログラムコード自体が前述した実施の形態の機能を実現することになり、そのプログラムコードを格納した記憶媒体は本発明を構成することになる。
【００４７】
また、プログラムコードを供給するための記憶媒体としては、例えば、フロッピー（登録商標）ディスク、ハードディスク、光ディスク、ＣＤ−ＲＯＭ、ＣＤ−Ｒ、ＤＶＤ−ＲＯＭ、磁気テープ、不揮発性のメモリカード、ＲＯＭ等を用いることができる。
【００４８】
また、コンピュータが読み出したプログラムコードを実行することにより、前述した実施の形態の機能が実現されるだけでなく、そのプログラムコードの指示に基づき、コンピュータ上で稼動しているＯＳ（オペレーティングシステム）等が実際の処理の一部、または全部を行い、その処理によって前述した実施の形態の機能が実現される場合も含まれることは言うまでもない。
【００４９】
更に、記憶媒体から読み出されたプログラムコードが、コンピュータに挿入された機能拡張ボードやコンピュータに接続された機能拡張ユニットに備わるメモリに書き込まれた後、そのプログラムコードの指示に基づき、その機能拡張ボードや機能拡張ユニットに備わるＣＰＵ等が実際の処理の一部、または全部を行い、その処理によって前述した実施の形態の機能が実現される場合も含まれることは言うまでもない。
【００５０】
以上では、本発明の様々な例と実施形態を説明したが、当業者であれば、本発明の趣旨と範囲は本明細書内の特定の説明と図に限定されるものではなく、本願特許請求の範囲に全て述べられた様々な修正と変更に及ぶことが可能であることは言うまでもない。
【００５１】
【発明の効果】
以上説明したように、本発明によれば、テキストと画像が混在した文書において、テキストは要約し、画像は選択若しくは縮小するようにして要約文書を生成するようにしたので、要約文書の分量を適度な量に抑えつつ、利用者が文書の内容を効率的に把握することができる。
【図面の簡単な説明】
【図１】本発明の第１の実施の形態に係る文書処理装置の基本構成を示すブロック図である。
【図２】本発明の第１の実施の形態に係る文書処理装置の具体的構成を示すブロック図である。
【図３】本発明の第１の実施の形態に係る文書処理装置の処理動作の流れを示すフローチャートである。
【図４】本発明の第１の実施の形態に係る文書処理装置の具体例を示す図である。
【図５】本発明の第１の実施の形態に係る文書処理装置の具体例を示す図である。
【図６】本発明の第１の実施の形態に係る文書処理装置におけるテキスト要素保持部の内容の一例を示す図である。
【図７】本発明の第１の実施の形態に係る文書処理装置における画像要素保持部の内容の一例を示す図である。
【符号の説明】
１０１文書保持部
１０２構成要素抽出部
１０３テキスト要素保持部
１０４画像要素保持部
１０５テキスト要約処理部
１０６画像選択部
１０７要約文書合成部
１０８要約文書保持部
１０９要約文書出力部
２０１ＣＰＵ
２０２メモリ
２０３制御メモリ
２０４出力装置
２０５バス[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a document processing apparatus and a document processing method for summarizing a document, and a control program for the document processing apparatus.
[0002]
[Prior art]
With the widespread use of the Internet and PCs (personal computers), in recent years, an enormous amount of documents have been created and distributed. Since the amount of generated documents exceeds the processing capacity of the user, it is becoming difficult for the user to obtain necessary information from a large number of documents.
[0003]
As one solution to such a problem, a technique for summarizing text as described in Non-Patent Document 1 has been developed. With this technique, the text in the document can be summarized and presented, so that the user can efficiently grasp the contents of the text.
[0004]
On the other hand, as another solution, Patent Literature 1 discloses a method in which each page of a document is reduced and printed so that it can be viewed, and a user can easily determine the necessity of the document.
[0005]
In Patent Document 2, a document is classified into a structured document and a document image. In the case of a structured document, a text summary is created and presented using logical structure information of the text. Discloses a method of presenting individual components in a reduced size. Here, the document image refers to an image in which document data including a text portion is represented as an image, such as an image obtained by scanning a paper document.
[0006]
[Non-patent document 1]
Okumura Manabu, Namba Eiji, "Research Trend on Automatic Text Summarization", Natural Language Processing, Special Issue on "Linguistic Processing for Text Summarization", Vol. 6, no. 6, July 1999 [Patent Document 1]
JP-A-7-146875 [Patent Document 2]
JP-A-10-320412
[Problems to be solved by the invention]
In general, most documents include not only text elements but also image elements such as charts, photographs, and images. When a summary document is created for such a document, the text summarization method results in the image element remaining even though the text is summarized.
[0008]
When a large number of image elements exist in a document, the amount of the entire summary document increases when the document is output on paper or the like, so that it is difficult for a user to easily determine the contents of the document.
[0009]
Also, as described in Patent Document 1 and Patent Document 2 described above, in the method of printing each page in a reduced size, the amount of the summary document can be reduced. Is small, it is difficult for the user to understand the contents of the text.
[0010]
SUMMARY OF THE INVENTION The present invention has been made to solve the above-described problems of the related art, and an object of the present invention is to provide a document that allows a user to efficiently grasp the contents of a document while suppressing the amount of a summary document to an appropriate amount. An object of the present invention is to provide a processing device, a document processing method, and a control program for the document processing device.
[0011]
[Means for Solving the Problems]
In order to achieve the above object, a document processing apparatus according to the present invention comprises: a component extracting unit configured to extract at least a text element and an image element from a document; a summary text generating unit configured to generate a summary text from the text element; A summary image generating means for generating a summary image from an element, and a summary document generating means for generating a summary document from the summary text and the summary image are provided.
[0012]
Further, in order to achieve the above object, the document processing method of the present invention includes a component extraction step of extracting at least a text element and an image element from a document; a summary text generation step of generating a summary text from the text element; A summary image generating step of generating a summary image from the image elements; and a summary document generating step of generating a summary document from the summary text and the summary image.
[0013]
Further, in order to achieve the above object, a control program for a document processing apparatus according to the present invention is characterized by comprising a program code for causing a computer to execute each step of the document processing method.
[0014]
BEST MODE FOR CARRYING OUT THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
[0015]
(First Embodiment)
First, a first embodiment of the present invention will be described with reference to FIGS.
[0016]
FIG. 1 is a block diagram showing a basic configuration of a document processing apparatus according to the present embodiment. In FIG. 1, reference numeral 101 denotes a document holding unit which holds a document to be processed. A component extraction unit 102 extracts a text element and an image element from the document held in the document holding unit 101. In the present embodiment, it is assumed that the document is a digitized document having a structure, and that components such as texts, charts, and images can be easily obtained. Note that elements other than text are treated as image elements here.
[0017]
A text element holding unit 103 holds the text elements extracted by the component extraction unit 102. An image element holding unit 104 holds the image elements extracted by the component extracting unit 102. A text summarizing unit 105 summarizes the text elements held in the text element holding unit 103. An image selection unit 106 selects some of the image elements held in the image element holding unit 104. Reference numeral 107 denotes a summary document synthesis unit that synthesizes a summary document from the summary text generated by the text summary processing unit 105 and the image element selected by the image selection unit 106. Reference numeral 108 denotes a summary document holding unit which holds the summary document synthesized by the summary document synthesis unit 107. A summary document output unit 109 outputs the summary document held in the summary document holding unit 108.
[0018]
FIG. 2 is a block diagram showing a specific configuration of the document processing apparatus according to the present embodiment. In FIG. 2, reference numeral 201 denotes a CPU (central processing unit), which operates according to a program for implementing a procedure described later. Reference numeral 202 denotes a memory that implements the document holding unit 101, the text element holding unit 103, the image element holding unit 104, and the summary document holding unit 108, and provides a storage area necessary for the operation of the program. Reference numeral 203 denotes a control memory which holds a program for implementing a procedure described later. An output device 204 implements the summary document output unit 109. Specifically, it is a display device or a printer device. A bus 205 connects the components.
[0019]
Next, the operation of the document processing apparatus according to the present embodiment will be described with reference to the flowchart in FIG.
[0020]
First, in step S301, the component extracting unit 102 extracts a document component from the document held in the document holding unit 101. The document component is a text element or an image element. In the present embodiment, as described above, the document is a digitized document having a structure, and these elements can be easily extracted. The text element is held in the text element holding unit 103, and the image element is held in the image element holding unit 104.
[0021]
FIG. 6 shows an example of the contents of the text element holding unit 103. An ID (identifier) is assigned to each text constituent unit, and holds a text character string and an ID of a referenced image element. Such a reference relationship is described in the electronic document and can be easily extracted. Here, the constituent unit of the text is a sentence here, but another constituent unit may be adopted.
[0022]
FIG. 7 shows an example of the contents of the image element holding unit 104. An ID is assigned to each image element, and holds a pointer to image data and an ID of a text element referencing the image element.
[0023]
Referring back to FIG. 3, in step S302, it is checked whether or not there is a text element in the document components extracted in step S301. If there is a text element, the process proceeds to step S303. If there is no text element, the process proceeds to step S308.
[0024]
In step S303, the text summarization processing unit 105 performs a summarization process on the text elements held in the text element holding unit 103. Here, an abstract is generated by selecting an important sentence from the text. Specifically, it can be performed by a generally known text summarization method as described in Non-Patent Document 1, and an arbitrary amount of summary text can be generated. The amount of summary text to be generated is given by a numerical value (the number of characters or the ratio to the number of characters in the original document). This numerical value may be set in advance, may be given by the user at the time of processing, or may be set automatically. As a result of the summarization, the text element holding unit 103 adds a mark to the selected sentence as shown in FIG.
[0025]
Next, in step S304, it is checked whether or not there is an image element in the document components extracted in step S301. If there is an image element, the process proceeds to step S305, and if there is no image element, the process proceeds to step S306.
[0026]
In step S305, the image selecting unit 106 selects, from among the image elements held in the image element holding unit 104, those related to the summary text generated in step S303. As shown in FIG. 7, the image element holding unit 104 marks the image element selected in step S305 in the summary column.
[0027]
The selection of the image element in step S305 is specifically performed by selecting an image element referred to from the summary text. This reference relationship is held in the image element holding unit 104. In the examples of FIGS. 6 and 7, the sentence of ID = tx1 is included in the summary text, but the sentences of ID = tx2 and tx3 are not included. In this case, the image element im1 referenced from the sentence with ID = tx1 is selected, and the image element im2 referenced from the sentence with ID = tx2 is not selected.
[0028]
Next, in step S306, the summary document synthesizing unit 107 extracts the summary document from the summary text held in the text element holding unit 103 and the selected one of the image elements held in the image element holding unit 104. Generate. When the text element holding unit 103 does not hold the summary text, the summary document is generated only from the image element, and when the image element holding unit 104 does not hold the image element, the summary document is generated only from the summary text. . The generated summary document is stored in the summary document storage unit 108.
[0029]
Next, in step S307, the summary document held in the summary document holding unit 108 is output to the summary document output unit 109, and then this processing operation ends.
[0030]
On the other hand, if the text element does not exist in the text element holding unit 103 in step S302, the process proceeds to step S308, and it is determined whether or not the image element is held in the image element holding unit 104. If the image element is held, the process proceeds to step S309. If the image element is not held, the process ends.
[0031]
In step S309, a part of the image elements held in the image element holding unit 104 is selected, and the process proceeds to step S306. There is no particular limitation on the selection method. The amount of the image element selected here (the number of elements or the ratio to the total number of elements) may be given by a numerical value. This numerical value may be set in advance, or may be given by the user at the time of processing. Among the image elements held in the image element holding unit 104, a mark is given to the one selected in step S309.
[0032]
Next, a more detailed description will be given based on a specific example with reference to FIGS.
[0033]
In the example shown in FIG. 4, an image element 402 is referenced from a character string 401 in a text element, and an image element 405 is referenced from a character string 404 in a text element. Here, it is assumed that the generated summary text includes the character string 401 and the character string 407 but does not include the character string 404. In this case, the image element 402 referenced from the character string 401 is selected, and the image element 405 referenced from the character string 404 is not selected. In the case of this document, an output as shown in FIG. 5 is obtained as a summary document. The character strings 401 and 407 included in the abstract text and the image element 402 referred to therefrom are output, and the character string 404 not included in the abstract text and the image element 405 referenced therefrom are not output. Thus, only the summary text and the necessary image elements are output to the summary document. In this example, the caption is output for the image element, but may not be output.
[0034]
(Second embodiment)
In the first embodiment, when selecting an image element related to the summary text in step S305 in FIG. 3, the image element referred to from the summary text is selected. May be selected. Here, the constitutional unit refers to a chapter, a section, a section, a paragraph, and the like, and which constitutional unit is to be based on is set in advance. For example, image elements included in a paragraph containing the summary text are selected, and image elements that are not are not selected.
[0035]
(Third embodiment)
In the first embodiment, when selecting an image element related to the text summary in step S305 in FIG. 3, the image element referred to in the summary text is selected. If added, an image element having a caption similar to the summary text may be selected. To determine the similarity between the summary text and the caption, for example, a generally known vector space model (for example, described in “Salton, McGill,“ An Introduction to Modern Information Retrieval ”, 1983.”) is used. Both the summary text and the caption can be expressed as vectors, and the similarity can be determined based on the distance between the vectors. However, another method may be used.
[0036]
When determining the similarity, not all of the summary texts may be used, but some of them may be used. For example, it is conceivable to use a summary text existing near an image element.
[0037]
(Fourth embodiment)
In the first embodiment, when selecting an image element related to the text summary in step S305 in FIG. 3, the image element referred to from the summary text is selected. In addition, a user may explicitly specify an image element to be output to the summary document.
[0038]
(Fifth embodiment)
In the first embodiment, a part of the image elements is selected and output in the summary document. However, instead of selecting the image elements, all the image elements are uniformly reduced to be included in the summary document. You may make it output. In this case, image reduction processing means is provided in place of the image selection means, and processing for reducing all image elements is performed instead of image element selection processing. Here, the reduction ratio may be given by a numerical value. This numerical value may be set in advance, or may be given by the user at the time of processing.
[0039]
Further, the selection of the image element and the reduction of the image element may be used together.
[0040]
Further, an image element output method setting unit may be provided, and the user may specify whether to select and output the image element, uniformly reduce the image element and output the image element, or use both of them. .
[0041]
(Sixth embodiment)
Further, in the first embodiment, a digitized document having a structure is processed, but the present invention can also be implemented for a document image obtained by scanning.
[0042]
Further, by providing an area separating means for extracting a text area and an image area from a document image and a character recognizing means for extracting a text character string from the text area thus extracted, an image element and a text element can be obtained. , Can be carried out in the same manner as in the first embodiment.
[0043]
(Seventh embodiment)
Further, in the first embodiment, the case where each unit is configured on the same electronic computer has been described. However, the present invention is not limited to this, and may be realized on a plurality of electronic computers.
[0044]
(Other embodiments)
Note that the present invention may be applied to a system including a plurality of devices or to an apparatus including a single device.
[0045]
In addition, a storage medium storing software program codes for realizing the functions of the above-described embodiments is supplied to a system or apparatus, and a computer (or CPU or MPU) of the system or apparatus stores the program stored in the storage medium. Needless to say, this can also be achieved by reading and executing the code.
[0046]
In this case, the program code itself read from the storage medium realizes the functions of the above-described embodiment, and the storage medium storing the program code constitutes the present invention.
[0047]
Examples of a storage medium for supplying the program code include a floppy (registered trademark) disk, a hard disk, an optical disk, a CD-ROM, a CD-R, a DVD-ROM, a magnetic tape, a nonvolatile memory card, and a ROM. Can be used.
[0048]
When the computer executes the readout program codes, not only the functions of the above-described embodiments are realized, but also an OS (Operating System) or the like running on the computer based on the instructions of the program codes. Does part of or all of the actual processing, and the processing realizes the functions of the above-described embodiments.
[0049]
Further, after the program code read from the storage medium is written into a memory provided in a function expansion board inserted into the computer or a function expansion unit connected to the computer, the function expansion is performed based on the instruction of the program code. It goes without saying that the CPU or the like provided in the board or the function expansion unit performs part or all of the actual processing, and the processing realizes the functions of the above-described embodiments.
[0050]
Although various examples and embodiments of the present invention have been described above, those skilled in the art will appreciate that the spirit and scope of the present invention are not limited to the specific description and drawings in this specification, and are not limited thereto. It goes without saying that it is possible to cover various modifications and changes all set forth in the claims.
[0051]
【The invention's effect】
As described above, according to the present invention, in a document in which a text and an image are mixed, the text is summarized, and the image is selected or reduced to generate the summary document. The user can efficiently grasp the contents of the document while keeping the amount to an appropriate amount.
[Brief description of the drawings]
FIG. 1 is a block diagram illustrating a basic configuration of a document processing apparatus according to a first embodiment of the present invention.
FIG. 2 is a block diagram illustrating a specific configuration of the document processing apparatus according to the first embodiment of the present invention.
FIG. 3 is a flowchart illustrating a flow of a processing operation of the document processing apparatus according to the first embodiment of the present invention.
FIG. 4 is a diagram illustrating a specific example of the document processing apparatus according to the first embodiment of the present invention.
FIG. 5 is a diagram illustrating a specific example of the document processing apparatus according to the first embodiment of the present invention.
FIG. 6 is a diagram showing an example of the contents of a text element holding unit in the document processing device according to the first embodiment of the present invention.
FIG. 7 is a diagram showing an example of the content of an image element holding unit in the document processing device according to the first embodiment of the present invention.
[Explanation of symbols]
101 Document holding unit 102 Component extraction unit 103 Text element holding unit 104 Image element holding unit 105 Text summarization processing unit 106 Image selection unit 107 Summarized document synthesizing unit 108 Summarized document holding unit 109 Summarized document output unit 201 CPU
202 memory 203 control memory 204 output device 205 bus

Claims

Component extracting means for extracting at least a text element and an image element from the document,
Summary text generating means for generating a summary text from the text element;
Summary image generation means for generating a summary image from the image element;
A document processing apparatus comprising: a summary document generation unit configured to generate a summary document from the summary text and the summary image.

2. The document processing apparatus according to claim 1, wherein said summary image generation means selects an image element referred to from said summary text to generate a summary image.

2. The document processing apparatus according to claim 1, wherein the summary image generation unit selects an image element belonging to the same unit as the summary text to generate a summary image.

Text similarity determining means for determining similarity between texts is further provided,
2. The summary image generation unit according to claim 1, wherein the summary image generation unit selects an image element to which a description similar to the summary text is provided from the image elements to generate a summary image. 3. Document processing device.

The document is an electronic document;
5. The method according to claim 1, wherein the component extracting unit extracts the text element and the image element by analyzing structural information of the digitized document. Document processing device.

The document is a document image;
2. The method according to claim 1, wherein the component extracting unit includes an area separating unit that extracts a text region and an image region from the document image, and a character recognizing unit that extracts a text character string from the text region. The document processing device according to any one of claims 1 to 4.

The summary image generating means includes image element reducing means for generating a reduced image of the image element, and image element selecting means for selecting a part of the image element. 2. The document processing apparatus according to claim 1, further comprising a mode selection unit for selectively selecting a second mode using at least one of these two units.

A component extraction step of extracting at least a text element and an image element from the document,
A summary text generating step of generating a summary text from the text element;
A summary image generating step of generating a summary image from the image element;
A document processing method, comprising: generating a summary document from the summary text and the summary image.

9. The document processing method according to claim 8, wherein the summary image generating step is to generate a summary image by selecting an image element referred to from the summary text.

9. The document processing method according to claim 8, wherein in the summary image generating step, a summary image is generated by selecting image elements belonging to the same structural unit as the summary text.

Further comprising a text similarity determination step of determining similarity between texts,
9. The summary image generating step according to claim 8, wherein, in the image elements, an image element to which an explanation similar to the summary text is added is selected to generate a summary image. Document processing method.

The document is an electronic document;
13. The method according to claim 8, wherein the component extracting step extracts the text element and the image element by analyzing structural information of the digitized document. Document processing method.

The document is a document image;
2. The method according to claim 1, wherein the component extracting step includes an area separating step of extracting a text area and an image area from the document image, and a character recognition step of extracting a text character string from the text area. 13. The document processing method according to any one of 8 to 12.

The summary image generation step includes an image element reduction step of generating a reduced image of the image element, and an image element selection step of selecting a part of the image element, and a first mode in which both steps are used. 9. The document processing method according to claim 8, further comprising a mode selection step of alternatively selecting a second mode using at least one of these two steps.

A control program for a document processing apparatus, comprising a program code for causing a computer to execute each step of the document processing method according to any one of claims 8 to 14.