JP2001167114A

JP2001167114A - Document processor, document processing method and recording medium

Info

Publication number: JP2001167114A
Application number: JP34992199A
Authority: JP
Inventors: Kazuyuki Marukawa; 和幸丸川
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1999-12-09
Filing date: 1999-12-09
Publication date: 2001-06-22

Abstract

PROBLEM TO BE SOLVED: To properly present copyright information concerning an electronic document. SOLUTION: A document processing system is provided with a summary preparing means capable of preparing a summary sentence concerning the electronic document and an output control means which performs control for presenting and outputting the summary sentence prepared by the summary preparing means and also performs control for presenting and outputting copyright information shown by a tag when the tag showing the copyright information is added to the electronic document to be the source of the summary. When the tag showing the copyright information is added to the electronic document to be the source for preparing the summary sentence, the summary preparing means adds the tag showing the copyright information to the prepared summary sentence.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は電子文書に対して各
種処理を行う文書処理装置、及びこの文書処理装置に適
応できる文書処理方法や記録媒体に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document processing apparatus for performing various processes on an electronic document, and a document processing method and a recording medium applicable to the document processing apparatus.

【０００２】[0002]

【従来の技術】従来、インターネットにおいて、ウィン
ドウ形式でハイパーテキスト型情報を提供するアプリケ
ーションサービスとしてＷＷＷ（World Wide Web）が提
供されている。2. Description of the Related Art Conventionally, WWW (World Wide Web) has been provided as an application service for providing hypertext information in a window format on the Internet.

【０００３】ＷＷＷは、文書の作成、公開または共有化
の文書処理を実行し、新しいスタイルの文書の在り方を
示したシステムである。しかし、文書の実際上の利用の
観点からは、文書の内容に基づいた文書の分類や要約と
いった、ＷＷＷを越える高度な文書処理が求められてい
る。このような高度な文書処理には、文書の内容の機械
的な処理が不可欠である。[0003] The WWW is a system for executing document processing for creating, publishing, or sharing a document, and showing the way of a new style document. However, from the viewpoint of practical use of documents, advanced document processing beyond WWW, such as classification and summarization of documents based on the contents of the documents, is required. For such advanced document processing, mechanical processing of the contents of the document is indispensable.

【０００４】しかしながら、文書の内容の機械的な処理
は、以下のような理由から依然として困難である。第１
に、ハイパーテキストを記述する言語であるＨＴＭＬ
（Hyper Text Markup Language）は、文書の表現につい
ては規定するが、文書の内容についてはほとんど規定し
ない。第２に、文書間に構成されたハイパーテキストの
ネットワークは、文書の読者にとって文書の内容を理解
するために必ずしも利用しやすいものではない。第３
に、一般に文章の著作者は読者の便宜を念頭に置かずに
著作するが、文書の読者の便宜が著作者の便宜と調整さ
れることはない。However, mechanical processing of the contents of a document is still difficult for the following reasons. First
HTML, a language that describes hypertext
(Hyper Text Markup Language) specifies the expression of a document, but hardly specifies the content of the document. Second, a network of hypertexts formed between documents is not always easy for a reader of the document to understand the contents of the document. Third
In general, the author of a document writes without considering the convenience of the reader, but the convenience of the reader of the document is not adjusted to the convenience of the author.

【０００５】このように、ＷＷＷは新しい文書の在り方
を示したシステムであるが、文書を機械的に処理しない
ので、高度な文書処理をおこなうことができなかった。
換言すると、高度な文書処理を実行するためには、文書
を機械的に処理することが必要となる。[0005] As described above, the WWW is a system showing the way of a new document. However, since the document is not mechanically processed, advanced document processing cannot be performed.
In other words, in order to perform advanced document processing, it is necessary to process the document mechanically.

【０００６】そこで、文書の機械的な処理を目標とし
て、文書の機械的な処理を支援するシステムが自然言語
研究の成果に基づいて開発されている。自然言語研究に
よる文書処理として、文書の著作者等による文書の内部
構造についての属性情報、いわゆるタグの付与を前提と
した、文書に付与されたタグを利用する機械的な文書処
理が提案されている。Therefore, a system for supporting mechanical processing of documents has been developed based on the results of natural language research, with the goal of mechanical processing of documents. As a document processing based on natural language research, mechanical document processing using tags attached to a document has been proposed on the assumption that attribute information about the internal structure of the document by the author of the document, so-called tags are added. I have.

【０００７】[0007]

【発明が解決しようとする課題】ところで、近年のコン
ピュータの普及や、ネットワーク化の進展に伴い、文章
処理や、文書の内容に依存した索引などで、テキスト文
書の作成、ラベル付け、変更などを行う文書処理の高機
能化が求められている。たとえば、ユーザの要望に応じ
た文書の要約や、文書の分類等が望まれる。By the way, with the recent spread of computers and the progress of networking, text processing, creation of text documents, labeling, change, and the like have been performed by using an index depending on the contents of the documents. There is a demand for enhanced document processing to be performed. For example, it is desired to summarize a document or classify a document according to a user's request.

【０００８】しかしながら各種の文書処理機能について
高機能化、多様化されることで、元々の電子文書データ
からの編集なども容易となり、要約文に代表されるよう
な派生的な電子文書データが各種生成されるようにな
る。さらには通信システムの発達により、派生的な電子
文書データも含めて電子文書データは容易かつ短時間に
伝送されることが可能となっている。これらの事情か
ら、例えば元々の電子文書データの作者等が有する著作
権が何らかの形で侵害されるおそれも生じており、この
ため、特に派生的な電子文書データについて、著作権に
ついての情報を提示できるようにすることが求められて
いる。However, as various document processing functions are enhanced and diversified, editing from the original electronic document data becomes easy, and derivative electronic document data represented by a summary sentence becomes various. Will be generated. Further, with the development of communication systems, electronic document data including derivative electronic document data can be transmitted easily and in a short time. Under these circumstances, for example, the copyright of the original electronic document data creator may be infringed in some form. For this reason, information on the copyright is provided, especially for derivative electronic document data. We need to be able to do that.

【０００９】[0009]

【課題を解決するための手段】本発明は、このような事
情に鑑みて提案されたものであって、電子文書について
の著作権についての情報がユーザーに提示されるように
する文書処理装置を提供することを目的とする。SUMMARY OF THE INVENTION The present invention has been proposed in view of such circumstances, and has been developed in view of the above circumstances. A document processing apparatus for presenting information on the copyright of an electronic document to a user is provided. The purpose is to provide.

【００１０】このため本発明の文書処理装置は、電子文
書についての要約文を作成することのできる要約作成手
段と、前記要約作成手段により作成された要約文を提示
出力させる制御を行うとともに、その要約文の元となる
電子文書に著作権情報を示すタグが付加されていた場合
は、そのタグによって示される著作権情報も提示出力さ
れるように制御を行う出力制御手段と、を備えるように
する。また前記要約作成手段は、要約文を作成する元と
なる電子文書に著作権情報を示すタグが付加されていた
場合は、作成した要約文に著作権情報を示すタグを付加
するようにもする。For this reason, the document processing apparatus of the present invention performs a summary creation unit capable of creating a summary sentence for an electronic document, and controls the presentation and output of the summary sentence created by the summary creation unit. Output control means for performing control so that the copyright information indicated by the tag is also presented and output when a tag indicating copyright information is added to the electronic document that is the source of the summary sentence. I do. Further, when the tag indicating the copyright information is added to the electronic document from which the abstract is created, the abstract creating means may add the tag indicating the copyright information to the created abstract. .

【００１１】本発明の文書処理方法は、電子文書につい
ての要約文を作成する要約作成手順と、前記要約作成手
順において作成された要約文を提示出力させるととも
に、その要約文の元となる電子文書に著作権情報を示す
タグが付加されていた場合は、そのタグによって示され
る著作権情報も提示出力させる出力制御手順と、が行わ
れるようにする。また前記要約作成手順では、要約文を
作成する元となる電子文書に著作権情報を示すタグが付
加されていた場合は、作成した要約文に著作権情報を示
すタグを付加するようにする。本発明の記録媒体は、上
記文書処理方法の動作を実行するための動作制御プログ
ラムが記録されているものとする。According to the document processing method of the present invention, an abstract creation procedure for creating an abstract sentence of an electronic document, an abstract sentence created in the abstract creation procedure are presented and output, and an electronic document serving as the source of the abstract sentence is provided. If a tag indicating copyright information is added to the file, an output control procedure for presenting and outputting the copyright information indicated by the tag is performed. In the summary creation procedure, when a tag indicating copyright information is added to the electronic document from which the summary is created, a tag indicating copyright information is added to the created summary. It is assumed that the recording medium of the present invention stores an operation control program for executing the operation of the document processing method.

【００１２】[0012]

【発明の実施の形態】以下、本発明の実施の形態につい
て次の順序で説明する。１．文書処理システムの構成２．文書処理装置の構成３．文書データ構造４．文書データに対する手動分類処理４−１処理手順４−２インデックス作成４−３文書閲覧／分類作成／分類操作４−４分類モデル作成／登録５．文書データに対する自動分類処理５−１処理手順５−２自動分類６．要約作成／表示処理７．文書処理装置の機能ブロック構成Embodiments of the present invention will be described below in the following order. 1. 1. Configuration of document processing system 2. Configuration of document processing device 3. Document data structure 4. Manual Classification Processing for Document Data 4-1 Processing Procedure 4-2 Index Creation 4-3 Document Browsing / Creation Creation / Classification Operation 4-4 Classification Model Creation / Registration 5. Automatic classification processing for document data 5-1 Processing procedure 5-2 Automatic classification 6. Summary creation / display processing Functional block configuration of document processing device

【００１３】１．文書処理システムの構成図１に実施の形態の文書処理装置を含む文書処理システ
ムの構成例を示す。この文書処理システムは、主に、文
書処理装置１、オーサリング装置２、サーバ３、文書プ
ロバイダ４等から成る。1. 1. Configuration of Document Processing System FIG. 1 shows a configuration example of a document processing system including the document processing apparatus according to the embodiment. This document processing system mainly includes a document processing device 1, an authoring device 2, a server 3, a document provider 4, and the like.

【００１４】図１には、各部が有する機能を示している
が、文書処理装置１、オーサリング装置２、サーバ３、
文書プロバイダ４の全ては、受信／送信機能を有し、図
中実線又は破線で示すように、相互に情報の通信が可能
とされている。FIG. 1 shows the functions of each unit. The document processing apparatus 1, the authoring apparatus 2, the server 3,
All of the document providers 4 have a receiving / transmitting function, and can communicate information with each other as shown by a solid line or a broken line in the figure.

【００１５】ここで実線で示す通信回線６とは、有線
（例えば一般公衆回線、専用通信線、インターネットな
ど）又は無線（例えば衛星通信や無線電話回線など）に
よる通信回線を示している。また破線は可搬性の記録媒
体３２による情報の伝送を示しており、光ディスク、光
磁気ディスク、磁気ディスクなどのディスク状記録媒体
や、例えばフラッシュメモリなどを搭載したメモリカー
ド、或いはテープメディアなど、各種の記録媒体が相当
する。すなわち図示する各部は、通信回線６又は記録媒
体３２により、電子文書、タグ付電子文書、識別子、そ
の他の各種制御データを相互に伝送することができる。
なお本システムでは、オーサリング装置２により電子文
書にタグを付してタグ付電子文書を生成するものである
が、タグが付されていない元の電子文書を「プレーンテ
キスト」、タグが付された電子文書を「タグファイル」
ということする。The communication line 6 indicated by a solid line indicates a wired (for example, general public line, dedicated communication line, the Internet, etc.) or wireless (for example, satellite communication, wireless telephone line, etc.) communication line. The broken line indicates the transmission of information by the portable recording medium 32, and may be a disk-shaped recording medium such as an optical disk, a magneto-optical disk, or a magnetic disk, or a memory card mounted with a flash memory or the like, or a tape medium. Corresponds to the recording medium. That is, each unit shown can mutually transmit an electronic document, an electronic document with a tag, an identifier, and other various control data via the communication line 6 or the recording medium 32.
In the present system, the electronic document is tagged by the authoring device 2 to generate a tagged electronic document. The original electronic document without the tag is "plain text" and the tag is added. Electronic documents as "tag files"
I mean.

【００１６】文書プロバイダ４は、提供すべき文書とし
ての元のテキストデータ、つまり後述するタグ等が付加
されていない通常の文書データであるプレーンテキスト
を提供する部位としている。文書プロバイダ４はプレー
ンテキスト格納機能を備え、格納してあるプレーンテキ
ストを、通信回線６又は記録媒体３２を介してサーバ３
又はオーサリング装置２に送信できる。また文書作成機
能を備え、プレーンテキストを作成することもできる。
但し、必ずしも文書プロバイダ４において文書作成機能
を備える必要はない。即ち、文書プロバイダ４はあくま
でもプレーンテキストを提供できる部位であればよく、
通信回線６又は記録媒体３２を介してシステム外部の文
書製作者等から受け取ったプレーンテキストを提供する
ようにしてもよい。The document provider 4 is a part that provides original text data as a document to be provided, that is, plain text that is ordinary document data to which tags and the like described below are not added. The document provider 4 has a plain text storage function, and stores the stored plain text into the server 3 via the communication line 6 or the recording medium 32.
Alternatively, it can be transmitted to the authoring device 2. It also has a document creation function and can create plain text.
However, the document provider 4 does not necessarily need to have a document creation function. That is, the document provider 4 only needs to be a part that can provide plain text.
The plain text received from a document creator outside the system via the communication line 6 or the recording medium 32 may be provided.

【００１７】オーサリング装置２は、文書プロバイダ４
又はサーバ３から通信回線６又は記録媒体３２を介して
提供されたプレーンテキストに対してオーサリング処理
を行ってタグファイルを生成する部位である。生成した
タグファイルは通信回線６又は記録媒体３２を介してサ
ーバ３に送信し、サーバ３内のデータベースに格納させ
る。The authoring device 2 includes a document provider 4
Alternatively, it is a part that performs an authoring process on the plain text provided from the server 3 via the communication line 6 or the recording medium 32 to generate a tag file. The generated tag file is transmitted to the server 3 via the communication line 6 or the recording medium 32, and stored in the database in the server 3.

【００１８】このオーサリング装置２としては、上記オ
ーサリング処理を行うためのオーサリング機能を有する
ほか、オーサリング対象となるプレーンテキストの受信
／要求やサーバ３のデータベースの検索要求などを行な
い、また生成されたタグファイルのサーバ３への送信な
どの制御を実行し、効率的なオーサリング動作を実現す
るオーサリング制御機能を備える。また課金機能を備
え、オーサリング動作に伴って、オーサリング料金を文
書プロバイダ４に課金することも行われる。The authoring apparatus 2 has an authoring function for performing the above-mentioned authoring process, receives / requests a plain text to be authored, requests a search of the database of the server 3, and generates generated tags. It has an authoring control function for executing control such as transmission of a file to the server 3 and realizing an efficient authoring operation. In addition, a billing function is provided, and the authoring operation is billed to the document provider 4 with the authoring operation.

【００１９】なお図示していないが、オーサリング装置
２に文書作成機能が設けられるようにして、文書プロバ
イダ４からプレーンテキストの供給を受けなくても、プ
レーンテキストを生成し、そのプレーンテキストに対し
てオーサリング処理を行ってタグファイルを生成するこ
とができるようにしてもよい。Although not shown, the authoring apparatus 2 is provided with a document creation function so that a plain text is generated without receiving a supply of the plain text from the document provider 4. The tag file may be generated by performing an authoring process.

【００２０】オーサリング装置２においては、オーサリ
ング機能、課金機能、受信送信機能、オーサリング制御
機能を実現するための動作制御プログラムが用意される
が、この動作制御プログラムは予め装置内部に保持され
ることの他、システム外部から通信回線６でダウンロー
ドしたり、或いは記録媒体３２により提供を受けるもの
でもよい。例えばこのようにシステム外部から提供を受
けることで、汎用のパーソナルコンピュータをオーサリ
ング装置として適用することも可能となる。In the authoring device 2, an operation control program for realizing an authoring function, a billing function, a reception / transmission function, and an authoring control function is prepared. The operation control program is stored in advance in the device. Alternatively, the program may be downloaded from outside the system via the communication line 6 or provided by the recording medium 32. For example, a general-purpose personal computer can be applied as an authoring device by receiving the information from outside the system.

【００２１】サーバ３は、データベースを有し、データ
ベースには文書プロバイダ４から送信されてきたプレー
ンテキストや、オーサリング装置２から送信されてきた
タグファイルが格納される。データベースに保存された
文書データ（タグファイル又はプレーンテキスト）は、
サーバ３の管理に基づいて、フロッピーディスクや光デ
ィスク等の記録媒体３２或いは通信回線６によって、一
般ユーザー側の文書処理装置１に提供されるものとな
る。サーバ３はデータベースに対する検索機能も備え
る。The server 3 has a database in which plain text sent from the document provider 4 and tag files sent from the authoring device 2 are stored. Document data (tag file or plain text) stored in the database is
Based on the management of the server 3, the document is provided to the general user's document processing apparatus 1 via the recording medium 32 such as a floppy disk or an optical disk or the communication line 6. The server 3 also has a search function for the database.

【００２２】一般ユーザーサイドでは、文書処理機能を
備える文書処理装置１を用いることで、サーバ３から提
供された文書データについて後述するように各種の処理
を行い、多様かつ高度な文書情報を得ることができる。On the general user side, by using the document processing apparatus 1 having a document processing function, various processes are performed on the document data provided from the server 3 as described later to obtain various and sophisticated document information. Can be.

【００２３】なお、この図１のシステム構成は、説明上
の１モデルにすぎず、実際のシステム構成は多様に考え
られる。例えば文書プロバイダ４、オーサリング装置
２、サーバ３等が多数存在したり、或いはサーバ３側に
オーサリング装置２が構築されたりするなど、多様な構
成例が考えられる。The system configuration shown in FIG. 1 is only one model for explanation, and the actual system configuration can be variously considered. For example, various configuration examples are conceivable, such as a large number of document providers 4, an authoring device 2, a server 3, and the like, or an authoring device 2 built on the server 3 side.

【００２４】２．文書処理装置の構成上記文書処理システムにおいて文書データの提供を受
ける側となる文書処理装置１について説明していく。2. Configuration of Document Processing Apparatus A description will be given of the document processing apparatus 1 which receives the provision of document data in the document processing system.

【００２５】文書処理装置１は、図２に示すように、制
御部１１およびインターフェース１２を備える本体１０
と、ユーザからの入力を受けて本体１０に送る入力部２
０と、外部との信号の送受信を行う通信部２１と、本体
１０からの出力を表示する表示部３０と、記録媒体３２
に対して情報を記録／再生する記録／再生部３１と、音
声出力部３３と、ＨＤＤ（ハードディスクドライブ）３
４を有している。As shown in FIG. 2, the document processing apparatus 1 has a main unit 10 having a control unit 11 and an interface 12.
And an input unit 2 which receives an input from a user and sends it to the main body 10
0, a communication unit 21 for transmitting and receiving signals to and from the outside, a display unit 30 for displaying an output from the main body 10, a recording medium 32
A recording / reproducing unit 31 for recording / reproducing information with respect to the HDD, an audio output unit 33, an HDD (hard disk drive)
Four.

【００２６】本体１０は、制御部１１およびインターフ
ェース１２を有し、この文書処理装置１の主要な部分を
構成している。制御部１１は、この文書処理装置１にお
ける処理を実行するＣＰＵ１３と、揮発性のメモリであ
るＲＡＭ１４と、不揮発性のメモリであるＲＯＭ１５と
を有している。ＣＰＵ１３は、たとえばＲＯＭ１５に記
録された手順にしたがって、必要な場合にはデータを一
時的にＲＡＭ１４に格納して、プログラムを実行するた
めの制御をおこなう。この制御部１１の動作としては、
例えば供給された文書データに関する分類処理、要約作
成処理、読み上げ動作のための音声読み上げ用ファイル
の生成処理、及びこれらの処理に必要な文書解析などが
ある。そしてこれらの動作のために必要なプログラムや
アプリケーションソフトが、ＲＯＭ１５や、ＨＤＤ３
４、あるいは記録媒体３２に記憶されている。なお、制
御部１１が用いる文書処理プログラムは上記のようにあ
らかじめＲＯＭ１５に格納されたり、あるいは、記録媒
体３２やＨＤＤ３４から取り込むことが考えられるが、
例えば通信部２１（通信回線６）を介して、インターネ
ット等のネットワークから、外部サーバ等が提供する文
書処理プログラムをダウンロードすることも考えられ
る。The main body 10 has a control unit 11 and an interface 12, and constitutes a main part of the document processing apparatus 1. The control unit 11 has a CPU 13 that executes processing in the document processing apparatus 1, a RAM 14 that is a volatile memory, and a ROM 15 that is a non-volatile memory. The CPU 13 temporarily stores data, if necessary, in the RAM 14 according to a procedure recorded in the ROM 15, for example, and performs control for executing the program. The operation of the control unit 11 is as follows.
For example, there are a classification process for the supplied document data, a summary creation process, a process of generating a speech reading file for a reading operation, and a document analysis required for these processes. Programs and application software necessary for these operations are stored in the ROM 15 or the HDD 3
4 or stored in the recording medium 32. Note that the document processing program used by the control unit 11 may be stored in the ROM 15 in advance as described above, or may be fetched from the recording medium 32 or the HDD 34.
For example, a document processing program provided by an external server or the like may be downloaded from a network such as the Internet via the communication unit 21 (communication line 6).

【００２７】インターフェース１２は、制御部１１、入
力部２０、通信部２１、表示部３０、記録／再生部３
１、音声出力部３３、ＨＤＤ３４に接続される。そして
インターフェース１２は、制御部１１の制御の下に、入
力部２０からのデータの入力、通信部２１との間のデー
タの入出力、表示部３０へのデータの出力、記録／再生
部３１に対するデータの入出力、音声出力部３３へのデ
ータの出力、ＨＤＤ３４に対するデータの入出力の各動
作を行う。具体的には制御部１１と上記各部の間でのデ
ータの入出力のタイミングを調整したり、データの形式
を変換することなどを行う。The interface 12 includes a control unit 11, an input unit 20, a communication unit 21, a display unit 30, and a recording / reproducing unit 3.
1. Connected to the audio output unit 33 and HDD. Under the control of the control unit 11, the interface 12 inputs data from the input unit 20, inputs and outputs data to and from the communication unit 21, outputs data to the display unit 30, and controls the recording / reproducing unit 31. Each operation of data input / output, data output to the audio output unit 33, and data input / output to the HDD 34 is performed. More specifically, it adjusts the timing of data input / output between the control unit 11 and each of the above units, converts the format of data, and the like.

【００２８】入力部２０は、この文書処理装置１に対す
るユーザの入力を受ける部分である。この入力部２０
は、たとえばキーボードやマウスにより構成される。ユ
ーザは、この入力部２０を用い、キーボードによリキー
ワード等の文字を入力したり、マウスにより表示部３０
に表示されている電子文書のエレメントを選択すること
などができる。なお、以下では文書処理装置１で扱う文
書データ（タグファイル等）を、単に「文書」と称する
場合もある。また「エレメント」とは文書を構成する要
素であって、たとえば文書、文および語が含まれる。The input section 20 is a section for receiving a user input to the document processing apparatus 1. This input unit 20
Is composed of, for example, a keyboard and a mouse. The user uses the input unit 20 to input characters such as keywords using a keyboard, or to display the display unit 30 using a mouse.
For example, the user can select an element of the electronic document that is displayed in. In the following, the document data (such as a tag file) handled by the document processing apparatus 1 may be simply referred to as a “document”. The “element” is an element constituting a document, and includes, for example, a document, a sentence, and a word.

【００２９】通信部２１は、この文書処理装置１に外部
からたとえば通信回線６を介して送信される信号を受信
したり、通信回線６に信号を送信する部位である。この
通信部２１は、例えば上記サーバ３から送信された１又
は複数の文書データ（タグファイル）等を受信し、受信
したデータを本体１０に送る。もちろん通信回線６を介
して外部装置にデータを送信することも可能である。The communication unit 21 is a unit that receives a signal transmitted from the outside to the document processing apparatus 1 via, for example, the communication line 6 and transmits a signal to the communication line 6. The communication unit 21 receives, for example, one or a plurality of document data (tag files) transmitted from the server 3 and sends the received data to the main body 10. Of course, it is also possible to transmit data to an external device via the communication line 6.

【００３０】表示部３０は、この文書処理装置１の出力
としての文字や画像情報を表示する部位である。この表
示部３０は、たとえば陰極線管（cathode ray tube；CR
T）や液晶表示装置（Liquid crystal display；LCD）な
どにより構成され、たとえば単数または複数のウィンド
ウを表示し、このウィンドウ上に文字、図形等を表示す
る。The display section 30 is a section for displaying characters and image information as an output of the document processing apparatus 1. The display unit 30 is, for example, a cathode ray tube (CR).
T), a liquid crystal display (LCD), or the like, for example, displays one or more windows, and displays characters, figures, and the like on these windows.

【００３１】記録／再生部３１は、例えばフロッピー
（登録商標）ディスクや光ディスクなどの記録媒体３２
に対してデータの記録／再生を行う。なお、ここでは記
録媒体３２の例としてフロッピーディスク（磁気ディス
ク）、光ディスクを例にあげているが、もちろん上述し
たように、光磁気ディスク、メモリカード、磁気テープ
など、可搬性メディアであれば記録媒体３２の例として
適用できる。そして記録／再生部３１は、メディアに応
じた記録再生装置（ディスクドライブ、カードドライブ
など）であればよい。The recording / reproducing unit 31 includes a recording medium 32 such as a floppy (registered trademark) disk or an optical disk.
Record / reproduce the data. Here, a floppy disk (magnetic disk) and an optical disk are taken as examples of the recording medium 32. However, as described above, if a portable medium such as a magneto-optical disk, a memory card, and a magnetic tape is used, recording is possible. It can be applied as an example of the medium 32. The recording / reproducing unit 31 may be any recording / reproducing device (disk drive, card drive, etc.) corresponding to the medium.

【００３２】記録媒体３２が、文書を処理するための文
書処理プログラムが記録されているものである場合は、
記録／再生部３１は、その記録媒体３２から文書処理プ
ログラムを読み出して制御部１１に供給することができ
る。また記録媒体３２に文書データが記録されていれ
ば、記録／再生部３１でそれを読み出して制御部１１に
供給することができる。即ち文書処理装置１にとって、
通信部２１による文書データの受信とは別の文書データ
の入力態様となる。さらに、制御部１１は当該文書処理
装置１で処理した文書データを記録／再生部３１におい
て記録媒体３２に記録させることもできる。If the recording medium 32 has recorded therein a document processing program for processing a document,
The recording / reproducing unit 31 can read out the document processing program from the recording medium 32 and supply it to the control unit 11. If document data is recorded on the recording medium 32, the recording / reproducing unit 31 can read it out and supply it to the control unit 11. That is, for the document processing apparatus 1,
This is an input mode of the document data different from the reception of the document data by the communication unit 21. Further, the control unit 11 can cause the recording / reproducing unit 31 to record the document data processed by the document processing apparatus 1 on the recording medium 32.

【００３３】音声出力部３３は、文書処理装置１の出力
としての文書を、読み上げ音声として出力する部位であ
る。即ち音声出力部３３は、制御部１１が文書情報（後
述する読み上げ用ファイル）に基づいた音声合成処理に
より生成した音声信号が供給された際に、その音声信号
の出力処理を行うことで、表示部３０とともに文書処理
装置１の出力手段として機能する。The voice output section 33 is a section for outputting a document as an output of the document processing apparatus 1 as a reading voice. That is, when a voice signal generated by a voice synthesis process based on document information (a reading file to be described later) is supplied by the control unit 11, the voice output unit 33 performs output processing of the voice signal to perform display. Together with the unit 30, it functions as an output unit of the document processing apparatus 1.

【００３４】ＨＤＤ３４は、文書処理装置１における大
容量の記録領域を提供する。ＨＤＤ３４は、制御部１１
の制御に基づいて情報の記録／再生を行う。このＨＤＤ
３４は、制御部１１で実行される各種処理のためのアプ
リケーションプログラム、例えば音声合成のためのプロ
グラムなどを格納するために用いられたり、例えば当該
文書処理装置１に取り込まれた文書データ等を格納して
おく部位として用いることなどが可能となる。The HDD 34 provides a large-capacity recording area in the document processing apparatus 1. The HDD 34 includes the control unit 11
The recording / reproduction of information is performed based on the control of. This HDD
Reference numeral 34 is used to store an application program for various processes executed by the control unit 11, for example, a program for speech synthesis, or stores, for example, document data or the like taken into the document processing apparatus 1. It can be used as a part to be kept.

【００３５】３．文書データ構造続いて、本例における文書データの構造について説明す
る。本例においては、文書処理は、文書に付与された属
性情報であるタグを参照しておこなわれる。本例で用い
られるタグには、文書の構造を示す統語論的（syntacti
c）タグと、多言語間で文書の機械的な内容理解を可能
にするような意味的（semantic）・語用論的タグとがあ
る。3. Document Data Structure Next, the structure of document data in this example will be described. In this example, document processing is performed with reference to a tag that is attribute information assigned to the document. The tags used in this example include syntactic
c) There are tags and semantic / pragmatic tags that enable mechanical understanding of the content of documents between multiple languages.

【００３６】統語論的なタグとしては、文書の内部構造
を記述するものがある。タグ付けによる内部構造は、図
３に示すように、文書、文、語彙エレメント等の各エレ
メントが互いに、通常リンク、参照・被参照リンクによ
りリンクされて構成されている。図中において、白丸
“○”はエレメントを示し、最下位の白丸は文書におけ
る最小レベルの語に対応する語彙エレメントである。ま
た、実線は文書、文、語彙エレメント等のエレメント間
のつながり示す通常リンク（normal link）であり、破
線は参照・被参照による係り受け関係を示す参照リンク
（reference link）である。文書の内部構造は、上位か
ら下位への順序で、文書（documemt）、サブディビジョ
ン（subdivision）、段落（paragraph）、文（sentenc
e）、サブセンテンシャルセグメント（subsentential s
egment）、・・・、語彙エレメントから構成される。こ
のうち、サブディビジョンと段落はオプションである。Some syntactic tags describe the internal structure of a document. As shown in FIG. 3, the internal structure by tagging is configured such that each element such as a document, a sentence, and a vocabulary element is linked to each other by a normal link and a reference / referenced link. In the figure, a white circle “○” indicates an element, and the lowest white circle is a vocabulary element corresponding to the minimum level word in the document. The solid line is a normal link indicating a connection between elements such as a document, a sentence, and a vocabulary element, and the dashed line is a reference link indicating a dependency relationship by reference / reference. The internal structure of a document is document (documemt), subdivision (subdivision), paragraph (paragraph), sentence (sentenc
e), subsentential segment (subsentential s
egment), ..., vocabulary elements. Of these, subdivisions and paragraphs are optional.

【００３７】一方、意味論・語用論的なタグ付けとして
は、多義語の意味のように意味等の情報を記述するもの
がある。本例におけるタグ付けは、ＨＴＭＬ（Hyper Te
xt Markup Language）と同様なＸＭＬ（Extensible Mar
kup Language）の形式によるものである。On the other hand, as a semantic / pragmatic tagging, there is a method of describing information such as a meaning like a meaning of a polysemy. Tagging in this example is performed by HTML (Hyper Te
XML (Extensible Mar) similar to xt Markup Language
kup Language).

【００３８】タグ付けの一例を次に示すが、文書へのタ
グ付けはこの方法に限られない。また、以下では英語と
日本語の文書の例を示すが、タグ付けによる内部構造の
記述は他の言語にも同様に適用することができる。An example of tagging is shown below, but tagging a document is not limited to this method. In the following, examples of English and Japanese documents are shown, but the description of the internal structure by tagging can be similarly applied to other languages.

【００３９】たとえば、“Time flies like an arro
w.”という文については、下記のようなタグ付けをする
ことができる。＜＞が、文書に対して付与されたタグ
である。For example, "Time flies like an arro
The sentence "w." can be tagged as follows. <> is a tag added to the document.

【００４０】＜文＞＜名詞句語義＝“time０”＞time＜／名詞句＞＜動詞句＞＜動詞語義＝“fly１”＞flies＜／動詞＞＜副詞句＞＜副詞語義＝like０＞like＜／副詞＞＜名詞句＞an ＜名詞語義＝“arrow０”＞arrow＜／名詞＞＜／名詞句＞＜／副詞句＞＜／動詞句＞．＜／文＞<Sentence> <noun phrase meaning = “time0”> time </ noun phrase> <verb phrase> <verb meaning = “fly1”> flies <// verb> <adverb phrase> <adverb meaning = like0> like < / Adverb> <noun phrase> an <noun meaning = “arrow0”> arrow </ noun> </ noun phrase> </ adverb phrase> </ verb phrase>. </ Sentence>

【００４１】ここで＜文＞、＜名詞＞、＜名詞句＞、＜
動詞＞、＜動詞句＞、＜副詞＞、＜副詞句＞は、それぞ
れ文、名詞、名詞句、動詞、動詞句、形容詞／副詞（前
置詞句または後置詞句を含む）、形容詞句／副詞句、を
示している。つまり文の統語構造（syntactic structur
e）を表している。Here, <sentence>, <noun>, <noun phrase>, <
Verbs, <verb phrases>, <adverbs>, <adverb phrases> are sentences, nouns, noun phrases, verbs, verb phrases, adjectives / adverbs (including prepositional or postpositional phrases), adjective phrases / adverbial phrases, respectively. Is shown. In other words, the syntactic structur of a sentence
e).

【００４２】これらのタグは、エレメントの先端の直前
および終端の直後に対応して配置される。エレメントの
終端の直後に配置されるタグは、記号“ ／”によりエ
レメントの終端であることを示している。エレメントと
は統語的構成素、すなわち句、節、および文のことであ
る。なお、語義（word sense）＝“time０”は、語“ti
me”の有する複数の意味、すなわち複数の語義のうちの
第０番目の意味を指している。具体的には、語“time”
には少なくとも名詞、形容詞、動詞の意味があるが、こ
こでは語“time”が名詞（＝第０番目の意味）であるこ
とを示している。同様に、語“オレンジ”は少なくとも
植物の名前、色、果物の意味があるが、これらも語義に
よって区別することができる。These tags are arranged immediately before the front end and immediately after the end of the element. The tag placed immediately after the end of the element indicates that it is the end of the element by the symbol "/". Elements are syntactic constructs: phrases, clauses, and sentences. Note that the meaning (word sense) = “time0” corresponds to the word “ti”.
It refers to the plural meanings of “me”, that is, the 0th meaning of the plural meanings.
Has at least the meanings of a noun, adjective, and verb. Here, it indicates that the word "time" is a noun (= the 0th meaning). Similarly, the word "orange" has at least the meaning of plant name, color, and fruit, but these can also be distinguished by their meaning.

【００４３】本例では、文書データについては、図４に
示すように、表示部３０上のウィンドウ１０１におい
て、その統語構造を表示することができる。このウィン
ドウ１０１においては、右半面１０３に語彙エレメント
が、左半面１０２に文の内部構造がそれぞれ表示されて
いる。In this example, the syntactic structure of the document data can be displayed in the window 101 on the display unit 30, as shown in FIG. In this window 101, vocabulary elements are displayed on the right half 103, and the internal structure of the sentence is displayed on the left half 102.

【００４４】例えば図示するようにこのウィンドウ１０
１には、タグ付けにより内部構造が記述された文章「Ａ
氏のＢ会が終わったＣ市で、一部の大衆紙と一般紙がそ
の写真報道を自主規制する方針を紙面で明らかにし
た。」の一部が表示されている。この文書のタグ付けの
例は次のようになる。For example, as shown in FIG.
1 has a sentence “A” in which the internal structure is described by tagging.
In C City, where his B meeting was over, some popular and general newspapers have stated on paper that they will voluntarily regulate their photographic coverage. Is displayed. An example of tagging this document is as follows:

【００４５】＜文書＞＜文＞＜副詞句関係＝“場所”
＞＜名詞句＞＜副詞句場所＝“Ｃ市”＞＜副詞句関係＝“主語”＞＜名詞句識別子＝“Ｂ
会”＞＜副詞句関係＝“所有”＞＜人名識別子＝
“Ａ氏”＞Ａ氏＜／人名＞の＜／副詞句＞＜組織名識別
子＝“Ｂ会”＞Ｂ会＜／組織名＞＜／名詞句＞が＜／副
詞句＞終わった＜／副詞句＞＜地名識別子＝“Ｃ市”＞Ｃ市
＜／地名＞＜／名詞句＞で、＜／副詞句＞＜副詞句関
係＝“主語”＞＜名詞句識別子＝“press” 統語＝“並列”＞＜名詞句＞＜副詞句＞一部の＜／副詞
句＞大衆紙＜／名詞句＞と＜名詞＞一般紙＜／名詞＞＜
／名詞句＞が＜／副詞句＞＜副詞句関係＝“目的語”＞＜副詞句関係＝“内
容” 主語＝“press”＞＜副詞句関係＝“目的語”＞＜名詞句＞＜副詞句＞＜
名詞共参照＝“Ｂ会” ＞そ＜／名詞＞の＜／副詞句＞写真報道＜／名詞句＞を
＜／副詞句＞自主規制する＜／副詞句＞方針を＜／副詞句＞＜副詞句関係＝“位置”＞紙面で＜／副詞句＞明らかにした。＜／文＞＜／文書＞<Document><sentence><adverb phrase relation = “place”
><Nounphrase><adverb phrase phrase = “C city”><adverb phrase relation = “subject”><noun phrase identifier = “B
Kai ”><adverb phrase relation =“ owned ”><person name identifier =
"Mr. A"> Mr. A </ personal name></ adverb phrase><organization name identifier = "B association"> B association </ organization name></ noun phrase></ adverb phrase> finished </ adverb Phrase><place name identifier = “C city”> C city </ place name></ noun phrase>, </ adverb phrase><adverb phrase relation = “subject”><noun phrase identifier = “press” syntactic = “parallel” "><Nounphrase><adverbphrase> Some </ adverb phrase> popular paper </ noun phrase> and <noun> general paper </ noun><
/ Noun phrase> is </ Adverb phrase><Adverbphrase> Relation = "Object"><Adverb phrase Relationship = "Content" Subject = "press"><Adverb phrase Relation = "Object"><Nounphrase><AdverbPhrase><
Noun co-reference = "B-kai"></noun></ adverb phrase> Photo coverage </ noun phrase></ adverb phrase> Self-regulating </ adverb phrase> Policy </ adverb phrase> Phrase Relation = "Position"> On paper </ Adverb phrase> Clarified. </ Text></text>

【００４６】このようにタグ付されることで、各一対の
タグ＜＞〜＜／＞によって文書の構造が表現され
る。例えば＜文書＞〜＜／文書＞で１つの文書の範囲が
示され、同様に＜文＞〜＜／文＞で１つの文の範囲が示
される。また例えば、＜名詞句識別子＝“Ｂ会”＞〜
＜／名詞句＞により、「Ａ氏のＢ会」という部分が「Ｂ
会」を識別子とする名詞句として表現される。即ち上記
タグ付により、図４の左半面１０２に示した文の内部構
造が表現される。By being tagged in this way, the structure of the document is represented by each pair of tags <> to </>. For example, <document> to </ document> indicates the range of one document, and similarly, <sentence> to </ sentence> indicates the range of one sentence. Also, for example, <Noun phrase identifier = “B meeting”>
According to </ noun phrase>, the part "Mr. A's B meeting"
It is expressed as a noun phrase with "kai" as the identifier. That is, the internal structure of the sentence shown on the left half surface 102 in FIG. 4 is expressed by the tagging.

【００４７】さらに、この文書においては、「一部の大
衆紙と一般紙」は、統語＝“並列”というタグにより並
列であることが表されている。並列の定義は、係り受け
関係を共有するということである。特に何も指定がない
場合、たとえば、＜名詞句関係＝ｘ＞＜名詞＞Ａ＜／名
詞＞＜名詞＞Ｂ＜／名詞＞＜／名詞句＞は、ＡがＢに依
存関係のあることを表す。関係＝ｘは関係属性を表す。Further, in this document, "part of popular paper and general paper" is expressed in parallel by a tag of syntactic = "parallel". The definition of parallel is to share a dependency relationship. If nothing is specified, for example, <noun phrase relation = x><noun> A </ noun><noun> B </ noun></ noun phrase> indicates that A has a dependency on B Represent. Relation = x represents a relation attribute.

【００４８】関係属性は、統語、意味、修辞についての
相互関俵を記述する。主語、目的語、間接目的語のよう
な文法機能、動作主、被動作者、受益者などのような主
題役割、および理由、結果などのような修辞関係はこの
関係属性により記述される。本例では、主語、目的語、
間接目的語のような比較的容易な文法機能について関係
属性を記述する。The relation attribute describes the interaction between syntactic, meaning, and rhetorical. Grammar functions such as subjects, objects, and indirect objects, subject roles such as an actor, a subject, a beneficiary, etc., and rhetorical relations such as a reason and a result are described by the relation attributes. In this example, the subject, object,
Describe relative attributes for relatively easy grammar functions such as indirect objects.

【００４９】また、この文書においては、“Ａ氏”、
“Ｂ会”、“Ｃ市”のような固有名詞について、地名、
人名、組織名等のタグにより属性が記述されている。こ
れら地名、人名、組織名等のタグが付与されることで、
その語が固有名詞であることが表現される。In this document, “Mr. A”,
For proper nouns such as "B Association" and "C City", place names,
Attributes are described by tags such as person names and organization names. By adding tags such as place name, person name, organization name, etc.,
It is expressed that the word is a proper noun.

【００５０】４．文書データに対する手動分類処理４−１処理手順本例の文書処理装置１では、例えば通信部２１（又は記
録／再生部３１）により外部から文書データが取り込ま
れると、その文書データを内容に応じて分類する処理を
行う。なお、以下の説明では、外部からの文書データは
通信部２１を介して取り込まれるとして述べていくが、
その説明は、外部からフロッピーディスク等の可搬性メ
ディアの形態で供給され、記録／再生部３１から文書デ
ータが取り込まれる場合も同様となるものである。4. Manual Classification Processing on Document Data 4-1 Processing Procedure In the document processing apparatus 1 of this example, when document data is taken in from the outside by, for example, the communication unit 21 (or the recording / reproducing unit 31), the document data is converted according to the content. Perform classification processing. In the following description, it is described that document data from the outside is taken in via the communication unit 21.
The description is the same when the document data is supplied from the outside in the form of a portable medium such as a floppy disk and the document data is taken in from the recording / reproducing unit 31.

【００５１】分類処理としては、文書データ内容に応じ
てユーザーが手動で分類する手動分類処理と、文書処理
装置１が自動的に分類する自動分類処理がある。これら
の分類処理は、後述する分類モデルに基づいて行われる
わけであるが、文書処理装置１においては、初期状態で
は分類モデルは存在しない。そのため初期状態にある時
点では、手動分類処理として、分類モデルの作成を含む
分類処理が必要になる。そして、分類モデルが生成され
た後においては、入力された文書データに対して自動分
類処理が可能となるものである。まずここでは、最初に
実行することが必要とされる手動分類処理について説明
する。即ちこの手動分類処理とは、初期状態にある文書
処理装置１が外部から送られた文書データを受信した際
に、ユーザーの操作に基づいて、制御部１１が分類モデ
ルの作成及び文書データの分類を行う動作となる。As the classification process, there are a manual classification process in which the user manually classifies according to the contents of the document data, and an automatic classification process in which the document processing apparatus 1 automatically classifies. These classification processes are performed based on a classification model described later. However, in the document processing apparatus 1, no classification model exists in an initial state. Therefore, at the time of the initial state, a classification process including creation of a classification model is required as a manual classification process. After the classification model is generated, the input document data can be automatically classified. First, a manual classification process that needs to be performed first will be described. That is, the manual classification process means that when the document processing apparatus 1 in the initial state receives document data sent from the outside, the control unit 11 creates a classification model and classifies the document data based on a user operation. Is performed.

【００５２】まず手動分類処理としての全体の処理手順
を図５に示す。なお、各処理ステップの詳細な処理につ
いては後述する。FIG. 5 shows the overall processing procedure as the manual classification processing. The detailed processing of each processing step will be described later.

【００５３】図５のステップＦ１１は、文書処理装置１
の受信部２１による文書受信処理を示している。このス
テップＦ１１では、受信部２１は、たとえば通信回線を
介して送信された１又は複数の文書を受信する。受信部
２１は、受信した文書を文書処理装置の本体１０に送
る。制御部１１は供給された１又は複数の文書データを
ＲＡＭ１４又はＨＤＤ３４に格納する。Step F11 in FIG.
2 shows a document receiving process by the receiving unit 21 of FIG. In step F11, the receiving unit 21 receives one or a plurality of documents transmitted via, for example, a communication line. The receiving unit 21 sends the received document to the main body 10 of the document processing device. The control unit 11 stores the supplied document data or document data in the RAM 14 or the HDD 34.

【００５４】ステップＦ１２では、文書処理装置１の制
御部１１は、受信部２１から送られた複数の文書の特徴
を抽出し、それぞれの文書の特徴情報すなわちインデッ
クスを作成する。制御部１１は、作成したインデックス
を、たとえばＲＡＭ１４又はＨＤＤ３４に記憶させる。
後述するがインデックスは、その文書に特徴的な、固有
名詞、固有名詞以外の語義などを含むものであり、文書
の分類や検索に利用できるものである。In step F12, the control unit 11 of the document processing apparatus 1 extracts the features of the plurality of documents sent from the receiving unit 21 and creates feature information, that is, an index of each document. The control unit 11 stores the created index in, for example, the RAM 14 or the HDD 34.
As will be described later, the index includes proper nouns, meanings other than proper nouns, and the like, which are characteristic of the document, and can be used for document classification and search.

【００５５】ステップＦ１３の文書閲覧は、ユーザーの
必要に応じて実行される処理である。つまりユーザーの
操作に応じて行われる。なお、このステップＦ１３や次
のステップＦ１４は、ユーザ操作に基づく処理である。
入力された文書データに対しては、ユーザーは所要の操
作を行うことにより、表示部３０の画面上で、その文書
内容を閲覧することができる。そして文書閲覧中は、ユ
ーザーは画面上のアイコン等に対する操作により、例え
ば後述する要約作成などの各種処理を指示できるが、こ
の手動分類処理に関しては、ステップＦ１４として示す
ように、分類項目の作成及び分類操作としての処理に進
むことになる。ステップＦ１４では、ユーザーが分類項
目（なお本明細書では、分類項目のことをカテゴリとも
いう）を設定する操作を行うことに応じて、制御部１１
は分類項目を生成／表示していく。またユーザーが文書
データを、設定された分類項目に振り分けていく操作も
行うことになり、それに応じて制御部１１は文書データ
の振り分け／表示を行うことになる。The document browsing in step F13 is a process executed as required by the user. That is, it is performed according to a user operation. Step F13 and the next step F14 are processes based on a user operation.
The user can view the content of the input document data on the screen of the display unit 30 by performing a required operation. While browsing the document, the user can instruct various processes such as summarization, which will be described later, by operating the icons and the like on the screen. The process will proceed to the classification operation. In step F14, in response to the user performing an operation of setting a classification item (the classification item is also referred to as a category in this specification), the control unit 11
Generates / displays classification items. The user also performs an operation of sorting the document data into the set classification items, and the control unit 11 sorts and displays the document data accordingly.

【００５６】ステップＦ１５では、制御部１１は、ステ
ップＦ１４でユーザーが行った分類項目作成及び分類操
作に応じて、分類モデルを作成する。分類モデルは、文
書を分類する複数の分類項目（カテゴリ）から構成され
るとともに、各カテゴリに対して各文書のインデックス
（ステップＦ１２で作成した各文書のインデックス）を
対応づけることで、分類状態を規定するデータである。
このような分類モデルを生成したら、ステップＦ１６
で、その分類モデルを登録する。即ち制御部１１は、分
類モデルをたとえばＲＡＭ１４に記憶させることで登録
を行う。以上の図５の処理により、文書処理状態１が初
期状態にある時に入力された１又は複数の各文書データ
について、手動分類及び分類モデルの作成が行われたこ
とになる。この図５のステップＦ１２以下の処理につい
て詳しく述べていく。In step F15, the control unit 11 creates a classification model in accordance with the classification item creation and the classification operation performed by the user in step F14. The classification model includes a plurality of classification items (categories) for classifying documents, and associates each document index with each category index (index of each document created in step F12) to change the classification state. This is the specified data.
After generating such a classification model, step F16
Then, the classification model is registered. That is, the control unit 11 performs the registration by storing the classification model in, for example, the RAM 14. By the above-described processing of FIG. 5, manual classification and generation of a classification model have been performed for one or a plurality of pieces of document data input when the document processing state 1 is in the initial state. The processing from step F12 onward in FIG. 5 will be described in detail.

【００５７】４−２インデックス作成ステップＦ１４では、制御部１１は入力された文書デー
タについてインデックスの作成を行う。まず、或る１つ
の文書データに対して作成されたインデックスの具体例
を示す。4-2 Index Creation In step F14, the control unit 11 creates an index for the input document data. First, a specific example of an index created for certain document data will be described.

【００５８】＜インデックス日付＝“AAAA/BB/CC” 時刻＝“DD:EE:FF” 文書アドレス＝ “1234”＞＜ユーザの操作履歴最大要約サイズ＝“100”＞＜選択エレメントの数＝“10”＞ピクチャーテル＜／選択＞・・・＜／ユーザの操作履歴＞＜要約＞減税規模、触れず−Ｘ首相の会見＜／要約＞＜語語義＝“0003” 中心活性値＝“140.6”＞触れず＜／語＞＜語語義＝“0105” 識別子＝“Ｘ” 中心活性値＝“67.2”＞首相＜／語＞＜人名識別子＝“Ｘ” 語語義＝“6103” 中心活性値＝“150.2”＞Ｘ首相＜／語／人名＞＜語語義＝“5301” 中心活性値＝“120.6”＞求めた＜／語＞＜語語義＝“2350” 識別子＝“Ｘ” 中心活性値＝“31.4”＞首相＜／語＞＜語語義＝“9582” 中心活性値＝“182.3”＞強調した＜／語＞＜語語義＝“2595” 中心活性値＝“93.6”＞触れる＜／語＞＜語語義＝“9472” 中心活性値＝“12.0”＞予告した＜／語＞＜語語義＝“4934” 中心活性値＝“46.7”＞触れなかった＜／語＞＜語語義＝“0178” 中心活性値＝“175.7”＞釈明した＜／語＞＜語語義＝“7248” 識別子＝“Ｘ” 中心活性値＝“130.6”＞私＜／語＞＜語語義＝“3684” 識別子＝“Ｘ” 中心活性値＝“121.9”＞首相＜／語＞＜語語義＝“1824” 中心活性値＝“144.4.”＞訴えた＜／語＞＜語語義＝“7289” 中心活性値＝“176.8”＞見せた＜／語＞＜／インデックス＞<Index Date = "AAAA / BB / CC" Time = "DD: EE: FF" Document Address = "1234"> <User Operation History Maximum Summary Size = "100"> <Number of Selected Elements = " 10 "> Picturetel </ selection> ...... </ user operation history> <summary> Tax reduction scale, not touching-Prime Minister X's meeting </ summary> <word meaning =“ 0003 ”central activity value =“ 140.6 ” > Do not touch </ word> <word meaning = “0105” identifier = “X” central activity value = “67.2”> Prime Minister </ word> <person name identifier = “X” word meaning = “6103” central activity value = “ 150.2 ”> X Prime Minister </ word / person name> <word meaning =“ 5301 ”central activity value =“ 120.6 ”> determined </ word> <word meaning =“ 2350 ”identifier =“ X ”central activity value =“ 31.4 ”> Prime Minister </ word> <word meaning =“ 9582 ”central activity value =“ 182.3 ”> emphasized </ word> <word meaning =“ 2595 ”central activity value “93.6”> touched </ word> <word meaning = “9472” central activity value = “12.0”> foretold </ word> <word meaning = “4934” central activity value = “46.7”> not touched / Word> <word meaning = “0178” central activity value = “175.7”> explicated </ word> <word meaning = “7248” identifier = “X” central activity value = “130.6”> I </ word> <word Meaning = "3684" Identifier = "X" Central activity value = "121.9"> Prime Minister </ word> <word Meaning = "1824" Central activity value = "144.4."> Complained </ word> <word meaning = " 7289 ”Central activity value =“ 176.8 ”> Showed </ word> </ index>

【００５９】このインデックスにおいては、＜インデッ
クス＞および＜／インデックス＞は、インデックスの始
端および終端を、＜日付＞および＜時刻＞はこのインデ
ックスが作成された日付および時刻を、＜要約＞および
＜／要約＞はこのインデックスの内容の要約の始端およ
び終端を、それぞれ示している。また、＜語＞および＜
／語＞は語の始端および終端を示している。さらに例え
ば、語義＝“0003”は、第３番目の語義であることを示
している。他についても同様である。上述したように、
同じ語でも複数の意味を持つ場合があるので、それを区
別するために語義ごとに番号が予め決められており、そ
の該当する語義が番号で表されているものである。In this index, <index> and </ index> indicate the start and end of the index, <date> and <time> indicate the date and time when this index was created, and <summary> and </ Summary> indicates the start and end of the summary of the contents of this index, respectively. Also, <word> and <
/ Word> indicates the beginning and end of the word. Further, for example, the meaning = “0003” indicates that it is the third meaning. The same applies to other cases. As mentioned above,
Since the same word may have a plurality of meanings, a number is predetermined for each meaning in order to distinguish them, and the corresponding meaning is represented by a number.

【００６０】また、＜ユーザの操作履歴＞および＜／ユ
ーザの操作履歴＞は、ユーザの操作履歴の始端および終
端を、＜選択＞および＜／選択＞は、選択されたエレメ
ントの始端および終端を、それぞれ示している。最大要
約サイズ＝“100”は、要約の最大のサイズが１００文
字であることを、エレメントの数＝“10”は、選択され
たエレメントの数が１０であることを示している。<User operation history> and </ user operation history> indicate the start and end of the user's operation history, and <select> and </ select> indicate the start and end of the selected element. , Respectively. The maximum summary size = “100” indicates that the maximum size of the summary is 100 characters, and the number of elements = “10” indicates that the number of selected elements is 10.

【００６１】この例のように、インデックスは、その文
書に特徴的な、固有名詞、固有名詞以外の語義などを含
むものである。例えばこのようなインデックスを作成す
るステップＦ１２の処理を、図６〜図９で説明する。な
お、図６は１つの文書データに対するインデックス作成
処理を示しており、従って複数の文書データについて処
理を行う場合は、各文書データについてこの図６の処理
が行われることになる。また図６のステップＦ３１の詳
細な処理を図８に示し、さらに図８のステップＦ４３の
詳細な処理を図９に示している。As in this example, the index includes proper nouns, meanings other than proper nouns, and the like, which are characteristic of the document. For example, the process of step F12 for creating such an index will be described with reference to FIGS. FIG. 6 shows an index creation process for one document data. Therefore, when a process is performed for a plurality of document data, the process of FIG. 6 is performed for each document data. FIG. 8 shows the detailed processing of step F31 in FIG. 6, and FIG. 9 shows the detailed processing of step F43 in FIG.

【００６２】上述した図５のステップＦ１２のインデッ
クス作成処理としては、まず図６のステップＦ３１の活
性拡散が行われる。この活性拡散とは、文書データにつ
いて、エレメントの中心活性値を文書の内部構造に基づ
いて拡散することで、中心活性値の高いエレメントと関
わりのあるエレメントにも高い中心活性値を与えるよう
な処理である。即ち、文書を構成する各エレメントに対
して初期値としての中心活性値を与えた後、その中心活
性値を、文書の内部構造、具体的にはリンク構造に基づ
いて拡散する。この中心活性値は、タグ付けによる内部
構造に応じて決定されるので、文書の特徴の抽出等に利
用されるものである。制御部１１は、このステップＦ３
１として、活性拡散を行い、活性拡散の結果として得ら
れた各エレメントの中心活性値を、たとえばＲＡＭ１４
に記憶させることになる。As the index creation processing in step F12 in FIG. 5 described above, first, active diffusion in step F31 in FIG. 6 is performed. This active diffusion is a process in which the central activity value of an element is spread based on the internal structure of a document with respect to document data, so that an element related to an element having a high central activity value also has a high central activity value. It is. That is, after giving a central activity value as an initial value to each element constituting the document, the central activity value is diffused based on the internal structure of the document, specifically, the link structure. Since this central activity value is determined according to the internal structure by tagging, it is used for extracting features of a document. The control unit 11 determines in step F3
As 1, the active diffusion is performed, and the central active value of each element obtained as a result of the active diffusion is stored in, for example, the RAM 14.
Will be stored.

【００６３】ステップＦ３１の活性拡散について、図７
〜図９で詳しく説明していく。まずエレメントとエレメ
ントのリンク構造の例を図７に示す。図７においては、
文書を構成するエレメントとリンクの構造の一部とし
て、エレメントＥ１、Ｅ２の周辺を示している。Ｅ１〜
Ｅ８はエレメントの例であり、この中でエレメントＥ
１、Ｅ２に注目して説明する。FIG. 7 shows the active diffusion in step F31.
9 will be described in detail. First, an example of a link structure between elements is shown in FIG. In FIG.
The periphery of elements E1 and E2 is shown as part of the structure of the elements and links that make up the document. E1
E8 is an example of an element, in which element E
1 and E2 will be described.

【００６４】エレメントＥ１の中心活性値はｅ１である
とし、またエレメントＥ２の中心活性値はｅ２であると
する。このエレメントＥ１，Ｅ２は、リンクＬ１２（上
述した通常リンクもしくは参照リンク）にて接続されて
いる。リンクＬ１２のエレメントＥ１に接続する端点を
Ｔ１２、エレメントＥ２に接続する端点をＴ２１とす
る。エレメントＥ１は、さらにエレメントＥ３，Ｅ４，
Ｅ５と、それぞれリンクＬ１３，Ｌ１４，Ｌ１５で接続
されている。各リンクＬ１３，Ｌ１４，Ｌ１５における
エレメントＥ１側の端点をそれぞれＴ１３，Ｔ１４，Ｔ
１５とする。またエレメントＥ２は、エレメントＥ６，
Ｅ７，Ｅ８とも、それぞれリンクＬ２６，Ｌ２７，Ｌ２
８で接続されている。各リンクＬ２６，Ｌ２７，Ｌ２８
におけるエレメントＥ２側の端点をそれぞれＴ２６，Ｔ
２７，Ｔ２８とする。このようなリンク構造の例を用い
ながら、図８、図９の活性拡散処理を説明していく。It is assumed that the central activity value of element E1 is e1, and the central activity value of element E2 is e2. The elements E1 and E2 are connected by a link L12 (the normal link or the reference link described above). The end point of the link L12 connected to the element E1 is T12, and the end point of the link L12 connected to the element E2 is T21. Element E1 further includes elements E3, E4,
E5 is connected to links L13, L14, and L15, respectively. The end points of the links L13, L14, and L15 on the element E1 side are denoted by T13, T14, and T, respectively.
It is assumed to be 15. Element E2 is composed of elements E6 and E6.
Links L26, L27, L2 for E7 and E8 respectively
8 are connected. Each link L26, L27, L28
At the element E2 side are denoted by T26 and T26, respectively.
27 and T28. The active diffusion processing of FIGS. 8 and 9 will be described using an example of such a link structure.

【００６５】図８のステップＦ４１で制御部１１は、イ
ンデックス作成対象としての文書データについて活性拡
散を開始するにあたり、まず文書データの全エレメント
について中心活性値の初期設定を行う。中心活性値の初
期値としては、例えば固有名詞や、ユーザーが選択（ク
リック）したエレメント等に高い値を与えるようにす
る。また制御部１１は、参照リンクと通常リンクに関し
て、エレメントを連結するリンクの端点Ｔ(xx)の端点活
性値を０に設定する。制御部１１は、このように付与し
た端点活性値の初期値を、たとえばＲＡＭ１４に記憶さ
せる。In step F41 in FIG. 8, when starting the active diffusion for the document data to be indexed, the control unit 11 first initializes the central active value for all elements of the document data. As the initial value of the central activity value, for example, a high value is given to a proper noun, an element selected (clicked) by a user, or the like. Further, the control unit 11 sets the end point activation value of the end point T (xx) of the link connecting the elements to 0 for the reference link and the normal link. The control unit 11 stores, for example, the RAM 14 with the initial value of the endpoint activation value thus assigned.

【００６６】ステップＦ４２においては、制御部１１
は、文書を構成するエレメントＥｉを計数するカウンタ
の初期化をおこなう。すなわち、エレメントを計数する
カウンタのカウント値ｉを１に設定する。ｉ＝１の場
合、このカウンタは、第１番目のエレメント（例えば図
７のエレメントＥ１）を参照することになる。In step F42, the control unit 11
Initializes a counter for counting the elements Ei constituting the document. That is, the count value i of the counter for counting elements is set to one. If i = 1, this counter will refer to the first element (eg, element E1 in FIG. 7).

【００６７】ステップＦ４３においては、制御部１１
は、カウンタが参照するエレメントについて、新たな中
心活性値を計算する中心活性値更新処理を実行する。こ
の中心活性値更新処理について、エレメントＥ１につい
ての処理を例に挙げながら、図９で詳しく説明する。こ
の中心活性値更新処理は、エレメントについての端点活
性値を更新し、さらに更新された端点活性値と現在の中
心活性値を用いて、新たな中心活性値を算出する処理と
なる。In step F43, the control unit 11
Executes a central activation value update process for calculating a new central activation value for the element referenced by the counter. This center activation value update process will be described in detail with reference to FIG. 9 by taking the process for the element E1 as an example. The center activation value updating process is a process of updating the end point activation value of the element, and calculating a new center activation value using the updated end point activation value and the current center activation value.

【００６８】図９のステップＦ５１では、制御部１１
は、文書を構成するエレメントＥｉ（例えばこの場合Ｅ
１）に一端が接続されたリンクの数を計数するカウンタ
の初期化をおこなう。すなわち、リンクを計数するカウ
ンタのカウント値ｊを１に設定する。ｊ＝１の場合、こ
のカウンタは、エレメントＥｉと接続された第１番目の
リンクＬ（yy）を参照することになる。図７の例では、
エレメントＥ１についての第１のリンクとして例えばリ
ンクＬ１２を参照する。In step F51 of FIG.
Is an element Ei (for example, in this case E
In 1), a counter for counting the number of links having one end connected thereto is initialized. That is, the count value j of the counter for counting the links is set to one. When j = 1, this counter refers to the first link L (yy) connected to the element Ei. In the example of FIG.
For example, the link L12 is referred to as the first link for the element E1.

【００６９】ステップＦ５２で制御部１１は、参照中の
リンク、つまりエレメントＥ１とＥ２を接続するリンク
Ｌ１２について、関係属性のタグを参照することにより
通常リンクであるか否かを判断する。制御部１１は、リ
ンクＬ１２が通常リンクであればステップＦ５３に、一
方リンクＬ１２が参照リンクであればステップＦ５４に
処理を進める。In step F52, the control unit 11 determines whether the link being referred to, that is, the link L12 connecting the elements E1 and E2, is a normal link by referring to the tag of the related attribute. If the link L12 is a normal link, the control unit 11 proceeds to step F53, and if the link L12 is a reference link, proceeds to step F54.

【００７０】リンクＬ１２が通常リンクと判断されてス
テップＦ５３に進んだ場合は、制御部１１は、エレメン
トＥ１の通常リンクＬ１２に接続された端点Ｔ１２の新
たな端点活性値を計算する処理をおこなう。端点Ｔ１２
の端点活性値ｔ１２は、リンク先のエレメントＥ２の端
点活性値のうち、リンクＬ１２以外のリンクに接続する
すべての端点の各端点活性値（この場合Ｔ２６、Ｔ２
７、Ｔ２８の各端点活性値ｔ２６、ｔ２７，ｔ２８）
と、エレメントＥ２の中心活性値ｅ２を加算し、この加
算で得た値を、文書に含まれるエレメントの総数で除す
ることにより求められる。制御部１１は、この様な演算
を、ＲＡＭ１４から読み出した各端点活性値および各中
心活性値を用いて行うことで、通常リンクと接続された
端点についての新たな端点活性値を算出し、算出した端
点活性値を、ＲＡＭ１４に記憶させる。つまり端点Ｔ１
２の端点活性値ｔ１２を更新する。If the link L12 is determined to be a normal link and the process proceeds to step F53, the control unit 11 performs a process of calculating a new endpoint activation value of the endpoint T12 connected to the normal link L12 of the element E1. Endpoint T12
Are the endpoint activation values of all the endpoints connected to the link other than the link L12 among the endpoint activation values of the linked element E2 (in this case, T26 and T2).
7, each end point activation value t26, t27, t28 of T28)
And the central activation value e2 of the element E2, and the value obtained by this addition is divided by the total number of elements included in the document. The control unit 11 performs such an operation using each endpoint activity value and each center activity value read from the RAM 14, thereby calculating a new endpoint activity value for the endpoint connected to the normal link. The calculated end point activation value is stored in the RAM 14. That is, the end point T1
The end point activation value t12 of No. 2 is updated.

【００７１】一方、ステップＦ５２でリンクＬ１２が参
照リンクであると判断され、ステップＦ５４に進んだ場
合は、同じく制御部１１は、通常リンクＬ１２に接続さ
れたエレメントＥ１の端点Ｔ１２の新たな端点活性値を
計算する処理をおこなうことになるが、端点活性値の算
出のための演算は次のようになる。即ちこの場合は、端
点Ｔ１２の端点活性値ｔ１２は、リンク先のエレメント
Ｅ２の端点活性値のうち、リンクＬ１２以外のリンクに
接続するすべての端点の各端点活性値（この場合Ｔ２
６、Ｔ２７、Ｔ２８の各端点活性値ｔ２６、ｔ２７，ｔ
２８）と、エレメントＥ２の中心活性値ｅ２を加算した
値とする。（つまり除算がない点が上記通常リンクの場
合と異なるものとなる）そして制御部１１は、この様な演算を、ＲＡＭ１４から
読み出した各端点活性値および各中心活性値を用いて行
うことで、参照リンクと接続された端点についての新た
な端点活性値を算出し、算出した端点活性値を、ＲＡＭ
１４に記憶させる。つまり端点Ｔ１２の端点活性値ｔ１
２を更新する。On the other hand, if it is determined in step F52 that the link L12 is a reference link and the process proceeds to step F54, the control unit 11 similarly sets a new endpoint activation of the endpoint T12 of the element E1 connected to the normal link L12. The processing for calculating the value is performed. The calculation for calculating the end point activity value is as follows. That is, in this case, the endpoint activation values t12 of the endpoint T12 are the endpoint activation values of all the endpoints connected to the link other than the link L12 among the endpoint activation values of the linked element E2 (in this case, T2
6, each end point activation value t26, t27, t of T27, T28
28) and the central activation value e2 of the element E2. (That is, the point where there is no division is different from the case of the above-mentioned normal link.) The control unit 11 performs such an operation using each of the end point activation values and each of the center activation values read from the RAM 14, A new endpoint activity value is calculated for the endpoint connected to the reference link, and the calculated endpoint activity value is stored in the RAM.
14 is stored. That is, the end point activation value t1 of the end point T12
Update 2.

【００７２】このようなステップＦ５３又はＦ５４の処
理を行なったら、制御部１１はステップＦ５５での判別
処理を介して（判別結果がＮＯであれば）ステップＦ５
７に進み、カウント値ｊをインクリメントしてステップ
Ｆ５２に戻る。即ち続いて、カウント値ｊ＝２とされる
ことにより、エレメントＥ１についての第２のリンク
（例えばリンクＬ１３）が参照されることになるため、
上記同様にステップＦ５２以降の処理でリンクＬ１３に
接続される端点Ｔ１３の端点活性値ｔ１３が算出／更新
されることになる。After performing the processing in step F53 or F54, the control unit 11 performs the processing in step F5 through the determination processing in step F55 (if the determination result is NO).
Proceeding to 7, the count value j is incremented, and the process returns to step F52. That is, subsequently, by setting the count value j = 2, the second link (for example, link L13) for the element E1 is referred to.
In the same manner as described above, the end point activation value t13 of the end point T13 connected to the link L13 is calculated / updated in the processing after step F52.

【００７３】ステップＦ５５では、制御部１１は、現在
カウント値ｉで参照中のエレメントＥｉ（Ｅ１）につい
て、全てのリンクについての新たな端点活性値が計算さ
れたか否かを判別して処理を分岐するものであるため、
端点活性値の更新処理は、参照中のエレメントＥｉの全
ての端点活性値が更新されるまで行われる。つまりステ
ップＦ５７でカウント値ｊがインクリメントされながら
処理が繰り返されることで、例えばエレメントＥ１につ
いては、端点Ｔ１２，Ｔ１３，Ｔ１４，Ｔ１５について
それぞれ端点活性値ｔ１２，ｔ１３，ｔ１４，ｔ１５が
更新されていき、その全てが更新された時点で、処理は
ステップＦ５５からＦ５６に進むことになる。In step F55, the control unit 11 determines whether or not a new endpoint activation value has been calculated for all the links for the element Ei (E1) currently referred to by the count value i, and branches the processing. Because
The process of updating the endpoint activation values is performed until all endpoint activation values of the element Ei being referred to are updated. That is, by repeating the process while incrementing the count value j in step F57, for the element E1, for example, the endpoint activation values t12, t13, t14, and t15 are updated for the endpoints T12, T13, T14, and T15, respectively. When all of them have been updated, the process proceeds from step F55 to F56.

【００７４】エレメントＥｉについての全ての端点活性
値が求められたことに応じて、ステップＦ５６では、更
新された端点活性値を用いて、エレメントＥｉの新たな
中心活性値ｅｉを算出する。エレメントＥｉの新たな中
心活性値ｅｉは、エレメントＥｉの現在の中心活性値ｅ
ｉとエレメントＥｉのすべての端点の新たな端点活性値
の和で求められる。例えば図７のエレメントＥ１の場合
は、新たな中心活性値ｅ１(new)は、ｅ１(new)＝ｅ１＋ｔ１２＋ｔ１３＋ｔ１４＋ｔ１５となる。In step F56, a new central activity value ei of the element Ei is calculated by using the updated endpoint activity values in response to all the endpoint activity values for the element Ei being obtained. The new central activation value ei of the element Ei is the current central activation value e of the element Ei.
It is determined by the sum of new endpoint activation values of all endpoints of i and element Ei. For example, in the case of the element E1 in FIG. 7, the new central activation value e1 (new) is e1 (new) = e1 + t12 + t13 + t14 + t15.

【００７５】制御部１１は、このようにして現在カウン
ト値ｉで参照中のエレメントＥｉの中心活性値ｅｉを算
出する。そして、制御部１１は、計算した新たな中心活
性値ｅｉをＲＡＭ１４に記憶させる。つまりエレメント
Ｅｉの中心活性値ｅｉを更新する。（但しこの時点で
は、後述するステップＦ４５の処理で用いるため、旧中
心活性値も保持しておく）The control unit 11 calculates the central activation value ei of the element Ei being referred to by the current count value i in this way. Then, the control unit 11 causes the RAM 14 to store the calculated new central activation value ei. That is, the central activation value ei of the element Ei is updated. (However, at this point, the old center activation value is also held because it is used in the process of step F45 described later.)

【００７６】図８のステップＦ４３の中心活性値更新処
理として、以上図９に示したような処理が行われるた
ら、制御部１１の処理は図８のステップＦ４４に進み、
制御部１１は、文書中のすべてのエレメントについて中
心活性値更新処理が完了したか否かを判断する。具体的
には、制御部１１は、カウント値ｉが、文書に含まれる
エレメントの総数に達したか否かを判断する。制御部１
１は、すべてのエレメントについて中心活性値更新処理
が完了していないときは、ステップＦ４７に処理を進
め、カウント値ｉをインクリメントしてステップＦ４３
に戻る。例えば上記のようにエレメントＥ１についての
処理が終わった後であれば、カウント値ｉ＝２とされ
て、今度はエレメントＥ２が参照されることになる。そ
してエレメントＥ２について、ステップＦ４３の中心活
性値更新処理（即ち図９の処理）が上記同様に行われ
る。重複説明となるため詳細は述べないが、図７のリン
ク例でいえば、エレメントＥ２の場合は、図９の処理に
おいて端点Ｔ２１，Ｔ２６，Ｔ２７，Ｔ２８の各端点活
性値ｔ２１，ｔ２６，ｔ２７，ｔ２８が更新された後、
新たな中心活性値ｅ２(new)が、ｅ２(new)＝ｅ２＋ｔ２１＋ｔ２６＋ｔ２７＋ｔ２８として算出され、更新されることになる。When the processing as shown in FIG. 9 is performed as the central activation value updating processing in step F43 in FIG. 8, the processing of the control unit 11 proceeds to step F44 in FIG.
The control unit 11 determines whether or not the central activation value update processing has been completed for all elements in the document. Specifically, control unit 11 determines whether or not count value i has reached the total number of elements included in the document. Control unit 1
When the central activation value update processing has not been completed for all the elements, the process proceeds to step F47, where the count value i is incremented, and
Return to For example, after the processing for the element E1 is completed as described above, the count value i is set to 2 and the element E2 is referred to this time. Then, for the element E2, the central activation value update processing of step F43 (that is, the processing of FIG. 9) is performed in the same manner as described above. Although the details will not be described because they will be redundantly described, in the case of the link example of FIG. 7, in the case of the element E2, the end point activation values t21, t26, t27, After t28 is updated,
A new central activity value e2 (new) is calculated as e2 (new) = e2 + t21 + t26 + t27 + t28, and is updated.

【００７７】図８の処理においては、このようにステッ
プＦ４７でカウント値ｉがインクリメントされて参照エ
レメントが変更されながらステップＦ４３の中心活性値
更新処理が繰り返されることで、文書に含まれる全ての
エレメントの中心活性値が更新されていくことになる。In the processing shown in FIG. 8, the central activation value update processing in step F43 is repeated while the count value i is incremented in step F47 and the reference element is changed, so that all elements included in the document are updated. Will be updated.

【００７８】文書中のすべてのエレメントについて中心
活性値の更新が完了したときは、処理はステップＦ４４
からＦ４５に進むことになる。ステップＦ４５において
は、制御部１１は、文書に含まれるすべてのエレメント
の中心活性値の変化分、すなわち新たに計算された中心
活性値の元の中心活性値に対する変化分について平均値
を計算する。例えば制御部１１は、ＲＡＭ１４に記憶さ
れた旧中心活性値と、更新した新たな中心活性値を、文
書に含まれるすべてのエレメントについて読み出す。そ
して各エレメントについて新中心活性値と旧中心活性値
の差分を求め、その差分の総和をエレメントの総数で除
することにより、すべてのエレメントの中心活性値の変
化分の平均値を計算する。制御部１１は、このように計
算したすべてのエレメントの中心活性値の変化分の平均
値を、たとえばＲＡＭ１４に記憶させる。When the updating of the central activation value has been completed for all the elements in the document, the process proceeds to step F44.
To F45. In step F45, the control unit 11 calculates the average value of the change in the central activity values of all the elements included in the document, that is, the change in the newly calculated central activity value from the original central activity value. For example, the control unit 11 reads the old central activation value stored in the RAM 14 and the updated new central activation value for all elements included in the document. Then, for each element, the difference between the new central activity value and the old central activity value is obtained, and the sum of the differences is divided by the total number of elements to calculate the average value of the change in the central activity value of all elements. The control unit 11 causes the RAM 14 to store, for example, the average value of the change in the central activity value of all elements calculated in this way.

【００７９】続いてステップＦ４６において制御部１１
は、ステップＦ４５で計算した平均値が、あらかじめ設
定された閾値以内であるか否かを判断する。そして、制
御部１１は、上記平均値が閾値以内である場合は、活性
拡散処理としての一連の行程を終了するが、上記平均値
が閾値以内でないときには、ステップＦ４２にもどっ
て、上述した一連の行程を再び実行する。Subsequently, at step F46, the control unit 11
Determines whether the average calculated in step F45 is within a preset threshold. When the average value is within the threshold value, the control unit 11 ends a series of steps as the active diffusion process. When the average value is not within the threshold value, the control unit 11 returns to step F42 and performs the above-described series of processes. Perform the process again.

【００８０】この一連の活性拡散処理は、中心活性値が
高いエレメントに関連のある（リンクする）エレメント
について、その中心活性値を引き上げていく処理といえ
るものである。ところが、この活性拡散を１回行うのみ
では、インデックス作成処理の目的を考えたときに、本
来中心活性値を引き上げられるべきエレメントの中で、
中心活性値が十分に引き上げられないものが発生する場
合もありうる。例えば、１回の活性拡散では、中心活性
値の初期値が高く設定されたエレメントに直接リンクす
るエレメントについては、或る程度中心活性値が引き上
げられるが、直接リンクしていないエレメントは、それ
がインデックスとして重要なエレメントであっても十分
に中心活性値が引き上げられないことが生ずる。そこ
で、ステップＦ４６の判断を介して、必要に応じて活性
拡散処理を複数回行うようにすることで、全体的に中心
活性値が収束されるようにし、中心活性値が引き上げら
れない重要なエレメントがなるべく生じないようにする
ものである。なお、複数回の活性拡散で、全体的に中心
活性値が収束されていくのは、活性拡散処理で更新され
た各エレメントの中心活性値に基づいて、さらに次の活
性拡散処理で各エレメントの中心活性値が更新されてい
くためである。但し、このような活性拡散処理が多数回
行われすぎると、全エレメントの中心活性値が収束しき
ってほぼ同値となるような事態となり、不適切である。
このため、ステップＦ４５，Ｆ４６の処理として、中心
活性値の変化分の平均値を求めるように、その変化分に
基づいて活性拡散処理の終了タイミングを判断すること
で、インデックス作成に好適な活性拡散が実現されるこ
とになる。This series of active diffusion processing can be said to be processing for raising the central activity value of an element related (linked) to an element having a high central activity value. However, by performing this activation diffusion only once, when considering the purpose of the index creation processing, among the elements that should be able to raise the central activation value,
In some cases, the central activity value may not be sufficiently raised. For example, in one activation diffusion, for an element directly linked to an element for which the initial value of the central activation value is set high, the central activation value is raised to some extent, but for an element that is not directly linked, it is Even if the element is important as an index, the central activity value may not be sufficiently raised. Therefore, by performing the active diffusion process a plurality of times as necessary based on the determination in step F46, the central activity value can be converged as a whole, and important elements that cannot be increased in the central activity value As much as possible. The reason why the central activity value is converged as a whole in the active diffusion process a plurality of times is that the central active value of each element updated in the active diffusion process is based on the central activity value of each element, and further the active activity diffusion process is performed for each element. This is because the central activity value is updated. However, if such activation diffusion processing is performed too many times, the central activation values of all the elements converge and become almost the same, which is inappropriate.
For this reason, in the processing of steps F45 and F46, the end timing of the active diffusion process is determined based on the change so as to obtain the average value of the change of the central active value, so that the active diffusion suitable for index creation is performed. Will be realized.

【００８１】以上の図８、図９のような活性拡散処理
（即ち図６のステップＦ３１）が完了したら、制御部１
１の処理は図６のステップＦ３２に進むことになる。ス
テップＦ３２においては、制御部１１は、ステップＦ３
１で得られた各エレメントの中心活性値に基づいて、中
心活性値があらかじめ設定された閾値を超えるエレメン
トを抽出する。制御部１１は、このように抽出したエレ
メントをＲＡＭ１４に記憶させる。When the active diffusion process as shown in FIGS. 8 and 9 (ie, step F31 in FIG. 6) is completed, the control unit 1
The process 1 proceeds to step F32 in FIG. In step F32, the control unit 11 determines in step F3
Based on the central activity value of each element obtained in step 1, an element whose central activity value exceeds a preset threshold is extracted. The control unit 11 causes the RAM 14 to store the extracted elements.

【００８２】続いてステップＦ３３においては、制御部
１１は、ステップＦ３２にて抽出したエレメントをたと
えばＲＡＭ１４から読み出す。そして制御部１１は、こ
の抽出したエレメントの中からすべての固有名詞を取り
出してインデックスに加える。固有名詞は語義を持た
ず、辞書に載っていないなどの特殊の性質を有するので
固有名詞以外の語とは別に扱うものである。なお語義と
は、前述したように、語の有する複数の意味のうちの各
意味に対応したものである。各エレメントが固有名詞で
あるか否かは、文書に付されたタグに基づいて判断する
ことができる。たとえば、図４に示したタグ付けによる
内部構造においては、“Ａ氏”、“Ｂ会”および“Ｃ
市”は、タグによる関係属性がそれぞれ“人名”、“組
織名”および“地名”であるので固有名詞であることが
分かる。そして、制御部１１は、取り出した固有名詞を
インデックスに加え、その結果をＲＡＭ１４に記憶させ
る。Subsequently, in step F33, the control section 11 reads out the elements extracted in step F32, for example, from the RAM. Then, the control unit 11 extracts all proper nouns from the extracted elements and adds them to the index. Proper nouns have special properties such as having no meaning and not appearing in dictionaries, and are therefore treated separately from words other than proper nouns. As described above, the word meaning corresponds to each of a plurality of meanings of the word. Whether or not each element is a proper noun can be determined based on a tag attached to the document. For example, in the internal structure by tagging shown in FIG. 4, "Mr. A", "B meeting" and "C"
“City” is a proper noun since the related attributes of the tags are “person name”, “organization name”, and “place name”, and the control unit 11 adds the extracted proper noun to the index, and The result is stored in the RAM 14.

【００８３】次のステップＦ３４においては、制御部１
１は、ステップＦ３２にて抽出したエレメントの中か
ら、固有名詞以外の語義を取り出してインデックスに加
え、その結果をＲＡＭ１４に記憶させる。In the next step F34, the control unit 1
1 extracts the meanings other than proper nouns from the elements extracted in step F32, adds them to the index, and stores the result in the RAM 14.

【００８４】以上の処理により、例えば上記した具体例
のようなインデックスが生成される。即ちインデックス
は、タグ付けされた文書の特徴を発見して、その特徴を
配列したものとなり、その文書の特徴は、文書の内部構
造に応じて拡散処理された中心活性値に基づいて判断さ
れるものとなる。そしてこのようなインデックスは、文
書を代表するような特徴を表す語義および固有名詞を含
むので、所望の文書を参照する際に用いることができ
る。なお、インデックスには、文書の特徴を表す語義お
よび固有名詞とともに、その文書がＲＡＭ１４（又はＨ
ＤＤ３４）において記憶された位置を示す文書アドレス
を含めておく。With the above processing, for example, an index as in the above specific example is generated. That is, the index finds the features of the tagged document and arranges the features, and the features of the document are determined based on the central activity value diffused according to the internal structure of the document. It will be. Since such an index includes meanings and proper nouns representing features representative of a document, it can be used when referring to a desired document. The index includes, in addition to the meanings and proper nouns representing the characteristics of the document, the document in the RAM 14 (or H
The document address indicating the position stored in DD34) is included.

【００８５】４−３文書閲覧／分類作成／分類操作以上の図６〜図９で説明したインデックス作成処理は図
５のステップＦ１２で行われるものとなる。従って図５
の手動分類処理としては、続いてステップＦ１３，Ｆ１
４の処理、即ち上述したようにユーザーによる閲覧及び
手動分類の処理に移る。4-3 Document Browsing / Category Creation / Category Operation The index creation process described above with reference to FIGS. 6 to 9 is performed in step F12 of FIG. Therefore, FIG.
As for the manual classification processing of step F13 and step F1
The process proceeds to the process of No. 4, that is, the process of browsing and manual classification by the user as described above.

【００８６】上述のように、図５のステップＦ１３にお
いては、ユーザーは表示部３０に表示される文書を閲覧
することができる。またステップＦ１４においては、ユ
ーザーが分類項目を設定する操作や、文書データを、設
定された分類項目に振り分けていく操作を行うことがで
きる。このステップＦ１３，Ｆ１４で行われる操作や、
それに対応する制御部１１の処理及び表示部３０の表示
例は以下のようになる。As described above, in step F13 of FIG. 5, the user can browse the document displayed on the display unit 30. In step F14, the user can perform an operation of setting a classification item and an operation of sorting document data into the set classification item. The operations performed in steps F13 and F14,
The corresponding process of the control unit 11 and a display example of the display unit 30 are as follows.

【００８７】図１０、図１１は表示部３０における表示
の具体例を示している。まず図１０は、詳しくは後述す
る分類モデルに対応した分類ウインドウ２０１の表示例
である。即ち、文書分類の表示に用いられるグラフィッ
クユーザインターフェース（graphic user interface；
GUI）の具体例となる。この分類ウィンドウ２０１に
は、操作用のボタン表示２０２として、画面のウィンド
ウの状態を初期の位置にもどすポジションリセット（po
sition reset）ボタン２０２ａと、文書の内容を閲読す
るブラウザ（browser）を呼び出すブラウザボタン２０
２ｂと、このウィンドウからの脱出（exit）ボタン２０
２ｃとが表示される。FIGS. 10 and 11 show specific examples of the display on the display unit 30. FIG. First, FIG. 10 is a display example of a classification window 201 corresponding to a classification model described later in detail. That is, a graphic user interface (graphic user interface) used for displaying a document classification.
GUI). In the classification window 201, a position reset (po) that returns the state of the window on the screen to the initial position is provided as an operation button display 202.
reset button 202a and a browser button 20 for calling a browser for reading the contents of the document.
2b and exit button 20 from this window
2c is displayed.

【００８８】また、この分類ウィンドウ３０１には、分
類モデルに対応する分類項目に応じた小ウインドウとし
て、文書分類エリア２０３，２０４，２０５・・・が形
成される。文書分類エリア２０３は、“他のトピック
ス”を表示するエリアとされる。この”他のトピック
ス”の文書分類エリア２０３は、まだ分類されていない
文書が提示される領域となる。例えば図５のステップＦ
１１で受信された各文書（つまりこれから分類しようと
する文書）は、この”他のトピックス”の文書分類エリ
ア２０３に提示される。文書分類エリア２０４は、例え
ば”ビジネスニュース”に分類された文書が提示される
領域となる。文書分類エリア２０５は、例えば”政治ニ
ュース”に分類された文書が提示される領域となる。こ
れら以外にも、図中で符号を付していない文書分類エリ
アは、それぞれ特定の分類項目に応じた文書が提示され
る領域となる。In the classification window 301, document classification areas 203, 204, 205,... Are formed as small windows according to the classification items corresponding to the classification model. The document classification area 203 is an area for displaying “other topics”. The “other topics” document classification area 203 is an area where documents that have not been classified yet are presented. For example, step F in FIG.
Each document received at 11 (that is, a document to be classified) is presented in the "other topics" document classification area 203. The document classification area 204 is an area where documents classified as, for example, “business news” are presented. The document classification area 205 is an area in which documents classified as, for example, “political news” are presented. In addition to these, document classification areas without reference numerals in the figure are areas where documents corresponding to specific classification items are presented.

【００８９】これらの各文書分類エリア２０３，２０４
・・・では、その各文書分類エリアに設定された分類項
目（カテゴリ）に分類された文書が、その文書のアイコ
ンと文書のタイトルにより提示される。タイトルがない
場合には、一文の要約が表示される。また各文書分類エ
リア２０３，２０４・・・の大きさは固定的ではなく、
ユーザーがドラッグ操作などにより各文書分類エリアを
区切る区切枠２１１，２１２，２１３・・・を移動させ
ることにより、各文書分類エリア２０３，２０４・・・
の面積を任意に変更させることができる。文書分類エリ
アの数もユーザーが任意に増減できる。Each of these document classification areas 203 and 204
.., The documents classified into the classification items (categories) set in the respective document classification areas are presented by the icon of the document and the title of the document. If there is no title, a summary of one sentence is displayed. Also, the size of each of the document classification areas 203, 204,... Is not fixed,
When the user moves the delimiting frames 211, 212, 213,... Separating the document classification areas by a drag operation or the like, the document classification areas 203, 204,.
Can be arbitrarily changed. The user can arbitrarily increase or decrease the number of document classification areas.

【００９０】また各文書分類エリア２０３，２０４・・
・のタイトル（例えば「政治ニュース」など）は、ユー
ザーが任意に設定、変更できるものである。なお、この
文書分類エリアの数及び各タイトルは、後述する分類モ
デルの分類項目に応じたものとなる。言い換えれば、ユ
ーザーがこの分類ウインドウ２０１においてマウスやキ
ーボード等による入力部２０からの操作で、文書分類エ
リアの設定や削除、或いはタイトル設定を行うことで、
分類モデルの分類項目の数やタイトルが設定されること
になる。Each of the document classification areas 203, 204,.
The title (for example, "political news") can be arbitrarily set and changed by the user. Note that the number of document classification areas and each title correspond to the classification items of the classification model described later. In other words, the user sets or deletes a document classification area or performs a title setting in the classification window 201 by operating the input unit 20 using a mouse, a keyboard, or the like.
The number and title of the classification items of the classification model are set.

【００９１】図１１は、ユーザーが文書データの内容を
閲覧する閲覧ウインドウ３０１の例を示している。例え
ばユーザーが、図１０の分類ウインドウ２０１において
或る文書をクリックして選択した状態としたうえで、ブ
ラウザボタン２０２ｂをクリックすることで、制御部１
１は図１１のように選択された文書を表示する閲覧ウイ
ンドウ３０１を開くようにする。FIG. 11 shows an example of a browsing window 301 for a user to browse the contents of document data. For example, the user clicks on a certain document in the classification window 201 of FIG. 10 to select the document, and then clicks the browser button 202b, whereby the control unit 1
1 opens a browsing window 301 for displaying the selected document as shown in FIG.

【００９２】この閲覧ウインドウ３０１には、文書デー
タファイルのファイル名を表示するファイル名表示部３
０２、そのファイル名の文書データを表示する文書表示
部３０３、文書表示部３０３に表示された文書の要約文
を表示する要約表示部３０４、キーワードの入力／表示
を行うキーワード表示部３０５が設けられる。また操作
用のボタン表示３０６として、要約文の作成を指示する
ための要約作成ボタン３０６ａ、アンドゥ操作（操作取
消）を行うためのアンドゥボタン３０６ｂ、読み上げ動
作を実行させるための読み上げボタン３０６ｃなどが表
示される。In the browsing window 301, a file name display unit 3 for displaying the file name of the document data file
02, a document display unit 303 for displaying the document data of the file name, a summary display unit 304 for displaying a summary of the document displayed on the document display unit 303, and a keyword display unit 305 for inputting / displaying a keyword. . As the operation button display 306, a summary creation button 306a for instructing creation of a summary sentence, an undo button 306b for performing an undo operation (operation cancellation), a reading button 306c for executing a reading operation, and the like are displayed. Is done.

【００９３】この様な閲覧ウインドウ３０１において、
ユーザーは文書表示部３０３に表示される文書を閲覧す
ることができる。なお、文書の全体を表示しきれないと
きは、文書の一部が表示される。もちろんスクロール操
作を行うことで、全文を閲覧できる。また、ユーザーは
要約作成ボタン３０６ａをクリックすることで、文書表
示部３０３に表示される文書についての要約文を作成さ
せ、要約表示部３０４に表示させることができる。な
お、要約文作成のための制御部１１の処理については後
述する。さらにユーザーは、読み上げボタン３０６ｃを
クリックすることで、文書表示部３０３に表示されてい
る文書の本文又は要約文についての読み上げを実行させ
ることができる。この読み上げ動作についても後述す
る。In such a browsing window 301,
The user can browse the document displayed on the document display unit 303. If the entire document cannot be displayed, a part of the document is displayed. Of course, you can browse the full text by scrolling. In addition, by clicking the summary creation button 306a, the user can cause a summary sentence for the document displayed on the document display unit 303 to be created and displayed on the summary display unit 304. The process of the control unit 11 for creating the summary sentence will be described later. Further, by clicking the read-aloud button 306c, the user can cause the text or the summary of the document displayed on the document display section 303 to be read aloud. This reading operation will also be described later.

【００９４】以上のような分類ウインドウ２０１、閲覧
ウインドウ３０１は、図５の手動分類処理の際に限ら
ず、ユーザーの操作に応じて随時表示部２０に表示され
るものであるが、図５の手動分類処理に関していえば、
ユーザーは受信した文書の種類や内容を、分類ウインド
ウ２０１、閲覧ウインドウ３０１で確認することができ
るものである。具体的には、図５のステップＦ１１で受
信された１又は複数の文書は、ステップＦ１２でのイン
デックス作成処理の後、図１０のような分類ウインドウ
２０１における”他のトピックス”の文書分類エリア２
０３に表示される。この分類ウインドウ２０１におい
て、ユーザーは、文書分類エリア２０３に表示された各
文書を手動で分類していくことになるが、例えば文書の
タイトルだけ等では内容がわからない場合は、図１１の
閲覧ウインドウ３０１により文書内容を確認する。その
ようにユーザの必要に応じて行われる閲覧が図５のステ
ップＦ１３の処理となる。The classification window 201 and the browsing window 301 as described above are displayed not only in the manual classification process in FIG. 5 but also in the display unit 20 at any time according to the operation of the user. As for the manual classification process,
The user can confirm the type and content of the received document in the classification window 201 and the browsing window 301. Specifically, the one or more documents received in step F11 in FIG. 5 are subjected to the index creation processing in step F12, and then the document classification area 2 of "other topics" in the classification window 201 as shown in FIG.
03 is displayed. In the classification window 201, the user manually classifies each document displayed in the document classification area 203. For example, when the content cannot be understood only by the title of the document, the browsing window 301 in FIG. To confirm the document content. The browsing performed as needed by the user in this manner is the process of step F13 in FIG.

【００９５】ステップＦ１４としては、ユーザーは分類
ウインドウ２０１上において分類項目の追加、更新、削
除等を任意に行うことができ、その操作に応じて、制御
部１１は表示される文書分類エリア２０３、２０４・・
・の表示態様（数、面積、タイトル等）を変更させてい
く。なお、ユーザーによる分類項目（文書分類エリアの
タイトル）の設定／変更は、それが後述する分類モデル
に反映されることになる。In step F14, the user can arbitrarily add, update, or delete a classification item on the classification window 201, and in accordance with the operation, the control unit 11 causes the displayed document classification area 203, 204 ...
・ Change the display mode (number, area, title, etc.). The setting / change of the classification item (title of the document classification area) by the user is reflected in the classification model described later.

【００９６】ユーザーは必要に応じて分類項目の設定を
行った後、文書分類エリア２０３に表示されている各文
書を、各文書分類エリアに振り分けていく。つまりユー
ザーの手動により、文書を分類する。具体的には、”他
のトピックス”の文書分類エリア２０３に表示されてい
る文書のアイコンを、例えば入力部２０のマウスを用
い、所望の分類項目（カテゴリ）に対応する文書分類エ
リアにドラッグすることによりおこなう。例えばユーザ
ーは、「スポーツ」というタイトルの文書分類エリアを
設定したうえで、”他のトピックス”の文書分類エリア
２０３に表示されているスポーツ関連の文書のアイコン
を、“スポーツ”の文書分類エリアにドラッグするよう
な操作を行う。このようにして手動で分類された各文書
のアイコンやタイトルは、以降、そのドラッグされた先
の文書分類エリア内で表示される。After setting the classification items as necessary, the user sorts each document displayed in the document classification area 203 into each document classification area. That is, the documents are classified manually by the user. Specifically, the icon of the document displayed in the document classification area 203 of “other topics” is dragged to the document classification area corresponding to a desired classification item (category) using, for example, the mouse of the input unit 20. Do it by doing. For example, the user sets a document classification area titled “Sports”, and changes the icon of a sports-related document displayed in the document classification area 203 of “Other topics” to the document classification area of “Sports”. Perform an operation like dragging. The icons and titles of the documents that have been manually classified in this manner are thereafter displayed in the dragged destination document classification area.

【００９７】４−４分類モデル作成／登録以上のようにユーザーによる手動分類操作が行われた
ら、制御部１１は図５のステップＦ１５において、ユー
ザの分類操作に基づいた複数のカテゴリからなる分類モ
デルを作成する。すなわち制御部１１は、各カテゴリに
分類された上記複数の文書のインデックスを集めて、分
類モデルを生成する。そして、分類モデルの各カテゴリ
に上記複数の文書を分類する。4-4 Creation / Registration of Classification Model As described above, when the user performs the manual classification operation, the control unit 11 determines in step F15 of FIG. 5 that the classification model includes a plurality of categories based on the user's classification operation. Create That is, the control unit 11 collects indices of the plurality of documents classified into each category and generates a classification model. Then, the plurality of documents are classified into each category of the classification model.

【００９８】分類モデルは、文書を分類する複数の分類
項目（カテゴリ）から構成される。そして各カテゴリに
ついて、分類された文書が示されるデータ形態となる。
各文書については、上記ステップＦ１２などでインデッ
クスが形成されるが、分類モデルは例えば図１２（ａ）
に示すように、各カテゴリについて分類された文書のイ
ンデックスが対応づけられたようなデータ構造となる。
この図１２（ａ）では、カテゴリとして「スポーツ」
「会社」「コンピュータ」・・・等が設定されている
が、これらは上記のように分類ウインドウ２０１におい
てユーザーが設定した分類項目となる。なお、もちろん
ユーザーが設定しなくとも、予め設定されている（つま
り分類ウインドウで文書分類エリアとして表示される）
カテゴリがあってもよい。そして各分類項目にはインデ
ックスＩＤＸ１、ＩＤＸ２・・・が対応づけられるが、
即ち各分類項目には、ユーザーが上記のように分類した
文書のインデックスが対応づけられるものとなる。The classification model is composed of a plurality of classification items (categories) for classifying documents. Then, for each category, a data format is provided in which the classified documents are shown.
For each document, an index is formed in step F12 or the like, and the classification model is, for example, as shown in FIG.
As shown in (1), the data structure is such that the indexes of the documents classified for each category are associated with each other.
In FIG. 12A, the category is “sports”.
“Company”, “Computer”, etc. are set, but these are the classification items set by the user in the classification window 201 as described above. Of course, even if the user does not set, it is set in advance (that is, displayed as a document classification area in the classification window).
There may be categories. Each of the classification items is associated with an index IDX1, IDX2,.
That is, the index of the document classified by the user as described above is associated with each classification item.

【００９９】各分類項目に対応づけられるインデックス
は、分類ウインドウ２０１においてその分類項目の文書
分類エリアに表示されている文書のインデックスであ
る。例えばインデックスＩＤＸ１がカテゴリ「スポー
ツ」に対応づけられているのは、ユーザーが、分類ウイ
ンドウ２０１において「スポーツ」をタイトルとする文
書分類エリアを作成し、さらにインデックスＩＤＸ１の
文書のアイコンを、その「スポーツ」をタイトルとする
文書分類エリアにドラッグするという手動分類を行った
ことに基づくものとなる。The index associated with each classification item is the index of the document displayed in the document classification area of the classification item in the classification window 201. For example, the index IDX1 is associated with the category "sports" because the user creates a document classification area with the title "sports" in the classification window 201, and furthermore, sets the icon of the document of the index IDX1 to "sports". Is manually dragged to a document classification area having a title of "".

【０１００】ところで上述のように各文書のインデック
スは、固有名詞、固有名詞以外の語義や文書アドレス等
を含んでいる。そして、例えば図１２（ａ）のように１
つの分類項目には１又は複数のインデックスが対応づけ
られるが、インデックスとして固有名詞、語義、文書ア
ドレス等が含まれるため、分類モデルは図１２（ｂ）の
ようにも表すことができる。As described above, the index of each document includes proper nouns, meanings other than proper nouns, document addresses, and the like. Then, for example, as shown in FIG.
One or a plurality of indices are associated with one classification item. Since the indexes include proper nouns, meanings, document addresses, and the like, the classification model can be represented as shown in FIG.

【０１０１】即ち図１２（ｂ）に示すように、分類モデ
ルは、各カテゴリに対応するカテゴリインデックスとし
て、固有名詞、固有名詞以外の語義、文書アドレスの欄
を有する構造となる。そして分類モデルにおいては、各
カテゴリ「スポーツ」「社会」「コンピュータ」「植
物」「美術」「イベント」に対して、固有名詞“Ａ氏、
・・・”、“Ｂ氏、・・・”、“Ｃ社、Ｇ社、・・
・”、“Ｄ種、・・・”、“Ｅ氏、・・・”および“Ｆ
氏”等の固有名詞が割り当てられる。また、“野球（４
５４６）、グランド（２３４３）、・・・”、“労働
（３１１２）、固有（９８２１）、・・・”、“モバイ
ル（２１０２）、・・・”、“桜１（１１１１１）、オ
レンジ１（９９１１）”、“桜２（１１１１２）、オレ
ンジ２（９９１２）”および“桜３（１１１１３）”等
の語義も各カテゴリに割り当てられる。さらに文書アド
レス“ＳＰ１、ＳＰ２、ＳＰ３、・・・”、“Ｓ０１、
Ｓ０２、Ｓ０３、・・・”、“ＣＯ１、ＣＯ２、ＣＯ
３、・・・”、“ＰＬ１、ＰＬ２、ＰＬ３、・・・”、
“ＡＲ１、ＡＲ２、ＡＲ３、・・・”および“ＥＶ１、
ＥＶ２、ＥＶ３、・・・”も各カテゴリに割り当てられ
る。That is, as shown in FIG. 12B, the classification model has a structure having columns of proper nouns, meanings other than proper nouns, and document addresses as category indexes corresponding to each category. Then, in the classification model, the proper noun “Mr. A” for each category “sports” “society” “computer” “plant” “art” “event”
... "," Mr. B, ... "," Company C, Company G, ...
"," D type, ... "," Mr. E, ... "and" F
A proper noun such as "Mr." is also assigned.
546), ground (2343), ... "," labor (3112), unique (9821), ... "," mobile (2102), ... "," cherry blossom 1 (11111), orange 1 ( 9911) "," Sakura 2 (11112), Orange 2 (9912) ", and" Sakura 3 (11113) "are also assigned to each category. Further, the document addresses" SP1, SP2, SP3,... " “S01,
S02, S03,... "," CO1, CO2, CO
3, ... "," PL1, PL2, PL3, ... ",
"AR1, AR2, AR3, ..." and "EV1,
EV2, EV3,... "Are also assigned to each category.

【０１０２】なお、“桜１”“桜２”“桜３”は、
“桜”の第１の語義（１１１１１）、第２の語義（１１
１１２）、第３の語義（１１１１３）を示している。ま
た、“オレンジ１”“オレンジ２”は、“オレンジ”の
第１の語義（９９１１）、第２の語義（９９１２）を示
している。たとえば“オレンジ１”は植物のオレンジを
表し、“オレンジ２”はオレンジ色を表す。固有名詞以
外の場合に語そのものではなく語義を用いるのは、この
様に、同じ語でも複数の意味を有することがあるからで
ある。Note that "Sakura 1", "Sakura 2" and "Sakura 3"
The first meaning (11111) and the second meaning (1111)
112) and the third meaning (11113). “Orange 1” and “Orange 2” indicate the first meaning (9911) and the second meaning (9912) of “orange”. For example, "orange 1" represents a plant orange, and "orange 2" represents an orange color. The reason that the meaning is used instead of the word itself in cases other than proper nouns is that the same word may have a plurality of meanings.

【０１０３】図５のステップＦ１５では、ユーザーの手
動分類操作に応じて例えばこの様な分類モデルが生成さ
れる。そしてステップＦ１６として分類モデルが登録、
即ちＲＡＭ１５（又はＨＤＤ３４）に記録される。この
ように分類モデルが生成／登録されることにより、文書
の分類が行われたことになる。In step F15 of FIG. 5, for example, such a classification model is generated in response to a user's manual classification operation. Then, the classification model is registered as Step F16,
That is, it is recorded in the RAM 15 (or the HDD 34). The generation / registration of the classification model in this way means that the classification of the document has been performed.

【０１０４】なお、このように図５におけるステップＦ
１５、Ｆ１６として分類モデルの作成／登録が行われた
後は、後述する自動分類処理や、ユーザーの分類項目の
編集、或いは手動分類操作などに応じて、分類モデルは
逐次更新されていくことになる。分類モデルが更新され
ると、分類モデルに更新日時が記録される。図１２に
は、更新日時として“１９９８年１２月１０日１９時５
６分１０秒”が記録されている。As described above, step F in FIG.
After the creation / registration of the classification model is performed as F15 and F16, the classification model is sequentially updated in accordance with automatic classification processing described later, editing of a classification item by a user, or manual classification operation. Become. When the classification model is updated, the update date and time are recorded in the classification model. FIG. 12 shows that the update date and time “19:05 on December 10, 1998
6 minutes and 10 seconds "are recorded.

【０１０５】５．文書データに対する自動分類処理５−１処理手順本例の文書処理装置１では、上記のように一旦分類モデ
ルが作成された後は、例えば通信部２１により外部から
取り込まれた文書データを、自動的に分類していく自動
分類処理が可能となる。即ち以下説明する自動分類処理
とは、文書処理装置１が外部から送られた文書データを
受信した際に、その文書データを分類モデルに対して分
類していく処理となる。なお、この例では、一つの文書
を受信する毎に以下説明する自動分類処理をおこなうこ
ととするか、複数の所定数の文書を受信する度におこな
ってもよいし、ユーザが図９の画面を開く操作をしたと
きにそれまでに受信した全文書に対して自動分類処理を
おこなうようにしてもよい。5. 5. Automatic Classification Processing for Document Data 5-1 Processing Procedure In the document processing apparatus 1 of the present embodiment, after the classification model is once created as described above, for example, the document data taken in from the outside by the communication unit 21 is automatically converted to the document data. The automatic classification process of classifying into is possible. That is, the automatic classification process described below is a process of, when the document processing apparatus 1 receives document data sent from the outside, classifying the document data with respect to a classification model. In this example, the automatic classification process described below may be performed each time one document is received, or may be performed each time a plurality of predetermined documents are received. The automatic classification process may be performed on all the documents received so far when the operation of opening is performed.

【０１０６】自動分類処理としての全体の処理手順を図
１３に示す。図１３のステップＦ２１は、文書処理装置
１の受信部２１による文書受信処理を示している。この
ステップＦ２１では、受信部２１は、たとえば通信回線
を介して送信された１又は複数の文書を受信する。受信
部２１は、受信した文書を文書処理装置の本体１０に送
る。制御部１１は供給された１又は複数の文書データを
ＲＡＭ１４又はＨＤＤ３４に格納する。FIG. 13 shows the entire processing procedure as the automatic classification processing. Step F21 in FIG. 13 shows a document receiving process by the receiving unit 21 of the document processing apparatus 1. In this step F21, the receiving section 21 receives one or a plurality of documents transmitted via a communication line, for example. The receiving unit 21 sends the received document to the main body 10 of the document processing device. The control unit 11 stores the supplied document data or document data in the RAM 14 or the HDD 34.

【０１０７】続いてステップＦ２２に進み、制御部１１
は、ステップＦ２１で取り込まれた文書についてインデ
ックスを作成する。Then, the process proceeds to step F22, where the control unit 11
Creates an index for the document taken in step F21.

【０１０８】ステップＦ２３では、制御部１１は、分類
モデルに基づいて、インデックスを付された各文書を、
分類モデルのいずれかのカテゴリに自動分類する。そし
て、制御部１１は、分類の結果をたとえばＲＡＭ１４に
記憶させる。自動分類の詳細については後述する。In step F23, the control unit 11 converts each indexed document based on the classification model into
Automatic classification into one of the categories of the classification model. Then, the control unit 11 causes the RAM 14 to store the result of the classification, for example. Details of the automatic classification will be described later.

【０１０９】ステップＦ２４では、制御部１１は、ステ
ップＦ２３での新たな文書の自動分類の結果に基づい
て、分類モデルを更新する。そしてステップＦ２５で
は、制御部１１は、ステップＦ２４で更新された分類モ
デルを登録する。例えば分類モデルをＲＡＭ１４に記憶
させる。In step F24, the control unit 11 updates the classification model based on the result of the automatic classification of a new document in step F23. Then, in step F25, the control unit 11 registers the classification model updated in step F24. For example, the classification model is stored in the RAM 14.

【０１１０】以上の図１３の処理により、文書処理状態
１に入力された文書データが、分類モデル上で分類され
るように自動分類処理が行われることになる。すなわち
この自動分類処理においては、受信した文書に対しては
インデックスが作成され、さらに自動分類が行われた
後、そのインデックスを構成している固有名詞、語義、
文書アドレス等が、上記図１２のように分類モデル上で
或るカテゴリーに対応づけられることになる（分類モデ
ルが更新される）。By the above-described processing in FIG. 13, automatic classification processing is performed so that the document data input to the document processing state 1 is classified on the classification model. That is, in this automatic classification process, an index is created for the received document, and after the automatic classification is performed, proper nouns, meanings,
The document address and the like are associated with a certain category on the classification model as shown in FIG. 12 (the classification model is updated).

【０１１１】ステップＦ２１、Ｆ２２の処理は、上述し
た手動分類処理におけるステップＦ１１，Ｆ１２と同様
である。即ちステップＦ２２のインデックス作成処理と
しては、図６〜図９で説明した処理が行われるものであ
り、ここでの繰り返しの説明は避ける。また、ステップ
Ｆ２４の分類モデルの更新は、ステップＦ２３の自動分
類の分類結果に応じてものとなる。以下、上述の手動分
類処理とは異なる処理として、ステップＦ２３の自動分
類について詳細に説明する。The processes in steps F21 and F22 are the same as steps F11 and F12 in the above-described manual classification process. That is, as the index creation processing in step F22, the processing described with reference to FIGS. 6 to 9 is performed, and the repeated description will be omitted. The update of the classification model in step F24 depends on the classification result of the automatic classification in step F23. Hereinafter, automatic classification in step F23 will be described in detail as processing different from the above-described manual classification processing.

【０１１２】５−２自動分類図１３のステップＦ２３での自動分類の詳しい処理を図
１４に示す。図１４のステップＦ６１では、制御部１１
は、分類モデルのカテゴリＣｉに含まれる固有名詞の集
合と、ステップＦ２１で受信した文書から抽出されイン
デックスに入れられた語のうちの固有名詞の集合とにつ
いて、これらの共通集合の数をＰ（Ｃｉ）とする。そし
て制御部１１は、このようにして算出した数Ｐ（Ｃｉ）
をＲＡＭ１４に記憶させる。5-2 Automatic Classification FIG. 14 shows the detailed processing of the automatic classification in step F23 in FIG. In step F61 of FIG.
Defines the number of intersections of the set of proper nouns included in the category Ci of the classification model and the set of proper nouns among the words extracted from the document received in step F21 and put into the index by P ( Ci). Then, the control unit 11 calculates the number P (Ci) thus calculated.
Is stored in the RAM 14.

【０１１３】ステップＦ６２においては、制御部１１
は、その文書のインデックス中に含まれる全語義と、各
カテゴリＣｉに含まれる全語義との語義間関連度を、後
述する図１６に示す語義間関連度の表を参照して、語義
間関連度の総和Ｒ（Ｃｉ）を演算する。すなわち制御部
１１は、分類モデルにおける固有名詞以外の語につい
て、全語義間関連度の総和Ｒ（Ｃｉ）を演算する。そし
て制御部１１は、演算した語義間関連度の総和Ｒ（Ｃ
ｉ）をＲＡＭ１４に記憶させる。At step F62, the control unit 11
Refers to the degree of association between all meanings included in the index of the document and all the meanings included in each category Ci by referring to the table of association between meanings shown in FIG. The total sum R (Ci) of the degrees is calculated. That is, the control unit 11 calculates the total sum R (Ci) of the degrees of association between all the meanings of words other than the proper noun in the classification model. The control unit 11 then calculates the total sum R (C
i) is stored in the RAM 14.

【０１１４】ここで語義間関連度について説明してお
く。語義間関連度は、図１５の処理により文書処理装置
１が備える電子辞書に含まれる語義について予め算出
し、その結果を図１６のように保持しておけばよい。つ
まり、制御部１１が予め一度だけ図１５の処理を実行し
ておくようにすることで、図１４の自動分類処理の際に
用いることができる。Here, the degree of association between meanings will be described. The degree of association between meanings may be calculated in advance for the meanings included in the electronic dictionary provided in the document processing apparatus 1 by the processing in FIG. 15 and the result may be stored as shown in FIG. That is, the control unit 11 executes the process of FIG. 15 only once in advance, so that it can be used in the automatic classification process of FIG.

【０１１５】制御部１１が予め実行しておく図１５の処
理は次のようになる。まずステップＦ７１において、制
御部１１は、電子辞書内の語の語義の説明を用いて、こ
の辞書を使って語義のネットワークを作成する。すなわ
ち、辞書における各語義の説明とこの説明中に現れる語
義との参照関係から、語義のネットワークを作成する。
ネットワークの内部構造は、上述したようなタグ付けに
より記述される。文書処理装置の制御部１１は、たとえ
ばＲＡＭ１４に記憶された電子辞書について、語義とそ
の説明を順に読み出して、ネットワークを作成する。制
御部１４は、このようにして作成した語義のネットワー
クをＲＡＭ１４に記憶させる。The processing of FIG. 15 which is executed in advance by the control unit 11 is as follows. First, in step F71, the control unit 11 uses the description of the meaning of a word in the electronic dictionary and creates a meaning network using this dictionary. That is, a meaning network is created from the reference relation between the explanation of each meaning in the dictionary and the meaning appearing in the explanation.
The internal structure of the network is described by tagging as described above. For example, the control unit 11 of the document processing apparatus sequentially reads meanings and descriptions of the electronic dictionary stored in the RAM 14 and creates a network. The control unit 14 causes the RAM 14 to store the meaning network created in this manner.

【０１１６】なお、上記ネットワークは、文書処理装置
の制御部１１が辞書を用いて作成する他に、受信部２１
にて外部から受信したリ、記録／再生部３１にて記録媒
体３２から再生したりすることにより得ることもでき
る。また上記電子辞書は、受信部２１にて外部から受信
したり、記録／再生部３１にて記録媒体３２から再生し
たりすることにより得ることができる。The above network is not only created by the control unit 11 of the document processing apparatus using a dictionary, but also by the receiving unit 21.
The recording / reproducing section 31 reproduces the data received from the outside from the recording medium 32. Further, the electronic dictionary can be obtained by receiving from the outside by the receiving unit 21 or by reproducing from the recording medium 32 by the recording / reproducing unit 31.

【０１１７】ステップＦ７２においては、ステップＦ７
１で作成された語義のネットワーク上で、各語義のエレ
メントに対応する中心活性値の拡散処理をおこなう。こ
の活性拡散により、各語義に対応する中心活性値は、上
記辞書により与えられたタグ付けによる内部構造に応じ
て与えられる。中心活性値の拡散処理は、図８で説明し
た処理となる。In Step F72, Step F7
The central activity value corresponding to each semantic element is spread on the semantic network created in step 1. By this activity diffusion, the central activity value corresponding to each meaning is given according to the internal structure by tagging given by the dictionary. The diffusion processing of the central activation value is the processing described with reference to FIG.

【０１１８】ステップＦ７３においては、ステップＦ７
１で作成された語義のネットワークを構成するある一つ
の語義Ｓｉを選択し、続くステップＦ７４においては、
この語義Ｓｉに対応する語彙エレメントＥｉの中心活性
値ｅｉの初期値を変化させ、このときの中心活性値の差
分△ｅｉを計算する。In Step F73, Step F7
One semantic Si constituting the semantic network created in 1 is selected, and in the subsequent step F74,
The initial value of the central activity value ei of the vocabulary element Ei corresponding to this meaning Si is changed, and the difference △ ei of the central activity value at this time is calculated.

【０１１９】さらにステップＦ７５においては、ステッ
プＦ７４におけるエレメントＥｉの中心活性値ｅｉの差
分△ｅｉに対応する、他の語義Ｓｊに対応するエレメン
トＥｊの中心活性値ｅｊの差分△ｅｊを求める。ステッ
プＦ７６においては、ステップＦ７５で求めた差分△ｅ
ｊを、ステップＦ７４で求めた△ｅｉで除した商△ｅｊ
／△ｅｉを、語義Ｓｉの語義ｓｊに対する語義間関連度
とする。Further, in step F75, a difference Δej of the central activation value ej of the element Ej corresponding to another meaning Sj corresponding to the difference Δei of the central activation value ei of the element Ei in step F74 is determined. In step F76, the difference Δe obtained in step F75
quotient ｊej obtained by dividing j by △ ei obtained in step F74
/ △ ei is the degree of association between meanings of the meaning Si with respect to the meaning sj.

【０１２０】ステップＦ７７においては、一の語義Ｓｉ
と他の語義Ｓｊとのすべての対について語義間関連度の
演算が終了したか否かについて判断する。すべての語義
の対について語義間関連度の演算が終了していないとき
には、ステップＦ７３にもどり、語義間関連度の演算が
終了していない対について語義間関連度の演算を継続す
る。このようなステップＦ７３からステップＦ７７のル
ープにおいて、制御部１１は、必要な値をたとえばＲＡ
Ｍ１４から順に読み出して、上述したように語義間関連
度を計算する。制御部１１は、計算した語義間関連度を
たとえばＲＡＭ１４に順に記憶させる。そして、すべて
の語義の対について語義間関連度の演算が終了したとき
には、ステップＦ７７から、この一連の処理を終了す
る。In step F77, one meaning Si
It is determined whether or not the calculation of the degree of association between meanings has been completed for all pairs of the word and another meaning Sj. When the calculation of the relevance between meanings is not completed for all pairs of meanings, the process returns to step F73, and the calculation of the relevance between meanings is continued for the pairs for which the calculation of the relevance is not completed. In the loop from step F73 to step F77, the control unit 11 sets a necessary value to, for example, RA
M14 is sequentially read out, and the degree of association between meanings is calculated as described above. The control unit 11 causes the RAM 14 to sequentially store the calculated meaning-to-sense associations, for example. Then, when the calculation of the degree of inter-word meaning is completed for all the word-sense pairs, this series of processing is ended from step F77.

【０１２１】このような語義間関連度の算出は、或る１
つの語義の中心活性値を変化させた時に、それにつられ
て中心活性値が変化する語義を、関連度が高いものとす
る処理といえる。つまりステップＦ７４で或る語義の中
心活性値を変化させると、それに応じて関連する（リン
クされた）語義の中心活性値が変化するものとなるた
め、その変化の度合いを調べれば、或る語義に対する他
の各語義の関連度がわかるものである。（或るエレメン
トＥｉの中心活性値は、上述した活性拡散の説明におい
て述べたように、リンク先のエレメントの中心活性値と
端点活性値が反映されて、そのエレメントＥｉ端点活性
値が更新されたうえで、そのエレメントＥｉの端点活性
値と現在の中心活性値の和から求められるため、リンク
先との関連度が大きいほど中心活性値の変化量は大きく
なる）このような処理を各語義から他の全ての語義に対して行
っていくことで、すべての語義の対（組み合わせ）につ
いて、関連度を算出することができる。The calculation of the degree of association between meanings is performed by a certain one.
It can be said that, when the central activity value of one meaning is changed, the meaning in which the central activity value changes along with the meaning has a high degree of relevance. That is, if the central activity value of a certain meaning is changed in step F74, the central activity value of the related (linked) meaning changes accordingly, and if the degree of the change is examined, a certain meaning is obtained. The degree of relevance of each other meaning with respect to is understood. (The central activation value of an element Ei reflects the central activation value and the endpoint activation value of the linked element as described in the above description of the activity diffusion, and the element Ei endpoint activation value is updated. In addition, since the value is obtained from the sum of the end point activity value of the element Ei and the current center activity value, the larger the degree of association with the link destination is, the larger the amount of change of the center activity value is.) By performing the processing for all other meanings, it is possible to calculate the degree of association for all pairs (combinations) of meanings.

【０１２２】このように計算された語義間関連度は、図
１６に示すように、それぞれの語義と語義の間に定義さ
れる。この図１６の表においては、語義間関連度は０か
ら１までの値をとるように正規化されている。そしてこ
の表においては一例として“コンピュータ”、“テレ
ビ”、“ＶＴＲ”の間の相互の語義間関連度が示されて
いる。“コンピュータ”と“テレビ”の語義間関連度は
０．５５、“コンピュータ” と“ＶＴＲ”の語義間関
連度は０．２５、“テレビ”と“ＶＴＲ”の語義間関連
度は０．６０である。The calculated degree of association between meanings is defined between each meaning as shown in FIG. In the table of FIG. 16, the degree of association between meanings is normalized to take a value from 0 to 1. In this table, as an example, the degree of inter-semantic association between “computer”, “television”, and “VTR” is shown. The relevance between meanings of "computer" and "television" is 0.55, the relevance between meanings of "computer" and "VTR" is 0.25, and the relevance between meanings of "television" and "VTR" is 0.60. It is.

【０１２３】以上のように予め算出されていた語義間関
連度を用いて図１４のステップＦ６２の処理が行われた
ら、続いて制御部１１は、ステップＦ６３として、カテ
ゴリＣｉに対する文書の文書分類間関連度Ｒｅｌ（Ｃ
ｉ）をＲｅｌ（Ｃｉ）＝ｍ１Ｐ（Ｃｉ）＋ｎ１Ｒ（Ｃｉ）として算出する。ここで、係数ｍ１、ｎ１は定数で、そ
れぞれの値の文書分類間関連度への寄与の度合いを表す
ものである。制御部１１は、ステップＦ６１で算出した
共通集合の数Ｐ（Ｃｉ）およびステップＦ６２で算出し
た語義間関連度の総和Ｒ（Ｃｉ）を用いて、上記式の演
算を行い、文書分類間関連度Ｒｅｌ（Ｃｉ）を算出す
る。制御部１１は、このように算出した文書分類間関連
度Ｒｅｌ（Ｃｉ）をＲＡＭ１４に記憶させる。When the processing in step F62 in FIG. 14 is performed using the degree of association between the meanings calculated in advance as described above, the control unit 11 then proceeds to step F63 to determine the inter-document classification for the category Ci. Relevance Rel (C
i) is calculated as Rel (Ci) = m1P (Ci) + n1R (Ci). Here, the coefficients m1 and n1 are constants and represent the degree of contribution of the respective values to the inter-document-class relevance. The control unit 11 performs the calculation of the above expression using the number P (Ci) of common sets calculated in step F61 and the sum R (Ci) of the degree of association between meanings calculated in step F62, and calculates the degree of association between document classifications. Calculate Rel (Ci). The control unit 11 causes the RAM 14 to store the thus calculated inter-document-class relevance Rel (Ci).

【０１２４】なお、これらの係数ｍ１、ｎ１の値として
は、たとえばｍ１＝１０、ｎ１＝１とすることができ
る。また係数ｍ１、ｎ１の値は、統計的手法を使って推
定することもできる。すなわち、制御部１１は、複数の
係数ｍおよびｎの対について文書分類間関連度Ｒｅｌ
（Ｃｉ）が与えられることで、上記係数を最適化により
求めることができる。The values of these coefficients m1 and n1 can be, for example, m1 = 10 and n1 = 1. The values of the coefficients m1 and n1 can also be estimated using a statistical method. That is, the control unit 11 determines the degree of association Rel between document classifications for a plurality of pairs of coefficients m and n.
Given (Ci), the coefficient can be obtained by optimization.

【０１２５】ステップＦ６４においては、制御部１１
は、カテゴリＣｉに対する文書分類間関連度Ｒｅｌ（Ｃ
ｉ）が最大で、その文書分類間関連度Ｒｅｌ（Ｃｉ）の
値がある閾値を越えているとき、そのカテゴリＣｉに文
書を分類する。すなわち制御部１１は、複数のカテゴリ
に対してそれぞれ文書分類間関連度を作成し、最大の文
書分類間関連度が閣値を越えているときには、文書を最
大の文書分類間関連度を有する上記カテゴリＣｉに分類
する。これにより文書が自動的に所要のカテゴリに分類
されることになる。なお最大の文書分類間関連度が閾値
を越えていないときには、文書の分類はおこなわない。In step F64, the control unit 11
Is the document class relevance Rel (C
When i) is the largest and the value of the degree of association Rel (Ci) between the document classifications exceeds a certain threshold, the documents are classified into the category Ci. That is, the control unit 11 creates the inter-document-class relevance for each of the plurality of categories, and when the maximum inter-document-class relevance exceeds the cabinet value, the control unit 11 assigns the document with the maximum inter-document-class relevance. Classify into category Ci. As a result, the document is automatically classified into a required category. If the maximum degree of association between document classifications does not exceed the threshold value, no document classification is performed.

【０１２６】以上のような図１４の処理として、図１３
のステップＦ２３の自動分類が行われたら、ステップＦ
２４、Ｆ２５で、それに応じて分類モデルを更新し、登
録することで、一連の自動分類が完了する。即ち文書処
理装置１に受信された文書データは、自動的に分類され
たことになり、ユーザーは例えば図１０の分類ウインド
ウ２０１において、所要の文書分類エリアにおいて、受
信された文書データを確認できることになる。As described above, the processing of FIG.
If the automatic classification in step F23 is performed,
The series of automatic classification is completed by updating and registering the classification model accordingly at 24 and F25. That is, the document data received by the document processing apparatus 1 is automatically classified, and the user can confirm the received document data in a required document classification area in the classification window 201 of FIG. 10, for example. Become.

【０１２７】６．要約作成／表示処理続いて、文書データについての要約文を作成し、表示出
力する処理について述べる。上述したようにユーザー
は、文書を選択して図１１のような閲覧ウインドウ３０
１を開くことにより、文書の本文を閲覧することができ
る。例えば上述した手動分類処理におけるステップＦ１
３の時点や、その他任意の時点において、図１０で説明
した分類ウインドウ２０１から、閲覧ウインドウ３０１
を開くことができる。6. Summary Creation / Display Process Next, a process of creating a summary sentence for document data and displaying and outputting it will be described. As described above, the user selects a document and selects the browsing window 30 shown in FIG.
1, the text of the document can be browsed. For example, step F1 in the above-described manual classification process
At time 3 or any other time, the browsing window 301 is changed from the classification window 201 described with reference to FIG.
Can be opened.

【０１２８】例えば分類ウインドウ２０１において或る
文書を選択した状態でブラウザボタン２０２ｂをクリッ
クすることで、図１７のように、文書表示部３０３に選
択された文書の本文が表示された閲覧ウインドウ３０１
が開かれる。なお文書表示部３０３に文書全文が表示で
きないときには、その文書の一部が表示される。また要
約文が作成されていない時点では、図１７のように要約
表示部３０４は空白とされる。For example, when a browser button 202b is clicked while a certain document is selected in the classification window 201, as shown in FIG. 17, a browsing window 301 in which the text of the selected document is displayed in the document display unit 303.
Is opened. When the entire text of the document cannot be displayed on the document display unit 303, a part of the document is displayed. At the time when the summary sentence is not created, the summary display section 304 is blank as shown in FIG.

【０１２９】この閲覧ウインドウ３０１において要約作
成ボタン３０６ａがクリックされると、文書表示部３０
３に表示されている文書についての要約文が作成され、
図１８に示すように要約表示部３０４に表示される。つ
まり制御部１１は、ユーザーの要約作成操作に応じて、
以下説明するような要約文作成処理を行い、作成後、そ
れを表示する制御を行うものとなる。文書から要約を作
成する処理は、文書のタグ付けによる内部構造に基づい
て実行される。なお要約文は、要約表示部３０４のサイ
ズに応じて生成される。そして本文表示部３０３と要約
表示部３０４の面積は、ユーザーが仕切枠３１２を移動
させることで変化させることができる。つまり要約文
は、要約作成が指示された時点での要約表示部３０４の
サイズに応じたサイズ（文書長）で作成されることにな
る。When the user clicks the summary creation button 306a in the browsing window 301, the document display unit 30
A summary sentence for the document displayed in 3 is created,
It is displayed on the summary display unit 304 as shown in FIG. That is, the control unit 11 responds to the user's summary creation operation by
A summary sentence creation process as described below is performed, and after creation, control for displaying the summary sentence is performed. The process of creating an abstract from a document is performed based on the internal structure of the document by tagging. The summary sentence is generated according to the size of the summary display unit 304. The area of the text display unit 303 and the summary display unit 304 can be changed by moving the partition frame 312 by the user. In other words, the summary sentence is created in a size (document length) according to the size of the summary display unit 304 at the time when the summary creation is instructed.

【０１３０】ここで説明上の例として、以下のような英
語の文書にタグ付けがなされたタグファイルが受信され
た場合を想定する。タグファイルの元となる文書（プレ
ーンテキスト）は次のような文書とする。「During its centennial year, The ABC Journal will
report events of the past century that stand as m
ilestones of American business history. THREE COMP
UTERS THAT CHANGED the face of personal computing
were Iaunchedin 1977. That year the PC A II, PC B
and PC C came to market. The computers were crude
by today's standerds. PC A ll owners, for example,
had touse their television sets as screens and sto
red data on audiocassettes.But PC A ・・・・・・・
・・・・・・・（以下略）」Here, as an example for explanation, it is assumed that a tag file in which the following English documents are tagged is received. The source document (plain text) of the tag file is as follows. `` During its centennial year, The ABC Journal will
report events of the past century that stand as m
ilestones of American business history.THREE COMP
UTERS THAT CHANGED the face of personal computing
were Iaunchedin 1977. That year the PC A II, PC B
and PC C came to market.The computers were crude
by today's standerds.PC A ll owners, for example,
had touse their television sets as screens and sto
red data on audiocassettes.But PC A
・・・・・・・ (Hereinafter abbreviated) ”

【０１３１】このような文書に対しては、図１で説明し
たオーサリング装置２においてオーサリング処理が施さ
れ、タグファイルとされる。そしてサーバ３から文書処
理装置１に提供されるものである。そして文書処理装置
１は、このような文書についてのタグファイルを受信す
ると、分類処理や、図１７に示すように本文を表示する
ことができ、また以下に説明するように、要約文を作成
して表示することができる。Such a document is subjected to an authoring process in the authoring apparatus 2 described with reference to FIG. 1, and is made a tag file. The document is provided from the server 3 to the document processing apparatus 1. Upon receiving the tag file for such a document, the document processing apparatus 1 can perform the classification process and display the text as shown in FIG. 17, and generate a summary sentence as described below. Can be displayed.

【０１３２】上記の英語の文書のタグファイルは、例え
ば図２２又は図２３に示すように構成されている。即ち
図３、図４等により説明した文書構造を示す各種のタグ
＜＊＊＊＞〜＜／＊＊＊＞が付されたものとなってい
る。図２２の例と図２３の例の違いは、この文書につい
ての著作権情報を示すタグ＜著作権＞〜＜／著作権＞が
付されているか否かである。図２２はこのようなタグ
（以下、著作権タグという）が付されていない例である
が、図２３の例では＜著作権著者＝“T.YAMADA”＞〜
＜／著作権＞というように著作権タグが付されている。The tag file of the above-mentioned English document is structured as shown in FIG. 22 or 23, for example. That is, various tags <****> to </ ****> indicating the document structure described with reference to FIGS. The difference between the example of FIG. 22 and the example of FIG. 23 is whether or not tags <copyright> to </ copyright> indicating copyright information on this document are attached. FIG. 22 shows an example in which such a tag (hereinafter, referred to as a copyright tag) is not attached. In the example of FIG. 23, <copyright author = “T.YAMADA”>
A copyright tag is attached such as </ copyright>.

【０１３３】この著作権タグは、文書についての著作権
情報として例えば著者、著作権者、著作権内容など、各
種の情報を示すものである。そしてこの著作権タグは、
文書プロバイダ４からの著作権情報に基づいて、オーサ
リング装置２がオーサリングを行う際に、タグファイル
内に付加するものである。なお、例えばサーバ３などに
おいてタグファイルに著作権タグを付加できるようにす
ることも当然可能である。いずれにしても、ユーザサイ
ドである文書処理装置１に提供されるタグファイルとし
ては、著作権タグが付されているものと、付されていな
いものが存在する。[0133] This copyright tag indicates various information such as the author, the copyright holder, and the copyright content as copyright information on the document. And this copyright tag,
When the authoring apparatus 2 performs authoring based on the copyright information from the document provider 4, it is added to the tag file. It is naturally possible to add a copyright tag to the tag file in the server 3 or the like. In any case, the tag file provided to the document processing apparatus 1 on the user side includes a file with a copyright tag and a file without a copyright tag.

【０１３４】著作権タグが付されているタグファイルを
表示する場合は、文書処理装置１の制御部１１は、著作
権タグに示される著作権情報を提示するようにしてもよ
い。例えば上述したように、分類ウインドウ２０１にお
いて或る文書を選択した状態でブラウザボタン２０２ｂ
をクリックすることで、図１７のように、文書表示部３
０３に選択された文書の本文が表示された閲覧ウインド
ウ３０１が開かれるが、この選択された文書が図２３の
ようなタグファイルであった場合は、図１９に示すよう
に、本文とともに著作権情報３２０を表示させる。When displaying a tag file with a copyright tag, the control unit 11 of the document processing apparatus 1 may present the copyright information indicated by the copyright tag. For example, as described above, with a certain document selected in the classification window 201, the browser button 202b
By clicking, as shown in FIG.
When the selected document is a tag file as shown in FIG. 23, the browsing window 301 displaying the text of the selected document is opened in FIG. The information 320 is displayed.

【０１３５】そして本例では、以下説明する要約文の作
成／表示の際にも、元となる本文タグファイルに著作権
タグが付加されている場合は、その著作権タグに示され
る著作権情報を表示するようにするものである。In this example, if a copyright tag is added to the original body tag file also at the time of creating / displaying the summary described below, the copyright information indicated by the copyright tag is displayed. Is displayed.

【０１３６】例えば図１７又は図１９のような閲覧ウイ
ンドウ３０１において要約作成ボタン３０６ａがクリッ
クされることにより開始される、制御部１１の要約作成
及び表示処理を図２１に示す。FIG. 21 shows a summary creation and display process of the control unit 11, which is started when the summary creation button 306a is clicked in the browsing window 301 as shown in FIG. 17 or FIG.

【０１３７】図２１のステップＦ８１では、制御部１１
は活性拡散を行う。本例においては、活性拡散により得
られた中心活性値を重要度として採用することにより、
文書の要約を行うものである。タグ付けによる内部構造
を与えられた文書においては、活性拡散を行うことによ
り、各エレメントにタグ付けによる内部構造に応じた中
心活性値を付与することができる。ステップＦ８１で行
う活性拡散処理は、図７〜図９で説明したものと同様の
処理となるが、上述したように活性拡散は、中心活性値
の高いエレメントと関わりのあるエレメントにも高い中
心活性値を与えるような処理である。すなわち、活性拡
散は、照応（共参照）表現とその先行詞の間で中心活性
値が等しくなり、それ以外では中心活性値が減衰するよ
うな中心活性値についての演算である。この中心活性値
は、タグ付けによる内部構造に応じて決定されるので、
タグ付けによる内部構造を考慮した文書の分析に利用す
ることができる。In Step F81 of FIG.
Performs active diffusion. In this example, by adopting the central activity value obtained by the active diffusion as the importance,
Summarizes the document. In a document given an internal structure by tagging, by performing activity diffusion, a central activity value corresponding to the internal structure by tagging can be given to each element. The active diffusion processing performed in step F81 is the same processing as that described with reference to FIGS. 7 to 9. However, as described above, the active diffusion is performed even on an element having a high central activity value and an element having a high central activity value. It is a process that gives a value. That is, the active diffusion is an operation on the central activity value such that the central activity value is equal between the anaphor (co-reference) expression and its antecedent, and the central activity value attenuates otherwise. Since this central activity value is determined according to the internal structure by tagging,
It can be used for document analysis considering the internal structure by tagging.

【０１３８】次にステップＦ８２では、制御部１１は、
表示部３０に表示されている閲覧ウィンドウ３０１の要
約表示部３０４のサイズ、具体的にはこの要約表示部３
０４に表示可能な最大文字数をｗｓと設定する。また制
御部１１は、要約の文字列（要約文を保持する内部レジ
スタ）ｓを初期化して初期値ｓ（０）＝””と設定す
る。制御部１１は、このように設定した、最大文字数ｗ
ｓおよび文字列ｓの初期値ｓ（０）を、ＲＡＭ１４に記
録する。Next, in step F82, the control unit 11
The size of the summary display section 304 of the browsing window 301 displayed on the display section 30, specifically, the summary display section 3
The maximum number of characters that can be displayed in 04 is set as ws. Further, the control unit 11 initializes a summary character string (an internal register holding a summary sentence) and sets an initial value s (0) = “”. The control unit 11 sets the maximum number of characters w
The s and the initial value s (0) of the character string s are recorded in the RAM 14.

【０１３９】ステップＦ８３では、制御部１１は、文の
骨格の抽出処理をカウントするカウンタのカウント値ｉ
を「１」に設定する。そしてステップＦ８４で制御部１
１は、カウンタのカウント値ｉに基づいて、文章からｉ
番目に平均中心活性値の高い文の骨格を抽出する。平均
中心活性値とは、一つの文を構成する各エレメントの中
心活性値を平均したものである。制御部１１は、たとえ
ばＲＡＭ１４に記録した文字列ｓ（ｉ−１）を読み出
し、この文字列ｓ（ｉ−１）に対して、抽出した文の骨
格の文字列を加えて、Ｓ（ｉ）とする。そして制御部１
１は、このようにして得た文字列ｓ（ｉ）をＲＡＭ１４
に記録する。初回は、文字列ｓ（ｉ−１）は初期値ｓ
（０）であるので、今回抽出した文の骨格が文字列Ｓ
（ｉ）としてＲＡＭ１４に記憶されることになる。また
以降においてステップＦ８４の処理が行われる場合は、
抽出された文の骨格が文字列Ｓ（ｉ）に、それまでの文
字列Ｓ（ｉ）（つまりその時点では文字列Ｓ（ｉ−
１））に追加されていくものとなる。また同時に、制御
部１１はこのステップＦ８４において、上記文の骨格に
含まれないエレメントの中心活性値順のリストＬ（ｉ）
を作成し、このリストＬ（ｉ）をＲＡＭ１４に記録す
る。In step F83, the control unit 11 counts the count value i of the counter that counts the sentence skeleton extraction processing.
Is set to “1”. Then, in Step F84, the control unit 1
1 is based on the count value i of the counter, i
The skeleton of the sentence with the highest average central activity value is extracted second. The average central activity value is the average of the central activity values of the elements constituting one sentence. The control unit 11 reads, for example, a character string s (i-1) recorded in the RAM 14, adds a character string of the extracted sentence skeleton to the character string s (i-1), and generates S (i) And And control unit 1
1 stores the character string s (i) thus obtained in the RAM 14
To record. In the first time, the character string s (i-1) has the initial value s
(0), the skeleton of the sentence extracted this time is the character string S
(I) is stored in the RAM 14. When the process of step F84 is performed thereafter,
The skeleton of the extracted sentence is added to the character string S (i), and the previous character string S (i) (that is, the character string S (i−
1)). At the same time, the control unit 11 determines in this step F84 a list L (i) of the elements not included in the skeleton of the sentence in the order of the central activation value.
And the list L (i) is recorded in the RAM 14.

【０１４０】すなわち、このステップＦ８４において
は、要約のアルゴリズムは、活性拡散の結果を用いて、
平均中心活性値の大きい順に文を選択し、選択された文
の骨格の抽出する。文の骨格は、文から抽出した必須要
素により構成される。必須要素になりうるのは、エレメ
ントの主辞（head）と、主語（subject）、目的語（obj
ect）、間接目的語（indirect object）、所有者（poss
essor）、原因（cause）、条件（condition）または比
較（comparison）の関係属性を有する要素と、等位構造
が必須要素のときにはそれに直接含まれるエレメントと
が必須要素を構成するものである。そして、文の必須要
素をつなげて文の骨格を生成し、要約に加える。That is, in this step F84, the summary algorithm uses the result of activity diffusion to
The sentences are selected in descending order of the average central activity value, and the skeleton of the selected sentence is extracted. The skeleton of a sentence is composed of essential elements extracted from the sentence. Required elements can be the head of the element, the subject, the object (obj
ect), indirect object, owner (poss
An element having a relationship attribute of essor, cause, condition, or comparison, and an element directly included when the coordination structure is an essential element, constitute an essential element. Then, by connecting the essential elements of the sentence, a skeleton of the sentence is generated and added to the summary.

【０１４１】、ステップＦ８５では制御部１１は、文字
列ｓ（ｉ）の長さが、閲覧ウィンドウ３０１の要約表示
部１０４の最大文字数ｗｓより大きいか否かを判断す
る。このステップＦ８５は、要約表示部３０４のサイズ
に応じた要約文を作成するための判断処理となる。In step F85, the control section 11 determines whether or not the length of the character string s (i) is larger than the maximum number of characters ws of the summary display section 104 of the browsing window 301. This step F85 is a determination process for creating a summary sentence according to the size of the summary display unit 304.

【０１４２】制御部１１は、文字列ｓ（ｉ）の長さが最
大文字数ｗｓに達していないときは、処理をステップＦ
８６に進める。ステップＦ８６では制御部１１は、文書
中で、（ｉ＋１）番目に平均中心活性値が高い文のエレ
メントの中心活性値と、上記ステップＦ８４で作成した
リストＬ（ｉ）の最も中心活性値が高いエレメントの中
心活性値を比較する。つまり、上記ステップＦ８４にお
いて要約として採用された文の次に平均中心活性値が高
い文（即ち次に要約文に付加する候補となる文）と、ス
テップＦ８４において要約として採用された文の中で骨
格ではないとして要約からは排除されたエレメントの中
心活性値を比較する。If the length of the character string s (i) has not reached the maximum number of characters ws, the control unit 11 executes the processing in step F
Proceed to 86. In step F86, the control unit 11 determines in the document the central activity value of the element of the sentence having the (i + 1) -th highest average central activity value and the highest central activity value of the list L (i) created in step F84. Compare the central activity values of the elements. That is, a sentence having the second highest average central activity value after the sentence adopted as the summary in step F84 (that is, a sentence which is a candidate to be added to the next summary sentence) and a sentence adopted as the summary in step F84 The central activity values of elements excluded from the summary as not scaffolds are compared.

【０１４３】このステップＦ８６の処理は、要約文とし
ての文字列に次に加える部位を、その直前のステップＦ
８４で採用した文において骨格として採用されなかった
ものから選ぶか、或いは他の文から選ぶかを判断する処
理となる。In the process of step F86, the part to be added next to the character string as the summary sentence is set in the immediately preceding step F86.
In the sentence adopted in S84, the process is to determine whether to select from those not adopted as a skeleton or to select from other sentences.

【０１４４】（ｉ＋１）番目に平均中心活性値が高い文
におけるエレメントの中心活性値よりも、リストＬ
（ｉ）における最も高い中心活性値の方が、中心活性値
が高い値であった場合は、要約文としての文字列に次に
加える部位を、その直前のステップＦ８４で採用した文
において骨格として採用されなかったものから選ぶよう
にする。このため制御部１１の処理はステップＦ８８に
進み、リストＬ（ｉ）における最も中心活性値が高いエ
レメントを、その時点で記憶されている文字列Ｓ（ｉ）
に加え、文字列ＳＳ（ｉ）とする。またこのとき、文字
列ＳＳ（ｉ）に加えたエレメントをリストＬ（ｉ）から
削除する。そして、ステップＦ８９において、文字列Ｓ
Ｓ（ｉ）が、最大文字数ｗｓより大きいか否かを判断
し、大きくなければステップＦ８６に戻る。The list L is higher than the central activity value of the element in the (i + 1) -th sentence having the highest average central activity value.
In the case where the highest central activity value in (i) is a higher central activity value, the portion to be added next to the character string as the summary sentence is set as the skeleton in the sentence adopted in the immediately preceding step F84. Try to choose from those that have not been adopted. For this reason, the process of the control unit 11 proceeds to step F88, in which the element having the highest central activity value in the list L (i) is stored in the character string S (i) stored at that time.
And a character string SS (i). At this time, the element added to the character string SS (i) is deleted from the list L (i). Then, in a step F89, the character string S
It is determined whether S (i) is greater than the maximum number of characters ws. If not, the process returns to step F86.

【０１４５】ステップＦ８６において、（ｉ＋１）番目
に平均中心活性値が高い文のエレメントとして、リスト
Ｌ（ｉ）における最も高い中心活性値よりも中心活性値
が高いエレメントがあった場合は、要約文としての文字
列に次に加える部位を、その直前のステップＦ８４で採
用した文とは別の文から選ぶこととしてステップＦ８７
でカウント値ｉをインクリメントしてステップＦ８４に
戻ることになる。つまりステップＦ８６で、（ｉ＋１）
番目に平均中心活性値が高い文とされた文について、ス
テップＦ８４で骨格を抽出し、それを文字列Ｓ（ｉ）に
加えるようにする。In step F86, if there is an element having a higher central activity value than the highest central activity value in the list L (i) as an element of the (i + 1) -th highest average central activity value, the summary sentence The part to be added next to the character string as is selected from a sentence different from the sentence adopted in step F84 immediately before that step F87.
, The count value i is incremented, and the process returns to step F84. That is, in step F86, (i + 1)
For the sentence having the second highest average central activity value, the skeleton is extracted in step F84 and added to the character string S (i).

【０１４６】以上のように、ステップＦ８４又はステッ
プＦ８８で文の骨格となるエレメントやその他のエレメ
ントとして、中心活性値の高いものを基準として文字列
に加えていきながら、ステップＦ８５又はステップＦ８
９で、文字列Ｓ（ｉ）又はＳＳ（ｉ）を最大文字数ｗｓ
と比較していくことで、最大文字数ｗｓに近いが最大文
字数ｗｓを越えない文字列を作成していくことになる。As described above, in step F84 or step F88, as the element or other element serving as the skeleton of the sentence in step F84 or step F88 is added to the character string based on the element having the higher central activation value.
9, the character string S (i) or SS (i) is converted to the maximum number of characters ws
By comparing with, a character string close to the maximum number of characters ws but not exceeding the maximum number of characters ws is created.

【０１４７】例えばステップＦ８５で文字列Ｓ（ｉ）が
最大文字数ｗｓを越えた場合は、制御部１１の処理はス
テップＦ９０に進み、直前のステップＦ８４で骨格を加
える前の文字列Ｓ（ｉ−１）を、要約文とする。つま
り、これはステップＦ８４で文の骨格を加えたことによ
り、最大文字数ｗｓを越えてしまったことになるため、
その骨格を加える前の文字列Ｓ（ｉ−１）が、最大文字
数ｗｓに近いが最大文字数ｗｓを越えない文字列である
と判断して、それを要約文とするものである。For example, if the character string S (i) exceeds the maximum number of characters ws in step F85, the process of the control section 11 proceeds to step F90, and the character string S (i- 1) is a summary sentence. In other words, this means that the maximum number of characters ws has been exceeded by adding the skeleton of the sentence in step F84.
The character string S (i-1) before adding the skeleton is determined to be a character string that is close to the maximum number of characters ws but does not exceed the maximum number of characters ws, and is used as a summary sentence.

【０１４８】なお、このため初めてステップＦ８４で文
字列Ｓ（ｉ）を生成した時点（ｉ＝１の時点）で、ステ
ップＦ８５で、文字列Ｓ（ｉ）が最大文字数ｗｓを越え
た場合は、文字列Ｓ（ｉ−１）は、ステップＦ８２で設
定した初期値としての文字列Ｓ（０）となるため、実質
的に要約文は作成できなかったことになる。これは、要
約表示部３０４のサイズが小さすぎたことに起因するた
め、ユーザーは画面上で要約表示部３０４の面積を広げ
た上で、再度、要約作成ボタン３０６ａをクリックし
て、図１９の処理が開始されるようにすればよい。If the character string S (i) exceeds the maximum number of characters ws in step F85 when the character string S (i) is generated for the first time (i = 1) in step F84, Since the character string S (i-1) becomes the character string S (0) as the initial value set in step F82, a summary sentence could not be created substantially. This is because the size of the summary display unit 304 is too small. Therefore, after increasing the area of the summary display unit 304 on the screen, the user clicks the summary creation button 306a again, and What is necessary is just to make a process start.

【０１４９】ステップＦ８５で文字列Ｓ（ｉ）が最大文
字数ｗｓを越えていない場合は、上述のように制御部１
１の処理はステップＦ８６に進み、次に文字列に加える
部分を判断することになる。そして上記のようにステッ
プＦ８９に進んだ場合は、文字列ＳＳ（ｉ）が最大文字
数ｗｓを越えたか否かを判別する。ここで文字列ＳＳ
（ｉ）が最大文字数ｗｓを越えた場合は、制御部１１の
処理はステップＦ９１に進み、直前のステップＦ８８で
或るエレメントを加える前の文字列Ｓ（ｉ）を、要約文
とすることになる。つまり、これはステップＦ８８でエ
レメントを加えたことにより、最大文字数ｗｓを越えて
しまったことになるため、そのエレメントを加える前の
文字列Ｓ（ｉ）が、最大文字数ｗｓに近いが最大文字数
ｗｓを越えない文字列であると判断して、それを要約文
とするものである。If the character string S (i) does not exceed the maximum number of characters ws in step F85, as described above, the control unit 1
The process of step 1 proceeds to step F86, where a portion to be added to the next character string is determined. When the process proceeds to step F89 as described above, it is determined whether or not the character string SS (i) has exceeded the maximum number of characters ws. Where the string SS
If (i) exceeds the maximum number of characters ws, the process of the control unit 11 proceeds to step F91, and the character string S (i) before adding a certain element in the immediately preceding step F88 is used as a summary sentence. Become. In other words, this means that the maximum number of characters ws has been exceeded by adding an element in step F88. Therefore, the character string S (i) before the addition of the element is close to the maximum number of characters ws but the maximum number of characters ws Is determined to be a character string that does not exceed, and is used as a summary sentence.

【０１５０】以上のような処理により、その時点の要約
表示部３０４のサイズに適合した要約文が作成されるこ
とになる。そしてその要約文の内容は、平均中心活性値
の高い１又は複数の文の骨格、及び骨格以外の中心活性
値の高いエレメントが用いられたものとなる。By the above processing, a summary sentence suitable for the size of the summary display unit 304 at that time is created. The contents of the summary sentence include the skeleton of one or more sentences having a high average central activity value and elements having a high central activity value other than the skeleton.

【０１５１】このようにして要約文としての文書が生成
できたら、続いて制御部１１はステップＦ９２で、要約
文の元となった本文タグファイルに、著作権タグが付さ
れているか否かを判別する。即ち本文タグファイルが上
記図２２の例のようなものか、図２３の例のようなもの
かを判別する。そして図２２のように著作権タグが付加
されていないものであった場合は、制御部１１はステッ
プＦ９６において、上記の様に作成された要約文（要約
文タグファイル）を、ＲＡＭ１４又はＨＤＤ３４などに
記憶させるとともに、ステップＦ９７で、要約文を例え
ば図１８のように要約表示部３０４に表示させる。After the summary document has been generated in this way, the control unit 11 determines in step F92 whether or not a copyright tag has been added to the text tag file from which the summary sentence was generated. Determine. That is, it is determined whether the body tag file is as shown in the example of FIG. 22 or as shown in FIG. If the copyright tag is not added as shown in FIG. 22, the control unit 11 stores the summary text (abstract text tag file) created as described above in step F96 in the RAM 14, the HDD 34, or the like. In step F97, the summary sentence is displayed on the summary display unit 304 as shown in FIG. 18, for example.

【０１５２】一方、要約文の元となった本文タグファイ
ルが図２３のように著作権タグが付されているものであ
った場合は、制御部１１はステップＦ９３において、上
記の様に作成された要約文（要約文タグファイル）に、
本文タグファイルに付加されている著作権タグを付加す
る。そして制御部１１はステップＦ９４において、著作
権タグが付された要約文（要約文タグファイル）を、Ｒ
ＡＭ１４又はＨＤＤ３４などに記憶させる。さらに制御
部１１はステップＦ９５で、要約文を例えば図２０のよ
うに要約表示部３０４に表示させるとともに、図示する
ように要約文に対応させて著作権情報３２０も表示させ
る。On the other hand, if the text tag file from which the summary sentence is based has a copyright tag as shown in FIG. 23, the control unit 11 creates the above-described file in step F93. Summary sentence (summary sentence tag file)
Adds the copyright tag added to the body tag file. In step F94, the control unit 11 stores the summary sentence (summary sentence tag file) with the copyright tag
It is stored in the AM 14 or the HDD 34 or the like. Further, in step F95, the control unit 11 displays the summary on the summary display unit 304 as shown in FIG. 20, for example, and also displays the copyright information 320 corresponding to the summary as shown in the figure.

【０１５３】以上のように本例では、要約文が生成さ
れ、表示される際には、その要約文の表示に対応して、
本文に付加された著作権タグに基づく著作権情報が表示
されることになり、ユーザは、要約文のように本文タグ
ファイルから派生した文書データについても、元の著作
権情報を確認することができるようにされている。As described above, in this example, when a summary sentence is generated and displayed, in response to the display of the summary sentence,
The copyright information based on the copyright tag added to the text is displayed, and the user can confirm the original copyright information even for document data derived from the text tag file such as an abstract. Have been able to.

【０１５４】なお図２１の処理例では、ステップＦ９４
として、著作権タグを要約文タグファイルに付加するよ
うにしているが、これにより、例えば要約文タグファイ
ルのみが文書処理装置１から他の装置に伝送されて、何
らかの文書処理に用いられるような場合でも、その元の
文書データの著作権情報が明示されることになり、適切
なものとなる。但し、図２０のように要約文の表示に対
応して著作権情報３２０を表示するという目的のために
は、必ずしも要約文タグファイルに著作権タグを付加す
る必要はないため、ステップＦ９４の処理は行われなく
てもよい。In the processing example of FIG. 21, step F94
The copyright tag is added to the abstract sentence tag file. However, for example, only the abstract sentence tag file is transmitted from the document processing apparatus 1 to another device and used for some kind of document processing. Even in this case, the copyright information of the original document data is specified, which is appropriate. However, for the purpose of displaying the copyright information 320 in response to the display of the summary sentence as shown in FIG. 20, it is not always necessary to add a copyright tag to the summary sentence tag file. May not be performed.

【０１５５】なお図１８又は図２０のように表示された
要約文を見て、ユーザーがより詳しい要約文を見たいと
思った場合、或いはより短い要約文を見たいと思った場
合は、閲覧ウインドウ３０１の要約表示部３０４のサイ
ズ（面積）を増減した上で、再度要約作成ボタン３０６
ａをクリックすればよい。すると、上述した図２１の処
理により、その時点の要約表示部３０４のサイズに応じ
た文書長の要約書が作成され、表示されることになる。If the user looks at the summary displayed as shown in FIG. 18 or FIG. 20 and wants to see a more detailed summary, or wants to see a shorter summary, the user can view After increasing or decreasing the size (area) of the summary display portion 304 of the window 301, the summary creation button 306 is again displayed.
Click a. Then, by the processing of FIG. 21 described above, a summary having a document length corresponding to the size of the summary display unit 304 at that time is created and displayed.

【０１５６】７．文書処理装置の機能ブロック構成以上、本例の文書処理装置１において実現される分類処
理や要約作成／表示処理について説明し、また要約文表
示の際に著作権情報も表示されることも述べてきた。本
例の文書処理装置では、これ以外に、例えば本文又は要
約文についてのテロップ表示、本文又は要約文について
の読み上げ、電子文書に関連するビデオデータの出力な
ども可能とされるが、以下の説明では、これらの動作を
実現するための、例えば制御部１１内のソフトウエア構
成（もしくはハードウエア構成でもよい）及びファイル
群構成として形成される機能ブロックについて述べ、ま
た各種動作において、処理対象となるタグファイルに著
作権タグが付加されていた場合の動作例を述べていく。[0156] 7. Functional Block Configuration of Document Processing Apparatus The classification processing and the summary creation / display processing realized in the document processing apparatus 1 of the present embodiment have been described above, and it has been described that copyright information is also displayed when displaying a summary sentence. Was. In addition to the above, the document processing apparatus of the present example can also display a telop for a text or an abstract, read out a text or an abstract, output video data related to an electronic document, and the like. In the following, a description will be given of, for example, a software configuration (or a hardware configuration) in the control unit 11 and a functional block formed as a file group configuration in order to realize these operations. An operation example when a copyright tag is added to a tag file will be described.

【０１５７】図２４は文書処理装置１内にソフトウエア
（もしくはハードウエア）による機能ブロック及びファ
イル群を示している。なお図２４は、表示部３０又は音
声出力部３３からの出力動作に関連する部位のみを示し
たものであり、例えば文書受信処理、分類処理などを実
現する機能ブロックについては省略した。FIG. 24 shows functional blocks and files in the document processing apparatus 1 by software (or hardware). FIG. 24 shows only parts related to the output operation from the display unit 30 or the audio output unit 33, and, for example, functional blocks for realizing document reception processing, classification processing, and the like are omitted.

【０１５８】機能ブロックとしては図示するように、音
声合成エンジン６０１、テロップ作成エンジン６０２、
要約作成エンジン６０３、ビデオエンジン６０４が設け
られる。また音声出力部３３に対する音声信号の出力処
理部として音声制御部６０５や、表示部３０に対する画
像信号の出力処理部として表示制御部６０６が設けられ
る。さらに、ユーザーインターフェース６０７として、
表示される各種ウインドウ上でのボタンについてのユー
ザー操作（入力部２０としてのマウスのクリック等によ
る操作）についての処理を行う部位が設けられる。そし
て、これらの機能ブロックの制御を行うコントローラ６
００が形成される。As shown in the figure, the function blocks are a speech synthesis engine 601, a telop creation engine 602,
A summary creation engine 603 and a video engine 604 are provided. An audio control unit 605 is provided as an audio signal output processing unit for the audio output unit 33, and a display control unit 606 is provided as an image signal output processing unit for the display unit 30. Further, as a user interface 607,
There is provided a part for performing processing for a user operation (an operation by clicking a mouse as the input unit 20) on a button on various displayed windows. A controller 6 for controlling these functional blocks
00 is formed.

【０１５９】またファイルとしては、読み上げ処理に用
いる読み上げ用ファイル６０８、本文タグファイル６０
９、要約文タグファイル６１０、ビデオファイル６１
１、ビデオ出力用ファイル６１２がある。本文タグファ
イル６０９及びビデオファイル６１１は、通信部２１も
しくは記録／再生部３１（記録媒体３２）から取り込ま
れるものとなる。また、要約文タグファイル６１０は、
要約作成エンジン６０３によって本文タグファイル６０
９から生成される。即ち要約作成エンジン６０３は上述
した図２１の処理を実行して要約文タグファイル６１０
を生成する。The files include a reading file 608 used for reading processing and a body tag file 60.
9. Summary sentence tag file 610, video file 61
1. There is a video output file 612. The body tag file 609 and the video file 611 are taken from the communication unit 21 or the recording / reproducing unit 31 (recording medium 32). The summary sentence tag file 610 is
The body tag file 60 by the summary creation engine 603
9 is generated. That is, the digest creation engine 603 executes the above-described processing of FIG.
Generate

【０１６０】読み上げ用ファイル６０８は、本文タグフ
ァイル６０９もしくは要約文タグファイル６１０が変換
されて生成される。上述したように文書データ（タグフ
ァイル）は、オーサリング装置２において生成されるも
のであるが、オーサリング装置２では、音声合成を行う
ために必要なタグを付与することもできる。或いは文書
処理装置１が、タグファイルを受信したうえで、その文
書に音声合成を行うために必要なタグを新たに付与して
文書を作成することもできる。読み上げ用ファイル６０
８は、タグファイル中のタグから、読み上げのための属
性情報を導出し、この属性情報を文書データ内に埋め込
むことにより生成されるファイルである。具体的には、
文書の各読み上げ部分の切れ目（ポーズ期間）、読みの
指定などの情報を有するファイルとなる。The reading file 608 is generated by converting the text tag file 609 or the summary sentence tag file 610. As described above, the document data (tag file) is generated in the authoring apparatus 2, but the authoring apparatus 2 can also add a tag necessary for performing speech synthesis. Alternatively, after receiving the tag file, the document processing apparatus 1 can newly create a document by adding a tag necessary for performing voice synthesis to the document. Speech file 60
Reference numeral 8 denotes a file generated by deriving attribute information for reading out from the tags in the tag file and embedding the attribute information in the document data. In particular,
The file has information such as a break (pause period) of each reading part of the document and designation of reading.

【０１６１】ビデオ出力用ファイル６１２は、本文タグ
ファイル６０９もしくは要約文タグファイル６１０から
変換されて（具体的にはビデオタグが抽出されて）生成
される。ビデオタグとは、タグファイルの文書全体もし
くは段落等の部分毎に、対応するビデオデータを指定す
る情報である。つまり、ビデオ出力用ファイルとは、あ
る文書データについて、対応するビデオデータを指定す
るファイルとなる。The video output file 612 is generated by converting (specifically, extracting video tags) from the text tag file 609 or the summary sentence tag file 610. The video tag is information for specifying corresponding video data for each part of the entire document or paragraph of the tag file. In other words, the video output file is a file for specifying video data corresponding to certain document data.

【０１６２】上述した各動作を実現するための図２４の
機能ブロックの処理、及び著作権タグが付加されている
場合の処理例を、以下説明していく。The processing of the functional blocks in FIG. 24 for realizing the above-described operations and a processing example when a copyright tag is added will be described below.

【０１６３】・本文表示処理図１７、図１９に示すように、ある文書データとしての
本文を表示する際には、選択されたタグファイル、即ち
或る本文タグファイル６０９の情報が表示制御部６０６
に供給される。表示制御部６０６は、閲覧ウインドウ３
００の画像に本文としての文字データを合成し、表示部
３０で図１７，図１９のように表示させる。上述したと
おり、本文タグファイルに著作権タグが付加されていた
場合は、このとき、著作権タグによって示される著作権
情報も同時に表示する。但し、著作権情報を表示しない
ような処理例も考えられる。Text Display Processing As shown in FIGS. 17 and 19, when displaying text as certain document data, the information of the selected tag file, that is, the text tag file 609 is displayed by the display control unit 606.
Supplied to The display control unit 606 controls the viewing window 3
The character data as the text is combined with the image 00 and displayed on the display unit 30 as shown in FIGS. As described above, when the copyright tag is added to the body tag file, the copyright information indicated by the copyright tag is also displayed at this time. However, a processing example in which copyright information is not displayed may be considered.

【０１６４】・本文又は要約文の読み上げ処理本文又は要約文についての読み上げ処理を実行するに
は、まず本文タグファイル６０９もしくは要約文タグフ
ァイル６１０から読み上げ用ファイル６０８が生成され
る。そして音声合成エンジン６０１はコントローラ６０
０の指示に基づいて読み上げ用ファイル６０８を参照
し、読み上げ用ファイル６０８に基づいた音声合成処理
を行う。生成された合成音声信号（読み上げ音声信号）
Ｙｏｕｔは音声制御部６０５において出力レベル調整等
が行われ、音声出力部３３に供給されて出力される。ま
たコントローラは表示制御部６０６から読み上げの際の
ウインドウとして所要の画像信号を出力させ、表示部３
０に表示させる。またその際のユーザー操作の情報はユ
ーザーインターフェース６０７で取り込まれてコントロ
ーラ６００に伝えられ、コントローラ６００は、ユーザ
ー操作に応じて音声合成エンジン６０１の動作を制御す
る。読み上げ対象となった本文タグファイル６０９もし
くは要約文タグファイル６１０に著作権タグが付加され
ていた場合は、表示制御部６０６は、その著作権タグで
示される著作権情報を、読み上げ時に表示されているウ
インドウ上に表示させる。なお、読み上げ音声により著
作権情報を提示してもよい。Reading process of text or abstract sentence To execute reading process of text or abstract sentence, first, a reading file 608 is generated from the text tag file 609 or the abstract sentence tag file 610. The speech synthesis engine 601 is connected to the controller 60.
A speech synthesis process is performed based on the reading-out file 608 with reference to the reading-out file 608 based on the instruction of “0”. Generated synthesized voice signal (speech signal)
Yout is subjected to output level adjustment and the like in the audio control unit 605, and is supplied to the audio output unit 33 and output. Further, the controller causes the display control unit 606 to output a required image signal as a window for reading out,
0 is displayed. Information on the user operation at that time is captured by the user interface 607 and transmitted to the controller 600. The controller 600 controls the operation of the speech synthesis engine 601 according to the user operation. When a copyright tag is added to the text tag file 609 or the summary tag file 610 to be read out, the display control unit 606 displays the copyright information indicated by the copyright tag at the time of reading out. To be displayed on the current window. Note that the copyright information may be presented by a reading voice.

【０１６５】・要約作成処理図２１で説明した要約作成処理の際には、コントローラ
６００は要約作成エンジン６０３に指示を出し、本文タ
グファイル６０９について要約作成を実行させる。これ
により要約文タグファイルが形成される。なおコントロ
ーラ６００は、要約作成エンジン６０３に対して、要約
表示部３０４のサイズ情報を伝えることで、上述のよう
に要約表示部３０４のサイズに応じた要約生成処理が行
われる。Summarizing Process At the time of the summarizing process described with reference to FIG. 21, the controller 600 issues an instruction to the summarizing engine 603 to cause the main body tag file 609 to execute summarizing. As a result, a summary tag file is formed. By transmitting the size information of the summary display unit 304 to the summary creation engine 603, the controller 600 performs the summary generation process according to the size of the summary display unit 304 as described above.

【０１６６】・要約固定表示処理図２１の要約作成処理で説明したように、要約表示部３
０４のサイズに応じた文書長として要約文タグファイル
６１０が生成され、通常はそれが固定的に表示される。
この場合、生成された要約文タグファイル６１０は、要
約作成エンジン６０３によって表示文書出力Ｓｏｕｔと
して処理され、表示制御部６０６に供給される。そして
表示制御部６０６で、閲覧ウインドウ３００の画像に合
成され、表示部３０で図１８、図２０のように表示され
る。上述したとおり、要約文タグファイルの元となった
本文タグファイルに著作権タグが付加されていた場合
は、著作権タグによって示される著作権情報も同時に表
示される。Summary Fixed Display Processing As described in the summary creation processing of FIG. 21, the summary display unit 3
The summary sentence tag file 610 is generated as a document length corresponding to the size of the document 04, and is usually displayed fixedly.
In this case, the generated summary sentence tag file 610 is processed as a display document output Sout by the summary creation engine 603 and supplied to the display control unit 606. The display control unit 606 combines the image with the image of the browsing window 300 and displays the image on the display unit 30 as shown in FIGS. As described above, when the copyright tag is added to the text tag file that is the basis of the summary sentence tag file, the copyright information indicated by the copyright tag is also displayed.

【０１６７】・要約文／本文のテロップ表示処理本例では要約表示部３０４のサイズなどに関係なく、本
文又は要約文についてのテロップ表示を行うこともでき
る。その場合は、本文タグファイル６０９又は要約文タ
グファイル６１０について、テロップ作成エンジン６０
２によってテロップ化処理が行われる。そしてテロップ
表示文書出力Ｔｏｕｔとして逐次出力されていく。表示
制御部６０６では、テロップ表示文書出力Ｔｏｕｔを閲
覧ウインドウ３００その他の所要のウインドウの画像に
合成し、表示部３０でテロップ表示が実行されていくよ
うにする。このようにテロップ表示を行う際において、
テロップ表示する本文タグファイル、又はテロップ表示
する要約文タグファイル（又はその元となった本文タグ
ファイル）に、著作権タグが付加されていた場合は、著
作権タグによって示される著作権情報が、テロップ表示
を行っているウインドウ上に表示される。[0167] Summarized sentence / text telop display processing In this example, regardless of the size of the summary display unit 304, telop display of a text or an abstract sentence can also be performed. In that case, the telop creation engine 60
2 performs a telop processing. Then, it is sequentially output as a telop display document output Tout. The display control unit 606 combines the telop display document output Tout with the image of the browsing window 300 and other required windows, and causes the display unit 30 to execute the telop display. When performing telop display in this way,
If a copyright tag is added to the text tag file to be displayed in the telop or the summary tag file to be displayed in the telop (or the text tag file from which the text tag is based), the copyright information indicated by the copyright tag is This is displayed on the window that displays the telop.

【０１６８】・ビデオ出力処理ビデオ出力処理は、ユーザーインターフェース６０７か
らの情報に基づくコンピュータ６００の指示によって、
ビデオエンジン６０４の処理で行われる。ビデオエンジ
ン６０４は、本文タグファイル６０９もしくは要約文タ
グファイル６１０から生成されたビデオ出力用ファイル
６１２を参照し、再生すべきビデオデータを判別して、
ビデオファイル６１１を読み出す。読み出されたビデオ
データは、ビデオエンジン６０４によって出力用の映像
信号Ｖｏｕｔとして処理され、表示制御部６０６に供給
される。そして表示制御部６０６で、ビデオウインドウ
としての画像に合成され、表示部３０で表示される。ま
たビデオデータに含まれるオーディオデータについて
も、ビデオエンジン６０４によって出力用の音声信号Ａ
ｏｕｔとして処理され、音声制御部６０５に供給されて
レベル調整等の処理が行われる。そして音声出力部３３
から再生音声として出力される。ビデオ出力中には、ビ
デオウインドウ５０１に対するユーザー操作の情報はユ
ーザーインターフェース６０７で取り込まれてコントロ
ーラ６００に伝えられ、コントローラ６００は、ユーザ
ー操作に応じてビデオエンジン６０１の動作を制御す
る。このように或る文書データに対応するビデオデータ
出力を行う際において、その文書データとしてのタグフ
ァイルに、著作権タグが付加されていた場合は、著作権
タグによって示される著作権情報が、ビデオデータ出力
を行っているウインドウ上に表示される。Video Output Processing The video output processing is performed by an instruction from the computer 600 based on information from the user interface 607.
This is performed in the processing of the video engine 604. The video engine 604 refers to the video output file 612 generated from the body tag file 609 or the summary tag file 610 to determine video data to be reproduced,
The video file 611 is read. The read video data is processed by the video engine 604 as a video signal Vout for output, and supplied to the display control unit 606. Then, the image is combined with an image as a video window by the display control unit 606 and displayed on the display unit 30. Also, audio data included in the video data is output from the video engine 604 by the audio signal A for output.
Out is processed and supplied to the voice control unit 605 to perform processing such as level adjustment. And the audio output unit 33
Is output as a playback sound. During video output, information on the user operation on the video window 501 is captured by the user interface 607 and transmitted to the controller 600, and the controller 600 controls the operation of the video engine 601 according to the user operation. When outputting video data corresponding to certain document data in this way, if a copyright tag is added to the tag file as the document data, the copyright information indicated by the copyright tag is converted to the video file. It is displayed on the window that outputs data.

【０１６９】本例の文書処理装置１では以上のようにし
て、文書データについての要約作成／表示、本文又は要
約文についてのテロップ表示、本文又は要約文について
の読み上げ、及びビデオデータの出力が実行され、また
著作権タグに応じて著作権情報の提示が行われる。As described above, in the document processing apparatus 1 of this embodiment, the creation / display of the summary of the document data, the display of the telop of the text or the summary, the reading of the text or the summary, and the output of the video data are executed. The copyright information is presented according to the copyright tag.

【０１７０】なお、この図２４の機能ブロックは、あく
までも一例であり、上記各動作を実現する機能ブロック
の構成及び動作が、かならずしもこのような例に限定さ
れるものではない。The function blocks shown in FIG. 24 are merely examples, and the configuration and operation of the function blocks for realizing the above operations are not necessarily limited to such examples.

【０１７１】以上、実施の形態としての文書処理装置１
について説明してきたが、そのハードウエアもしくはソ
フトウエア構成、及び処理例は多様に考えられる。As described above, the document processing apparatus 1 according to the embodiment is described.
Has been described, but its hardware or software configuration and processing examples can be variously considered.

【０１７２】例えば図２０に例示したような著作権タグ
に基づいた著作権情報の提示態様は多様に考えられる。
即ち要約文とともに表示する例についていえば、要約文
が表示されている期間は、継続して著作権情報を表示し
ているようにしてもよいし、一時的に表示するようにし
てもよい。例えば要約文表示開始時に数秒〜数分程度の
み、著作権情報を表示したり、或いは数分おきに表示す
るなどの表示態様が考えられる。或いは、ユーザーが本
文や要約文についての何らかの操作、例えば文や語の指
定のためのクリックや編集処理、或いはデータの移動
（他の記録媒体への複製や移動、或いは送信）などが行
われた際に、著作権情報が表示されるようにすることも
好適である。For example, there are various ways of presenting copyright information based on a copyright tag as illustrated in FIG.
That is, in the example of displaying the summary text together with the summary text, the copyright information may be continuously displayed or temporarily displayed while the summary text is displayed. For example, a display mode is conceivable in which the copyright information is displayed only for a few seconds to a few minutes at the start of the display of the summary text, or displayed every few minutes. Alternatively, the user performs some operation on the body text or the summary sentence, for example, a click or editing process for designating a sentence or word, or data movement (copying or moving to another recording medium, or transmission). At this time, it is also preferable to display copyright information.

【０１７３】また上記例では著作権情報として著者の氏
名が表示されている例を示したが、著作権情報の内容は
多様に考えられる。例えば複製権、公衆送信権、二次的
著作物の利用に関する権利など、著作権に関するより細
かい情報を示してもよいし、表示、複製、送信等に関す
る許諾期間などの情報を提示してもよい。In the above example, an example is shown in which the name of the author is displayed as the copyright information. However, the contents of the copyright information can be variously considered. For example, it may indicate more detailed information about copyright, such as the right to copy, the right to public transmission, and the right to use derivative works, or may show information such as the permission period for display, duplication, transmission, etc. .

【０１７４】また文書処理装置１を構成する具体的なデ
バイス例は多様であり、例えば文書処理装置１における
入力部２０を例に挙げれば、キーボードやマウスだけで
なく、タブレット、ライトペン、赤外線等を利用した無
線コマンダ装置等の他のデバイスが考えられる。There are various specific examples of devices constituting the document processing apparatus 1. For example, when the input unit 20 of the document processing apparatus 1 is taken as an example, not only a keyboard and a mouse but also a tablet, a light pen, an infrared ray, etc. Other devices, such as a wireless commander device that utilizes the command, are conceivable.

【０１７５】また実施の形態においては、通信部２２に
外部から電話回線等を介して文書データやビデオデータ
が送信されてくるものとして説明したが、本発明は、こ
れに限定されるものではない。例えば、衛星等を介して
文書データが送信される場合にも適用できる他、記録／
再生部３１において記録媒体３２から読み出されたり、
ＲＯＭ１５に予め文書データ等が書き込まれていてもよ
い。Further, in the embodiment, description has been made assuming that document data and video data are transmitted to communication section 22 from outside via a telephone line or the like, but the present invention is not limited to this. . For example, the present invention can be applied to a case where document data is transmitted via a satellite or the like.
Readout from the recording medium 32 in the reproduction unit 31,
Document data or the like may be written in the ROM 15 in advance.

【０１７６】また実施の形態において、文書へのタグ付
けの方法の一例を示したが、本発明がこのタグ付けの方
法に限定されないことはもちろんである。さらに、上述
の実施の形態においては英語の文章を例示したが、本発
明が英語に限られず、日本語その他の文書データに適用
できることはいうまでもない。このように、本発明は、
その趣旨を逸脱しない範囲で適宜変更が可能であること
はいうまでもない。In the embodiment, an example of a method of tagging a document has been described. However, it is needless to say that the present invention is not limited to this tagging method. Further, in the above-described embodiment, English sentences have been exemplified. However, it is needless to say that the present invention is not limited to English and can be applied to Japanese and other document data. Thus, the present invention provides
Needless to say, it can be changed as appropriate without departing from the spirit thereof.

【０１７７】さらにまた、本発明においては、記録媒体
３２として、上述した要約作成／表示処理を実行させる
動作制御プログラムが書き込まれたディスク状記録媒体
やテープ状記録媒体等を提供するものである。もちろん
記録媒体３２としては、フロッピーディスクの他に、光
ディスク、光磁気ディスク、磁気テープ、フラッシュメ
モリ等によるメモリカード、メモリチップ等としてもよ
い。また図１に示したＨＤＤ３４としても同様に本発明
の記録媒体とすることができる。さらには、その動作制
御プログラムを例えばインターネット等のネットワーク
通信を介しても提供することができるものであり、従っ
て、プログラムサーバ側もしくは通信過程における記録
媒体としても本発明は適用できるものである。Further, in the present invention, a disk-shaped recording medium or a tape-shaped recording medium in which an operation control program for executing the above-described summary creation / display processing is written is provided as the recording medium 32. Of course, the recording medium 32 may be an optical disk, a magneto-optical disk, a magnetic tape, a memory card such as a flash memory, a memory chip, or the like, in addition to the floppy disk. The recording medium of the present invention can be similarly used as the HDD 34 shown in FIG. Further, the operation control program can be provided also via a network communication such as the Internet, and therefore, the present invention can be applied to a program server side or a recording medium in a communication process.

【０１７８】そしてそのような記録媒体３２等によれ
ば、上記動作制御プログラムが文書処理装置１に提供さ
れることで、ユーザーサイドの各文書処理装置１は、上
述の著作権タグに応じた著作権情報の提示、例えば表示
による提示や音声による提示を行うことができることに
なる。また、上述した分類処理や各種ウインドウの表示
動作を含めた動作制御プログラムとすることで、上記し
てきた文書処理方法を実現する文書処理装置１を、例え
ば汎用のパーソナルコンピュータ等を用いて容易に実現
できる。According to such a recording medium 32 or the like, the operation control program is provided to the document processing apparatus 1 so that each of the document processing apparatuses 1 on the user side can execute the copyright processing according to the above-mentioned copyright tag. The right information can be presented, for example, by presentation or by voice. Further, by making the operation control program including the above-described classification processing and various window display operations, the document processing apparatus 1 that realizes the above-described document processing method can be easily realized using, for example, a general-purpose personal computer. it can.

【０１７９】[0179]

【発明の効果】以上の説明からわかるように本発明で
は、次のような効果が得られる。即ち本発明の文書処理
装置、文書処理方法によれば、電子文書について文書処
理を行って作成された要約文が提示出力される際に、元
の電子文書の著作権情報が提示されるため、文書処理装
置のユーザーは電子文書の著作権者等を知ることができ
る。また逆に言えば、著作権者等は、電子文書から派生
する電子文書が提示される際にも、元の電子文書につい
ての著作権をユーザーに提示できる。これは、電子文書
の有効利用と著作権保護の両方の観点において有用なも
のとなる。さらに、要約文としてのタグファイルにも著
作権情報を付加することで、例えば要約文ファイルのみ
が伝送されるなどして扱われる場合にも、元の電子文書
の著作権を明示できるという効果がある。As will be understood from the above description, the present invention has the following effects. That is, according to the document processing apparatus and the document processing method of the present invention, when a summary sentence created by performing document processing on an electronic document is presented and output, the copyright information of the original electronic document is presented. The user of the document processing apparatus can know the copyright holder of the electronic document. Conversely, the copyright holder can present the copyright of the original electronic document to the user even when the electronic document derived from the electronic document is presented. This is useful in terms of both effective use of electronic documents and copyright protection. Furthermore, by adding copyright information to the tag file as a summary, the copyright of the original electronic document can be clearly specified even when only the summary file is handled, for example. is there.

【０１８０】また本発明の記録媒体によれば、本発明の
文書処理装置を実現するプログラムを提供でき、例えば
汎用のパーソナルコンピュータ等を用いて、容易に本発
明の文書処理装置を実現できる。According to the recording medium of the present invention, a program for realizing the document processing apparatus of the present invention can be provided. For example, the document processing apparatus of the present invention can be easily realized using a general-purpose personal computer or the like.

[Brief description of the drawings]

【図１】本発明の実施の形態の文書処理システムの構成
の説明図である。FIG. 1 is an explanatory diagram of a configuration of a document processing system according to an embodiment of the present invention.

【図２】実施の形態の文書処理装置のブロック図であ
る。FIG. 2 is a block diagram of the document processing apparatus according to the embodiment;

【図３】実施の形態で用いる文書構造の説明図である。FIG. 3 is an explanatory diagram of a document structure used in the embodiment.

【図４】実施の形態の文章構造を表示するウインドウの
説明図である。FIG. 4 is an explanatory diagram of a window displaying a sentence structure according to the embodiment;

【図５】実施の形態の手動分類処理のフローチャートで
ある。FIG. 5 is a flowchart of a manual classification process according to the embodiment.

【図６】実施の形態のインデックス作成処理のフローチ
ャートである。FIG. 6 is a flowchart of an index creation process according to the embodiment.

【図７】実施の形態のエレメントの活性値の説明図であ
る。FIG. 7 is an explanatory diagram of an activation value of an element according to the embodiment.

【図８】実施の形態の活性拡散処理のフローチャートで
ある。FIG. 8 is a flowchart of an active diffusion process according to the embodiment.

【図９】実施の形態の中心活性値更新処理のフローチャ
ートである。FIG. 9 is a flowchart of a central activation value update process according to the embodiment.

【図１０】実施の形態の分類ウインドウの説明図であ
る。FIG. 10 is an explanatory diagram of a classification window according to the embodiment.

【図１１】実施の形態の閲覧ウインドウの説明図であ
る。FIG. 11 is an explanatory diagram of a browsing window according to the embodiment.

【図１２】実施の形態の分類モデルの説明図である。FIG. 12 is an explanatory diagram of a classification model according to the embodiment.

【図１３】実施の形態の自動分類処理のフローチャート
である。FIG. 13 is a flowchart of an automatic classification process according to the embodiment.

【図１４】実施の形態の自動分類のフローチャートであ
る。FIG. 14 is a flowchart of automatic classification according to the embodiment.

【図１５】実施の形態の語義間関連度算出処理のフロー
チャートである。FIG. 15 is a flowchart of a word meaning association degree calculation process according to the embodiment;

【図１６】実施の形態の語義間関連度の説明図である。FIG. 16 is an explanatory diagram of a degree of association between meanings according to the embodiment.

【図１７】実施の形態の閲覧ウインドウの表示例の説明
図である。FIG. 17 is an explanatory diagram of a display example of a browsing window according to the embodiment.

【図１８】実施の形態の閲覧ウインドウの要約文を含む
表示例の説明図である。FIG. 18 is an explanatory diagram of a display example including a summary of a browsing window according to the embodiment.

【図１９】実施の形態の閲覧ウインドウの著作権情報を
含む表示例の説明図である。FIG. 19 is an explanatory diagram of a display example including copyright information of a browsing window according to the embodiment.

【図２０】実施の形態の閲覧ウインドウの要約文及び著
作権情報を含む表示例の説明図である。FIG. 20 is an explanatory diagram of a display example including a summary and copyright information of a browsing window according to the embodiment.

【図２１】実施の形態の要約作成／表示処理のフローチ
ャートである。FIG. 21 is a flowchart of a summary creation / display process according to the embodiment.

【図２２】実施の形態のタグファイル例の説明図であ
る。FIG. 22 is an explanatory diagram of an example of a tag file according to the embodiment;

【図２３】実施の形態のタグファイル例の説明図であ
る。FIG. 23 is an explanatory diagram of an example of a tag file according to the embodiment;

【図２４】実施の形態の文書処理システムの構成の説明
図である。FIG. 24 is an explanatory diagram of a configuration of a document processing system according to an embodiment.

[Explanation of symbols]

１文書処理装置、２オーサリング装置、３サー
バ、３ａデータベース、４文書プロバイダ、６通
信回線、７サービス提供部、１１制御部、１３Ｃ
ＰＵ、１４ＲＡＭ、１５ＲＯＭ、１２インターフ
ェース、２１通信部、２０入力部、３０表示部、
３１記録再生部、３２記録媒体、３４ＨＤＤ1 document processing device, 2 authoring device, 3 server, 3a database, 4 document provider, 6 communication line, 7 service providing unit, 11 control unit, 13 C
PU, 14 RAM, 15 ROM, 12 interfaces, 21 communication unit, 20 input unit, 30 display unit,
31 recording / reproducing unit, 32 recording medium, 34 HDD

Claims

[Claims]

An abstract creation unit capable of creating an abstract of an electronic document, a control for presenting and outputting the abstract created by the abstract creation unit, and an electronic document serving as a source of the abstract And an output control unit that controls so that the copyright information indicated by the tag is also output when the tag indicating the copyright information is added to the document processing apparatus.

2. The method according to claim 1, wherein when the tag indicating the copyright information is added to the electronic document from which the summary is created, the summary creating unit adds the tag indicating the copyright information to the created summary. The document processing apparatus according to claim 1, wherein:

3. An abstract creation procedure for creating an abstract sentence for an electronic document, and the summary sentence created in the abstract creation procedure is presented and output, and copyright information is indicated on the electronic document from which the abstract is based. A document processing method, comprising: when a tag is added, an output control procedure for presenting and outputting copyright information indicated by the tag.

4. In the abstract creation procedure, when a tag indicating copyright information is added to an electronic document from which an abstract is created, a tag indicating copyright information is added to the created abstract. 4. The document processing method according to claim 3, wherein:

5. An abstract creation procedure for creating an abstract sentence for an electronic document, and the summary sentence created in the abstract creation procedure is presented and output, and copyright information is indicated on the electronic document from which the abstract is based. A recording medium characterized by recording an output control procedure for presenting and outputting copyright information indicated by the tag when the tag is added, and an operation control program for performing the following.

6. In the abstract creation procedure, when a tag indicating copyright information is added to an electronic document from which the abstract is created, a tag indicating copyright information is added to the created abstract. The recording medium according to claim 5, wherein the operation control program is recorded.