JP2000293533A

JP2000293533A - Method and device for processing document and recording medium

Info

Publication number: JP2000293533A
Application number: JP11100653A
Authority: JP
Inventors: Katashi Nagao; 確長尾
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1999-04-07
Filing date: 1999-04-07
Publication date: 2000-10-20
Anticipated expiration: 2019-04-07
Also published as: JP4345129B2

Abstract

PROBLEM TO BE SOLVED: To provide document processing method and device for calculating the degree of user's interest in a document and a recording medium recording a document processing program for calculating the degree of user's interest in the document. SOLUTION: When a summary element in a document displayed on a display part 30 is selected and inputted to an input part 20 in the document processing device for processing a document having structure consisting of plural elements, a control part 11 in a body 10 calculates the degree of real interest in the document on the basis of an element position most far from the leading position of the document out of respective element positions from the leading position of the document.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、電子文書を処理す
る文書処理方法および装置ならびに電子文書を処理する
文書処理プログラムが記録された記録媒体に関する。[0001] 1. Field of the Invention [0002] The present invention relates to a document processing method and apparatus for processing an electronic document, and a recording medium on which a document processing program for processing an electronic document is recorded.

【０００２】[0002]

【従来の技術】従来、インターネットにおいて、ウィン
ドウ形式でハイパーテキスト型情報を提供するアプリケ
ーションサービスとしてＷＷＷ（World Wide Web）が提
供されている。2. Description of the Related Art Conventionally, WWW (World Wide Web) has been provided as an application service for providing hypertext information in a window format on the Internet.

【０００３】ＷＷＷは、文書の作成、公開または共有化
の文書処理を実行し、新しいスタイルの文書の在り方を
示したシステムである。しかし、文書の実際上の利用の
観点からは、文書の内容に基づいた文書の分類や要約と
いった、ＷＷＷを越える高度な文書処理が求められてい
る。このような高度な文書処理には、文書の内容の機械
的な処理が不可欠である。[0003] The WWW is a system for executing document processing for creating, publishing, or sharing a document, and showing the way of a new style document. However, from the viewpoint of practical use of documents, advanced document processing beyond WWW, such as classification and summarization of documents based on the contents of the documents, is required. For such advanced document processing, mechanical processing of the contents of the document is indispensable.

【０００４】しかしながら、文書の内容の機械的な処理
は、以下のような理由から依然として困難である。第１
に、ハイパーテキストを記述する言語であるＨＴＭＬ
（Hyper Text Markup Language）は、文書の表現につい
ては規定するが、文書の内容についてはほとんど規定し
ない。第２に、文書間に構成されたハイパーテキストの
ネットワークは、文書の読者にとって文書の内容を理解
するために必ずしも利用しやすいものではない。第３
に、一般に文章の著作者は読者の便宜を念頭に置かずに
著作するが、文書の読者の便宜が著作者の便宜と調整さ
れることはない。However, mechanical processing of the contents of a document is still difficult for the following reasons. First
HTML, a language that describes hypertext
(Hyper Text Markup Language) specifies the expression of a document, but hardly specifies the content of the document. Second, a network of hypertexts formed between documents is not always easy for a reader of the document to understand the contents of the document. Third
In general, the author of a document writes without considering the convenience of the reader, but the convenience of the reader of the document is not adjusted to the convenience of the author.

【０００５】このように、ＷＷＷは新しい文書の在り方
を示したシステムであるが、文書を機械的に処理しない
ので、高度な文書処理をおこなうことができなかった。
換言すると、高度な文書処理を実行するためには、文書
を機械的に処理することが必要となる。[0005] As described above, the WWW is a system showing the way of a new document. However, since the document is not mechanically processed, advanced document processing cannot be performed.
In other words, in order to perform advanced document processing, it is necessary to process the document mechanically.

【０００６】そこで、文書の機械的な処理を目標とし
て、文書の機械的な処理を支援するシステムが自然言語
研究の成果に基づいて開発されている。自然言語研究に
よる文書処理として、文書の著作者等による文書の内部
構造についての属性情報、いわゆるタグの付与を前提と
した、文書に付与されたタグを利用する機械的な文書処
理が提案されている。Therefore, a system for supporting mechanical processing of documents has been developed based on the results of natural language research, with the goal of mechanical processing of documents. As a document processing based on natural language research, mechanical document processing using tags attached to a document has been proposed on the assumption that attribute information about the internal structure of the document by the author of the document, so-called tags are added. I have.

【０００７】[0007]

【発明が解決しようとする課題】ところで、近年のコン
ピュータの普及や、ネットワーク化の進展に伴い、文章
処理や、文書の内容に依存した索引などで、テキスト文
書の作成、ラベル付け、変更などをおこなう文書処理の
高機能化が求められている。たとえば、ユーザの要望に
応じた文書の要約や、文書の分類等が望まれる。By the way, with the recent spread of computers and the progress of networking, text processing, creation of text documents, labeling, change, and the like have been performed by using an index depending on the contents of the documents. There is a demand for enhanced document processing to be performed. For example, it is desired to summarize a document or classify a document according to a user's request.

【０００８】本発明は、上述の実情に鑑みて提案される
ものであって、文書に対するユーザの関心度を算出する
ような文書処理方法および装置、ならびに文書に対する
ユーザの関心度を算出するような文書処理プログラムが
記録されてなる記録媒体に関する。The present invention has been proposed in view of the above situation, and has a document processing method and apparatus for calculating a user's interest in a document, and a document processing method and apparatus for calculating a user's interest in a document. The present invention relates to a recording medium on which a document processing program is recorded.

【０００９】[0009]

【課題を解決するための手段】上述の課題を解決するた
めに、本発明に係る文書処理方法は、複数の電子文書を
処理するものであって、各電子文書に対する実関心度を
検出する実関心度検出工程と、上記実関心度検出工程で
検出した実関心度に基づいて各電子文書に優先順位を設
定する優先順位設定工程とを有するものである。In order to solve the above-mentioned problems, a document processing method according to the present invention processes a plurality of electronic documents, and detects an actual degree of interest in each electronic document. An interest degree detecting step and a priority order setting step of setting priorities to the respective electronic documents based on the actual interest degree detected in the actual interest degree detecting step.

【００１０】本発明に係る文書処理装置は、複数の電子
文書を処理するものであって、各電子文書に対する実関
心度を検出する実関心度検出手段と、上記実関心度検出
手段で検出した実関心度に基づいて各電子文書に優先順
位を設定する優先順位設定手段とを有するものである。[0010] A document processing apparatus according to the present invention processes a plurality of electronic documents, and includes a real interest detecting means for detecting a real interest in each electronic document, and a real interest detecting means for detecting the real interest. Priority order setting means for setting a priority order for each electronic document based on the actual interest level.

【００１１】本発明に係る記録媒体は、複数の電子文書
を処理する文書処理プログラムが記録されたものであっ
て、上記文書処理プログラムは、各電子文書に対する実
関心度を検出する実関心度検出処理と、上記実関心度検
出処理で検出した実関心度に基づいて各電子文書に優先
順位を設定する優先順位設定処理とを有するものであ
る。[0011] A recording medium according to the present invention has recorded thereon a document processing program for processing a plurality of electronic documents, and the document processing program includes an actual interest level detection for detecting an actual interest level for each electronic document. And a priority setting process for setting a priority for each electronic document based on the actual interest level detected in the actual interest level detection process.

【００１２】[0012]

【発明の実施の形態】以下、図面を参照して、本発明に
係る文書処理方法および装置ならびに記録媒体の実施の
形態について説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, embodiments of a document processing method and apparatus and a recording medium according to the present invention will be described with reference to the drawings.

【００１３】本発明の実施の形態としての文書処理装置
は、図１に示すように、制御部１１およびインターフェ
ース１２を備える本体１０と、ユーザからの入力を受け
て本体１０に送る入力部２０と、外部からの信号を受信
して本体１０に送る受信部２１と、本体１０からの出力
を表示する表示部３０と、記録媒体３２に対して情報を
記録／再生する記録／再生部３１とを有している。As shown in FIG. 1, a document processing apparatus according to an embodiment of the present invention includes a main unit 10 having a control unit 11 and an interface 12, an input unit 20 which receives an input from a user and sends it to the main unit 10. A receiving unit 21 that receives an external signal and sends it to the main unit 10, a display unit 30 that displays an output from the main unit 10, and a recording / reproducing unit 31 that records / reproduces information on / from a recording medium 32. Have.

【００１４】本体１０は、制御部１１およびインターフ
ェース１２を有し、この文書処理装置の主要な部分を構
成している。制御部１１は、この文書処理装置における
処理を実行するＣＰＵ１３と、揮発性のメモリであるＲ
ＡＭ１４と、不揮発性のメモリであるＲＯＭ１５とを有
している。ＣＰＵ１３は、たとえばＲＯＭ１５に記録さ
れた手順にしたがって、必要な場合にはデータを一時的
にＲＡＭ１４に格納して、プログラムを実行するための
制御をおこなう。インターフェース１２は、制御部１
１、入力部２０、受信部２１、表示部３０および記録／
再生部３１に接続される。インターフェース１２は、制
御部１１の制御の下に、入力部２０および受信部２１か
らのデータの入力、表示部３０へのデータの送信、記録
／再生部３１に対するデータの送受信について、データ
を送信するタイミングを調整したり、データの形式を変
換したりする。The main body 10 has a controller 11 and an interface 12, and constitutes a main part of the document processing apparatus. The control unit 11 includes a CPU 13 that executes processing in the document processing apparatus, and an R that is a volatile memory.
It has an AM 14 and a ROM 15 which is a nonvolatile memory. The CPU 13 temporarily stores data, if necessary, in the RAM 14 according to a procedure recorded in the ROM 15, for example, and performs control for executing the program. The interface 12 includes the control unit 1
1, input unit 20, receiving unit 21, display unit 30 and recording /
Connected to the playback unit 31. The interface 12 transmits data for input of data from the input unit 20 and the reception unit 21, transmission of data to the display unit 30, and transmission and reception of data to the recording / reproducing unit 31 under the control of the control unit 11. Adjust timing and convert data formats.

【００１５】入力部２０は、この文書処理装置に対する
ユーザの入力を受ける部分である。この入力部２０は、
たとえばキーボードやマウスにより構成される。ユーザ
は、この入力部２０を用い、キーボードによりキーワー
ドを入力したり、マウスにより表示部３０に表示されて
いる電子文書のエレメントを選択して入力したりするこ
とができる。なお、以下では電子文書を単に文書と称す
ることにする。ここで、エレメントとは文書を構成する
要素であって、たとえば文書、文および語が含まれる。The input section 20 is a section for receiving an input from the user to the document processing apparatus. This input unit 20
For example, it is composed of a keyboard and a mouse. The user can use the input unit 20 to input a keyword using a keyboard or select and input an element of the electronic document displayed on the display unit 30 using a mouse. In the following, an electronic document will be simply referred to as a document. Here, the element is an element constituting a document, and includes, for example, a document, a sentence, and a word.

【００１６】受信部２１は、この文書処理装置に外部か
らたとえば通信回線を介して送信される信号を受信する
部分である。この受信部２１は、外部から送信された複
数の文書を受信する。受信部２１は、受信したデータを
本体１０に送る。The receiving section 21 is a section for receiving a signal transmitted from the outside to the document processing apparatus via, for example, a communication line. The receiving unit 21 receives a plurality of documents transmitted from outside. The receiving unit 21 sends the received data to the main unit 10.

【００１７】表示部３０は、この文書処理装置からの文
字や画像情報の出力を表示する。表示部３０は、たとえ
ば陰極線管（cathode ray tube;CRT）や液晶表示装置
（liquid crystal display;LCD）から構成され、たとえ
ば単数または複数のウィンドウを表示し、このウィンド
ウ上に文字、図形等を表示したりする。The display unit 30 displays the output of characters and image information from the document processing device. The display unit 30 includes, for example, a cathode ray tube (CRT) or a liquid crystal display (LCD), and displays, for example, one or a plurality of windows, and displays characters, figures, and the like on the windows. Or

【００１８】記録／再生部３１は、たとえばいわゆるフ
ロッピーディスクのような記録媒体３２に対してデータ
の記録／再生をおこなう。記録媒体３２には、文書を処
理する文書処理プログラムが記録されている。この記録
媒体３２についてはさらに後述する。The recording / reproducing unit 31 records / reproduces data on / from a recording medium 32 such as a so-called floppy disk. In the recording medium 32, a document processing program for processing a document is recorded. This recording medium 32 will be further described later.

【００１９】続いて、本実施の形態における文書につい
て説明する。本実施の形態においては、文書処理は、文
書に付与された属性情報であるタグを参照しておこなわ
れる。本実施の形態で用いられるタグには、文書の構造
を示す統語論的（syntactic）タグと、多言語間で文書
の機械的な内容理解を可能にするような意味的（semant
ic）・語用論的タグとがある。Next, a document according to this embodiment will be described. In the present embodiment, document processing is performed with reference to a tag that is attribute information given to a document. Tags used in the present embodiment include a syntactic tag indicating the structure of a document and a semantic (semant) tag that enables mechanical understanding of the content of a document between multiple languages.
ic) ・ There is a pragmatic tag.

【００２０】統語論的なタグとしては、文書の内部構造
を記述するものがある。タグ付けによる内部構造は、図
２に示すように、文書、文、語彙エレメント等の各エレ
メントが、通常リンク、参照・被参照リンクによりリン
クされて構成されている。図中において、白丸“○”は
エレメントを示し、最下位の白丸は文書における最小レ
ベルの語に対応する語彙エレメントである。また、実線
は文書、文、語彙エレメント等のエレメント間のつなが
り示す通常リンク（normal link ）である。破線は参照
・被参照による係り受け関係を示す参照リンク（refere
nce link）である。文書の内部構造は、上位から下位へ
の順序で、文書（document）、サブディビジョン（subd
ivision ）、段落（paragraph）、文（sentence ）、サ
ブセンテンシャルセグメント（subsentential segment
）、・・・、語彙エレメントから構成される。このう
ち、サブディビジョンと段落は、オプションである。Some syntactic tags describe the internal structure of a document. As shown in FIG. 2, the internal structure by tagging is configured such that elements such as a document, a sentence, and a vocabulary element are linked by a normal link and a reference / referenced link. In the figure, a white circle “○” indicates an element, and the lowest white circle is a vocabulary element corresponding to the minimum level word in the document. The solid line is a normal link indicating a connection between elements such as a document, a sentence, and a vocabulary element. A dashed line indicates a reference link (refere
nce link). The internal structure of a document is document, subdivision (subd
ivision), paragraph, sentence, subsentential segment
), ..., vocabulary elements. Of these, subdivisions and paragraphs are optional.

【００２１】一方、意味論・語用論的なタグ付けとして
は、多義語の意味のように意味等の情報を記述するもの
がある。本実施の形態におけるタグ付けは、ＨＴＭＬ
（Hyper Text Markup Language）と同様なＸＭＬ（Exte
nded Markup Language）の形式によるものである。On the other hand, as the semantic / pragmatic tagging, there is a method of describing information such as a meaning like a meaning of a polysemy. Tagging in the present embodiment is performed in HTML.
XML (Exte) similar to (Hyper Text Markup Language)
nded Markup Language).

【００２２】タグ付けの一例を次に示すが、文書へのタ
グ付けはこの方法に限られない。また、以下では英語と
日本語の文書の例を示すが、タグ付けによる内部構造の
記述は他の言語にも同様に適用することができる。An example of tagging is shown below, but tagging a document is not limited to this method. In the following, examples of English and Japanese documents are shown, but the description of the internal structure by tagging can be similarly applied to other languages.

【００２３】たとえば、“Time flies like an arro
w.”という文については、下記のようなタグ付けをする
ことができる。For example, "Time flies like an arro
The following tag can be attached to the sentence "w."

【００２４】＜文＞＜名詞句語義＝“time０”＞time＜／名詞句＞
＜動詞句＞＜動詞語義＝“fly１”＞flies＜／動詞＞＜形容動詞句＞＜形容動詞語義＝like０＞like＜／形
容動詞＞＜名詞句＞an＜名詞語義＝“arrow０”＞ar
row＜／名詞＞＜／名詞句＞＜／形容動詞句＞＜／動詞句＞.＜／文＞ここで＜文＞、＜名詞＞、＜名詞句＞、＜動詞＞、＜動
詞句＞、＜形容動詞＞、＜形容動詞句＞は、それぞれ
文、名詞、名詞句、動詞、動詞句、形容詞を含む前置詞
句または後置詞句／形容詞句、形容詞句／形容動詞句の
ような文の統語構造（syntactic structure ）を表して
いる。タグは、エレメントの先端の直前および終端の直
後に対応して配置される。エレメントの終端の直後に配
置されるタグは、記号“／”によりエレメントの終端で
あることを示している。エレメントは統語的構成素、す
なわち句、節、および文を示す。なお、語義（word sen
se）＝“time０”は、語“time”の有する複数の意味、
すなわち複数の語義のうちの第０番目の意味を指してい
る。具体的には、語“time”には少なくとも名詞、形容
詞、動詞の意味があるが、ここでは語“time”が名詞で
あることを示している。同様に、語“オレンジ”は少な
くとも植物の名前、色、果物の意味があるが、これらも
語義によって区別することができる。<Sentence><noun phrase meaning = “time0”> time </ noun phrase>
<Verb phrase><verb meaning = "fly1"> flies </ verb><adjective verb phrase><adjective verb meaning = like0> like </ adjective verb><nounphrase> an <noun meaning = "arrow0"> ar
row </ noun></ noun phrase></ adjective verb phrase></ verb phrase>. </ sentence> where <sentence>, <noun>, <noun phrase>, <verb>, <verb phrase>, <Adjective verb> and <adjective verb phrase> are syntactic of sentences such as sentence, noun, noun phrase, verb, verb phrase, preposition phrase or postposition phrase / adjective phrase including adjective, adjective phrase / adjective verb phrase, respectively. Represents a syntactic structure. Tags are arranged corresponding to immediately before the head of the element and immediately after the end of the element. The tag placed immediately after the end of the element indicates that it is the end of the element by the symbol "/". Elements indicate syntactic constituents: phrases, clauses, and sentences. The meaning (word sen
se) = “time0” is a plural meaning of the word “time”,
That is, it indicates the 0th meaning of the plural meanings. Specifically, the word “time” has at least the meanings of a noun, an adjective, and a verb. Here, it indicates that the word “time” is a noun. Similarly, the word "orange" has at least the meaning of plant name, color, and fruit, but these can also be distinguished by their meaning.

【００２５】本実施の形態における文書は、図３に示す
ように、表示部３０のウィンドウ１０１に統語構造を表
示することができる。このウィンドウ１０１において
は、右半面１０３に語彙エレメントが、左半面１０２に
文の内部構造がそれぞれ表示されている。The document according to the present embodiment can display a syntactic structure in a window 101 of the display unit 30 as shown in FIG. In this window 101, vocabulary elements are displayed on the right half 103, and the internal structure of the sentence is displayed on the left half 102.

【００２６】このウィンドウ１０１には、タグ付けによ
り内部構造を記述された次に示すような文書「Ａ氏のＢ
会が終わったＣ市で、一部の大衆紙と一般紙がその写真
報道を自主規制する方針を紙面で明らかにした。」の一
部が表示されている。この文書のタグ付けの例を次に示
す。In this window 101, the following document "Mr. A's B"
In C City, where the event ended, some popular and general newspapers have announced on their papers that they will voluntarily regulate their photographic coverage. Is displayed. An example of tagging this document follows.

【００２７】＜文書＞＜文＞＜形容動詞句関係＝“位
置”＞＜名詞句＞＜形容動詞句場所＝“Ｃ市”＞＜形容動詞句関係＝“主語”＞＜名詞句識別子＝
“Ｂ会”＞＜形容動詞句関係＝“所属”＞＜人名識別子＝“Ａ氏”＞Ａ氏＜／
人名＞の＜／形容動詞句＞＜組織名識別子＝“Ｂ会”
＞Ｂ会＜／組織名＞＜／名詞句＞が＜／形容動詞句＞終わった＜／形容動詞句＞＜地名識別子＝“Ｃ市”＞
Ｃ市＜／地名＞＜／名詞句＞で、＜／形容動詞句＞＜形
容動詞句関係＝“主語”＞＜名詞句識別子＝“pres
s” 統語＝“並列”＞＜名詞句＞＜形容動詞句＞一部
の＜／形容動詞句＞大衆紙＜／名詞句＞と＜名詞＞一般
紙＜／名詞＞＜／名詞句＞が＜／形容動詞句＞＜形容動詞句関係＝“目的語”＞＜形容動詞句関係
＝“内容” 主語＝“press”＞＜形容動詞句関係＝
“目的語”＞＜名詞句＞＜形容動詞句＞＜名詞共参照＝“Ｂ会”＞そ＜／名詞＞の＜／形容動詞句＞写
真報道＜／名詞句＞を＜／形容動詞句＞自主規制する＜／形容動詞句＞方針を＜／形容動詞句＞＜形容動詞句関係＝“位置”＞紙面で＜／形容動詞句
＞明らかにした。＜／文＞＜／文書＞<Document><sentence><adjective verb phrase relation = “position”><nounphrase><adjective verb phrase place = “C city”><adjective verb phrase relation = “subject”><noun phrase identifier =
“B meeting”><adjective verb phrase relation = “affiliation”><person name identifier = “Mr. A”> Mr.
<Person name></ adjective verb phrase><organization name identifier = "B meeting"
> B meeting </ organization name></ noun phrase></ adjective verb phrase> finished </ adjective verb phrase><place name identifier = "C city">
C city </ place name></ noun phrase>, </ adjective verb phrase><adjective verb phrase relation = "subject"><noun phrase identifier = "pres
s ”syntactic =“ parallel ”><nounphrase><adjective verb phrase> Some </ adjective verb phrases> popular paper </ noun phrase> and <noun> general paper </ noun></ noun phrase> / Adjective verb phrase><adjective verb phrase relation = “object”><adjective verb phrase relation = “content” subject = “press”><adjective verb phrase relation =
"Object"><nounphrase><adjective verb phrase><noun co-reference = "B-kai"> so </ noun></ adjective verb phrase> photo coverage </ noun phrase></ adjective verb phrase> The self-regulated </ adjective verb phrase> policy was clarified as </ adjective verb phrase><adjective verb phrase relation = "position"></ adjective verb phrase></Text></text>

【００２８】この文書においては、「一部の大衆紙と一
般紙」は、統語＝“並列”というタグにより並列である
ことが表されている。並列の定義は、係り受け関係を共
有すると言うことである。特に何も指定がない場合は、
たとえば、＜名詞句関係＝ｘ＞＜名詞＞Ａ＜／名詞＞
＜名詞＞Ｂ＜／名詞＞＜／名詞句＞はＡがＢに依存関
係のあることを表す。関係＝ｘは関係属性を表す。In this document, "part of popular paper and general paper" is expressed in parallel by the tag "syntax""parallel". The definition of parallel is to share a dependency relationship. Unless otherwise specified,
For example, <noun phrase relation = x><noun> A <// noun>
<Noun> B </ Noun></ Noun phrase> indicates that A has a dependency on B. Relation = x represents a relation attribute.

【００２９】関係属性は、統語、意味、修辞についての
相互関係を記述する。主語、目的語、間接目的語のよう
な文法機能、動作主、被動作者、受益者などのような主
題役割、および理由、結果などのような修辞関係はこの
関係属性により記述される。本実施の形態では、主語、
目的語、間接目的語のような比較的容易な文法機能につ
いて関係属性を記述する。The relation attribute describes a mutual relation between syntactic, meaning, and rhetorical. Grammar functions such as subjects, objects, and indirect objects, subject roles such as an actor, a subject, a beneficiary, etc., and rhetorical relations such as a reason and a result are described by the relation attributes. In the present embodiment, the subject,
Describe relational attributes for relatively easy grammar functions such as object and indirect objects.

【００３０】また、この文書においては、“Ａ氏”、
“Ｂ会”、“Ｃ市”のような固有名詞について、地名、
人名、組織名等のタグにより属性が記述されている。こ
れら地名、人名、組織名等のタグが付与される語は固有
名詞である。In this document, "Mr. A"
For proper nouns such as "B Association" and "C City", place names,
Attributes are described by tags such as person names and organization names. These words to which tags such as place names, personal names, and organization names are given are proper nouns.

【００３１】以下では、本発明に係る実施の形態として
の文書処理装置の動作について説明する。文書処理装置
は、文書に対する実関心度を検出し、検出した実関心度
に基づいて他の文書に優先順位を設定するものである。
文書処理装置は、文書を表示し、表示された文書に基づ
いて実関心度を検出する。実関心度は、ユーザの文書に
対する操作に応じて検出される。この実関心度との関連
度に基づいて、実関心度が与えられていない文書に対し
て予測関心度が定義される。予測関心度を用いると、ユ
ーザが操作していない文書に対して優先順位を与えるこ
とができる。Hereinafter, the operation of the document processing apparatus according to the embodiment of the present invention will be described. The document processing apparatus detects the degree of actual interest in a document and sets priorities to other documents based on the detected degree of actual interest.
The document processing device displays a document and detects a degree of actual interest based on the displayed document. The actual interest level is detected according to the user's operation on the document. Based on the degree of association with the actual degree of interest, a predicted degree of interest is defined for a document to which no actual degree of interest is given. By using the predicted interest level, a priority can be given to a document that is not operated by the user.

【００３２】このような実関心度の説明に先立って、文
書の手動分類および文書の自動分類について説明するこ
とにする。すなわち、文書処理装置の動作について、
（１）文書の手動分類、（２）文書の自動分類、（３）
実関心度および予測関心度の順序で説明する。Prior to the description of the actual interest level, manual classification of documents and automatic classification of documents will be described. That is, regarding the operation of the document processing device,
(1) Manual classification of documents, (2) Automatic classification of documents, (3)
The description will be made in the order of the actual interest level and the predicted interest level.

【００３３】説明の内容を簡単に述べると、（１）文書
の手動分類においては、文書処理装置が外部から送られ
た文書を受信し、ユーザがこの文書を手動分類する動作
について説明する。この手動分類により、文書を分類す
る分類モデルが作成される。（２）文書の自動分類においては、文書の手動分類によ
り作成された分類モデルに基づいて、文書分類間関連度
を用いて文書を分類する動作について説明する。（３）実関心度および予測関心度においては、ユーザの
操作に基づいて検出される実関心度と、この実関心度お
よび文書間関連度に基づいて得られる予測関心度に基づ
いておこなわれる処理について説明する。The contents of the description will be briefly described. (1) In the manual classification of a document, an operation in which a document processing apparatus receives a document transmitted from the outside and a user manually classifies the document will be described. By this manual classification, a classification model for classifying documents is created. (2) In the automatic classification of documents, an operation of classifying documents using the degree of association between document classifications based on a classification model created by manual classification of documents will be described. (3) In the actual interest level and the predicted interest level, processing performed based on the actual interest level detected based on the user's operation and the predicted interest level obtained based on the actual interest level and the inter-document relevance level. Will be described.

【００３４】（１）文書の手動分類本実施の形態では、初期状態では分類モデルが存在しな
い。初期状態においては、分類モデルを作成するため
に、外部から送られた文書を手動によって分類する必要
がある。このような文書処理装置の手動分類の動作につ
いて、図４を参照して説明する。(1) Manual Classification of Document In this embodiment, no classification model exists in the initial state. In the initial state, it is necessary to manually classify documents sent from the outside in order to create a classification model. The operation of manual classification of such a document processing apparatus will be described with reference to FIG.

【００３５】図４のステップＳ１１では、文書処理装置
の受信部２１は、たとえば通信回線を介して送信された
複数の文書を受信する。受信部２１は、受信した文書を
文書処理装置の本体１０に送る。In step S11 of FIG. 4, the receiving section 21 of the document processing apparatus receives a plurality of documents transmitted via, for example, a communication line. The receiving unit 21 sends the received document to the main body 10 of the document processing device.

【００３６】ステップＳ１２では、文書処理装置の制御
部１１は、受信部２１から送られた複数文書の特徴を抽
出し、それぞれの文書の特徴情報すなわちインデックス
を作成する。制御部１１は、受信した複数の文書や、作
成したインデックスを、たとえばＲＡＭ１４に記憶させ
る。インデックスは、その文書に特徴的な、固有名詞、
固有名詞以外の語義などを含む。In step S12, the control unit 11 of the document processing apparatus extracts the features of a plurality of documents sent from the receiving unit 21 and creates feature information of each document, that is, an index. The control unit 11 stores the received plurality of documents and the created index in, for example, the RAM 14. The index contains the proper nouns,
Includes meanings other than proper nouns.

【００３７】ここで、インデックスの具体例を示す。Here, a specific example of the index will be described.

【００３８】＜インデックス日付＝“AAAA/BB/CC”
時刻＝“DD:EE:FF” 文書アドレス＝“1234”＞＜ユーザの操作履歴最大要約サイズ＝“100”＞＜選択エレメントの数＝“10”＞ピクチャーテル＜／
選択＞・・・＜／ユーザの操作履歴＞＜要約＞減税規模、触れず−Ｘ首相の会見＜／要約＞＜語語義＝“0003” 中心活性値＝“140.6”＞触れ
ず＜／語＞＜語語義＝“0105” 識別子＝“Ｘ” 中心活性値＝
“67.2”＞首相＜／語＞＜人名識別子＝“Ｘ” 語語義＝“6103” 中心活
性値＝“150.2”＞Ｘ首相＜／語／人名＞＜語語義＝“5301” 中心活性値＝“120.6”＞求め
た＜／語＞＜語語義＝“2350” 識別子＝“Ｘ” 中心活性値＝
“31.4”＞首相＜／語＞＜語語義＝“9582” 中心活性値＝“182.3”＞強調
した＜／語＞＜語語義＝“2595” 中心活性値＝“93.6”＞触れる
＜／語＞＜語語義＝“9472” 中心活性値＝“12.0”＞予告し
た＜／語＞＜語語義＝“4934” 中心活性値＝“46.7”＞触れな
かった＜／語＞＜語語義＝“0178” 中心活性値＝“175.7”＞釈明
した＜／語＞＜語語義＝“7248” 識別子＝“Ｘ” 中心活性値＝
“130.6”＞私＜／語＞＜語語義＝“3684” 識別子＝“Ｘ” 中心活性値＝
“121.9”＞首相＜／語＞＜語語義＝“1824” 中心活性値＝“144.4.”＞訴え
た＜／語＞＜語語義＝“7289” 中心活性値＝“176.8”＞見せ
た＜／語＞＜／インデックス＞<Index date = "AAAA / BB / CC"
Time = “DD: EE: FF” Document address = “1234”><User operation history Maximum summary size = “100”><Number of selected elements = “10”> Picturetel </
Selection> ... </ User operation history><Summary> Tax reduction scale, not touching-Prime Minister X's interview </ Summary><Word Meaning = "0003" Central activity value = "140.6"> No touch </ word><Word meaning = “0105” identifier = “X” central activity value =
“67.2”> Prime Minister </ word><person name identifier = “X” word meaning = “6103” central activity value = “150.2”> X Prime Minister </ word / person name><word meaning = “5301” central activity value = “ 120.6 ”> sought </ word><word meaning =“ 2350 ”identifier =“ X ”central activity value =
“31.4”> Prime Minister </ word><word meaning = “9582” Central activity value = “182.3”> emphasized </ word><word meaning = “2595” Central activity value = “93.6”> touch </ word><Word meaning = “9472” central activity value = “12.0”> forecasted </ word><word meaning = “4934” central activity value = “46.7”> not touched </ word><word meaning = “0178” Central activity value = "175.7"> Explained </ word><word Meaning = "7248" Identifier = "X" Central activity value =
“130.6”> I </ word><word Meaning = “3684” Identifier = “X” Central activity value =
“121.9”> Prime Minister </ word><word meaning = “1824” Central activity value = “144.4.”> Appealed </ word><word meaning = “7289” Central activity value = “176.8”> Showed </ / Word></index>

【００３９】このインデックスにおいては、＜インデッ
クス＞および＜／インデックス＞は、インデックスの始
端および終端を、＜日付＞および＜時刻＞はこのインデ
ックスが作成された日付および時刻を、＜要約＞および
＜／要約＞はこのインデックスの内容の要約の始端およ
び終端を示している。＜語＞および＜／語＞は語の始端
および終端を、それぞれ示している。語義＝“0003”
は、第３番目の語義であることを示している。他につい
ても同様である。すなわち、同じ語でも複数の意味を持
つ場合があるので、それを区別するために語義ごとに番
号が予め決められている。したがって、同じ語に対して
単数または複数の語義が存在する。In this index, <index> and </ index> indicate the start and end of the index, <date> and <time> indicate the date and time when this index was created, and <summary> and </ Summary> indicates the start and end of the summary of the contents of this index. <Word> and </ word> indicate the beginning and end of the word, respectively. Meaning = “0003”
Indicates that it is the third meaning. The same applies to other cases. That is, since the same word may have a plurality of meanings, a number is predetermined for each meaning in order to distinguish them. Thus, there is one or more meanings for the same word.

【００４０】また、＜ユーザの操作履歴＞および＜／ユ
ーザの操作履歴＞は、ユーザの操作履歴の始端および終
端を、＜選択＞および＜／選択＞は、選択されたエレメ
ントの始端および終端を、それぞれ示している。最大要
約サイズ＝“100”は、要約の最大のサイズが１００文
字であることを、エレメントの数＝“10”は、選択され
たエレメントの数が１０であることを示している。<User operation history> and </ user operation history> indicate the start and end of the user's operation history, and <select> and </ select> indicate the start and end of the selected element. , Respectively. The maximum summary size = “100” indicates that the maximum size of the summary is 100 characters, and the number of elements = “10” indicates that the number of selected elements is 10.

【００４１】図４のステップＳ１３においては、ユーザ
は、図５の表示の具体例に示すように文書処理装置の表
示部３０に表示される文書を閲覧する。図５において
は、ユーザによる分類前の文書は“他のトピックス”分
類され、ウィンドウ３０１の第１の表示部３０３の“他
のトピックス”に、文書のアイコンやタイトルが表示さ
れている。文書処理装置の制御部１１は、このように表
示された複数の文書のうちから、ユーザの所望の文書を
表示部３０に表示するように制御する。制御部１１は、
入力部２０へのユーザの入力に応じて、表示部３０に表
示する文書を選択する。表示部３０には、ユーザにより
選択された文書が、その領域の大きさを変更可能なウィ
ンドウにより表示される。このウィンドウに文書の全体
が表示できないときには、文書の一部が表示される。In step S13 in FIG. 4, the user browses the document displayed on the display unit 30 of the document processing apparatus as shown in a specific example of the display in FIG. In FIG. 5, the document before being classified by the user is classified as “other topics”, and the icon and title of the document are displayed in “other topics” in the first display unit 303 of the window 301. The control unit 11 of the document processing device controls the display unit 30 to display a document desired by the user from among the plurality of displayed documents. The control unit 11
A document to be displayed on the display unit 30 is selected according to a user's input to the input unit 20. The display unit 30 displays a document selected by the user in a window in which the size of the area can be changed. When the entire document cannot be displayed in this window, a part of the document is displayed.

【００４２】なお、ユーザが文書閲覧をおこなうこのス
テップＳ１３は、ユーザの必要に応じて設けられる。ま
た、図中においてこのステップＳ１３が平行四辺形で表
されているのは、ユーザが操作することを示すものであ
る。以下も同様である。This step S13 in which the user browses the document is provided as needed by the user. In the figure, the fact that this step S13 is represented by a parallelogram indicates that the user operates. The same applies to the following.

【００４３】ここで、上述の図５で示した表示の具体例
について詳細に説明する。この具体例においては、ユー
ザが自由に文書を分類するカテゴリを設定や変更をする
ことができるようにしている。このようなカテゴリの設
定や変更は、ユーザが手動によりおこなう。Here, a specific example of the display shown in FIG. 5 will be described in detail. In this specific example, the user can freely set or change the category into which the document is classified. The setting or change of such a category is manually performed by the user.

【００４４】表示部３０において文書分類の表示に用い
られるグラフィックユーザインターフェース（graphic
user interface; GUI）の具体例は、図６に示すように
なる。この文書分類ウィンドウ３０１は、画面のウィン
ドウの状態を初期の位置にもどすポジションリセット
（position reset）のボタンと、文書の内容を閲読する
ブラウザ（browser ）を呼び出すブラウザのボタンと、
このウィンドウからの脱出（exit）のボタンとを含む操
作ボタン３０２を有している。A graphic user interface (graphic) used for displaying the document classification on the display unit 30
FIG. 6 shows a specific example of the user interface (GUI). The document classification window 301 includes a position reset button for returning a screen window state to an initial position, a browser button for calling a browser (browser) for reading the contents of a document,
An operation button 302 including a button for exiting from this window is provided.

【００４５】また、文書分類ウィンドウ３０１は、上述
した“他のトピックス”を表示する第１の分類表示部３
０３、“ビジネスニュース”を表示する第２の分類表示
部３０４、“政治ニュース”を表示する第３の分類表示
部３０５等が表示されている。これらの分類部には、各
カテゴリに対応し、そのカテゴリに分類された文書のア
イコンと文書のタイトルが表示されている。タイトルが
ない場合には、一文の要約が表示される。各分類表示部
の大きさは固定的ではなく、たとえば入力部２０のマウ
スにて操作することにより、所望の大きさに変更するこ
とができる。また、分類表示部のタイトルまたはラベル
も自由に変更することができる。The document classification window 301 displays a first classification display unit 3 for displaying the above-mentioned "other topics".
03, a second category display section 304 for displaying “business news”, a third category display section 305 for displaying “political news”, and the like are displayed. In each of these classification sections, the icon of the document classified into the category and the title of the document corresponding to each category are displayed. If there is no title, a summary of one sentence is displayed. The size of each classification display unit is not fixed, and can be changed to a desired size by operating the mouse with the input unit 20, for example. Also, the title or label of the classification display section can be freely changed.

【００４６】第１の分類表示部３０３の“他のトピック
ス”には、たとえば第２の分類表示部３０４以下に対応
するカテゴリに分類される前の文書のタイトルが表示さ
れる。すなわち、この手動分類の工程では、文書処理装
置が受信した文書は、一旦は第１の分類表示部３０３の
“他のトピックス”に表示される。第１の分類表示部３
０３に表示された文書は、以下のようにユーザによりカ
テゴリに分類される。In the "other topics" of the first category display section 303, for example, the title of a document before being classified into a category corresponding to the second category display section 304 and below is displayed. That is, in the manual classification process, the document received by the document processing apparatus is temporarily displayed on “other topics” in the first classification display unit 303. First classification display section 3
The documents displayed at 03 are classified into categories by the user as follows.

【００４７】図４のステップＳ１４においては、ユーザ
は、ステップＳ１３において文書処理装置の表示部３０
にて閲覧した複数の文書を分類するための複数のカテゴ
リからなる分類モデルを作成する。そして、分類モデル
の各カテゴリに上記複数の文書を分類する。In step S14 of FIG. 4, the user operates the display unit 30 of the document processing apparatus in step S13.
Create a classification model consisting of a plurality of categories for classifying a plurality of documents viewed in. Then, the plurality of documents are classified into each category of the classification model.

【００４８】分類モデルは、文書を分類する複数の分類
項目すなわちカテゴリから構成される。カテゴリは、そ
のカテゴリに特徴的な、固有名詞、固有名詞以外の語義
やカテゴリに含まれる文書アドレス等を含んでなるカテ
ゴリインデックスから構成される。カテゴリインデック
スは、固有名詞、固有名詞以外の語義を含む文書のイン
デックスから構成される。The classification model is composed of a plurality of classification items for classifying documents, that is, categories. The category is composed of a category index including a proper noun, a meaning other than the proper noun, a document address included in the category, and the like, which are characteristic of the category. The category index is composed of an index of a document including proper nouns and meanings other than proper nouns.

【００４９】たとえば、図７に示す分類モデルは、各カ
テゴリに対応するカテゴリインデックスについて、固有
名詞、固有名詞以外の語義、文書アドレスの欄を有して
いる。この分類モデルにおいては、カテゴリ“スポー
ツ”、“社会”、“コンピュータ”、“植物”、“美
術”および“イベント”に対して、固有名詞“Ａ氏、・
・・”、“Ｂ氏、・・・”、“Ｃ社、Ｇ社、・・・”、
“Ｄ種、・・・”、“Ｅ氏、・・・”および“Ｆ氏”
を、語義“野球（４５４６）、グランド（２３４３）、
・・・”、“労働（３１１２）、固有（９８２１）、・
・・”、“モバイル（２１０２）、・・・”、“桜１
(１１１１１)、オレンジ１（９９１１）”、“桜２(１
１１１２)、オレンジ２（９９１２）”および“桜３(１
１１１３)”を、この分類モデルに対応する文書アドレ
ス“ＳＰ１、ＳＰ２、ＳＰ３、・・・”、“ＳＯ１、Ｓ
Ｏ２、ＳＯ３、・・・”、“ＣＯ１、ＣＯ２、ＣＯ３、
・・・”、“ＰＬ１、ＰＬ２、ＰＬ３、・・・”、“Ａ
Ｒ１、ＡＲ２、ＡＲ３、・・・”および“ＥＶ１、ＥＶ
２、ＥＶ３、・・・”をそれぞれ有している。なお、
“桜１”、“桜２”および“桜３”は“桜”の第１の語
義(１１１１１)、第２の語義(１１１１２)および第３の
語義(１１１１３)を示している。また、“オレンジ１”
および“オレンジ２”は、“オレンジ”の第１の語義
（９９１１）および第２の語義（９９１２）を示してい
る。たとえば“オレンジ１”は植物のオレンジを表し、
“オレンジ２”はオレンジ色を表す。For example, the classification model shown in FIG. 7 has columns for proper noun, meaning other than proper noun, and document address for the category index corresponding to each category. In this classification model, the proper nouns “A”,... For the categories “sports”, “society”, “computer”, “plant”, “art” and “event”
・・ ”,“ Mr. B, ... ”,“ Company C, Company G, ... ”,
"D type, ...", "Mr. E, ..." and "Mr. F"
To the meanings "baseball (4546), ground (2343),
... "," Labor (3112), Unique (9821), ...
・・ ”,“ Mobile (2102), ... ”,“ Sakura 1 ”
(11111), Orange 1 (9911) "," Sakura 2 (1
1112), Orange 2 (9912) ”and“ Sakura 3 (1
.., “SO1, S1” corresponding to the document addresses “SP1, SP2, SP3,.
O2, SO3, ... "," CO1, CO2, CO3,
... "," PL1, PL2, PL3, ... "," A
R1, AR2, AR3, ... "and" EV1, EV
, EV3,... ".
“Sakura 1”, “Sakura 2”, and “Sakura 3” indicate the first meaning (11111), the second meaning (11112), and the third meaning (11113) of “Sakura”. Also, "Orange 1"
And "orange 2" indicate a first meaning (9911) and a second meaning (9912) of "orange". For example, "Orange 1" represents a plant orange,
“Orange 2” represents orange.

【００５０】分類モデルが更新されると、分類モデルに
更新日時が記録される。図中には、更新日時として“１
９９８年１２月１０日１９時５６分１０秒”が記録され
ている。When the classification model is updated, the updated date and time are recorded in the classification model. In the figure, “1” is used as the update date and time.
19:56:10 on December 10, 998.

【００５１】分類モデルのカテゴリの作成は、文書分類
ウィンドウ３０１において、各カテゴリに対応する分類
表示部を変更や削除したり、新たに分類表示部を設定す
ることにより、ユーザが手動でおこなう。The category of the classification model is created manually by the user by changing or deleting a classification display section corresponding to each category or newly setting a classification display section in the document classification window 301.

【００５２】文書のカテゴリへの分類操作は、たとえ
ば、文書分類ウィンドウ３０１において、分類表示部に
表示された文書のタイトルに対応するアイコンを、入力
部２０のマウスを用い、所望のカテゴリに対応する分類
表示部にドラッグすることによりおこなう。カテゴリに
分類された文書のタイトルは、文書分類ウィンドウ３０
１において、各カテゴリに対応する分類表示部に表示さ
れる。The operation of classifying documents into categories is performed by, for example, using the mouse of the input unit 20 to change the icon corresponding to the title of the document displayed on the classification display unit in the document classification window 301 to the desired category. This is done by dragging to the classification display area. The title of the document classified into the category is displayed in the document classification window 30.
In 1, it is displayed on the classification display section corresponding to each category.

【００５３】ステップＳ１５においては、文書処理装置
の制御部１１は、ステップＳ１４においておこなわれた
カテゴリの作成と、このカテゴリに応じたユーザの手動
による分類操作によって分類された各文書のインデック
スに基づいて、分類モデルを作成する。すなわち、文書
処理装置の制御部１１は、各カテゴリに分類された上記
複数の文書のインデックスを集めて、分類モデルを生成
する。In step S15, the control section 11 of the document processing apparatus creates the category performed in step S14 and based on the index of each document classified by the user's manual classification operation according to this category. , Create a classification model. That is, the control unit 11 of the document processing apparatus collects indexes of the plurality of documents classified into each category and generates a classification model.

【００５４】各カテゴリのカテゴリインデックスは、そ
のカテゴリに特徴的な固有名詞、固有名詞以外の語義、
各カテゴリに分類された文書アドレスからなる。ここ
で、固有名詞以外の場合に語そのものではなく語義を用
いるのは、同じ語でも複数の意味を有することがあるか
らである。そして、文書処理装置の制御部１１は、この
ように作成した分類モデルをたとえばＲＡＭ１４に記憶
させる。The category index of each category includes proper nouns characteristic to the category, meanings other than proper nouns,
It consists of document addresses classified into each category. Here, the reason why the meaning is used instead of the word itself in cases other than the proper noun is that the same word may have a plurality of meanings. Then, the control unit 11 of the document processing apparatus stores the classification model thus created in, for example, the RAM 14.

【００５５】なお、ステップＳ１５における分類モデル
の作成は、ステップＳ１４におけるカテゴリの作成と、
ユーザの手動による分類操作がおこなわれる度におこな
うこともできる。The creation of the classification model in step S15 includes the creation of a category in step S14 and
It can also be performed each time a user's manual classification operation is performed.

【００５６】ステップＳ１６では、文書処理装置の制御
部１１は、ステップＳ１５で作成された分類モデルを登
録する。制御部１１は、登録した分類モデルをたとえば
ＲＡＭ１４に記憶させる。In step S16, the control unit 11 of the document processing device registers the classification model created in step S15. The control unit 11 stores the registered classification model in, for example, the RAM 14.

【００５７】（２）文書の自動分類次に、文書処理装置が分類モデルに基づいておこなう文
書の自動分類について、図８を参照して説明する。この
文書分類は、図４に示す処理により分類モデルが作成さ
れた後に受信した文書に対しておこなわれる。なお、こ
の例では、一つの文書を受信する毎に図８に示す処理を
おこなうこととして説明するが、複数の所定数の文書を
受信する度におこなってもよいし、ユーザが図６の画面
を開く操作をしたときにそれまでに受信した全文書に対
して処理をおこなってもよい。(2) Automatic Classification of Documents Next, automatic classification of documents performed by the document processing apparatus based on a classification model will be described with reference to FIG. This document classification is performed on the document received after the classification model is created by the processing shown in FIG. In this example, the processing shown in FIG. 8 is performed every time one document is received. However, the processing may be performed every time a plurality of predetermined number of documents are received, or the user may use the screen shown in FIG. May be performed on all documents received up to that time when the operation of opening is performed.

【００５８】ステップＳ２１では、文書処理装置の受信
部２１は、外部から文書を受信する。この文書の受信に
ついては、ステップＳ１１で説明したので、ここでの説
明を省略することにする。In step S21, the receiving section 21 of the document processing device receives a document from outside. Since the reception of this document has been described in step S11, the description will be omitted here.

【００５９】ステップＳ２２に進み、文書処理装置の制
御部１１は、ステップＳ２１でＲＡＭ１４に記憶された
文書を読み出し、インデックスを作成する。このインデ
ックスの作成については、さらに後述する。In step S22, the control section 11 of the document processing apparatus reads the document stored in the RAM 14 in step S21 and creates an index. Creation of this index will be further described later.

【００６０】ステップＳ２３では、文書処理装置の制御
部１１は、分類モデルに基づいて、インデックスを附さ
れた各文書を分類モデルのいずれかのカテゴリに自動分
類する。そして、制御部１１は、分類の結果をたとえば
ＲＡＭ１４に記憶させる。自動分類の詳細については、
さらに後述する。In step S23, the control unit 11 of the document processing apparatus automatically classifies each indexed document into one of the categories of the classification model based on the classification model. Then, the control unit 11 causes the RAM 14 to store the result of the classification, for example. For more information on automatic classification,
Further details will be described later.

【００６１】ステップＳ２４では、文書処理装置の制御
部１１は、たとえばＲＡＭ１４に記憶されたステップＳ
２３での新たな文書の自動分類の結果に基づいて、分類
モデルを更新する。ステップＳ２５では、文書処理装置
の制御部１１は、ステップＳ２４で更新された分類モデ
ルを登録する。制御部１１は、登録した分類モデルをた
とえばＲＡＭ１４に記憶させる。In step S24, the control unit 11 of the document processing apparatus executes the processing in step S24 stored in the RAM 14, for example.
The classification model is updated based on the result of the automatic classification of the new document in 23. In step S25, the control unit 11 of the document processing device registers the classification model updated in step S24. The control unit 11 stores the registered classification model in, for example, the RAM 14.

【００６２】次に、図４のステップＳ１２および図８の
ステップＳ２２でのインデックス作成について、図９を
参照して説明する。Next, the index creation in step S12 of FIG. 4 and step S22 of FIG. 8 will be described with reference to FIG.

【００６３】ステップＳ３１においては、文書処理装置
の制御部１１は、図４のステップＳ１１および図８のス
テップＳ２１で受信された文書について、エレメントの
中心活性値を文書の内部構造に基づいて拡散する活性拡
散を実行する。中心活性値の拡散処理については、さら
に後述する。制御部１１は、活性拡散の結果として得ら
れた各エレメントの中心活性値を、たとえばＲＡＭ１４
に記憶させる。In step S31, the control unit 11 of the document processing apparatus spreads the central activation value of the element based on the internal structure of the document with respect to the document received in step S11 of FIG. 4 and step S21 of FIG. Perform active diffusion. The central activity value diffusion process will be further described later. The control unit 11 stores the central activation value of each element obtained as a result of the activation diffusion, for example, in the RAM 14.
To memorize.

【００６４】ステップＳ３２においては、文書処理装の
制御部１１は、ステップＳ３１で得られた各エレメント
の中心活性値に基づいて、中心活性値があらかじめ設定
された閾値を超えるエレメントを抽出する。制御部１１
は、このように抽出したエレメントをたとえばＲＡＭ１
４に記憶させる。In step S32, the control section 11 of the document processing apparatus extracts elements whose central activity value exceeds a preset threshold value based on the central activity value of each element obtained in step S31. Control unit 11
Stores the extracted elements in the RAM 1
4 is stored.

【００６５】ステップＳ３３においては、文書処理装置
の制御部１１は、ステップＳ３２にて抽出したエレメン
トをたとえばＲＡＭ１４から読み出す。そして、制御部
１１は、このエレメントからすべての固有名詞を取り出
してインデックスに加える。固有名詞は語義を持たず、
辞書に載っていないなどの特殊の性質を有するので固有
名詞以外の語とは別に扱うものである。ここで、語義と
は、語の有する複数の意味のうちの各意味に対応したも
のである。In step S33, the control section 11 of the document processing apparatus reads the elements extracted in step S32 from, for example, the RAM 14. Then, the control unit 11 extracts all proper nouns from this element and adds them to the index. Proper nouns have no meaning,
Since it has special properties such as not being listed in a dictionary, it is treated separately from words other than proper nouns. Here, the word meaning corresponds to each meaning among a plurality of meanings of the word.

【００６６】文書処理装置の制御部１１は、エレメント
が固有名詞であるか否か、受信した文書に附されたタグ
に基づいて判断する。たとえば、図３に示したタグ付け
による内部構造においては、“Ａ氏”、“Ｂ会”および
“Ｃ市”は、タグによる関係属性がそれぞれ“人名”、
“組織名”および“地名”であるので固有名詞であるこ
とが分かる。そして、制御部１１は、取り出した固有名
詞をインデックスに加え、その結果をたとえばＲＡＭ１
４に記憶させる。The control unit 11 of the document processing apparatus determines whether or not the element is a proper noun based on the tag attached to the received document. For example, in the internal structure by tagging shown in FIG. 3, “Mr. A”, “B society”, and “C city” have the related attributes of “person name”,
Since it is "organization name" and "place name", it is understood that it is a proper noun. Then, the control unit 11 adds the extracted proper noun to the index, and stores the result in, for example, the RAM 1
4 is stored.

【００６７】ステップＳ３４においては、文書処理装置
の制御部１１は、たとえばＲＡＭ１４から、ステップＳ
３２にて抽出したエレメントから、固有名詞以外の語義
を取り出してインデックスに加え、その結果をＲＡＭ１
４に記憶させる。In step S34, the control unit 11 of the document processing apparatus reads, for example,
The meanings other than proper nouns are extracted from the elements extracted at step 32 and added to the index.
4 is stored.

【００６８】このように、文書の特徴を発見してインデ
ックスを作成する手順は、タグ付けされた文書の特徴を
発見して、その特徴を配列したインデックスを作るもの
である。文書の特徴は、文書の内部構造に応じて拡散処
理された中心活性値に基づいて判断される。As described above, the procedure for finding the features of a document and creating an index involves finding the features of a tagged document and creating an index in which the features are arranged. The features of the document are determined based on the central activity value that has been diffused according to the internal structure of the document.

【００６９】なお、上述のインデックスには、文書の特
徴を表す語義および固有名詞とともに、その文書がＲＡ
Ｍ１４において記憶された位置を示す文書アドレスを含
めておく。In addition, the above-mentioned index includes the meaning and proper noun representing the characteristics of the document, and the document
The document address indicating the position stored in M14 is included.

【００７０】インデックスは文書を代表するような特徴
を表す語義および固有名詞を含むので、所望の文書を参
照する際に用いることができる。Since the index includes meanings and proper nouns representing features representative of a document, the index can be used when referring to a desired document.

【００７１】次に、文書の内部構造に基づいて、エレメ
ントに対応する中心活性値を拡散する活性拡散につい
て、図１０を参照して説明する。活性拡散は、図９のス
テップＳ３１他でおこなわれる。活性拡散は、中心活性
値の高いエレメントと関わりのあるエレメントにも高い
中心活性値を与えるような処理である。この中心活性値
は、タグ付けによる内部構造に応じて決定されるので、
文書の特徴の抽出等に利用される。Next, activation diffusion for diffusing a central activation value corresponding to an element based on the internal structure of a document will be described with reference to FIG. The active diffusion is performed in step S31 and the like in FIG. Active diffusion is a process in which an element associated with an element having a high central activity value is also given a high central activity value. Since this central activity value is determined according to the internal structure by tagging,
It is used for extracting features of a document.

【００７２】ステップＳ８１では、文書処理装置の制御
部１１は、参照・被参照リンクと通常リンクに関して
は、エレメントを連結するリンクの端点の端点活性値を
０に設定する。制御部１１は、このように付与した端点
活性値の初期値を、たとえばＲＡＭ１４に記憶させる。In step S81, the control unit 11 of the document processing apparatus sets the end point activation value of the end point of the link connecting the elements to 0 for the reference / referenced link and the normal link. The control unit 11 stores, for example, the RAM 14 with the initial value of the endpoint activation value thus assigned.

【００７３】エレメントとエレメントの連結は、たとえ
ば図１１に示すようになる。この図においては、文書を
構成するエレメントとリンクの構造の一部として、エレ
メントＥ_iおよびエレメントＥ_jが示されている。エレメ
ントＥ_iとエレメントＥ_jとは、中心活性値ｅ_iおよびｅ_j
をそれぞれ有し、リンクＬ_ijにて接続されている。リン
クＬ_ijのエレメントＥ_iに接続する端点はＴ_ij、エレメ
ントＥ_jに接続する端点はＴ_jiである。エレメントＥ
_iは、リンクＬ_ijにより接続されるエレメントＥ_jの他
に、リンクＬ_ik、Ｌ_ilおよびＬ_imによって図示しないエ
レメントＥ_k、Ｅ_lおよびＥ_mにそれぞれ接続している。
エレメントＥ_jは、エレメントＥ_jを基準としたリンクＬ
_ijであるＬ_jiにより接続されるエレメントＥ_iの他に、
リンクＬ_jp、Ｌ_j _qおよびＬ_jrによって図示しないエレメ
ントＥ_p、Ｅ_qおよびＥ_rにそれぞれ接続している。The connection between the elements is as shown in FIG. 11, for example. In this figure, an element _Ei and an element _Ej are shown as a part of the structure of the elements and links constituting the document. The element E _i and the element E _j are the central activation values e _i and e _j
And are connected by a link L _ij . The end point of the link L _ij connected to the element E _i is T _ij , and the end point of the link L _ij connected to the element E _j is T _ji . Element E
_i, in addition to the elements E _j, which is connected by a link L _ij, the link L _ik, L _il and L _im element E _k (not shown) by being connected to the E _l and E _m.
The element E _j is a link L based on the element E _j
_In addition to the element E _i connected by L _ji which is _ij ,
Link L _uk, L _j _q and L not shown by _jr element E _p, respectively connected to the E _q and E _r.

【００７４】ステップＳ８２においては、文書処理装置
の制御部１１は、文書を構成するエレメントＥ_iを計数
するカウンタの初期化をおこなう。すなわち、エレメン
トを計数するカウンタのカウント値ｉを１に設定する。
このカウンタは、第１番目のエレメントＥ₁を参照する
ことになる。[0074] In step S82, the control unit 11 of the document processing apparatus initializes the counter for counting the elements E _i of a document. That is, the count value i of the counter for counting elements is set to one.
This counter will refer to the _first element E1.

【００７５】ステップＳ８３においては、文書処理装置
の制御部１１は、カウンタが参照するエレメントについ
て、新たな中心活性値を計算するリンク処理を実行す
る。このリンク処理については、さらに後述する。In step S83, the control unit 11 of the document processing apparatus executes link processing for calculating a new central activation value for the element referred to by the counter. This link processing will be further described later.

【００７６】ステップＳ８４においては、文書処理装置
の制御部１１は、文書中のすべてのエレメントについて
新たな中心活性値の計算が完了したか否かを判断する。
そして、制御部１１は、文書中のすべてのエレメントに
ついて中心活性値の計算が完了したときには“ＹＥＳ”
としてステップＳ８５に処理を進め、文書中のすべての
エレメントについて新たな中心活性値の計算が完了して
いないときには“ＮＯ”としてステップＳ８７に処理を
進める。In step S84, the control unit 11 of the document processing apparatus determines whether the calculation of a new central activation value has been completed for all elements in the document.
Then, when the calculation of the central activation value is completed for all the elements in the document, the control unit 11 sets “YES”.
When the calculation of the new central activation value is not completed for all the elements in the document, “NO” is determined and the process proceeds to step S87.

【００７７】具体的には、制御部１１は、カウンタのカ
ウント値ｉが、文書の含むエレメントの総数に達したか
否かを判断する。そして、制御部１１は、カウンタのカ
ウント値ｉが文書に含まれるエレメントの総数に達した
ときには、すべてのエレメントが計算済みとしてステッ
プＳ８５に処理を進める。制御部１１は、カウンタのカ
ウント値ｉが文書に含まれるエレメントの総数に達して
いないときにはすべてのエレメントについて計算が終了
していないとしてステップＳ８７に処理を進める。Specifically, the control unit 11 determines whether or not the count value i of the counter has reached the total number of elements included in the document. Then, when the count value i of the counter reaches the total number of elements included in the document, the control unit 11 determines that all elements have been calculated, and proceeds to step S85. When the count value i of the counter has not reached the total number of elements included in the document, the control unit 11 determines that the calculation has not been completed for all elements, and proceeds to step S87.

【００７８】ステップＳ８７においては、文書処理装置
の制御部１１は、カウンタのカウント値ｉを１増加させ
て、カウンタのカウント値をｉ＋１とする。このことに
より、カウンタはｉ＋１番目Ｅ_i+1のエレメント、すな
わち次のエレメントを参照する。そして、処理はステッ
プＳ８３にもどり、端点活性値の計算およびこれに続く
一連の行程が、次のｉ＋１番目のエレメントＥ_i+1につ
いて実行される。In step S87, the control unit 11 of the document processing apparatus increases the count value i of the counter by 1, and sets the count value of the counter to i + 1. Thus, the counter refers to the (i + 1) th element of Ei _{+ 1} , that is, the next element. Then, the process returns to step S83, and the calculation of the endpoint activation value and a series of steps subsequent thereto are executed for the next (i + 1) -th element E _{i + 1} .

【００７９】ステップＳ８５においては、文書処理装置
の制御部１１は、文書に含まれるすべてのエレメントの
中心活性値の変化分、すなわち新たに計算された中心活
性値の元の中心活性値に対する変化分について平均値を
計算する。In step S85, the control unit 11 of the document processing apparatus changes the central activity values of all the elements included in the document, that is, the variation of the newly calculated central activity value from the original central activity value. Calculate the average value for.

【００８０】文書処理装置の制御部１１は、たとえばＲ
ＡＭ１４に記憶された元の中心活性値と新たに計算した
中心活性値を、文書に含まれるすべてのエレメントにつ
いて読み出す。制御部１１は、新たに計算した中心活性
値の元の中心活性値に対するそれぞれの変化分の総和を
文書に含まれるエレメントの総数で除することにより、
すべてのエレメントの中心活性値の変化分の平均値を計
算する。制御部１１は、このように計算したすべてのエ
レメントの中心活性値の変化分の平均値を、たとえばＲ
ＡＭ１４に記憶させる。The control unit 11 of the document processing device is, for example, R
The original center activity value and the newly calculated center activity value stored in the AM 14 are read for all elements included in the document. The control unit 11 divides the sum of the respective changes of the newly calculated central activity value from the original central activity value by the total number of elements included in the document,
Calculate the average value of the change in the central activity value of all elements. The control unit 11 calculates the average value of the change of the central activity values of all the elements calculated as described above, for example, as
It is stored in AM14.

【００８１】ステップＳ８６においては、制御部１１
は、ステップＳ８９で計算したすべてのエレメントの中
心活性値の変化分の平均値が、あらかじめ設定された閾
値以内であるか否かを判断する。そして、制御部１１
は、上記変化分が閾値以内であると“ＹＥＳ”としてこ
の一連の行程を終了する。上記制御部１１は、上記変化
分が閾値以内でないときには“ＮＯ”として、ステップ
Ｓ８２にてカウンタのカウント値ｉを１に設定して文書
のエレメントの中心活性値を計算する一連の行程を再び
実行する。この一連の行程にて構成されるステップＳ８
２からステップＳ８４に至るループが繰り返されるごと
に上記変化分は徐々に減少する。In step S86, control unit 11
Determines whether or not the average value of the change in the central activity values of all the elements calculated in step S89 is within a preset threshold value. And the control unit 11
Is "YES" if the change is within the threshold, and the series of steps is terminated. If the change is not within the threshold value, the control unit 11 sets the count value i of the counter to 1 in step S82 and executes a series of steps of calculating the central activity value of the document element again in step S82. I do. Step S8 composed of this series of steps
Each time the loop from step 2 to step S84 is repeated, the amount of the change gradually decreases.

【００８２】続いて、図１０のステップＳ８３にて実行
されるリンク処理について、図１２を参照して説明す
る。ここでは、一のエレメントＥ_iに対する処理を例に
とるが、中心活性値の拡散処理の際には、リンク処理は
すべてのエレメントに対しておこなわれる。Next, the link processing executed in step S83 in FIG. 10 will be described with reference to FIG. Here, assume the processing for one element E _i as an example, during the diffusion process of the central activation value, the link process is performed for all the elements.

【００８３】ステップＳ５１では、文書処理装置の制御
部１１は、文書を構成するエレメントＥ_iと一端が接続
されたリンクを計数するカウンタの初期化をおこなう。
すなわち、リンクを計数するカウンタのカウント値ｊを
１に設定する。カウンタは、エレメントＥ_iと接続され
た第１番目のエレメントＬ_i1を参照することになる。[0083] At step S51, the control unit 11 of the document processing apparatus initializes the counter for counting the link elements E _i and one end is connected of a document.
That is, the count value j of the counter for counting the links is set to one. The counter will refer to the first element L _i1 connected to element E _i .

【００８４】ステップＳ５２では、文書処理装置の制御
部１１は、エレメントＥ_iとＥ_jを接続するリンクＬ_ijに
ついて、関係属性のタグを参照することにより通常リン
クであるか否かを判断する。制御部１１は、リンクＬ_ij
が通常リンクのときには“ＹＥＳ”としてステップＳ５
３に処理を進め、リンクＬ_ijが参照リンクのときには
“ＮＯ”としてステップＳ５４に処理を進める。In step S52, the control section 11 of the document processing apparatus determines whether or not the link L _ij connecting the elements E _i and E _j is a normal link by referring to the tag of the related attribute. The control unit 11 sets the link L _ij
Is a normal link, "YES" is determined in step S5.
The process proceeds to step S3, and when the link _Lij is a reference link, “NO” is determined and the process proceeds to step S54.

【００８５】ステップＳ５３においては、文書処理装置
の制御部１１は、エレメントＥ_iの通常リンクＬ_ijに接
続された端点Ｔ_ijの新たな端点活性値を計算する処理を
おこなう。In step S53, the control unit 11 of the document processing device performs a process of calculating a new endpoint activation value of the endpoint T _ij connected to the normal link L _ij of the element E _i .

【００８６】ここでは、ステップＳ５２における判別に
より、リンクＬ_ijは通常リンクであることが明らかにな
っている。エレメントＥ_iの通常リンクＬ_ijに接続され
る端点Ｔ_ijの端点活性値ｔ_ijは、エレメントＥ_jの端点
活性値のうち、リンクＬ_ij以外のリンクに接続するすべ
ての端点Ｔ_jp、Ｔ_jq、Ｔ_jrの端点活性値ｔ_jp、ｔ_jq、ｔ
_jrと、エレメントＥ_iがリンクＬ_ijにより接続されるエ
レメントＥ_jの中心活性値ｅ_jを加算し、この加算で得た
値を文書に含まれるエレメントの総数で除することによ
り求められる。Here, it is clear from the determination in step S52 that the link _Lij is a normal link. Point activation values t _ij endpoint T _ij that is normally connected to the link L _ij of the element E _i, of the end-point activation value of the element E _j, all endpoints T _uk connecting link other than the link L _ij, T _jq , T _jr of the end-point activation values t _jp, t _jq, t
and _jr, adds the central activation value e _j of the element E _j of the element E _i is connected by the link L _ij, is determined by dividing the total number of elements included the value obtained by this addition to the document.

【００８７】文書処理装置の制御部１１は、たとえばＲ
ＡＭ１４から、端点活性値および中心活性値を読み出
す。制御部１１は、読み出された端点活性値および中心
活性値について、上述のようにその通常リンクと接続さ
れた端点の新たな端点活性値を計算する。そして制御部
１１は、このように計算した端点活性値を、たとえばＲ
ＡＭ１４に記憶させる。The control unit 11 of the document processing apparatus, for example,
From the AM 14, the endpoint activation value and the central activation value are read. The control unit 11 calculates a new endpoint activity value of the endpoint connected to the normal link as described above for the endpoint activity value and the center activity value that have been read. The control unit 11 then calculates the calculated endpoint activity value as, for example, R
It is stored in AM14.

【００８８】ステップＳ５４では、文書処理装置の制御
部１１は、エレメントＥ_iの参照リンクに接続された端
点Ｔ_ijの端点活性値を計算する処理をおこなう。In step S54, the control unit 11 of the document processing apparatus performs a process of calculating an end point activation value of the end point T _ij connected to the reference link of the element E _i .

【００８９】ステップＳ５２における判別により、リン
クＬ_ijは参照リンクであることが明らかになっている。
エレメントＥ_iの参照リンクＬ_ijに接続する端点Ｔ_ijの
新たな端点活性値ｔ_ijは、エレメントＥ_jの端点活性値
のうち、このリンクＬ_ijを除いたリンクに接続するすべ
ての端点Ｔ_jp、Ｔ_jq、Ｔ_jrの端点活性値ｔ_jp、ｔ_jq、ｔ
_jrと、エレメントＥ_iがリンクＬ_ijにより接続されるエ
レメントＥ_jの中心活性値ｅ_jを加算することにより求め
られる。From the determination in step S52, it is clear that the link _Lij is a reference link.
The new endpoint activation values t _ij of the endpoints T _ij connected to the reference link L _ij of the element E _i are all the endpoints T _jp connected to the links excluding the link L _ij among the endpoint activation values of the element E _j. , T _jq, T _jr of the end-point activation values t _jp, t _jq, t
and _jr, it is obtained by adding the central activation value e _j of the element E _j of the element E _i is connected by a link L _ij.

【００９０】文書処理装置の制御部１１は、たとえばＲ
ＡＭ１４に記憶された端点活性値および中心活性値か
ら、必要な端点活性値および中心活性値を読み出す。制
御部１１、読み出された端点活性値および中心活性値を
用いて、上述のように参照リンクと接続された新たな端
点活性値を計算する。そして制御部１１は、このように
計算した端点活性値を、たとえばＲＡＭ１４に記憶させ
る。The control unit 11 of the document processing apparatus, for example,
From the end point activity value and the center activity value stored in the AM 14, necessary endpoint activity values and center activity values are read. The control unit 11 calculates a new endpoint activation value connected to the reference link as described above using the read endpoint activation value and the central activation value. Then, the control unit 11 stores the calculated end point activation value in the RAM 14, for example.

【００９１】ステップＳ５３における通常リンクの処
理、およびステップＳ５４における参照リンクの処理
は、ステップＳ５２からステップＳ５５に至るループに
あるように、カウント値ｉにより参照されているエレメ
ントＥ_iに接続するすべてのリンクＬ_ijに対して実行さ
れる。[0091] The processing of the normal link in step S53, and processing of the reference link in step S54, as in a loop extending from the step S52 to step S55, all of which connect to the element E _i that are referenced by the count value i Executed for link L _ij .

【００９２】ステップＳ５５では、文書処理装置の制御
部１１は、エレメントＥ_iに接続するすべてのリンクに
ついて端点活性値が計算されたか否かを判別する。そし
て、すべてのリンクについて端点活性値が計算されてい
るときには“ＹＥＳ”としてステップＳ５７に進み、す
べてのリンクについて端点活性値が計算されていないと
きには“ＮＯ”としてステップＳ５７に進む。[0092] At step S55, the control unit 11 of the document processing device, endpoint activity values for all the links that connect to the element E _i is determined not to have been calculated. Then, when the end point activation values are calculated for all the links, the process proceeds to step S57 as “YES”, and when the end point activation values are not calculated for all the links, the process proceeds to step S57 as “NO”.

【００９３】ステップＳ５６においては、ステップＳ５
５にてエレメントＥ_iのすべてのリンクＬ_ijについて端
点活性値ｔ_ijが求められたことが判別されたので、文書
処理装置の制御部１１は、エレメントＥ_iの中心活性値
ｅ_iの更新を実行する。In step S56, step S5
Since the end-point activation value t _ij is obtained is determined for all the links L _ij of 5 at element E _i, the control unit 11 of the document processing apparatus, the update of the central activation value e _i of the element E _i Execute.

【００９４】エレメントＥ_iの中心活性値ｅ_iの新たな値
すなわち更新値は、エレメントＥ_iの現在の中心活性値
ｅ_iとエレメントＥ_iのすべての端点の新たな端点活性値
の和ｅ_i’＝ｅ_i＋Σｔ_j’をとることにより求められ
る。ここで、プライム“’”は、新たな値という意味で
ある。[0094] the new value that is updated value of the central activation value e _i of the element E _i, the sum e _i of a new end-point activation values of all of the end points of the current central activation value e _i and the element E _i of the element E _i It is obtained by taking '= e _i + _{ t _j '. Here, the prime “′” means a new value.

【００９５】文書処理装置の制御部１１は、たとえばＲ
ＡＭ１４に記憶された端点活性値および中心活性値から
必要な端点活性値を読み出す。制御部１１は、上述した
ような計算を実行し、そのエレメントＥ_iの中心活性値
ｅ_iを算出する。そして、制御部１１は、計算した新た
な中心活性値ｅ_iをたとえばＲＡＭ１４に記憶させる。The control unit 11 of the document processing apparatus, for example,
A necessary endpoint activation value is read from the endpoint activation value and the central activation value stored in the AM 14. Control unit 11 performs the calculation as described above, to calculate the central activation value e _i of the element E _i. Then, the control unit 11 stores the new central activation value e _i calculated for example RAM 14.

【００９６】次に、図８のステップＳ２３での自動分類
について、図１３を参照して説明する。Next, the automatic classification in step S23 in FIG. 8 will be described with reference to FIG.

【００９７】ステップＳ７１では、文書処理装置の制御
部１１は、分類モデルのカテゴリＣ_iに含まれる固有名
詞の集合と、ステップＳ２１で受信した文書から抽出さ
れインデックスに入れられた語のうちの固有名詞の集合
とについて、これらの共通集合の数をＰ（Ｃ_i ）とす
る。そして、制御部１１は、このようにして算出した数
Ｐ（Ｃ_i ）をたとえばＲＡＭ１４に記憶させる。In step S71, the control unit 11 of the document processing apparatus determines a set of proper nouns included in the category C _{i of the} classification model and a unique set of words extracted from the document received in step S21 and included in the index. Regarding a set of nouns, the number of these common sets is P (C _i ). Then, the control unit 11 stores the number P (C _i ) calculated in this way in, for example, the RAM 14.

【００９８】ステップＳ７２においては、文書処理装置
の制御部１１は、その文書のインデックス中に含まれる
全語義と各カテゴリＣ_iに含まれる全語義との語義間関
連度を、後述する図１５に示す語義間関連度の表を参照
し、語義間関連度の総和Ｒ（Ｃ_i ）を演算する。すなわ
ち、制御部１１は、分類モデルにおける固有名詞以外の
語について、全語義間関連度の総和Ｒ（Ｃ_i ）を演算す
る。そして、制御部１１は、演算した語義間関連度の総
和Ｒ（Ｃ_i ）をたとえばＲＡＭ１４に記憶させる。In step S72, the control unit 11 of the document processing apparatus determines the degree of association between all meanings included in the index of the document and all meanings included in each category C _i in FIG. The sum R (C _i ) of the degree of association between meanings is calculated with reference to the table of the degree of association between meanings shown below. That is, the control unit 11 calculates the total sum R (C _i ) of the degrees of association between all meanings for words other than proper nouns in the classification model. Then, the control unit 11 causes the RAM 14 to store the calculated sum R (C _i ) of the degree of association between meanings, for example.

【００９９】ステップＳ７３においては、文書処理装置
の制御部１１は、カテゴリＣ_i に対する文書の文書分類
間関連度をＲｅｌ（Ｃ_i ）＝ｍ₁Ｐ（Ｃ_i ）＋ｎ₁Ｒ（Ｃ_i ）と定義する。ここで、係数ｍ₁、ｎ₁は定数で、それぞれ
の値の文書分類間関連度への寄与の度合いを表すもので
ある。制御部１１は、ステップＳ７２で算出した共通集
合の数Ｐ（Ｃ_i ）およびステップＳ７３で算出した語義
間関連度の総和Ｒ（Ｃ_i ）をたとえばＲＡＭ１４から読
み出し、上述の式に当てはめて文書分類間関連度Ｒｅｌ
（Ｃ_i ）を算出する。なお、これらの係数ｍ₁、ｎ₁の値
としては、たとえばｍ₁＝１０、ｎ₁＝１とすることがで
きる。そして、制御部１１は、このように算出した文書
分類間関連度Ｒｅｌ（Ｃ_i ）をたとえばＲＡＭ１４に記
憶させる。In step S73, the control unit 11 of the document processing apparatus sets the relevance between the document classifications of the document for the category C _i as Rel (C _i ) = m ₁ P (C _i ) + n ₁ R (C _i ). Define. Here, the coefficients m ₁ and n ₁ are constants and represent the degree of contribution of the respective values to the inter-document-class relevance. The control unit 11 reads, for example, from the RAM 14 the number P (C _i ) of common sets calculated in step S72 and the sum R (C _i ) of the degree of association between meanings calculated in step S73, and applies the document classification to the above equation. Degree of association Rel
(C _i ) is calculated. The values of these coefficients m ₁ and n ₁ can be, for example, m ₁ = 10 and n ₁ = 1. Then, the control unit 11 stores the thus calculated inter-document-class relevance Rel (C _i ) in the RAM 14, for example.

【０１００】係数ｍ₁およびｎ₁の値は、統計的手法を使
って推定することもできる。すなわち、制御部１１は、
複数の係数ｍおよびｎの対について文書分類間関連度Ｒ
ｅｌ（Ｃ_i ）が与えられると、上記係数を最適化により
求めることができる。The values of the coefficients m ₁ and n ₁ can also be estimated using statistical techniques. That is, the control unit 11
Document class relevance R for a plurality of pairs of coefficients m and n
Given el (C _i ), the above coefficients can be obtained by optimization.

【０１０１】ステップＳ７４においては、文書処理装置
の制御部１１は、カテゴリＣ_iに対する文書分類間関連
度Ｒｅｌ（Ｃ_i ）が最大で、その文書分類間関連度Ｒｅ
ｌ（Ｃ_i ）の値がある閾値を越えているとき、そのカテ
ゴリＣ_iに文書を分類する。すなわち、制御部１１は、
複数のカテゴリに対してそれぞれ文書分類間関連度を作
成し、最大の文書分類間関連度が閾値を越えているとき
には、文書を最大の文書分類間関連度を有する上記カテ
ゴリＣ_iに分類する。最大の文書分類間関連度が閾値を
越えていないときには、文書の分類はおこなわない。In step S74, the control unit 11 of the document processing apparatus determines that the inter-document-class relevance Rel (C _i ) for the category C _i is the largest,
When the value of l (C _i ) exceeds a certain threshold, the document is classified into the category C _i . That is, the control unit 11
Create a document classification relevancy each for a plurality of categories, the largest document classification relevancy is when it exceeds the threshold value, classifies the above category C _i with the largest document classification relevancy of documents. If the maximum degree of association between document classes does not exceed the threshold value, no document classification is performed.

【０１０２】次に、図１３のステップＳ７２で用いられ
る語義間関連度の演算について、図１４を参照して説明
する。この図１４に示す処理は、図４に示す処理を行う
前に一度だけおこなえばよい。Next, the calculation of the degree of association between meanings used in step S72 of FIG. 13 will be described with reference to FIG. The process shown in FIG. 14 may be performed only once before performing the process shown in FIG.

【０１０３】ステップＳ６１において、文書処理装置の
制御部１１は、電子辞書内の語の語義の説明を用いて、
この辞書を使って語義のネットワークを作成する。すな
わち、辞書における各語義の説明と、この説明中に現れ
る語義との参照関係から、語義のネットワークを作成す
る。これにより、辞書を最上位の頂点とするツリー状の
語義のネットワークが構成される。ネットワークの内部
構造は、上述したようなタグ付けにより記述される。文
書処理装置の制御部１１は、たとえばＲＡＭ１４に記憶
された電子辞書について、語義とその説明を順に読み出
して、ネットワークを作成する。制御部１４は、このよ
うにして作成した語義のネットワークをたとえばＲＡＭ
１４に記憶させる。In step S61, the control unit 11 of the document processing apparatus uses the description of the meaning of the word in the electronic dictionary to
Use this dictionary to create a semantic network. That is, a meaning network is created from the description of each meaning in the dictionary and the reference relationship between the meanings appearing in the description. As a result, a tree-like meaning network having the dictionary as the highest vertex is configured. The internal structure of the network is described by tagging as described above. For example, the control unit 11 of the document processing apparatus sequentially reads meanings and descriptions of the electronic dictionary stored in the RAM 14 and creates a network. The control unit 14 stores the semantic network created in this way in, for example, a RAM.
14 is stored.

【０１０４】なお、上記ネットワークは、文書処理装置
の制御部１１が辞書を用いて作成する他に、受信部２１
にて外部から受信したり、記録／再生部３１にて記録媒
体３２から再生したりすることにより得ることもでき
る。上記辞書は、受信部２１にて外部から受信したり、
記録／再生部３１にて記録媒体３２から再生したりする
ことにより得られる。The above network is not only created by the control unit 11 of the document processing apparatus using a dictionary, but also by the receiving unit 21.
, Or by reproducing from the recording medium 32 by the recording / reproducing unit 31. The dictionary can be received externally by the receiving unit 21,
It is obtained by reproducing from the recording medium 32 by the recording / reproducing unit 31.

【０１０５】ステップＳ６２において、ステップＳ６１
で作成された語義のネットワーク上で、各語義のエレメ
ントに対応する中心活性値の拡散処理をおこなう。この
活性拡散により、各語義に対応する中心活性値は、上記
辞書により与えられたタグ付けによる内部構造に応じて
与えられる。中心活性値の拡散処理については、さらに
後述する。In step S62, step S61
The central activity value corresponding to each semantic element is spread on the semantic network created in step (1). By this activity diffusion, the central activity value corresponding to each meaning is given according to the internal structure by tagging given by the dictionary. The central activity value diffusion process will be further described later.

【０１０６】ステップＳ６３においては、ステップＳ６
１で作成された語義のネットワークを構成する一の語義
ｓ_iを選択し、ステップＳ６４においては、この一の語
義ｓ_iに対応する語彙エレメントＥ_iの中心活性値ｅ_iの
初期値を変化させ、このときの中心活性値の差分Δｅ_i
を計算する。In step S63, step S6
Selects one semantic s _i constituting the network of semantic created in 1, in step S64, to change the initial value of the central activation value e _i vocabulary element E _i corresponding to this one word sense s _i , The difference Δe _i between the central activation values at this time
Is calculated.

【０１０７】ステップＳ６５においては、ステップＳ６
４におけるエレメントＥ_iの中心活性値ｅ_iの差分Δｅ_i
に対応する、他の語義ｓ_jに対応するエレメントＥ_jの中
心活性値ｅ_jの差分Δｅ_jを求める。ステップＳ６６にお
いては、ステップＳ６５で求めた差分Δｅ_jをステップ
Ｓ６４で求めたΔｅ_iで除した商Δｅ_j／Δｅ_iを、語義
ｓ_iの語義ｓ_jに対する語義間関連度とする。In step S65, step S6
Difference .DELTA.e _i of central activation value e _i of the element E _i at 4
Corresponding to, obtain a difference .DELTA.e _j of central activation values e _j of the element E _j corresponding to other semantic s _j. In step S66, the quotient Δe _j / Δe _i the difference .DELTA.e _j divided by .DELTA.e _i obtained in step S64 determined at step S65, the word sense relevancy for meaning s _j semantic s _i.

【０１０８】ステップＳ６７においては、一の語義ｓ_i
と他の語義ｓ_jとのすべての対について語義間関連度の
演算が終了したか否かについて判断する。そして、すべ
ての語義の対について語義間関連度の演算が終了したと
きには“ＹＥＳ”として、この一連の処理を終了する。
すべての語義の対について語義間関連度の演算が終了し
ていないときには、“ＮＯ”として、ステップＳ６３に
もどり、語義間関連度の演算が終了していない対につい
て語義間関連度の演算を継続する。In step S67, one meaning s _i
It is determined whether or not the calculation of the degree of association between meanings has been completed for all pairs of the word and another meaning s _j . Then, when the calculation of the degree of association between meanings is completed for all the meaning pairs, "YES" is determined, and this series of processing ends.
If the calculation of the degree of association between the meanings is not completed for all the pairs of meanings, the process returns to step S63, and the calculation of the degree of association between the meanings is continued for the pair for which the calculation of the degree of meaning is not completed. I do.

【０１０９】ステップＳ６３からステップＳ６７のルー
プにおいて、文書処理装置の制御部１１は、必要な値を
たとえばＲＡＭ１４から順に読み出して、上述したよう
に語義間関連度を計算する。制御部１１は、計算した語
義間関連度をたとえばＲＡＭ１４に順に記憶させる。In the loop from step S63 to step S67, the control unit 11 of the document processing apparatus sequentially reads necessary values from, for example, the RAM 14 and calculates the degree of association between meanings as described above. The control unit 11 causes the RAM 14 to sequentially store the calculated meaning-to-sense associations, for example.

【０１１０】このように計算された語義間関連度は、図
１５に示すように、それぞれの語義と語義の間に定義さ
れる。この表においては、語義間関連度は０から１まで
の値をとるように正規化されている。この表においては
“コンピュータ”、“テレビ”、“ＶＴＲ”の間の相互
の語義間関連度が示されている。“コンピュータ”と
“テレビ”の語義間関連度は０．５５、“コンピュー
タ”と“ＶＴＲ”の語義間関連度は０．２５、“テレ
ビ”と“ＶＴＲ”の語義間関連度は０．６０である。The calculated degree of association between meanings is defined between each meaning as shown in FIG. In this table, the degree of association between meanings is normalized to take a value from 0 to 1. In this table, the relevance between the meanings of "computer", "television", and "VTR" is shown. The degree of association between meanings of "computer" and "television" is 0.55, the degree of association between meanings of "computer" and "VTR" is 0.25, and the degree of association between meanings of "television" and "VTR" is 0.60. It is.

【０１１１】（３）実関心度および予測関心度次に、図４のステップＳ１３の詳細について、図１６を
参照して説明する。この処理をおこなうことで実関心度
が検出される。(3) Degree of Actual Interest and Degree of Predicted Interest Next, the details of step S13 in FIG. 4 will be described with reference to FIG. By performing this processing, the actual interest level is detected.

【０１１２】ステップＳ１０１では、ユーザは、図６に
示す文書分類ウィンドウ３０１から所望の文書を選択す
る。たとえば、ユーザは、文書分類ウィンドウ３０１の
分類表示部に表示された文書のタイトルに対応するアイ
コンを、入力部２０のマウスにて選択する。そして、操
作ボタン３０２の“ブラウザ（browser）”のボタンを
選択することにより、次のステップＳ１０２の表示のス
テップに進む。In step S101, the user selects a desired document from the document classification window 301 shown in FIG. For example, the user selects an icon corresponding to the title of the document displayed on the classification display unit of the document classification window 301 with the mouse of the input unit 20. Then, by selecting the “browser” button of the operation button 302, the process proceeds to the next display step of step S102.

【０１１３】ステップＳ１０２では、文書処理装置の制
御部１１は、ステップＳ１０１においてユーザが選択し
た文書を、たとえばＲＡＭ１４から読み出す。制御部１
１は、表示部３０において、読み出した文書をウィンド
ウ５１の文書表示部５３に表示する。上述したように、
ウィンドウ５１の文書表示部５３に文書が全部表示でき
ないときには、その文書の一部が表示される。In step S102, the control section 11 of the document processing apparatus reads out the document selected by the user in step S101 from, for example, the RAM 14. Control unit 1
1 displays the read document on the document display unit 53 of the window 51 on the display unit 30. As mentioned above,
When the entire document cannot be displayed on the document display section 53 of the window 51, a part of the document is displayed.

【０１１４】ステップＳ１０３では、ユーザは、ステッ
プＳ１０２でウィンドウ５１の文書表示部５３に表示さ
れた文書について、閲読や要約の作成をおこなう。すな
わち、ユーザは、ステップＳ１０２で表示されたウィン
ドウ５１の文書表示部５３にて文書を閲読する。また、
ユーザは、ウィンドウ５１の操作ボタン５６の“要約
（summerize）”ボタンを選択することにより、文書表
示部５３に表示された文書の要約を要約表示部５４に表
示する。In step S103, the user reads and creates a summary of the document displayed on the document display unit 53 of the window 51 in step S102. That is, the user reads the document on the document display unit 53 of the window 51 displayed in step S102. Also,
The user selects the “summarize” button of the operation button 56 of the window 51 to display the summary of the document displayed on the document display unit 53 on the summary display unit 54.

【０１１５】ここで、要約表示部５４に要約を作成して
表示する際に、文書処理部５３に表示された文書につい
て、文書中のユーザが選択したエレメントの重要度をユ
ーザの操作により高める手順を図１７に示すフローチャ
ートを参照して説明する。Here, when a summary is created and displayed on the summary display unit 54, for the document displayed on the document processing unit 53, the procedure of increasing the importance of the element selected by the user in the document by the user's operation Will be described with reference to the flowchart shown in FIG.

【０１１６】最初のステップＳ９１においては、制御部
１１は文書中のエレメントがユーザにより選択されたか
否かを判断する。この判断は、図１８に示す、ユーザに
よる入力を受け付けるグラフィックユーザインタフェー
ス(grafic user interface;GUI)を用いた選択により行
われる。In the first step S91, the control section 11 determines whether or not an element in the document has been selected by the user. This determination is made by selection using a graphic user interface (GUI) that accepts input by the user, as shown in FIG.

【０１１７】ウィンドウ５１は、文書のファイル名を表
示するファイル名表示部５２と、ファイル名表示部５２
に表示されたファイル名の文書を表示する文書表示部５
３と、文書表示部５３に表示された文書の要約を表示す
る要約表示部５４を有している。文書表示部５３には、
ファイル名表示部５２にファイル名または文書の先頭部
分が表示された文書の全部または一部が表示される。文
書表示部５３に文書の一部のみが表示されたときには、
たとえば文書表示部５３に表示されている文書をスクロ
ールすることにより、順次に文書の全体を閲覧すること
ができる。要約表示部５３には、この要約表示部５４の
大きさに対応して、後述する処理によって文書表示部５
３に表示された文書の要約が表示される。要約表示部５
３は、また要約が作成されていないので、空白となって
いる。なお、文書処理部５３と要約表示部５４のサイズ
はそれぞれ変更が可能である。このウィンドウ５１にお
いて取り扱う文書は、たとえば文書処理装置の受信部２
１で受信されて、記録／再生部３１やＲＡＭ１４に記録
されたものである。A window 51 includes a file name display section 52 for displaying a file name of a document, and a file name display section 52.
Display section 5 for displaying the document having the file name displayed in
3 and a summary display unit 54 for displaying a summary of the document displayed on the document display unit 53. In the document display unit 53,
The file name display section 52 displays all or a part of the document in which the file name or the head of the document is displayed. When only a part of the document is displayed on the document display unit 53,
For example, by scrolling the document displayed on the document display unit 53, the entire document can be sequentially browsed. The summary display unit 53 displays the document display unit 5 according to the size of the summary display unit 54 by a process described later.
The summary of the document displayed in 3 is displayed. Summary display section 5
3 is blank because no summary has been created. Note that the sizes of the document processing unit 53 and the summary display unit 54 can be changed. The document handled in this window 51 is, for example, the receiving unit 2 of the document processing apparatus.
1 and recorded in the recording / reproducing unit 31 or the RAM 14.

【０１１８】また、このウインドウ５１は、キーワード
を入力するキーワード入力部５５と、複数のボタンを有
するボタン部５６とを有している。キーワード入力部５
５には、キーワードを入力することにより、文書表示部
５４に表示された語のうちでキーワードと関連度の高い
語の重要度が高められる。ボタン部５６には、実行した
結果をもとに戻す“アンドゥ(Undo)”ボタンと、文書表
示部５３に表示された文章を要約して要約表示部５４に
表示する処理を実行する“要約(summarize)”ボタンと
を備えている。このうち、“要約”ボタンを選択するこ
とにより、たとえば要約表示部５４のサイズが変更され
たときにも、新たな要約表示部５４の新たなサイズに対
応するように文書処理部５３に表示されている文書の要
約が生成され、生成された要約は要約表示部５４に表示
される。The window 51 has a keyword input section 55 for inputting a keyword, and a button section 56 having a plurality of buttons. Keyword input unit 5
In 5, by inputting a keyword, of words displayed on the document display unit 54, the importance of a word having a high degree of relevance to the keyword is increased. The button unit 56 includes an “Undo” button for restoring the execution result and an “Undo (Undo)” for executing a process of summarizing the sentence displayed on the document display unit 53 and displaying it on the summarization display unit 54. summarize) button. By selecting the “summary” button, for example, even when the size of the summary display section 54 is changed, the summary display section 54 is displayed on the document processing section 53 so as to correspond to the new size of the new summary display section 54. A summary of the current document is generated, and the generated summary is displayed on the summary display unit 54.

【０１１９】図１７のステップＳ９１では、制御部１１
は、文書処理装置の表示部３０に表示されたウィンドウ
５１において、文書表示部５３に表示された文章中のエ
レメントがユーザによって選択されたか否かを判断す
る。文書表示部５３中のエレメントを選択して入力する
文書処理装置の入力部２０としては、ポインティングデ
バイスを用いて、このポインティングデバイスに連動す
る表示部３０に表示されたカーソルを操作することによ
り行うことができる。たとえば、ポインティングデバイ
スとしてマウスを採用した場合には、マウスを操作して
カーソルを文書処理部５３の所望のエレメントにあわ
せ、マウスでクリックすることによりそのエレメントを
選択する。文書表示部５３においてエレメントが選択さ
れると、選択されたエレメントを明瞭に示すために、選
択されたエレメントがたとえばハイライト表示される。
図１９においては、ウィンドウ５１の文書表示部５３に
おいては、選択された最小のエレメントである語彙エレ
メント“mainframe”５７がハイライト表示されてい
る。要約表示部５３は、まだ要約が作成されていないの
で、空白となっている。制御部１１は、このようにして
エレメントが選択されると“ＹＥＳ”として処理を次の
ステップＳ９２に進める。制御部１１は、エレメントが
選択されないとき、たとえば所定時間内に入力がなかっ
たり、文書表示部５３の文章が表示されている部分以外
がマウスによってクリックされたときには、“ＮＯ”と
して再びこのステップＳ９１に処理を戻し、エレメント
の入力を待つことにする。なお、以下では、説明の便宜
のために入力部２０のポインティングデバイスとしては
マウスを利用するものとして説明を進める。In step S91 of FIG.
Determines whether an element in the text displayed on the document display unit 53 is selected by the user in the window 51 displayed on the display unit 30 of the document processing apparatus. As the input unit 20 of the document processing apparatus for selecting and inputting an element in the document display unit 53, by using a pointing device and operating a cursor displayed on the display unit 30 linked to the pointing device Can be. For example, when a mouse is used as a pointing device, the mouse is operated to move the cursor to a desired element of the document processing unit 53, and the element is selected by clicking with the mouse. When an element is selected in the document display section 53, the selected element is highlighted, for example, in order to clearly show the selected element.
In FIG. 19, in the document display section 53 of the window 51, the vocabulary element “mainframe” 57 which is the smallest selected element is highlighted. The summary display section 53 is blank because no summary has been created yet. When the element is thus selected, the control unit 11 determines that the element is “YES” and proceeds to the next step S92. When no element is selected, for example, when there is no input within a predetermined time, or when a portion of the document display unit 53 other than the portion where the text is displayed is clicked with the mouse, the control unit 11 returns “NO” to step S91 again. To wait for the input of the element. In the following, for convenience of description, the description will be made assuming that a mouse is used as a pointing device of the input unit 20.

【０１２０】ステップＳ９２では、文書処理装置の制御
部１１は、ステップＳ９１において選択されたが、過去
にマウスでクリックすることにより選択された語である
か否かが判断される。制御部１１は、そのエレメントが
過去にマウスでクリックすることにより選択されたエレ
メントであるときには“ＹＥＳ”として処理をステップ
Ｓ９３に進める。制御部１１は、そのエレメントが過去
にマウスでクリックすることにより選択されたエレメン
トでないときには、“ＮＯ”として処理をステップＳ９
４に進める。In step S92, the control section 11 of the document processing apparatus determines whether or not the word selected in step S91 was previously selected by clicking with a mouse. If the element is an element selected by clicking with the mouse in the past, the control unit 11 determines that the element is “YES” and advances the process to step S93. If the element is not an element selected by clicking with the mouse in the past, the control unit 11 determines “NO” and proceeds to step S9.
Proceed to 4.

【０１２１】ステップＳ９３では、文書処理装置の制御
部１１は、選択されているエレメントが、文章エレメン
トであるか否かを判別する。制御部１１は、レベルが文
章エレメントであるときには“ＹＥＳ”として処理をス
テップＳ９１に戻す。制御部１１は、レベルが文章エレ
メントでないときには“ＮＯ”として処理を次のステッ
プＳ９５に進める。In step S93, the control section 11 of the document processing apparatus determines whether or not the selected element is a text element. When the level is a sentence element, the control unit 11 returns “YES” to the step S91. If the level is not a sentence element, control unit 11 determines that the level is "NO" and proceeds to the next step S95.

【０１２２】ステップＳ９４では、文書処理装置の制御
部１１は、レベルを、文書の最小のエレメントであって
文書のタグ付けによる内部構造の最下位のエレメントで
ある語彙エレメントに設定する。そして、制御部１１
は、処理をステップＳ９１に戻す。In step S94, the control section 11 of the document processing apparatus sets the level to the vocabulary element which is the smallest element of the document and the lowest element of the internal structure by tagging the document. And the control unit 11
Returns the process to step S91.

【０１２３】ステップＳ９５では、文書処理装置の制御
部１１は、レベルを１増加させる。たとえば、このよう
にレベルが１増加することにより、ステップＳ９１で選
択された語彙エレメント“mainframe”５７について
は、図２０に示すように、この語彙エレメントを含む次
に大きな上位のエレメント“Big mainframe computer
s”５９が選択され、この部分“Big mainframe compu
ters”５９がハイライト表示されることになる。同時
に、制御部１１は、選択された上位のエレメントの重み
付け、すなわち中心活性値を選択されていないエレメン
トよりも高める。そして、制御部１１は、処理をステッ
プＳ１１に戻す。In step S95, the control section 11 of the document processing apparatus increases the level by one. For example, as the level increases by 1, the vocabulary element “mainframe” 57 selected in step S91 is, as shown in FIG. 20, the next largest higher-order element “Big mainframe computer” including this vocabulary element.
s ”59 is selected and this part“ Big mainframe compu
ters "59 is highlighted. At the same time, the control unit 11 increases the weight of the selected higher-order element, that is, the center activation value, as compared with the non-selected element. The process returns to step S11.

【０１２４】ウィンドウ５１のボタン部５６に表示され
た“要約”ボタンがマウスのクリックにより選択される
と、文書表示部５３に表示された文章の要約が要約表示
部５４に表示される。“要約”ボタンが選択されると、
制御部１１は、図１７に示した一連の工程から処理を割
り込みにより脱出するように制御し、要約を作成する処
理を開始する。要約は、文書表示部５３に表示された文
書から、要約表示部５４のサイズに合わせて、要約表示
部５４の領域を満たすように生成される。図２１に示す
ように、要約表示部５４に表示された要約には、文書表
示部５９においてハイライト表示されたエレメント“Bi
g mainframe computers”５９に対応するエレメント
“Big mainframe computers”６０が表示されてい
る。このように、ウィンドウ５１の文書表示部５３にお
いて所望のエレメントを選択して重要度を高めることに
より、そのエレメントが要約に含まれる可能性を高くす
ることができる。なお、要約の生成の詳細については、
さらに後述する。When the “summary” button displayed on the button section 56 of the window 51 is selected by clicking the mouse, a summary of the sentence displayed on the document display section 53 is displayed on the summary display section 54. When the "Summary" button is selected,
The control unit 11 controls the process to escape from the series of steps shown in FIG. 17 by interruption and starts the process of creating a summary. The digest is generated from the document displayed on the document display unit 53 so as to fill the area of the digest display unit 54 in accordance with the size of the digest display unit 54. As shown in FIG. 21, the summary displayed on the summary display unit 54 includes the element “Bi
An element “Big mainframe computers” 60 corresponding to “g mainframe computers” 59 is displayed.In this way, by selecting a desired element in the document display section 53 of the window 51 and increasing its importance, the element is displayed. You can increase the likelihood of being included in a summary.
Further details will be described later.

【０１２５】図１８に示したウィンドウ５１において
は、文書表示部５３に表示された文書中のエレメントの
選択はマウスによるクリック以外にも、キーワード入力
部５５にキーワードを入力することによって選択するこ
とができる。制御部１１は、このようにキーワード入力
部５５に入力されたキーワードに関連するエレメントの
重要度を上げる処理を行う。キーワードとエレメントの
関連度は、たとえばＲＯＭ１５に記録されたテーブルを
参照することにより得る。この参照は、キーワードが含
まれるエレメントをタグ付けによって参照することによ
りおこなわれる。In the window 51 shown in FIG. 18, an element in the document displayed on the document display section 53 can be selected not only by clicking with the mouse but also by inputting a keyword into the keyword input section 55. it can. The control unit 11 performs a process of increasing the importance of the element related to the keyword input to the keyword input unit 55 in this way. The degree of association between the keyword and the element is obtained by referring to a table recorded in the ROM 15, for example. This reference is made by referring to the element including the keyword by tagging.

【０１２６】図１６のステップＳ１０４では、文書処理
装置の操作部１１は、ユーザの文書への実関心度を演算
する。実関心度は、ステップＳ１０３におけるユーザの
ウィンドウ５１に表示された文書への操作に基づいて演
算される。In step S104 of FIG. 16, the operation unit 11 of the document processing apparatus calculates the user's actual interest in the document. The actual interest level is calculated based on the user's operation on the document displayed in the window 51 in step S103.

【０１２７】ここで、本実施の形態に用いられる実関心
度と予測関心度について説明する。実関心度とは、この
ステップＳ１０４で演算されるものであって、ユーザの
操作により検出される、ユーザが操作した文書に対する
実際の関心度である。これに対して、予測関心度とは、
ユーザの文書に対する関心度を予測したものである。こ
の予測関心度は、たとえば実関心度に基づいて予測され
る。Here, the actual interest level and the predicted interest level used in the present embodiment will be described. The actual interest level is calculated in step S104, and is the actual interest level in the document operated by the user, detected by the operation of the user. In contrast, predictive interest is
This is a prediction of the degree of interest of the user in the document. This predicted interest level is predicted, for example, based on the actual interest level.

【０１２８】ステップＳ１０５では、制御部１１は、ユ
ーザの操作履歴をインデックスに記録する。上述したイ
ンデックスにの具体例においては、ユーザの操作履歴と
して、＜ユーザの操作履歴最大要約サイズ＝“100”＞＜選択エレメントの数＝“10”＞ピクチャーテル＜／
選択＞・・・＜／ユーザの操作履歴＞が例示されていた。ステップＳ１０５においては、制御
部１１は、要約の最大サイズや、選択されたエレメント
や、選択されたエレメントの数のような操作履歴を更新
する。制御部１１は、更新したインデックスを、たとえ
ばＲＡＭ１４に記憶させる。At step S105, the control section 11 records the operation history of the user in the index. In the specific example of the above-mentioned index, as the user's operation history, <user operation history maximum summary size = “100”><number of selected elements = “10”>
Selection>... </ User operation history> In step S105, the control unit 11 updates the operation history such as the maximum size of the digest, the selected elements, and the number of selected elements. The control unit 11 stores the updated index in, for example, the RAM 14.

【０１２９】なお、インデックスには文書の実関心度を
含めておくこともできる。たとえば、カテゴリごとに各
文書に対する実関心度をインデックスに含めることがで
きる。このような場合には、ステップＳ１０５におい
て、その文書に関するインデックスに含まれる実関心度
自体も更新される。The index may include the actual interest level of the document. For example, the degree of actual interest in each document for each category can be included in the index. In such a case, in step S105, the actual interest level itself included in the index for the document is also updated.

【０１３０】次に、図１６のステップＳ１０３でのユー
ザの操作について、図２２、図２３、図２３および図２
４を用いて説明する。Next, the user operation in step S103 in FIG. 16 will be described with reference to FIGS.
4 will be described.

【０１３１】文書分類ウィンドウ３０１にタイトルが表
示された文書は、たとえば、入力部２０のマウスを用い
て表示部３０において選択することにより、表示部３０
に表示させることができる。このように文書を表示する
文書表示ウィンドウの具体例は、図１８に示したので、
ここでの説明を省略する。The document whose title is displayed in the document classification window 301 is selected on the display unit 30 using the mouse of the input unit 20, for example.
Can be displayed. A specific example of the document display window for displaying a document in this way is shown in FIG.
The description here is omitted.

【０１３２】続いて、要約を作成する処理の図４に示す
ものより詳細の制御を含む例について図２２に示すフロ
ーチャートを参照して詳細に説明する。この一連の工程
は、“要約”ボタン１０３をオンすることによって開始
される。Next, an example of the process of creating an abstract including more detailed control than that shown in FIG. 4 will be described in detail with reference to the flowchart shown in FIG. This series of steps is started by turning on the “summary” button 103.

【０１３３】文書から要約を作成する処理は、文書のタ
グ付けによる内部構造に基づいて実行される。上述した
ように、ウィンドウ１００において要約を表示する表示
領域１３０のサイズは変更することができる。文書処理
装置の制御部１１は、新たにウィンドウ１０１が表示部
３０のウィンドウ１００に描画されるか表示領域１３０
のサイズが変更され、実行ボタン１０３が操作されたと
きには、表示領域１３０に適合するようにウィンドウ１
００の表示領域１２０に表示されている文書から要約を
作成する処理を実行する。The process of creating an abstract from a document is executed based on the internal structure of the document by tagging. As described above, the size of the display area 130 for displaying the summary in the window 100 can be changed. The control unit 11 of the document processing apparatus determines whether a new window 101 is drawn in the window 100 of the display unit 30 or the display area 130.
Is changed, and when the execute button 103 is operated, the window 1 is adjusted to fit the display area 130.
Then, a process of creating an abstract from the document displayed in the display area 120 is executed.

【０１３４】図２２の最初のステップＳ１２０では、文
書処理装置の制御部１１は、活性拡散を行う。本実施の
形態においては、活性拡散により得られた中心活性値を
重要度として採用することにより、文書の要約を行う。
すなわち、タグ付けによる内部構造を与えられた文書に
おいては、活性拡散と呼ばれる処理を行うことにより、
各エレメントにタグ付けによる内部構造に応じた中心活
性値を付与することができる。活性拡散は、中心活性値
の高いエレメントと関わりのあるエレメントにも高い中
心活性値を与えるような処理である。すなわち、活性拡
散は、照応(共参照)表現とその先行詞の間で中心活性値
が等しくなり、それ以外では中心活性値が減衰するよう
な中心活性値についての演算である。この中心活性値
は、タグ付けによる内部構造に応じて決定されるので、
タグ付けによる内部構造を考慮した文書の分析に利用す
ることができる。In the first step S120 in FIG. 22, the control section 11 of the document processing apparatus performs active diffusion. In the present embodiment, a document is summarized by adopting the central activity value obtained by activity diffusion as the importance.
In other words, in a document given an internal structure by tagging, by performing a process called active diffusion,
Each element can be given a central activity value according to the internal structure by tagging. Active diffusion is a process in which an element associated with an element having a high central activity value is also given a high central activity value. That is, the active diffusion is an operation on the central activity value such that the central activity value becomes equal between the anaphor (co-reference) expression and its antecedent, and the central activity value attenuates otherwise. Since this central activity value is determined according to the internal structure by tagging,
It can be used for document analysis considering the internal structure by tagging.

【０１３５】ステップＳ１２１では、文書処理装置の制
御部１１は、表示部３０に表示されているウィンドウ５
１の文書処理部５３のサイズ、具体的にはこの文書処理
部５３に表示可能な最大文字数をｗ_sと設定する。ま
た、文書処理装置の制御部１１は、要約の文字列を格納
するｓを初期化して初期値ｓ₀＝””と設定する。制御
部１１は、このように設定した、文書表示部５３に表示
可能な最大文字数ｗ_sおよび要約の文字列を格納するｓ
の初期値ｓ₀を、たとえばＲＡＭ１４に記録する。In step S 121, the control section 11 of the document processing apparatus operates the window 5 displayed on the display section 30.
The size of the first document processing section 53, specifically to set the maximum number of characters that can be displayed on the document processor 53 and w _s. Further, the control unit 11 of the document processing apparatus initializes s for storing the summary character string, and sets an initial value s ₀ = “”. The control unit 11 stores the maximum number of characters w _{s that} can be displayed on the document display unit 53 and the summary character string set in this way.
Of the initial value s _0, for example, recorded in the RAM14.

【０１３６】ステップＳ１２２では、文書処理装置の制
御部１１は、要約の骨格の順次の作成をカウントするカ
ウンタのカウント値ｉを零に設定する。すなわち、制御
部１１は、カウント値について、ｉ＝０と設定する。制
御部１１は、このように設定したカウント値ｉをたとえ
ばＲＡＭ１４に記録する。In step S122, the control unit 11 of the document processing apparatus sets the count value i of the counter for counting the sequential creation of the skeleton of the summary to zero. That is, the control unit 11 sets i = 0 for the count value. The control unit 11 records the count value i thus set, for example, in the RAM 14.

【０１３７】ステップＳ１２３では、文書処理装置の制
御部１１は、カウンタのカウント値ｉについて、文章か
らｉ番目に平均中心活性値の高い文の骨格を抽出する。
平均中心活性値とは、一つの文を構成する各エレメント
の中心活性値を平均したものである。制御部１１は、た
とえばＲＡＭ１４に記録した要約を格納するｓ_i-1を読
み出し、このｓ_i-1に対して抽出した文の骨格の文字列
を加えて、ｓ_iとする。そして、制御部１１は、このよ
うにして得たｓ_iを、たとえばＲＡＭ１４に記録する。
同時に、制御部１１は、上記文の骨格に含まれないエレ
メントの中心活性値順のリストｌ_iを作成し、このリス
トｌ_iをたとえばＲＡＭ１４に記録する。In step S123, the control unit 11 of the document processing apparatus extracts the skeleton of the sentence having the i-th highest average central activity value from the sentence for the count value i of the counter.
The average central activity value is the average of the central activity values of the elements constituting one sentence. For example, the control unit 11 reads s _i-1 that stores the digest recorded in the RAM 14 and adds the extracted character string of the skeleton of the sentence to s _i-1 to obtain s _i . Then, the control unit 11 records such a s _i thus obtained, for example, RAM 14.
At the same time, the control unit 11 creates a list l _i of the elements not included in the skeleton of the sentence in the order of the central activation value, and records the list l _{i in,} for example, the RAM 14.

【０１３８】すなわち、このステップＳ１２３において
は、要約のアルゴリズムは、活性拡散の結果を用いて、
平均中心活性値の大きい順に文を選択し、選択された文
の骨格の抽出する。文の骨格は、文から抽出した必須要
素により構成される。必須要素になりうるのは、エレメ
ントの主辞(head)と、主語(subject)、目的語(objec
t)、間接目的語(indirect object)、所有者(possesso
r)、原因(cause)、条件(condition)または比較(compari
son)の関係属性を有する要素と、等位構造が必須要素の
ときにはそれに直接含まれるエレメントとが必須要素を
構成するものである。そして、文の必須要素をつなげて
文の骨格を生成し、要約に加える。That is, in step S123, the summarization algorithm uses the result of activity diffusion to
The sentences are selected in descending order of the average central activity value, and the skeleton of the selected sentence is extracted. The skeleton of a sentence is composed of essential elements extracted from the sentence. Required elements can be the head of the element, the subject, and the object (objec
t), indirect object, owner (possesso
r), cause, condition, or comparison
An element having a relation attribute of (son) and an element directly included when the coordinate structure is an essential element constitute an essential element. Then, by connecting the essential elements of the sentence, a skeleton of the sentence is generated and added to the summary.

【０１３９】ステップＳ１２４では、文書処理装置の制
御部１１は、ｓ_iの長さがウィンドウ５１の要約表示部
５４の最大文字数ｗ_sより大きいか否かを判断する。そ
して、制御部１１は、ｓ_iの長さが最大文字数ｗ_sより大
きいときには“ＹＥＳ”としてこの一連の処理を終了す
る。制御部は、ｓ_iの長さが最大文字数ｗ_sより大きくな
いときには“ＮＯ”として処理をステップＳ１２５に進
める。すなわち、このステップＳ１２４においては、要
約が指定された分量に達したときは終了する。まだ余裕
がある場合は、次に中心活性値の高い文と省略したエレ
メントの中心活性値を比較して、高いほうを要約に加え
るものである。[0139] At step S124, the control unit 11 of the document processing apparatus, the length of s _i to determine whether greater than the maximum number of characters w _s summary display portion 54 of the window 51. Then, the control unit 11, the length of s _i is at greater than the maximum number of characters w _s ends the series of processes as "YES". Control unit, when the length of the s _i is not greater than the maximum number of characters w _s advances to step S125 to process a "NO". That is, in step S124, the process ends when the digest has reached the designated amount. If there is still room, the sentence having the next highest central activity value is compared with the central activity value of the omitted element, and the higher one is added to the summary.

【０１４０】ステップＳ１２９では、文書処理装置の制
御部１１は、ステップＳ１２４でｓ_iの長さが最大文字
数ｗ_sより大きいと判断されたので、要約をｓ_i-1に設定
する。この場合、要約はウィンドウにおさまらないので
ｓ_i＝ｓ₀＝“”を出力する。したがって、このときには
要約は表示されないこととなる。そして、制御部１１
は、この一連の工程を終了する。[0140] In step S129, the control unit 11 of the document processing apparatus, the length of s _i in step S124 because it is determined to be greater than the maximum number of characters w _s, sets the summary to s _i-1. In this case, since the summary does not fit in the window, s _i = s ₀ = “” is output. Therefore, at this time, the summary is not displayed. And the control unit 11
Ends this series of steps.

【０１４１】ステップＳ１２５では、文書処理装置の制
御部１１は、ｉ＋１番目に平均中心活性値が中心活性値
と、ステップＳ２３で作成したリストｌ_iの要素の最も
中心活性値が高い要素の中心活性値を比較する。そし
て、制御部１１は、ｉ＋１番目に平均中心活性値が高い
文の中心活性値がリストｌ_iの要素の最も中心活性値が
高い要素の中心活性値より高いときには“ＹＥＳ”とし
て処理を次のステップＳ２７に進める。制御部１１は、
ｉ＋１番目に平均中心活性値が高い文の中心活性値がリ
ストｌ_iの要素の最も中心活性値が高い要素の中心活性
値より高くないときには“ＮＯ”として処理をステップ
Ｓ１２６に進める。In step S125, the control unit 11 of the document processing apparatus determines that the central active value of the (i + 1) th average central active value is the central active value of the element having the highest central active value of the elements of the list l _i created in step S23. Compare values. Then, when the central activity value of the sentence having the (i + 1) -th highest average central activity value is higher than the central activity value of the element having the highest central activity value of the elements of the list l _i , the control unit 11 determines “YES” and proceeds to the next processing. Proceed to step S27. The control unit 11
When the central activity value of the sentence having the (i + 1) -th highest average central activity value is not higher than the central activity value of the element having the highest central activity value of the list l _i , the process proceeds to step S126 with “NO”.

【０１４２】ステップＳ１２６では、文書処理装置の制
御部１１は、カウンタのカウント値ｉを１だけ増加させ
る。そして、制御部１１は、処理をステップＳ１２３に
戻す。In step S126, the control section 11 of the document processing apparatus increases the count value i of the counter by one. And the control part 11 returns a process to step S123.

【０１４３】ステップＳ１２７においては、文書処理装
置の制御部１１は、リストｌ_iの最も中心活性値の高い
要素ｅをｓ_iに加えてｓｓ_iを生成する。要素ｅをｌ_iか
ら削除する。そして、制御部１１は、このようにして生
成したｓｓ_iをたとえばＲＡＭ１４に記録する。[0143] In step S127, the control unit 11 of the document processing apparatus generates a ss _i added a high element e the most central activation value of the list l _i a s _i. Delete element e from l _i . Then, the control unit 11 records the ss _i generated in this way for example the RAM 14.

【０１４４】ステップＳ１２８においては、文書処理装
置の制御部１１は、ｓｓ_iの長さがウィンドウ５１の要
約表示部５４の最大文字数ｗ_sより大きいか否かを判別
する。制御部１１は、ｓｓ_iの長さがｗ_sより大きいとき
には“ＹＥＳ”としてこの一連の工程を終了する。制御
部１１は、ｓｓ_iの長さがｗ_sより大きくないときには
“ＮＯ”として処理をステップＳ１２５に戻す。[0144] In step S128, the control unit 11 of the document processing apparatus, the length of the ss _i, it is determined whether or not greater than the maximum number of characters w _s summary display portion 54 of the window 51. Control unit 11, the length of the ss _i is when greater than w _s ends the series of steps as "YES". Control unit 11, when the length of the ss _i is not greater than w _s is returned to step S125 to process a "NO".

【０１４５】ステップＳ１３０においては、文書処理装
置の制御部１１は、ステップＳ１２８でＳＳ_iの長さが
最大文字数ｗ_sより大きいと判断されたので、要約文を
ｓ_iに設定する。これにより、最大文字数ｗ_sより大きく
ならないように要約文が生成される。そして、制御部１
１は、この一連の工程を終了する。[0145] In step S130, the control unit 11 of the document processing device, the length of SS _i is determined to be greater than the maximum number of characters w _s in step S128, sets the summary to s _i. Thus, the summary to be no greater than the maximum number of characters w _s is generated. And the control unit 1
1 ends this series of steps.

【０１４６】また、このウィンドウ５１は、キーワード
を入力するキーワード入力部５５と、複数のボタンを有
するボタン部５６とを有している。キーワード入力部５
５には、キーワードを入力することにより、文書表示部
５３に表示された語のうちでキーワードと後述する語義
間関連度の高い語の実関心度が高められる。ボタン部５
６には、実行した結果をもとにもどす“アンドゥ(Und
o)”ボタンと、文書表示部５３に表示された文章を要約
して要約表示部５４に表示する処理を実行する“要約(s
ummarize)”ボタンとを備えている。このうち、“要
約”ボタンを選択することにより、たとえば要約表示部
５４のサイズが変更されたときにも、要約表示部５４の
新たなサイズに対応するように文書処理部５３に表示さ
れている文書の要約が生成され、生成された要約は要約
表示部５４に表示される。The window 51 has a keyword input section 55 for inputting a keyword, and a button section 56 having a plurality of buttons. Keyword input unit 5
In 5, by inputting a keyword, among words displayed on the document display unit 53, the degree of actual interest of the keyword and a word having a high degree of association between meanings to be described later is increased. Button part 5
6 shows “Undo (Und
o) "button and" summary (s) for executing a process of summarizing the sentence displayed on the document display unit 53 and displaying it on the summary display unit 54.
ummarize) button. When the “summary” button is selected, for example, even when the size of the summary display unit 54 is changed, the size of the summary display unit 54 is changed to correspond to the new size. The summary of the document displayed on the document processing unit 53 is generated, and the generated summary is displayed on the summary display unit 54.

【０１４７】文書に対するユーザの実関心度は、次のよ
うな複数の要素に基づいて演算される。なお、実関心度
についての要素は、文書を構成する要素とは、異なるも
のである。The user's actual interest level in the document is calculated based on a plurality of factors as follows. Note that the element of the actual interest level is different from the elements constituting the document.

【０１４８】実関心度の演算では、ユーザによって指定
されたエレメントのうち、文書中での出現位置が文書の
先頭から最も離れたものの位置を第１の要素Ａ（Ｄ_i）
とする。この第１の要素によると、ユーザによって指定
されたエレメントのうち、文書中での出現位置が文書の
先頭から最も離れたものの位置が大きいほど、ユーザが
その文書をより多く読んだと考え、その文書への実関心
度も大きいこととする。具体的には、選択されたエレメ
ントの最大出現位置と文書全体のサイズの比率を実関心
度の第１の要素Ａ（Ｄ_i）とする。ここで、Ｄ_iは第ｉ番
目の文書を表している。In the calculation of the actual interest level, among the elements specified by the user, the position of the element whose appearance position in the document is farthest from the head of the document is determined as the first element A (D _i ).
And According to the first element, it is considered that the larger the position of the element specified by the user whose appearance position in the document is farthest from the beginning of the document is, the more the user has read the document. It is assumed that the degree of actual interest in the document is high. Specifically, the ratio between the maximum appearance position of the selected element and the size of the entire document is set as the first element A (D _i ) of the actual interest level. Here, _Di represents the i-th document.

【０１４９】図２３に示すウィンドウ５１の文書表示部
５３においては、第１のエレメント５７、第２のエレメ
ント５８および第３のエレメント５９がユーザによって
指定され、ハイライト表示されている。実関心度の計算
には、これらのうちで文書の先頭から最も離れた第３の
エレメント５９が用いられる。In the document display section 53 of the window 51 shown in FIG. 23, the first element 57, the second element 58, and the third element 59 are designated by the user and are highlighted. The third element 59 farthest from the head of the document is used for calculating the actual interest level.

【０１５０】また、実関心度の演算では、ウィンドウ５
１の文書表示部５３に表示された文書のエレメントから
ユーザが選択したものの数や、キーワード入力部５５に
ユーザが入力したキーワードの数を第２の要素Ｅ
（Ｄ_i）とする。In the calculation of the actual interest level, the window 5
The number of elements selected by the user from the elements of the document displayed on the first document display unit 53 and the number of keywords input by the user to the keyword input unit 55 are represented by the second element E.
(D _i ).

【０１５１】図２３に示すウィンドウ５１の文書表示部
５３においては、第１のエレメント５７、第２のエレメ
ント５８および第３のエレメント５９の指定がユーザに
より入力されている。また、キーワード入力部５５に
は、キーワード“ＡＡＡ”が入力されている。これらエ
レメントおよびキーワードの入力の数を実関心度の第２
の要素Ｅ（Ｄ_i）とする。In the document display section 53 of the window 51 shown in FIG. 23, the designation of the first element 57, the second element 58, and the third element 59 has been input by the user. Further, the keyword “AAA” is input to the keyword input unit 55. The number of inputs of these elements and keywords is used as the second
Element E (D _i ).

【０１５２】さらに、実関心度の演算では、ウィンドウ
５１における要約表示部５４の領域のサイズの文章表示
部５３の領域のサイズに対する比率を第３の要素Ｗ（Ｄ
_i）とする。これは、要約表示部５４の領域のサイズに
応じて要約が表示されるが、ユーザの実関心度が高いほ
ど、ユーザは簡単ではなく詳しい要約、すなわち長い要
約を求めるであろうからである。したがって、要約表示
部５４の領域のサイズの文章表示部５３の領域のサイズ
に対する比率が増大するほど、実関心度が大きいものと
することができる。Further, in the calculation of the actual interest level, the ratio of the size of the region of the summary display unit 54 in the window 51 to the size of the region of the text display unit 53 is determined by a third element W (D
_i ). This is because the summary is displayed according to the size of the area of the summary display unit 54. The higher the degree of real interest of the user is, the easier the user will be to obtain a detailed summary, that is, a longer summary. Therefore, as the ratio of the size of the region of the summary display unit 54 to the size of the region of the text display unit 53 increases, the degree of actual interest can be increased.

【０１５３】図２４に示すウィンドウ５１においては、
要約を表示する要約表示部５４の最大のサイズの、文書
の全部を表示した文書表示部５３のサイズに対する比率
を実関心度の第３の要素Ｗ（Ｄ_i）とする。In the window 51 shown in FIG.
The ratio of the maximum size of the summary display unit 54 that displays the summary to the size of the document display unit 53 that displays the entire document is defined as a third element of the actual interest degree W (D _i ).

【０１５４】実関心度の第１の要素Ａ（Ｄ_i）、実関心
度の第２の要素Ｅ（Ｄ_i）および実関心度の第３の要素
Ｗ（Ｄ_i）に基づいて、ユーザの文書Ｄ_iに対する実関心
度ＩＲ（Ｄｉ）はＩＲ（Ｄｉ）＝ｌ₂Ｗ（Ｄｉ）＋ｍ₂Ａ（Ｄｉ）＋ｎ₂Ｅ
（Ｄｉ）と定義される。ここで、係数ｌ₂、ｍ₂、ｎ₂は定数で、
それぞれの値の実関心度への寄与を表すものである。な
お、これらの係数ｌ₂、ｍ₂、ｎ₂の値としては、ｌ₂＝ｍ
₂＝１０、ｎ₂＝１とすることができる。また、係数
ｌ₂，ｍ₂，ｎ₂の値は、統計的手法を使って推定するこ
ともできる。すなわち、制御部１１は、複数の係数
ｌ₂、ｍ₂、ｎ₂の組について実関心度ＩＲ（Ｄｉ）が与
えられると、上記係数を最適化により求めることができ
る。Based on the first element of actual interest A (D _i ), the second element of actual interest E (D _i ) and the third element of actual interest W (D _i ), real interest IR for the document D _i (Di) is _{IR (Di) = l 2 W} (Di) + m 2 A (Di) + n 2 E
(Di). Here, the coefficients l ₂ , m ₂ and n ₂ are constants,
It represents the contribution of each value to the actual interest level. Note that the values of these coefficients l ₂ , m ₂ , and n ₂ are l ₂ = m
₂ = 10 and n ₂ = 1. In addition, the values of the coefficients l ₂ , m ₂ , and n ₂ can be estimated using a statistical method. That is, when the actual interest level IR (Di) is given to a set of a plurality of coefficients l ₂ , m ₂ , and n ₂ , the control unit 11 can obtain the coefficients by optimization.

【０１５５】次に、実関心度を用いて求められる予測関
心度に基づいておこなう文書の並べ替えについて、図２
５を参照して説明する。このような文書の並べ替えは、
図６のブラウザが開いた状態でおこなわれる。Next, document sorting performed based on the predicted interest level obtained using the actual interest level will be described with reference to FIG.
This will be described with reference to FIG. Sorting such documents is
This is performed while the browser in FIG. 6 is open.

【０１５６】ステップＳ１１１では、文書処理装置の制
御部１１は、文書を分類するカテゴリを計数するカウン
タのカウント値Ｃを０に設定する。ステップＳ１１２で
は、文書処理装置の制御部１１は、文書間関連度を演算
する。すなわち、制御部１１は、図８のステップＳ２３
で分類されたが未読である文書のうち、カウント値Ｃで
示されるカテゴリ内の未読の各文書について、そのカテ
ゴリ内のすでに実関心度が与えられた各文書に対する文
書間関連度をそれぞれ演算する。上述のように、実関心
度はユーザの操作によって与えられる。文書間関連度の
演算は、上述したインデックスに基づいておこなわれ
る。文書間関連度の演算の詳細については、さらに後述
する。In step S111, the control section 11 of the document processing apparatus sets the count value C of the counter for counting the category into which the document is classified to 0. In step S112, the control unit 11 of the document processing device calculates the relevance between documents. That is, the control unit 11 performs step S23 in FIG.
For each unread document in the category indicated by the count value C among the unread documents classified by the above, the inter-document relevance is calculated for each document in the category that has already been given a real interest level. . As described above, the degree of actual interest is given by the operation of the user. The calculation of the relevance between documents is performed based on the above-mentioned index. The details of the calculation of the inter-document relevance will be further described later.

【０１５７】ステップＳ１１３においては、文書処理装
置の制御部１１は、予測関心度を演算する。予測関心度
は、当該文書と、すでに実関心度が与えられた文書との
間の文書間関連度に基づいて演算される。したがって、
予測関心度は、実関心度が与えられていない文書に対し
て演算される。In step S113, the control section 11 of the document processing apparatus calculates a predicted interest level. The predicted interest level is calculated based on the inter-document relevance between the document and the document to which the actual interest level has already been given. Therefore,
The predicted interest level is calculated for a document to which no actual interest level has been given.

【０１５８】制御部１１は、カテゴリ内の一の未読文書
について、ステップＳ１１２で演算した文書間関連度の
うち、最大の値の文書間関連度を有するそのカテゴリ内
の他の文書を選択する。制御部１１は、選択された他の
文書の実関心度を一の未読文書の予測関心度とする。制
御部１１は、このようにして得た予測関心度を、たとえ
ばＲＡＭ１４に記憶させる。For one unread document in the category, the control unit 11 selects another document in the category having the largest inter-document relevance among the inter-document relevances calculated in step S112. The control unit 11 sets the actual interest level of the selected other document as the predicted interest level of one unread document. The control unit 11 causes the RAM 14 to store the thus obtained prediction interest level, for example.

【０１５９】Ｓ１１８では、文書処理装置の制御部１１
は、カテゴリ内のすべての文書について予測関心度の演
算が終了したか否かによって処理を分岐する。制御部１
１は、カテゴリ内のすべての文書について演算が終了し
たときには“ＹＥＳ”として処理をステップＳ１１４に
進め、そうでないときには“ＮＯ”として処理をステッ
プＳ１１２にもどす。In S118, the control unit 11 of the document processing apparatus
Branches the process depending on whether or not the calculation of the predicted interest level has been completed for all the documents in the category. Control unit 1
If the calculation has been completed for all the documents in the category, the process proceeds to step S114, otherwise the process returns to step S112.

【０１６０】ステップＳ１１４では、文書処理装置の制
御部１１は、ステップＳ１１３で演算した予測関心度に
基づいて、カテゴリごとに未読文書を並べ替える。文書
の並べ替えの方法としては、予測関心度の高い未読文書
に対して高い優先順序を与え、優先順位の高い未読文書
ほど未読文書のタイトルの配列の先頭側にあるように配
列することができる。優先順位に有意な差がない場合に
は、受信した日時がより新しいものを上位にする。文書
のタイトルは、たとえば文書分類ウィンドウ３０１の分
類表示部３０３，３０４，３０５にカテゴリごとにこの
ような順序で配列される。In step S114, the control unit 11 of the document processing apparatus rearranges unread documents for each category based on the predicted interest calculated in step S113. As a method for sorting documents, a high priority order is given to unread documents having a high degree of interest in prediction, and unread documents having a high priority can be arranged such that the unread documents are at the head of the array of titles of unread documents. . If there is no significant difference in the priority, the one with the latest received date and time is ranked higher. The titles of the documents are arranged in such an order for each category, for example, in the classification display sections 303, 304, and 305 of the document classification window 301.

【０１６１】ステップＳ１１５では、文書処理装置の制
御部１１は、全カテゴリが終了したか否かを判断する。
制御部１１は、全カテゴリが終了したときには“ＹＥ
Ｓ”として処理をステップＳ１１７に進める。制御部１
１は、全カテゴリが終了していないときには“ＮＯ”と
して処理をステップＳ１１６に進める。In step S115, the control section 11 of the document processing apparatus determines whether or not all categories have been completed.
When all the categories are completed, the control unit 11 sets “YE
The process advances to step S117 as S ″.
When all the categories have not been completed, “NO” is determined, and the process proceeds to step S116.

【０１６２】ステップＳ１１６では、文書処理装置の制
御部１１は、カテゴリをカウントするカウンタ値Ｃを１
だけ増やす。すなわち、制御部１１は、Ｃ＝Ｃ＋１とす
る。そして、制御部１１は、処理をステップＳ１１２に
もどす。ステップＳ１１７では、制御部１１は、ステッ
プＳ１１５で全カテゴリについての処理が終了されたこ
とが判断されたので、並べ替えられた文書について表示
する。具体的には、図６に示したように、文書のアイコ
ンと文書のタイトルが表示される。文書のタイトルがな
い場合には、一文の要約が表示される。そして、この一
連の工程を終了する。In step S116, the control section 11 of the document processing apparatus sets the counter value C for counting the category to 1
Just increase. That is, the control unit 11 sets C = C + 1. Then, control unit 11 returns the process to step S112. In step S117, the control unit 11 displays the reordered documents because it is determined in step S115 that the processing for all the categories has been completed. Specifically, as shown in FIG. 6, a document icon and a document title are displayed. If there is no document title, a one sentence summary is displayed. Then, this series of steps ends.

【０１６３】次に、図２５のステップＳ１１２の文書間
関連度を計算する演算について、図２６を参照して詳細
に説明する。文書間関連度とは、一の文書Ｄ_iと他の文
書Ｄ_jの関連度である。Next, the calculation for calculating the relevance between documents in step S112 in FIG. 25 will be described in detail with reference to FIG. The document relevancy, a relevance of one document D _i and other documents D _j.

【０１６４】ステップＳ４１では、文書処理装置の制御
部１１は、一の文書Ｄ_i のインデックスに含まれる固有
名詞の集合と、図２５のステップＳ１１１またはＳ１１
６で指定されたカテゴリにすでに分類された他の文書Ｄ
_jのインデックスに含まれる固有名詞の集合とについ
て、これらの共通集合の数をＰ（Ｄ_i，Ｄ_j ）とする。
そして、制御部１１は、このようにして算出した数Ｐ
（Ｄ_i，Ｄ_j ）をたとえばＲＡＭ１４に記憶させる。[0164] At step S41, the control unit 11 of the document processing device, a set of proper nouns included in the index of one document D _i, step S111 or S11 in FIG. 25
Other documents D already classified into the category specified in 6.
For the set of proper nouns included in the index of _j , the number of these common sets is P (D _i , D _j ).
Then, the control unit 11 calculates the number P thus calculated.
(D _i , D _j ) is stored in the RAM 14, for example.

【０１６５】ステップＳ４２では、文書処理装置の制御
部１１は、図１５に示す語義間関連度の表を参照して、
一の未読文書Ｄ_iのインデックスに含まれる語義と他の
文書Ｄ_jのインデックスに含まれる語義との語義間関連
度の総和Ｒ（Ｄ_i，Ｄ_j）を演算する。In step S42, the control section 11 of the document processing apparatus refers to the table of the degree of association between meanings shown in FIG.
One unread document D _i sum R (D _i, D _j) of word sense relevancy of meaning contained in the index of semantic and other documents D _j included in the index of computing the.

【０１６６】ステップＳ４２では、文書処理装置の制御
部１１は、一の未読文書Ｄ_iの固有名詞以外の語につい
て、語義間関連度の表を参照して、他の文書Ｄ_jとの語
義間関連度の総和Ｒ（Ｄ_i ，Ｄ_j）を演算する。そし
て、制御部１１は、演算した語義間関連度の総和Ｒ（Ｄ
_i ，Ｄ_j）をたとえばＲＡＭ１４に記憶させる。[0166] At step S42, the control unit 11 of the document processing apparatus, the word other than the proper nouns one unread document D _i, with reference to the table of word sense relevancy, between semantic and other documents D _j The sum R (D _i , D _j ) of the relevance is calculated. Then, the control unit 11 calculates the total sum R (D
_i , D _j ) are stored in the RAM 14, for example.

【０１６７】ステップＳ４３では、文書処理装置の制御
部１１は、一の文書Ｄ_i に対する他の文書Ｄ_jの文書間
関連度をＲｅｌ（Ｄ_i ，Ｄ_j）＝ｍ₃Ｐ（Ｄ_i，Ｄ_j ）＋ｎ₃Ｒ（Ｄ
_i，Ｄ_j ）と定義する。ここで、係数ｍ₃、ｎ₃は定数で、それぞれ
の値の文書間関連度への寄与の度合いを表すものであ
る。制御部１１は、ステップＳ４１で算出した共通集合
の数Ｐ（Ｄ_i ，Ｄ_j ）およびステップＳ４２で算出した
語義間関連度の総和Ｒ（Ｄ_i ，Ｄ_j）をたとえばＲＡＭ
１４から読み出し、上述の式に当てはめて文書間関連度
Ｒｅｌ（Ｄ_i ，Ｄ_j ）を算出する。なお、これらの係数
ｍ₃、ｎ₃の値としては、たとえばｍ₃＝１０、ｎ₃＝１と
することができる。In step S43, the control unit 11 of the document processing apparatus determines the inter-document relevance of one document D _i to another document D _j by Rel (D _i , D _j ) = m ₃ P (D _i , D _j ) + n ₃ R (D
_i , D _j ). Here, the coefficients m ₃ and n ₃ are constants and represent the degree of contribution of each value to the inter-document relevance. The control unit 11 stores the number P (D _i , D _j ) of common sets calculated in step S41 and the total sum R (D _i , D _j ) of the meaning-to-sense association calculated in step S42 in, for example, a RAM.
14 and is applied to the above equation to calculate the inter-document relevance Rel (D _i , D _j ). The values of these coefficients m ₃ and n ₃ can be, for example, m ₃ = 10 and n ₃ = 1.

【０１６８】係数ｍ₃およびｎ₃の値は、統計的手法を使
って推定することもできる。すなわち、制御部１１は、
複数の係数ｍ₃およびｎ₃の対について文書間関連度Ｒｅ
ｌ（Ｄ_i ，Ｄ_j）が与えられると、上記係数を最適化に
より求めることができる。The values of the coefficients m ₃ and n ₃ can also be estimated using a statistical method. That is, the control unit 11
Relevance between documents Re for a plurality of pairs of coefficients m ₃ and n ₃
Given l (D _i , D _j ), the above coefficients can be obtained by optimization.

【０１６９】次に、文書処理装置の記録／再生部３１に
おいて記録／再生される記録媒体３２について説明す
る。記録媒体には、複数のエレメントからタグ付けによ
る内部構造を有する文書を処理する文書処理プログラム
が記録されている。この記録媒体３２としては、情報の
記録／再生が可能なたとえばフロッピーディスクが利用
される。Next, the recording medium 32 recorded / reproduced in the recording / reproducing unit 31 of the document processing apparatus will be described. A document processing program for processing a document having an internal structure by tagging from a plurality of elements is recorded on a recording medium. As the recording medium 32, for example, a floppy disk capable of recording / reproducing information is used.

【０１７０】記録媒体３２は、文書に対する実関心度を
検出する実関心度検出処理と、実関心度検出処理で検出
した実関心度に基づいて上記文書に優先順位を設定する
優先順位設定処理とを有する。さらに、記録媒体３２
は、文書を表示する表示処理と、表示処理で表示された
文書についての手動による入力を受ける入力処理とをさ
らに有し、実関心度検出処理は、上記入力手段での入力
に基づいて実関心度を検出する。The recording medium 32 includes a real interest level detection process for detecting the real interest level for the document, a priority order setting process for setting a priority order for the document based on the real interest level detected in the real interest level detection process. Having. Further, the recording medium 32
Further includes a display process for displaying the document, and an input process for receiving a manual input of the document displayed in the display process. The real interest level detection process includes a real interest level based on the input by the input unit. Detect the degree.

【０１７１】なお、本実施の形態においては、文書への
タグ付けの方法の一例を示したが、本発明がこのタグ付
けの方法に限定されないことはもちろんである。また、
本実施の形態においては、文書処理装置の受信部２１に
外部から文書が送信されるとしたが、本発明はこれに限
定されない。たとえば、上記文書は、文書処理装置のＲ
ＯＭ１３に書き込まれていたり、記録／再生部３１にお
いて記録媒体３２から読み出されてもよい。In the present embodiment, an example of a method of tagging a document has been described, but it is needless to say that the present invention is not limited to this tagging method. Also,
In the present embodiment, the document is transmitted from the outside to the receiving unit 21 of the document processing apparatus, but the present invention is not limited to this. For example, the above document is stored in the document processing device R
The data may be written in the OM 13 or read from the recording medium 32 in the recording / reproducing unit 31.

【０１７２】また、上述の実施の形態においては、文書
処理装置の表示部３０に表示された文書から所望のエレ
メントを選択するデバイスとしてマウスを例示したが、
本発明がこれに限定されないことはいうまでもない。文
書処理装置におけるエレメントの入力には、タブレッ
ト、ライトペン等の他のデバイスを利用することができ
る。Further, in the above-described embodiment, a mouse is exemplified as a device for selecting a desired element from a document displayed on the display unit 30 of the document processing apparatus.
It goes without saying that the present invention is not limited to this. Other devices such as a tablet and a light pen can be used for inputting elements in the document processing apparatus.

【０１７３】さらに、上述の実施の形態においては、日
本語および英語の文章を例示したが、本発明がこれらの
言語に限られないことはいうまでもない。Further, in the above-described embodiment, Japanese and English sentences have been exemplified, but it goes without saying that the present invention is not limited to these languages.

【０１７４】[0174]

【発明の効果】上述のように、本発明は、電子文書を処
理するものであって、電子文書に対する実関心度を検出
し、検出した実関心度に基づいて電子文書に優先順位を
設定している。また、本発明は、電子文書を表示し、表
示された電子文書についての手動による入力を受け付
け、この入力に基づいて実関心度を検出している。した
がって、本発明は、ユーザの実関心度を反映して電子文
書の優先順位を設定することにより、ユーザの便宜を図
っている。As described above, the present invention processes an electronic document, detects the degree of actual interest in the electronic document, and sets the priority of the electronic document based on the detected degree of actual interest. ing. According to the present invention, an electronic document is displayed, a manual input for the displayed electronic document is accepted, and the actual interest level is detected based on the input. Therefore, according to the present invention, the priorities of electronic documents are set to reflect the degree of actual interest of the user, thereby facilitating the user.

【０１７５】さらに、本発明は、すでに実関心度が求め
られた電子文書のうちで最も関連度の高い文書の実関心
度を予測関心度として、この予測関心度に基づいて優先
順位を設定している。したがって、本実施の形態は、実
関心度が与えられていない文書にも優先順位を与えるこ
とができる。Further, according to the present invention, the real interest level of the document having the highest relevance among the electronic documents for which the actual interest level has already been obtained is set as the predicted interest level, and the priority is set based on the predicted interest level. ing. Therefore, according to the present embodiment, it is possible to give a priority to a document to which no actual interest degree is given.

【０１７６】そして、本発明は、電子文書を複数の分類
項目に分類し、分類項目ごとに電子文書に優先順位を設
定している。したがって、本発明は、分類項目ごとに優
先順位を設定することにより、ユーザに利便性を提供し
ている。In the present invention, the electronic document is classified into a plurality of classification items, and the priority of the electronic document is set for each classification item. Therefore, the present invention provides convenience to the user by setting priorities for each classification item.

[Brief description of the drawings]

【図１】本実施の形態を適用した文書処理装置の構成を
示すブロック図である。FIG. 1 is a block diagram illustrating a configuration of a document processing apparatus to which the present embodiment has been applied.

【図２】文書のタグ付けによる内部構造を示す図であ
る。FIG. 2 is a diagram showing an internal structure by tagging a document.

【図３】文書のタグ付けによる内部構造を表示したウィ
ンドウを示す図である。FIG. 3 is a diagram showing a window displaying an internal structure by tagging a document.

【図４】本実施の形態を適用した文書処理装置の動作を
示すフローチャートである。FIG. 4 is a flowchart illustrating an operation of the document processing apparatus to which the embodiment is applied.

【図５】文書の分類前の文書の分類をおこなうＧＵＩを
示す図である。FIG. 5 is a diagram showing a GUI for performing document classification before document classification.

【図６】文書の分類をおこなうＧＵＩを示す図である。FIG. 6 is a view showing a GUI for classifying documents.

【図７】分類モデルの表を示す図である。FIG. 7 is a diagram showing a table of classification models.

【図８】文書を自動分類するフローチャートである。FIG. 8 is a flowchart for automatically classifying documents.

【図９】文書の特徴を発見してインデックスを作成する
フローチャートである。FIG. 9 is a flowchart for finding an index of a document and creating an index.

【図１０】活性拡散を示すフローチャートである。FIG. 10 is a flowchart showing active diffusion.

【図１１】活性拡散の処理を説明する図である。FIG. 11 is a diagram for explaining active diffusion processing.

【図１２】活性拡散のリンク処理のフローチャートであ
る。FIG. 12 is a flowchart of link processing of active spread.

【図１３】文書分類間関連度を演算するフローチャート
である。FIG. 13 is a flowchart for calculating the degree of association between document classifications.

【図１４】語義間関連度の計算のフローチャートであ
る。FIG. 14 is a flowchart of a calculation of the degree of association between meanings.

【図１５】語義間関連度の表を示す図である。FIG. 15 is a diagram showing a table of the degree of association between meanings.

【図１６】文書を閲覧して分類操作するフローチャート
である。FIG. 16 is a flowchart for browsing and classifying documents.

【図１７】文章の任意の部分の重要度を上げる一連の工
程を示すフローチャートである。FIG. 17 is a flowchart showing a series of steps for increasing the importance of an arbitrary part of a sentence.

【図１８】要約ウィンドウを示す図である。FIG. 18 shows a summary window.

【図１９】要約ウィンドウにおいて語が選択された状態
を示す図である。FIG. 19 is a diagram showing a state where a word is selected in the summary window.

【図２０】要約ウィンドウにおいて選択された領域をさ
らにクリックした状態を示す図である。FIG. 20 is a diagram showing a state where a region selected in the summary window is further clicked.

【図２１】要約ウィンドウに要約が表示された状態を示
す図である。FIG. 21 is a diagram illustrating a state in which a summary is displayed in a summary window.

【図２２】要約作成処理を詳細に示す図である。FIG. 22 is a diagram showing the summary creation processing in detail.

【図２３】選択エレメントの最大出現位置からの実関心
度の計算を説明する図である。FIG. 23 is a diagram for explaining calculation of the actual interest level from the maximum appearance position of the selected element.

【図２４】要約エレメントの最大のサイズと文書全体の
比率からの実関心度の算出を説明する図である。FIG. 24 is a diagram illustrating calculation of the actual interest level from the maximum size of the summary element and the ratio of the entire document.

【図２５】文書を予測関心度により自動分類するフロー
チャートである。FIG. 25 is a flowchart for automatically classifying documents according to a predicted interest level.

【図２６】文書間関連度を演算するフローチャートであ
る。FIG. 26 is a flowchart for calculating the inter-document relevance.

[Explanation of symbols]

１０本体、１１制御部、１２インターフェース、
１３ＣＰＵ、２０入力部、２１受信部、３０表示
部、３１記録／再生部10 body, 11 control unit, 12 interface,
13 CPU, 20 input unit, 21 receiving unit, 30 display unit, 31 recording / reproducing unit

Claims

[Claims]

1. A document processing method for processing a plurality of electronic documents, comprising: a real interest level detecting step for detecting a real interest level for each electronic document; and a real interest level detected based on the real interest level detected in the real interest level detecting step. A priority setting step of setting priorities for electronic documents.

2. The method according to claim 1, further comprising: a display step of displaying the document; and an input step of receiving a manual input of the document displayed in the display step. 2. The document processing method according to claim 1, wherein the degree of actual interest is detected based on the input.

3. The document processing method according to claim 2, wherein the actual interest level detecting step detects the actual interest level based on the keyword input in the input step.

4. The document processing method according to claim 1, wherein the actual interest level detecting step detects the actual interest level based on the number of elements constituting the electronic document.

5. The actual interest level detecting step includes the step of: locating an element farthest from the head among a plurality of elements constituting the electronic document input in the input step from a head in the electronic document. 3. The document processing method according to claim 2, wherein the degree of actual interest is detected based on:

6. The display step displays the electronic document and the summary of the electronic document, respectively, and the input step sets the area of the electronic document and the summary displayed in the display step, respectively. The actual interest level detection step is the size of the electronic document, which is displayed in the display step and set in the input step, for displaying the electronic document and the summary, the size of the area for displaying the summary. 3. The document processing method according to claim 2, wherein the degree of actual interest is detected based on a ratio to the degree of interest.

7. The method according to claim 1, wherein the priority setting step includes the step of setting the actual interest level of the most relevant electronic document among the electronic documents for which the actual interest level has already been determined as the predicted interest level. 2. The document processing method according to claim 1, wherein priorities are set based on the degree of interest.

8. The document processing method according to claim 1, wherein said electronic document has an internal structure described by attribute information.

9. A classification step for classifying the electronic document into a plurality of classification items, wherein the priority setting step sets a priority of the electronic document for each classification item classified in the classification step. 2. The document processing method according to claim 1, wherein:

10. A document processing apparatus for processing a plurality of electronic documents, wherein: a real interest level detecting means for detecting a real interest level for each electronic document; A document processing apparatus comprising: priority setting means for setting a priority in an electronic document.

11. A display device for displaying the document, and an input device for receiving a manual input of the document displayed on the display device, wherein the actual interest level detecting device includes: 11. The document processing apparatus according to claim 10, wherein the degree of actual interest is detected based on the input.

12. A recording medium on which a document processing program for processing a plurality of electronic documents is recorded, wherein the document processing program comprises: a real interest detection process for detecting a real interest in each electronic document; A priority setting process for setting a priority order for each electronic document based on the actual interest level detected in the detection process.

13. A display process for displaying the document, and an input process for receiving a manual input for the document displayed in the display process, wherein the actual interest level detection process is performed in the input process. 13. The recording medium according to claim 12, wherein the degree of actual interest is detected based on the input.