JP2000172696A

JP2000172696A - Document managing system

Info

Publication number: JP2000172696A
Application number: JP10344183A
Authority: JP
Inventors: Kazuaki Kidokoro; 和明城所; Nobuhisa Yoda; 信久依田; Tatsuya Haraguchi; 竜也原口
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1998-12-03
Filing date: 1998-12-03
Publication date: 2000-06-23

Abstract

PROBLEM TO BE SOLVED: To extract the group of series of operation histories performed, based on the intention of a user out of the huge quantity of operation histories generated and stored at random with few processings. SOLUTION: An operation monitor part 5 monitors user operation inputted from an input device 1, extracts the preparing/updating/reference processing of a document or document output processing, such as transmission/printing and records the document name of an operation object, time when the operation occurs, the identifier of a user who performs the operation and the user identifier at the transmission destination or the like in an operation history preserving part 6 as the operation history. Then, a history cluster preparing part 7 analyzes the operation histories stored in the operation history preserving part 6, for example divides it into small history sequences for each user time sequentially arranged within a predetermined timewise range, and records this divided operation history group in a cluster preserving part 8 as a cluster.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、複数のユーザか
ら共有されるドキュメントに対する操作の履歴を蓄積す
る機能を有するドキュメント管理システムに係り、特
に、無作為に発生して蓄積される膨大な量の操作履歴の
中から、ユーザの意図に基づいて行なわれた一連の操作
履歴群（業務シーケンス）を少ない処理量で抽出するこ
とのできるドキュメント管理システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document management system having a function of accumulating a history of operations on a document shared by a plurality of users, and more particularly, to a huge amount of data generated and accumulated at random. The present invention relates to a document management system capable of extracting a series of operation histories (business sequences) performed based on a user's intention from an operation history with a small processing amount.

【０００２】[0002]

【従来の技術】大容量の記憶装置やネットワークの普及
により、電子化されたドキュメントを用いた業務が一般
化している昨今のオフィスにおいては、大量の電子ドキ
ュメントの中から必要な情報をどれだけ効率的に取り出
せるか、また、電子ドキュメントをどれだけ効果的に利
用することができるかが、業務の効率に大きな影響を与
える。2. Description of the Related Art With the spread of large-capacity storage devices and networks, the use of digitized documents has become commonplace in modern offices. The ability to effectively retrieve electronic documents and how effectively they can use electronic documents has a significant effect on business efficiency.

【０００３】従来の木構造を用いたファイル管理方法で
は、オフィス業務においてドキュメントを管理するには
機能が不十分であり、情報管理機能を補完し業務を効率
化するために、たとえば検索機能を補完するための自然
言語や類義語検索を用いた高機能なドキュメントの全文
検索ツールや、予め定義されたドキュメントフローを用
いて業務の効率化を図るため、すなわち、フロー管理機
能を追加するためのワークフローシステムなどのドキュ
メント管理用アプリケーションプログラムが数多く開発
されている。The conventional file management method using a tree structure has insufficient functions for managing documents in office work. For example, a search function is supplemented in order to supplement the information management function and improve the work efficiency. A high-performance full-text search tool for documents using natural language and synonymous search, and a workflow system for improving work efficiency using predefined document flows, that is, adding a flow management function Many document management application programs have been developed.

【０００４】これらのアプリケーションプログラムは、
ドキュメントの検索や再利用を支援するツールとしてあ
る程度の効果を挙げているが、ユーザのニーズを完全に
満たすものではなく、これらのアプリケーションプログ
ラムではカバーしきれない機能も必要とされている。た
とえば、ドキュメントの全文検索アプリケーションプロ
グラムでは、ユーザがドキュメントに含まれるキーワー
ドを思い付かない場合には検索を行うことができない。
また、ワークフローシステムでは、ドキュメントの利用
方法が予め明確に定義されていなければならず、非定型
のドキュメントが利用されることが多い日常のオフィス
業務では使いにくいものであった。[0004] These application programs are:
Although it has been effective to some extent as a tool to help search and reuse documents, it does not fully meet the needs of users and needs features that cannot be covered by these application programs. For example, a full-text search application program for a document cannot perform a search if the user does not think of a keyword included in the document.
Further, in the workflow system, the method of using the document must be clearly defined in advance, and it is difficult to use the document in daily office work in which atypical documents are often used.

【０００５】また、最近では、ユーザのドキュメント操
作履歴を用いてドキュメントの利用を支援しようとする
試みも見られる。この試みは、従来のドキュメント管理
方法では利用できなかった「ドキュメントの利用履歴」
という情報を用いてドキュメントを検索することを可能
にしようとするものであり、たとえば特開平６−３４２
４５１号公報に記載の文書管理装置では、記録されたユ
ーザのドキュメント操作履歴を用いて、どのユーザがそ
のドキュメントをいつ頃参照したかなどの履歴情報を検
索することによって、検索キーワードを思い付かなくて
も必要なドキュメントを発見することを可能としてい
る。[0005] Recently, attempts have been made to support the use of documents by using a user's document operation history. This attempt was based on a "document usage history" that was not available with traditional document management methods.
To search for a document using the information described in, for example, Japanese Patent Laid-Open No. 6-342.
In the document management apparatus described in Japanese Patent No. 451, by using the recorded user's document operation history to search for history information such as when a user referred to the document and when, the user can not come up with a search keyword. Even allows you to find the documentation you need.

【０００６】[0006]

【発明が解決しようとする課題】ところで、前述した手
法では、ドキュメントを検索するユーザは、そのドキュ
メントが誰に、どのように利用されたのかを推測して操
作履歴を検索するが、ドキュメントを利用するユーザの
立場から見ると、利用履歴に「Ａさんはどのような順序
でドキュメントにアクセスしたのか？」、「そのドキュ
メントを作成する時にどのような情報を参照したのか
？」など、過去、ユーザが操作した時の操作意図などの
情報が残されていた方が利用しやすいため、単に操作履
歴を蓄積するだけでなく、操作を行なったユーザが自分
の操作履歴を目的別に纏めたり、操作内容についてのコ
メントを残すことを可能にするシステムも提案されてい
る。しかしながら、ユーザにコメントを残してもらう手
法は、ユーザの負担が大きく、コメントの記述もれが頻
発するなどの問題があった。In the above-described method, a user who searches for a document searches for an operation history by guessing who and how the document was used. From the user's perspective, the usage histories such as "In what order did A access the document?" And "What information did you refer to when creating the document?" Since it is easier to use information such as the intention of operation at the time of operation, it is easy to use it.In addition to simply accumulating operation history, the user who performed the operation can summarize his own operation history according to purpose, A system has also been proposed that allows to leave a comment on. However, the method of having the user leave a comment has a problem that the burden on the user is heavy and the description of the comment frequently occurs.

【０００７】一方、履歴を解析する手法としては、デー
タマイニングの手法を応用してイベントログから頻繁に
発生するシーケンシャルなパターンを発見する手法や、
同じくデータマイニングの手法を応用してイベントログ
間の相関関係を発見する手法などが存在するが、これら
の手法は、イベントログの膨大な数の組み合わせのそれ
ぞれが、イベントログ全体に対してどれくらいの頻度で
発生するかを履歴データの解析手段として用いているた
めに、パターン検出のための負荷が大きく、また、不連
続なシーケンスを発見することができない、発生頻度の
少ない相関関係が見つかりにくいなどの問題があり、複
数のユーザが複数の意図で行なった操作履歴が入り交じ
った履歴データの解析が必要であるドキュメント管理シ
ステムでは適用が困難であった。On the other hand, as a method of analyzing the history, a method of applying a data mining method to find a frequently occurring sequential pattern from an event log,
Similarly, there are methods that use data mining techniques to find correlations between event logs.However, these methods use a huge number of combinations of event logs, Because it is used as a means of analyzing historical data, whether it occurs at a high frequency, the load for pattern detection is large, discontinuous sequences cannot be found, and infrequent correlations are difficult to find. However, it has been difficult to apply this method to a document management system that requires analysis of history data in which operation histories performed by a plurality of users with a plurality of intentions are mixed.

【０００８】この発明は、このような実情に鑑みてなさ
れたものであり、ドキュメントに対する操作の履歴の依
存関係をユーザの負担なしに少ない処理量で抽出するこ
とのできるドキュメント管理システムを提供することを
目的とする。The present invention has been made in view of such circumstances, and provides a document management system capable of extracting a dependency of an operation history on a document with a small processing amount without burdening a user. With the goal.

【０００９】[0009]

【課題を解決するための手段】前述した目的を達成する
ために、この発明においては、ドキュメントに対する操
作の履歴をユーザ別に見た場合に、短い時間的範囲での
操作は、特定の作業に関連した操作に限定されることに
注目し、連続した履歴データを短い時間的範囲でのユー
ザ別の連続した操作列（クラスタ）に区切り、さらに、
この作成された操作列（クラスタ）間の類似度を判定し
て類似度の高い（同じ目的で行なわれた操作だと考えら
れる）操作列同士を結合し、より長い時間的範囲でのユ
ーザ操作履歴列（ケース）を作成するようにしたもので
ある。In order to achieve the above object, according to the present invention, when a history of operations on a document is viewed for each user, an operation within a short time range is associated with a specific operation. Noting that the operation is limited to the operations performed, continuous history data is divided into continuous operation sequences (clusters) for each user in a short time range.
The similarity between the created operation sequences (clusters) is determined, and the operation sequences having a high similarity (considered to have been performed for the same purpose) are connected to each other, and the user operation in a longer time range is performed. A history column (case) is created.

【００１０】この発明によれば、ユーザの負担なしに、
かつ、データマイニングと比べて非常に少ない処理量
で、膨大な量の操作履歴の中からユーザの意図に基づい
て行なわれた一連の操作履歴群（業務シーケンスとなる
クラスタおよびケース）を抽出することが可能となる。According to the present invention, without burden on the user,
In addition, to extract a series of operation histories (clusters and cases serving as business sequences) performed based on a user's intention from an enormous amount of operation histories with a processing amount much smaller than data mining. Becomes possible.

【００１１】[0011]

【発明の実施の形態】以下、図面を参照してこの発明の
実施形態について説明する。Embodiments of the present invention will be described below with reference to the drawings.

【００１２】図１は、この実施形態に係る情報システム
の機能ブロック図である。FIG. 1 is a functional block diagram of the information system according to this embodiment.

【００１３】入力装置１は、ユーザの操作をシステムに
入力するためのキーボード、マウスまたはネットワーク
を介して接続される操作端末などである。表示装置２
は、ユーザの操作に基づき、操作の結果やドキュメント
を操作を行なったユーザに表示するディスプレイ装置な
どである。出力装置３は、ユーザの操作に基づき、ドキ
ュメントをシステム外部に出力するためのプリンタ、フ
ァックスまたはネットワーク装置などである。The input device 1 is a keyboard, a mouse or an operation terminal connected via a network for inputting a user's operation to the system. Display device 2
Is a display device or the like that displays a result of an operation or a document to a user who has performed an operation based on a user operation. The output device 3 is a printer, a facsimile or a network device for outputting a document to the outside of the system based on a user operation.

【００１４】ドキュメント管理部４は、記憶領域の管理
や入出力装置、表示装置の制御を行なう通常のオペレー
ティングシステムであり、ネットワークを介して複数の
入出力装置に接続され、複数のユーザに共有されて使用
されている。このドキュメント管理部４は、入力装置１
からのユーザの操作に基づき、ドキュメントの作成／編
集／参照や、表示装置２や出力装置３への操作結果の表
示／出力を行なう。また、このドキュメント管理部４で
は、ドキュメントのファイル名およびディレクトリ情報
などの属性やユーザグループなどの情報が管理されてい
る。さらに、このドキュメント管理部４は、ドキュメン
トに対するユーザの操作をモニタできるように拡張され
ている。The document management unit 4 is a normal operating system for managing storage areas and controlling input / output devices and display devices. The document management unit 4 is connected to a plurality of input / output devices via a network and is shared by a plurality of users. Has been used. The document management unit 4 includes the input device 1
Based on the user's operation, the user performs creation / edit / reference of the document and display / output of the operation result to the display device 2 or the output device 3. The document management unit 4 manages attributes such as file names and directory information of documents and information such as user groups. Further, the document management unit 4 is extended so that a user operation on a document can be monitored.

【００１５】操作モニタ部５は、入力装置１から入力さ
れるユーザ操作を監視して、ドキュメントの作成／更新
／参照処理や、メール／ファックスによるドキュメント
の送信またはドキュメントの印刷などのドキュメント出
力処理を抽出し、操作対象のドキュメント名と、操作の
発生した時刻、操作を行なったユーザ識別子および送信
先のユーザ識別子などとを後述する操作履歴保存部６に
操作履歴として記録する。The operation monitor unit 5 monitors a user operation input from the input device 1 and performs a document creation / update / reference process, a document output process such as sending a document by e-mail / fax or printing a document. The extracted document name, the time at which the operation occurred, the identifier of the user who performed the operation, the user identifier of the transmission destination, and the like are recorded as the operation history in the operation history storage unit 6 described later.

【００１６】操作履歴保存部６は、操作モニタ部５が検
出した操作履歴を記録する。記録される操作履歴データ
の例を図２に示す。図２中、履歴ＩＤ（ａ１）は、操作
履歴を管理するために割り当てられる番号である。操作
の種類（ａ２）は、記録された操作履歴の種類を表すも
のであり、「作成」、「更新」、「参照」、「削除」、
「印刷」または「メール送信」など、ユーザが行なった
操作の項目が発生した時間順に保存される。The operation history storage unit 6 records the operation history detected by the operation monitor unit 5. FIG. 2 shows an example of recorded operation history data. In FIG. 2, a history ID (a1) is a number assigned to manage the operation history. The operation type (a2) represents the type of the recorded operation history, and includes “creation”, “update”, “reference”, “delete”,
The items of the operation performed by the user such as “print” or “send mail” are stored in the order in which they occur.

【００１７】ドキュメント名（ａ３）は、操作履歴とと
もに記録される操作対象のドキュメント名であり、この
例では、オペレーティングシステムのファイルシステム
でのディレクトリとファイル名とが記録されている。操
作日時（ａ４）は、操作の発生した時刻が記録される。
そして、ユーザ名（ａ５）は、操作履歴に残された操作
を行なったユーザのユーザ識別子であり、ドキュメント
管理部４で管理されているユーザ情報から取得されるも
のである。The document name (a3) is the name of the document to be operated recorded together with the operation history. In this example, the directory and the file name in the file system of the operating system are recorded. The operation date and time (a4) records the time at which the operation occurred.
The user name (a5) is the user identifier of the user who performed the operation left in the operation history, and is obtained from the user information managed by the document management unit 4.

【００１８】また、履歴クラスタ作成部７は、操作履歴
保存部６に蓄積された操作履歴を解析・分割し、時間順
に並んだ小さな履歴列であるクラスタに変換する処理を
行なう。クラスタは、ユーザが同じ目的で行なった処理
の履歴が纏められることを意図している。この発明で
は、履歴データにはユーザがなぜその操作を行なったの
かという情報は残されていないが、ユーザが複数の業務
を並行して行なっていても、短い時間的範囲の操作に注
目すると、特定の目的の操作を集中して行なうことが多
いというドキュメント操作の特徴に着目し、複数のユー
ザの操作内容が記録された操作履歴列を操作を行なった
ユーザ別に分割し、さらに、ユーザごとの操作履歴を短
い時間間隔内に発生した履歴列に分割する処理を行な
う。The history cluster creating unit 7 analyzes and divides the operation histories stored in the operation history storage unit 6 and converts them into clusters, which are small history strings arranged in time order. The cluster is intended to summarize the history of processing performed by the user for the same purpose. According to the present invention, information as to why the user performed the operation is not left in the history data. However, even if the user performs a plurality of tasks in parallel, focusing on the operation in a short time range, Focusing on the characteristics of document operations, in which operations for specific purposes are often performed in a concentrated manner, the operation history sequence in which the operation contents of a plurality of users are recorded is divided by the user who performed the operation, and furthermore, A process for dividing the operation history into history strings generated within a short time interval is performed.

【００１９】単に時間的範囲だけに注目して履歴を分割
すると、別の意図で行なった操作履歴が同じクラスタに
含まれてしまう確率が高いため（たとえば、「報告書を
作成中に別件のメールを受信したため、文書作成業務を
一時中断して返答メールを作成した。」のように異なる
意図の操作が短い時間内に発生した場合など）、この履
歴クラスタ作成部７では、経験的にユーザ操作の目的を
変えるときに行なうことの多いと考えられる文書の保存
操作や、メールの送信操作および文書の印刷操作などの
特定の操作履歴を検出し、これらの操作が発生した時点
でも操作履歴を分割する処理を行なう。If the history is divided merely by focusing on the time range, the history of operations performed with different intents is likely to be included in the same cluster (for example, “another mail is generated while creating a report”). Received, the document creation business was temporarily suspended and a reply mail was created. "), The history cluster creating unit 7 empirically performs the user operation. Detects specific operation histories, such as document save operations, e-mail send operations, and document print operations that are often performed when changing the purpose of a document, and divides the operation history even when these operations occur Is performed.

【００２０】履歴クラスタ作成部７で作成されたクラス
タは、クラスタ保存部８に記録される。図３は、作成さ
れたクラスタの例を示すものである。図３中、クラスタ
ＩＤ（ｂ１）は、クラスタとして纏められた操作履歴群
を管理するために割り当てられる番号である。たとえ
ば、クラスタＩＤ１２１は、鈴木さんが１０：３０から
１０：４３までに行なった操作を一つに纏めたものであ
る。時間的に近接した操作であっても、別のユーザが行
なった操作は別クラスタとして記録される（クラスタＩ
Ｄ１２２，１２３）。なお、クラスタＩＤ１２４に含ま
れる鈴木さんの操作は、クラスタＩＤ１２１に含まれる
操作と発生した時間が近接しているが、鈴木さんが履歴
ＩＤ３３０の操作で、操作の切れ目と考えられるドキュ
メントの印刷操作を行なっているため、クラスタ１２１
とは分割されて別のクラスタとなっている。The cluster created by the history cluster creating unit 7 is recorded in the cluster saving unit 8. FIG. 3 shows an example of the created cluster. In FIG. 3, a cluster ID (b1) is a number assigned to manage a group of operation histories grouped as a cluster. For example, the cluster ID 121 is a collection of operations performed by Mr. Suzuki from 10:30 to 10:43. Even if the operation is temporally close, the operation performed by another user is recorded as another cluster (cluster I
D122, 123). Although the operation of Mr. Suzuki included in the cluster ID 124 is close to the time of occurrence of the operation included in the cluster ID 121, the operation of the history ID 330 is performed by Mr. Suzuki to perform a print operation of a document considered to be a break in the operation. Cluster 121
Is divided into another cluster.

【００２１】クラスタ検索部９は、ユーザの要求に応じ
てクラスタ保存部８に蓄積されたクラスタを検索・表示
する処理を行なう。たとえば、ドキュメントＡ．ｄｏｃ
へのアクセスを含んだクラスタを検索すると、クラスタ
ＩＤ１２１と１２２が見つかるが、クラスタにはドキュ
メントＡ．ｄｏｃ以外のドキュメントへのアクセスを含
む複数の操作履歴が纏められているため、ドキュメント
Ａ．ｄｏｃとともにドキュメントＣ．ｐｐｔが使われる
ことがあるということがわかる。The cluster retrieving section 9 performs a process of retrieving and displaying clusters stored in the cluster storing section 8 in response to a user request. For example, document A. doc
When searching for a cluster that includes access to, the cluster IDs 121 and 122 are found. Since a plurality of operation histories including access to documents other than doc are compiled, the document A. doc with document C.doc. It can be seen that ppt may be used.

【００２２】クラスタ作成部７で作成されたクラスタ
は、時間的に近接したものを纏めただけであり、長い時
間をかけて行なわれた業務や、業務の間に別の業務が発
生したような場合は、履歴が途切れていることが考えら
れる（たとえば、前述した「報告書を作成中に別件のメ
ールを受信したため、文書作成業務を一時中断して返答
メールを作成した。」ような場合、メール受信前に行な
われていた文書作成業務の履歴と返答メール作成後に再
開された文書作成業務の履歴とはユーザの目的としては
一つの纏まりである）。The cluster created by the cluster creating unit 7 is simply a collection of clusters that are close in time, such as a task performed over a long time or another task occurring between tasks. In such a case, the history may be interrupted (for example, in the case of the above-mentioned "Another email was received while the report was being created, and the document creation work was suspended to create a reply email." The history of the document creation work performed before the mail reception and the history of the document creation work resumed after the reply mail creation are one set for the purpose of the user.)

【００２３】クラスタ間類似度判定部１０では、このよ
うにクラスタ作成処理によって分断された操作履歴のク
ラスタ間の類似度を判定し、ケース作成部１１によって
類似度の高い（すなわち同じ目的で行われた可能性の高
い）クラスタを連結して意味的につながりのあるユーザ
操作列（ケース）を作成する。クラスタ間の類似度の判
定には、「同じ業務を行なっている間には同じドキュメ
ントが何度もアクセスされる」という経験則から、クラ
スタに同様のドキュメント操作が含まれているかどうか
を用いる。また経験的に「同じ業務を行なっている最中
には、同じドキュメントが何度も更新される」ことか
ら、同じドキュメントの更新操作が含まれるクラスタは
類似度を高くする、など操作の種類に応じて類似度に重
み付けも行なう。The inter-cluster similarity determination unit 10 determines the similarity between clusters in the operation history divided by the cluster creation processing as described above, and the case creation unit 11 has a high similarity (that is, the similarity is determined for the same purpose). Clusters), and create a user operation sequence (case) that is semantically connected. The similarity between clusters is determined based on an empirical rule that “the same document is accessed many times during the same business”, whether or not the cluster includes similar document operations. Empirically, since the same document is updated many times during the same business, clusters that include update operations of the same document have higher similarities, such as increasing the similarity. The similarity is weighted accordingly.

【００２４】図４は、図３に示したクラスタ間の類似度
を判定し、類似クラスタを連結して作成されたケースの
例を示したものである。図４中、ケースＩＤ（ｃ１）
は、ケースとして連結されたクラスタ群を管理するため
に割り当てられる番号である。たとえば、ケースＩＤ３
２は、クラスタＩＤ１２１，１２２，１２４を纏めて発
生時間順に並べ替えたものになっている。クラスタ１２
１と１２２とには同じドキュメントＡ．ｄｏｃを参照し
た操作が含まれているため類似していると考えられる。
この例では、複数のユーザが共同で行なっている業務も
ケースとして纏められるように、別々のユーザの行なっ
たクラスタ間でも類似度の判定を行なっている（この場
合は、鈴木さんと佐藤さんのクラスタが同じケースに纏
められている）。クラスタ１２４の鈴木さんのクラスタ
は、クラスタ１２１とは共通のドキュメントが含まれて
いないが、クラスタ１２２で参照されているドキュメン
トＣ．ｐｐｔへのアクセスが含まれているため、類似し
ていると判断され同じケースに纏められている。作成さ
れたケースは、ケース保存部１２に蓄積される。FIG. 4 shows an example of a case in which the similarity between the clusters shown in FIG. 3 is determined, and similar clusters are connected. In FIG. 4, case ID (c1)
Is a number assigned to manage a cluster group connected as a case. For example, case ID3
Reference numeral 2 denotes a cluster ID 121, 122, and 124 that are collectively sorted in the order of occurrence time. Cluster 12
1 and 122 have the same document A.1. It is considered to be similar because an operation that refers to doc is included.
In this example, the similarity is determined between clusters performed by different users so that work performed jointly by multiple users can be summarized as a case (in this case, Mr. Suzuki and Mr. Sato) Clusters are grouped together in the same case). Mr. Suzuki's cluster of the cluster 124 does not include a document common to the cluster 121, but the document C.C. Since access to the ppt is included, it is determined that they are similar, and they are put together in the same case. The created case is stored in the case storage unit 12.

【００２５】ケース検索部１３は、クラスタ検索処理と
同様に、ユーザの要求に応じてケース保存部１２に蓄積
されたケースを検索・表示する処理を行なう。たとえば
ドキュメントＡ．ｄｏｃへのアクセスを含んだケースを
検索すると、ケース３２が発見される。ケースからは、
クラスタの場合と同様に、ドキュメントＡ．ｄｏｃとと
もにドキュメントＣ．ｐｐｔが使われることがあるとい
うことがわかるが、クラスタの場合と比べて複数の操作
履歴が含まれるため、ドキュメントアクセスの発生回数
からドキュメント間の関係に、より高い確信度があるこ
とがわかる。また、操作の発生パターンからドキュメン
トＣ．ｐｐｔに関して鈴木さんが更新し佐藤さんが参照
する関係にあることなど、クラスタでは明らかでなかっ
た関係がわかる。The case search unit 13 performs a process of searching and displaying the cases stored in the case storage unit 12 in response to a user request, similarly to the cluster search process. For example, document A. Searching for cases that include access to doc, case 32 is found. From the case,
As with the cluster, the document A. doc with document C.doc. It can be seen that ppt is sometimes used, but since a plurality of operation histories are included as compared with the case of a cluster, it can be seen from the number of occurrences of document access that there is higher certainty in the relationship between documents. Further, the document C.1 is obtained from the operation occurrence pattern. The relationship that was not clear in the cluster, such as the relationship that Mr. Suzuki updated regarding ppt and the relationship referred to by Mr. Sato, can be understood.

【００２６】図５は、この実施形態に係る情報システム
の構成を示す図である。FIG. 5 is a diagram showing the configuration of the information system according to this embodiment.

【００２７】メインメモリ１０１は、実行中のプログラ
ムとプログラムの実行に必要な制御用のデータとを記憶
するためのものであり、ハードディスク１０２は、制御
プログラムと、管理データおよび操作履歴とを記憶する
ものである。また、ＣＰＵ１０３は、装置全体の動作を
制御するものであり、システムが起動されるとハードデ
ィスク１０２に記憶されたプログラムをメインメモリ１
０１上に読み出し、その内容にしたがって制御を行な
う。The main memory 101 stores a program being executed and control data necessary for executing the program. The hard disk 102 stores a control program, management data, and operation history. Things. The CPU 103 controls the operation of the entire apparatus. When the system is started, the program stored in the hard disk 102 is stored in the main memory 1.
01 and control is performed according to the contents.

【００２８】ユーザとの対話処理には、入力装置として
キーボード１０４を使用し、入力された操作の結果をデ
ィスプレイ１０５に表示する。ドキュメントを出力する
手段としては、プリンタ１０６が接続されている。ま
た、この情報システムは、ネットワークインターフェー
ス１０７を介してネットワークに接続され、ネットワー
クから入力された操作に対しての処理を行なったり、ネ
ットワークに処理結果を出力すること、またはネットワ
ークを介してドキュメントをメール／ファックスで送信
することも可能である。For the interaction with the user, the keyboard 104 is used as an input device, and the result of the input operation is displayed on the display 105. As means for outputting a document, a printer 106 is connected. Further, this information system is connected to a network via a network interface 107, and performs processing for operations input from the network, outputs processing results to the network, or mails documents via the network. / Fax transmission is also possible.

【００２９】これらの各部は、システムバスにより接続
されており、ＣＰＵ１０３が動作制御するオペレーティ
ングシステムにより制御される。そして、前述したこの
実施形態の各機能は、同オペレーティングシステム上で
動作するアプリケーションプログラムとして実現されて
いる。These units are connected by a system bus, and are controlled by an operating system controlled by the CPU 103. The functions of the above-described embodiment are realized as application programs that operate on the operating system.

【００３０】次に、この情報システムの処理内容を詳細
に説明する。Next, the processing contents of this information system will be described in detail.

【００３１】図６は、この情報システムのメインフロー
である。FIG. 6 is a main flow of the information system.

【００３２】この情報システムは、その起動時、システ
ム全体の初期化処理を行ない（ステップＡ１）、ユーザ
操作の受け付けを開始する（ステップＡ２）。When this information system is activated, it initializes the entire system (step A1) and starts accepting user operations (step A2).

【００３３】ユーザがシステムに対してドキュメントの
作成、参照、更新または出力などの操作要求を行なうと
（ステップＡ３のＹＥＳ）、この情報システムは、要求
された操作内容を操作履歴保存部６に保存した上でユー
ザ操作の処理を行なう（ステップＡ４〜ステップＡ
５）。When the user makes an operation request for creating, referring to, updating or outputting a document to the system (YES in step A3), the information system stores the requested operation content in the operation history storage unit 6. (Step A4 to Step A)
5).

【００３４】ユーザからの要求が、処理パラメータの設
定である場合（ステップＡ６のＹＥＳ）、この情報シス
テムは、図７に示すようなダイアログを表示して、処理
パラメータの設定を行なう（ステップＡ７）。If the request from the user is to set a processing parameter (YES in step A6), the information system displays a dialog as shown in FIG. 7 and sets the processing parameter (step A7). .

【００３５】図７中、クラスタ時間範囲（ｄ１）は、一
つのクラスタの時間的な大きさの最大値を設定するもの
であり、履歴データは、ここに設定された時間範囲を超
えないクラスタに分割される。クラスタエントリ数（ｄ
２）は、一つのクラスタに含まれる操作履歴の数の最大
値を定めるものであり、ドキュメントのアクセスがバー
スト的に発生したような場合でも、クラスタが大きくな
りすぎるのを防ぐためのものである。In FIG. 7, the cluster time range (d1) sets the maximum value of the temporal size of one cluster, and the history data is set to a cluster that does not exceed the time range set here. Divided. Number of cluster entries (d
2) is to determine the maximum value of the number of operation histories included in one cluster, and to prevent the cluster from becoming too large even when document access occurs in a burst. .

【００３６】区切り操作（ｄ３）は、操作の区切りとし
て検出すべき操作の種類を定めるものであり、この例で
は、ドキュメントの作成・更新操作とドキュメントの削
除操作とがクラスタの区切りとして設定されている。The delimiter operation (d3) determines the type of operation to be detected as an operation delimiter. In this example, a document creation / update operation and a document delete operation are set as cluster delimiters. I have.

【００３７】また、ユーザの要求がクラスタの検索処理
であった場合（ステップＡ８）、この情報システムは、
操作履歴保存部６に蓄積された履歴データがあれば、履
歴データからクラスタを作成する履歴解析処理を行ない
（ステップＡ９）、クラスタ保存部８に蓄積されたクラ
スタからユーザの要求にマッチするクラスタの検索と表
示処理とを行なう（ステップＡ１０〜ステップＡ１
１）。図８は、履歴データからクラスタを作成する処理
の処理フローを示したものである。If the user's request is a cluster search process (step A8), this information system
If there is history data stored in the operation history storage unit 6, a history analysis process of creating a cluster from the history data is performed (step A9), and a cluster matching the user's request is selected from the clusters stored in the cluster storage unit 8 (step A9). Perform search and display processing (step A10 to step A1)
1). FIG. 8 shows a processing flow of processing for creating a cluster from history data.

【００３８】クラスタ作成処理が開始されると、操作履
歴保存部６に発生時間順に蓄積された履歴データを操作
を行なったユーザ別に分割する（ステップＢ１）。次
に、ユーザ別に分割された履歴列の中に、区切り操作と
して設定されている操作があれば、その操作の直後でさ
らに履歴列を分割する（ステップＢ２）。When the cluster creation process is started, the history data stored in the operation history storage unit 6 in the order of occurrence time is divided for each user who has performed an operation (step B1). Next, if there is an operation set as a delimiter operation in the history sequence divided for each user, the history sequence is further divided immediately after the operation (step B2).

【００３９】また、分割された結果の履歴群のそれぞれ
について、先頭の操作履歴の発生時間から最後の操作履
歴の発生時間までが、登録された「クラスタ時間範囲」
に収まるように、また、クラスタ内に含まれる操作履歴
のエントリ数が、登録されたクラスタエントリ数」以下
になるように履歴群を分割する（ステップＢ３）。そし
て、ここまでの処理で作成されたクラスタにクラスタＩ
Ｄをつけてクラスタ保存部８に記録し、操作履歴保存部
６に蓄積された履歴データをすべて削除する（ステップ
Ｂ４）。For each of the divided result histories, the registered “cluster time range” indicates the time from the occurrence of the first operation history to the occurrence of the last operation history.
The history group is divided such that the number of entries in the operation history included in the cluster is equal to or less than the number of registered cluster entries (step B3). Then, the cluster I created by the processing up to this point
A record is added to the cluster storage unit 8 with D attached, and all the history data stored in the operation history storage unit 6 is deleted (step B4).

【００４０】この実施形態では、一旦クラスタ作成のた
めに処理された履歴データはクラスタ保存部８に移動さ
れるため、検索時にクラスタ作成処理の対象となるのは
前回のクラスタ作成処理以降に発生した履歴データのみ
であり、解析処理にかかる負荷を分散している。In this embodiment, since the history data once processed for cluster creation is moved to the cluster storage unit 8, the target of the cluster creation processing at the time of retrieval has occurred since the previous cluster creation processing. It is only history data, and the load on analysis processing is distributed.

【００４１】図９は、ユーザがクラスタ検索要求を行な
う際に検索パラメータを設定する画面の例である。この
例では、操作を行なったユーザ、操作内容に含まれるド
キュメント各、ドキュメントへの操作内容および操作の
発生した時期（ｅ１〜ｅ４）のうち、必要であるものを
選択的に指定してクラスタの検索を行なう。指定できる
パラメータの値が複数ある場合には、ＡＮＤ、ＯＲの指
定やワイルドカードを指定する。FIG. 9 is an example of a screen for setting search parameters when a user makes a cluster search request. In this example, the user who has performed the operation, each of the documents included in the operation content, the operation content of the document, and the time at which the operation occurred (e1 to e4) are selectively designated as necessary, and the cluster is designated. Perform a search. If there are a plurality of parameter values that can be specified, AND and OR specifications and wildcards are specified.

【００４２】図１０は、ユーザが指定したパラメータを
満たすクラスタの検索結果を表示する画面の例である。
この例では、「鈴木さん」が、「更新または印刷」を行
った操作履歴を含み、「１２／２３〜１２／３１」の間
に発生したクラスタを一覧で表示している。FIG. 10 is an example of a screen displaying a search result of a cluster satisfying a parameter designated by the user.
In this example, “Mr. Suzuki” includes the operation history of “update or print”, and displays a list of clusters generated between “12/23 to 12/31”.

【００４３】そして、ユーザからの要求がシステムの終
了処理であった場合（ステップＡ１２のＹＥＳ）、この
情報システムは、管理データおよび履歴データをハード
ディスク１０２上に退避して、終了処理を実行する（ス
テップＡ１３）。When the request from the user is a system termination process (YES in step A12), the information system saves the management data and the history data on the hard disk 102 and executes the termination process ( Step A13).

【００４４】このように、この実施形態の情報システム
によれば、操作履歴保存部６に蓄積された膨大な量の操
作履歴を、少ない処理量で、ユーザが同じ目的で行なっ
た処理の履歴が纏められたクラスタに変換することがで
き、ユーザは、このクラスタを効率的に検索し参照する
ことが可能となる。As described above, according to the information system of this embodiment, the enormous amount of operation history stored in the operation history storage unit 6 can be converted into a history of processes performed by the user for the same purpose with a small amount of processing. The cluster can be converted into a grouped cluster, and the user can efficiently search and refer to the cluster.

【００４５】次に、前述したクラスタの作成に加えて、
さらに類似クラスタを連結させてケースを作成する場合
を説明する。この場合、前述の処理に加えて、クラスタ
同士を比較してクラスタ間の類似度を判定する処理と、
類似したクラスタを連結してケースを作成する処理とが
追加される。Next, in addition to the above-described cluster creation,
Further, a case in which similar clusters are linked to create a case will be described. In this case, in addition to the above-described processing, processing for comparing clusters to determine the degree of similarity between clusters,
A process of linking similar clusters to create a case is added.

【００４６】図１１は、クラスタ間の類似度を判定する
処理のフローを示している。FIG. 11 shows a flow of processing for determining the similarity between clusters.

【００４７】この場合、この情報システムは、まず、ク
ラスタ保存部８に蓄積されたクラスタ間の類似度をすべ
て０に初期化する（ステップＣ１）。次に、クラスタ群
の中から一つのクラスタを選択し（ステップＣ２）、そ
のクラスタと「最大ケース時間範囲」に設定された範囲
内に発生したクラスタすべてのクラスタそれぞれの類似
度を計算する（ステップＣ３）。In this case, the information system first initializes all the similarities between clusters stored in the cluster storage unit 8 to 0 (step C1). Next, one cluster is selected from the cluster group (step C2), and the degree of similarity between the cluster and all clusters generated within the range set in the “maximum case time range” is calculated (step C2). C3).

【００４８】類似度の計算は、主にクラスタに含まれる
ドキュメントおよびドキュメントに対する操作の種類の
比較によって行なう。図１２は、クラスタ間の類似度の
計算に用いるパラメータを設定する画面の例を示したも
のである。The similarity is calculated mainly by comparing the documents included in the cluster and the types of operations on the documents. FIG. 12 shows an example of a screen for setting parameters used for calculating the similarity between clusters.

【００４９】図１２中、ケース時間範囲（ｆ１）は、前
述したように、ケースとして連結することが可能なクラ
スタの発生時間範囲を設定するものである。この時間を
設定することで不必要に長いケースを作成することがな
くなり、また、時間範囲内のクラスタだけの比較に限定
することで比較処理の負荷を抑えることができる。In FIG. 12, the case time range (f1) sets the generation time range of clusters that can be connected as a case, as described above. By setting this time, an unnecessarily long case is prevented from being created, and the load of the comparison processing can be suppressed by limiting the comparison to only the clusters within the time range.

【００５０】類似度判定パラメータ（ｆ２）は、クラス
タ間の類似度を計算するために用いられる。たとえば、
図１２中に設定された値を用いると、二つのクラスタを
比較して、あるドキュメントが双方のクラスタで参照さ
れていた場合は類似度を５ポイント増加し、あるドキュ
メントが双方のクラスタで更新されていた場合は類似度
を１００ポイント増加する。また、あまり関係のないク
ラスタが類似度が高いと判定されるのを防ぐために、一
方のクラスタで参照されているドキュメントがもう一方
のクラスタではアクセスされていない場合に、逆に類似
度を２ポイント減少する。The similarity determination parameter (f2) is used to calculate the similarity between clusters. For example,
Using the values set in FIG. 12, two clusters are compared, and if a certain document is referred to by both clusters, the similarity is increased by 5 points, and a certain document is updated by both clusters. If so, the similarity is increased by 100 points. In order to prevent a cluster that is not closely related from being judged to have a high similarity, when a document referred to by one cluster is not accessed by another cluster, the similarity is conversely increased by 2 points. Decrease.

【００５１】たくさんのドキュメントアクセス履歴を含
むクラスタはそれだけ多くのクラスタと共通したアクセ
ス履歴を含むことが多く、類似度が不必要に高くなって
しまうことを避けるため、クラスタに含まれる共通履歴
の類似度の累積として計算されたクラスタ間の類似度を
最後に、そのクラスタに含まれる履歴の数で除算してク
ラスタ間の類似度とする。A cluster including a large number of document access histories often includes an access history common to many clusters, and in order to avoid an unnecessary increase in similarity, the similarity of the common history included in the cluster is reduced. Finally, the similarity between clusters calculated as the accumulation of degrees is finally divided by the number of histories included in the cluster to obtain the similarity between clusters.

【００５２】また、図１２で設定される残り２つのパラ
メータは、後のケース作成処理で用いるものである。ク
ラスタ連結しきい値（ｆ３）は、クラスタ間の類似度と
の比較に用い、２つのクラスタを連結するかどうかの判
断に用いる。複数ユーザのクラスタを連結するかどうか
の設定（ｆ４）は、連結するクラスタを同一ユーザの行
なった操作に限定するかどうかを制御するために用い
る。The remaining two parameters set in FIG. 12 are used in the case creation processing later. The cluster connection threshold value (f3) is used for comparison with the similarity between clusters, and is used for determining whether to connect two clusters. The setting (f4) of whether or not to connect clusters of a plurality of users is used to control whether or not to limit the connected clusters to operations performed by the same user.

【００５３】そして、この情報システムは、すべてのク
ラスタについて、前述した類似度計算を実行する（ステ
ップＣ４）。Then, the information system executes the above-described similarity calculation for all clusters (step C4).

【００５４】図１３は、類似度をもとにクラスタを連結
してケースを作成する処理のフローを示したものであ
る。FIG. 13 shows a flow of processing for creating a case by connecting clusters based on similarity.

【００５５】まず、この情報システムは、クラスタ保存
部８に蓄積されたクラスタから最も過去に発生した（ク
ラスタの先頭の操作履歴の発生時間が最も古い）クラス
タ（クラスタＡとする）を選択し、新規に作成したケー
ス（ケースＡとする）にそのクラスタの内容を登録する
（ステップＤ１）。First, the information system selects a cluster (the cluster A having the earliest occurrence time of the operation history at the head of the cluster) that has occurred most recently from the clusters stored in the cluster storage unit 8, The contents of the cluster are registered in a newly created case (referred to as case A) (step D1).

【００５６】次に、クラスタ保存部８の残りのクラスタ
の内、ケースＡの先頭の操作履歴の発生時間から、「ケ
ース時間範囲」内に発生したクラスタを、発生時間の古
い順に選び（クラスタＢとする）、クラスタＡまたはケ
ースＡと結合するかどうかの判定を行なう（ステップＤ
２）。ケース作成処理のパラメータとして、「複数ユー
ザのクラスタを連結しない」と指定されている場合に
は、クラスタＢとして選択するのはクラスタＡの操作を
行なったユーザと同じユーザが行ったクラスタに限定す
る。Next, from the remaining clusters in the cluster storage unit 8, clusters that have occurred within the “case time range” are selected in the order of the generation time from the generation time of the operation history at the head of case A (cluster B). ), It is determined whether or not to combine with cluster A or case A (step D).
2). If "do not link clusters of multiple users" is specified as a parameter of the case creation process, the selection as cluster B is limited to the cluster performed by the same user who performed the operation of cluster A. .

【００５７】判定方法としては、たとえば、クラスタＡ
とクラスタＢの類似度が、「クラスタ連結しきい値」よ
りも大きい場合クラスタが連結可能であると判定する。
あるいは、クラスタＢと、すでにケースに組み込まれて
いるすべてのクラスタとの類似度を平均して、ケースＡ
とクラスタＢの類似度とし、「クラスタ連結しきい値」
よりも大きい場合にクラスタが連結可能であると判定す
る。さらに、クラスタＢと、すでにケースに組み込まれ
ているすべてのクラスタとの類似度を平均する際、ケー
ス内のクラスタの発生時間とクラスタＢの発生時間の差
を類似度の重みづけとして用いることも有用である。た
とえば、クラスタ１とクラスタ２との発生時間の差を
「ケース時間範囲」で割ったものをクラスタ１〜２間の
類似度に掛けたものをクラスタ１〜２の重みづけされた
類似度として用いる。そして、この重みづけされた類似
度が「クラスタ連結しきい値」よりも大きい場合にクラ
スタが連結可能であると判定する。As a determination method, for example, the cluster A
If the similarity between the cluster and the cluster B is larger than the “cluster connection threshold value”, it is determined that the clusters can be connected.
Alternatively, by averaging the similarities between cluster B and all the clusters already included in the case,
"Cluster connection threshold"
If it is larger than this, it is determined that the clusters can be connected. Further, when averaging the similarity between the cluster B and all the clusters already included in the case, the difference between the occurrence time of the cluster in the case and the occurrence time of the cluster B may be used as the weight of the similarity. Useful. For example, a value obtained by dividing the difference between the occurrence times of clusters 1 and 2 by the “case time range” and multiplying the similarity between clusters 1 and 2 is used as the weighted similarity of clusters 1 and 2. . Then, when the weighted similarity is greater than the “cluster connection threshold value”, it is determined that clusters can be connected.

【００５８】クラスタＢがケースＡに連結可能であると
判定されると（ステップＤ２のＹＥＳ）、クラスタＢを
ケースＡに連結する（ステップＤ３）。また、クラスタ
Ｂを新たにクラスタＡとし、ステップＤ２に戻って次の
クラスタの選択、結合可能性の判定を続ける。If it is determined that cluster B can be connected to case A (YES in step D2), cluster B is connected to case A (step D3). Further, cluster B is newly set as cluster A, and the process returns to step D2 to continue selection of the next cluster and determination of the possibility of combination.

【００５９】一方、ケースＡに連結可能なクラスタが見
つからなかった場合（ステップＤ２のＮＯ）、ケースＡ
に含まれる操作履歴を時間順に並べ替え、ケースＩＤを
付けてケース保存部１２に登録し、ケースＡに登録され
たクラスタが他のケースに含まれないように登録済みの
クラスタをクラスタ保存部８から削除する（ステップＤ
４）。On the other hand, if no cluster connectable to case A is found (NO in step D2),
Are sorted in chronological order, registered with the case storage unit 12 with a case ID, and the registered cluster is stored in the cluster storage unit 8 so that the cluster registered in case A is not included in other cases. (Step D
4).

【００６０】そして、すべてのクラスタについてのケー
ス化の判断が終了するまで、ステップＤ１〜ステップＤ
４の処理を繰り返す。Steps D1 to D are executed until the case determination for all clusters is completed.
Step 4 is repeated.

【００６１】なお、ケース作成処理を行なった場合のユ
ーザの履歴検索操作は、ケース保存部１２に対して実行
される。検索対象、表示内容がクラスタではなくケース
であることを除けば、処理の内容は前述のクラスタ検索
・表示処理と同様に行なわれる。The user's history search operation when the case creation process is performed is executed on the case storage unit 12. Except that the search target and display contents are not clusters but cases, the processing contents are performed in the same manner as the above-described cluster search / display processing.

【００６２】また、前述の方法では、ケース作成処理
は、ユーザがケースの検索を要求するたびに行なわれる
ため、ケース検索処理が頻繁に行なわれるとケース作成
処理がごく少数の履歴クラスタに対して行なわれること
になり、十分連結されていない短いケースが作成されて
しまうことになる。これを避けるためには、ケースを登
録する際に、ケース作成時の時間とケースに登録された
最初の操作履歴の発生時間との差が「ケース時間範囲」
よりも小さいケースは、ケースとして登録せずにクラス
タに戻してクラスタ保存部８に戻す処理を行なえばよ
い。In the above-described method, the case creation process is performed every time a user requests a case search. Therefore, if the case search process is performed frequently, the case creation process is performed on a very small number of history clusters. This will result in a short case that is not well connected. To avoid this, when registering a case, the difference between the time when the case was created and the time when the first operation history registered in the case occurred is the "case time range"
For smaller cases, a process of returning to the cluster without registering as a case and returning to the cluster storage unit 8 may be performed.

【００６３】このように、この実施形態の情報システム
によれば、クラスタ保存部８に蓄積されたクラスタを、
互いに類似するクラスタ同士が連結されたより大きな集
合体であるケースに変換することができ、ユーザは、こ
のケースを効率的に検索し参照することが可能となる。As described above, according to the information system of this embodiment, the clusters stored in the cluster
It can be converted into a case that is a larger aggregate in which similar clusters are connected to each other, and the user can efficiently search and refer to this case.

【００６４】[0064]

【発明の効果】以上詳述したように、この発明によれ
ば、ドキュメントファイルに対して行なわれる操作の特
徴を用いて、履歴の蓄積・解析に関してユーザに負担を
かけることなく、小さい解析負荷で意味のある履歴集合
を作成することが可能となる。As described above in detail, according to the present invention, the characteristics of the operation performed on the document file can be used with a small analysis load without putting a burden on the user regarding the accumulation and analysis of the history. It is possible to create a meaningful history set.

【００６５】また、この作成した履歴集合をユーザに表
示することで、履歴の直接検索では得られないドキュメ
ント、ユーザおよび操作内容の関連を検出することが可
能となる。Further, by displaying the created history set to the user, it becomes possible to detect the relationship between the document, the user, and the operation content which cannot be obtained by the direct search of the history.

【００６６】さらに、クラスタを用いて類似度を判定す
ることで、少ない負荷で類似度の判定を行ない、操作履
歴のグループ化を行なうことが可能となる。Further, by determining the similarity using the cluster, the similarity can be determined with a small load, and the operation histories can be grouped.

[Brief description of the drawings]

【図１】この発明の実施形態に係る情報システムの機能
ブロック図。FIG. 1 is a functional block diagram of an information system according to an embodiment of the present invention.

【図２】同実施形態の操作履歴保存部に記録される操作
履歴データを例示する図。FIG. 2 is an exemplary view illustrating operation history data recorded in an operation history storage unit according to the embodiment;

【図３】同実施形態の履歴クラスタ作成部により作成さ
れるクラスタを例示する図。FIG. 3 is an exemplary view illustrating a cluster created by a history cluster creating unit according to the embodiment;

【図４】同実施形態のケース作成部により作成されるケ
ースを例示する図。FIG. 4 is an exemplary view exemplifying a case created by a case creating unit according to the embodiment;

【図５】同実施形態の情報システムの構成を示す図。FIG. 5 is an exemplary view showing the configuration of the information system of the embodiment.

【図６】同実施形態の情報システムのメインフロー。FIG. 6 is a main flow of the information system of the embodiment.

【図７】同実施形態のクラスタ作成に用いられる処理パ
ラメータの設定画面を例示する図。FIG. 7 is an exemplary view showing an example of a processing parameter setting screen used for cluster creation according to the embodiment;

【図８】同実施形態の履歴データからクラスタを作成す
る処理の処理フロー。FIG. 8 is a processing flow of processing for creating a cluster from history data according to the embodiment;

【図９】同実施形態のクラスタ検索要求を行なう際に検
索パラメータを設定する画面を例示する図。FIG. 9 is an exemplary view showing a screen for setting a search parameter when making a cluster search request according to the embodiment;

【図１０】同実施形態の指定されたパラメータを満たす
クラスタの検索結果を表示する画面を例示する図。FIG. 10 is an exemplary view showing an example of a screen displaying search results of clusters satisfying designated parameters according to the embodiment;

【図１１】同実施形態のクラスタ間の類似度を判定する
際の処理フロー。FIG. 11 is a processing flow when determining the similarity between clusters according to the embodiment;

【図１２】同実施形態のクラスタ間の類似度の計算に用
いるパラメータを設定する画面を例示する図。FIG. 12 is an exemplary view showing an example of a screen for setting parameters used for calculating similarity between clusters according to the embodiment;

【図１３】同実施形態の類似度をもとにクラスタを連結
してケースを作成する際の処理フロー。FIG. 13 is a processing flow of creating a case by connecting clusters based on the similarity according to the embodiment;

[Explanation of symbols]

１…入力装置２…表示装置３…出力装置４…ドキュメント管理部５…装置モニタ部６…操作履歴保存部７…履歴クラスタ作成部８…クラスタ保存部９…クラスタ検索部１０…クラスタ間類似度判定部１１…ケース作成部１２…ケース保存部１３…ケース検索部 REFERENCE SIGNS LIST 1 input device 2 display device 3 output device 4 document management unit 5 device monitor unit 6 operation history storage unit 7 history cluster creation unit 8 cluster storage unit 9 cluster search unit 10 similarity between clusters Judgment unit 11 Case creation unit 12 Case storage unit 13 Case search unit

───────────────────────────────────────────────────── フロントページの続き (72)発明者原口竜也神奈川県川崎市幸区柳町70番地株式会社東芝柳町工場内Ｆターム(参考） 5B075 ND03 ND20 NK04 NR10 NR12 PP02 PP03 PR03 PR04 PR06 UU06 5B082 DD04 FA11 ────────────────────────────────────────────────── ─── Continued on the front page (72) Inventor Tatsuya Haraguchi 70 Yanagimachi, Saiwai-ku, Kawasaki-shi, Kanagawa F-term in Toshiba Yanagimachi Plant (reference) 5B075 ND03 ND20 NK04 NR10 NR12 PP02 PP03 PR03 PR04 PR06 UU06 5B082 DD04 FA11

Claims

[Claims]

1. A document management system having a function of accumulating an operation history for a document file, wherein the accumulated operation history is classified according to a predetermined rule, and a cluster is an aggregate having the operation history as an element. A document management system, comprising: a cluster creation unit that creates a document.

2. The cluster creating means creates a cluster by dividing the accumulated operation history for each operator, and then dividing the operation history divided for each operator for each predetermined time range. 2. The document management system according to claim 1, wherein:

3. The method according to claim 1, wherein the cluster creation unit divides the accumulated operation histories for each operator, and divides the operation histories for each operator into a predetermined number to create a cluster. The document management system according to claim 1, wherein

4. The method according to claim 1, wherein the cluster creation unit divides the accumulated operation history for each operator, and then creates a cluster by dividing the operation history divided for each operator by using a specific operation as a delimiter. The document management system according to claim 1, wherein:

5. The document management system according to claim 1, wherein said cluster creating means further comprises a cluster setting means for setting a rule for classifying operation histories.

6. A cluster search means for detecting a cluster satisfying a specified search condition from among a plurality of clusters created by the cluster creation means, and a cluster display means for displaying clusters detected by the cluster search means. The method according to claim 1, further comprising:
The document management system according to 2, 3, 4 or 5.

7. The document according to claim 1, further comprising: a similarity calculating unit that calculates a similarity between clusters created by the cluster creating unit. Management system.

8. The method according to claim 1, wherein the similarity calculating means calculates a similarity between the clusters from attribute information including an operation target, an operator, an operation date and time, and an operation content of an operation history constituting the cluster. Item 7. The document management system according to Item 7.

9. A method according to claim 1, wherein said similarity calculating means sets a value obtained by subtracting the number of inconsistencies of documents appearing only in one of the clusters from the number of matches of documents appearing in two clusters as the similarity between clusters. The document management system according to claim 7, wherein

10. The similarity calculating means performs a weighting value for each combination of operations performed on the same document from two clusters and a weighting value for any document from only one of the clusters. Of the documents appearing in only one of the clusters weighted by the weighting value based on the number of matches of documents appearing in the two clusters weighted by the weighting value. The document management system according to claim 7, wherein a value obtained by subtracting the number is used as a similarity between clusters.

11. The method according to claim 9, wherein the similarity calculating means averages the calculated values by the number of operation histories included in the clusters to obtain a similarity between the clusters.
Or the document management system according to 10.

12. The method according to claim 9, wherein the similarity calculating unit sets a value obtained by weighting the calculated value in accordance with a temporal distance of a cluster occurrence time as a similarity between clusters. 10. The document management system according to item 10.

13. The document management system according to claim 10, further comprising weighting value setting means for setting a weighting value used by said similarity calculating means to calculate the similarity.

14. A case creating means for creating a case that is a series of operation history groups by combining clusters created by the cluster creating means using the similarity between clusters calculated by the similarity calculating means. 14. The document management system according to claim 7, further comprising:

15. The document management system according to claim 14, wherein said case creating means includes means for joining closest clusters having similarities exceeding a predetermined threshold value. .

16. The case creation means, when an average of similarities between a single cluster and each of the clusters included in the already combined cluster group exceeds a predetermined threshold, 15. A means for coupling said cluster to said cluster group.
Document management system described.

17. When calculating the average of the similarity between a single cluster and each of the clusters included in the already-combined cluster group, the case creating means calculates the similarity between the clusters by the time of occurrence of the cluster. 17. The document management system according to claim 16, wherein weighting is performed according to the distance.

18. The document according to claim 16, wherein the case creating means determines only a cluster generated within a predetermined time range from a single cluster as a similarity determination target. Management system.

19. The apparatus according to claim 15, further comprising a case setting means for setting a threshold value, a weight value, or a time range used by said case creating means to combine clusters. , 17 or 18.

20. A case search means for detecting a case satisfying a specified search condition from a plurality of cases created by the case creation means, and a case display means for displaying the cases detected by the case search means. 14. The method according to claim 14, further comprising:
The document management system according to 6, 17, 18 or 19.