JP2009211603A

JP2009211603A - Document search system

Info

Publication number: JP2009211603A
Application number: JP2008056023A
Authority: JP
Inventors: Shinsuke Nakazawa; 真介中澤
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2008-03-06
Filing date: 2008-03-06
Publication date: 2009-09-17

Abstract

<P>PROBLEM TO BE SOLVED: To allow related information showing that a plurality of documents are related to each other to be shared among a plurality of users while maintaining secrecy of the respective documents. <P>SOLUTION: A conversion processing part 22 generates secrecy related information for each related document group. The secrecy related information includes a plurality of secrecy identification data (sets of secrecy identification values) corresponding to the plurality of documents. Respective sets of secrecy identification values comprise a plurality of hash values generated by the related documents. A plurality of secrecy related information are registered on a database 32. Search is performed with a hash value or a set of hash values of an attention document as a search key to specify the hash value or a set of hash values with respect to the related document related to the attention document. It is returned to a requester client as a search result and the related document is predetermined by a requester client. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は文書検索システムに関する。 The present invention relates to a document search system.

電子的な文書の検索処理においては、例えば、検索の基礎をなす文書（以下、注目文書という）が指定され、多数の文書の中から、注目文書に関連する１又は複数の文書（以下、関連文書という）が選び出される。例えば、特許文献１には、注目文書中の語句（キーワード）を検索キーとして利用し、同じ語句を有する文書が関連文書として特定されている。その他にも従来から多様な検索技術が提案されている。 In an electronic document search process, for example, a document that forms the basis of a search (hereinafter referred to as a document of interest) is specified, and one or a plurality of documents (hereinafter referred to as related documents) related to the document of interest from a large number of documents. Document) is selected. For example, in Patent Document 1, a word (keyword) in a document of interest is used as a search key, and a document having the same word is specified as a related document. Various other search techniques have been proposed.

一方、相互に関連する複数の文書を特定する「関連情報」（複数文書間の関連情報）が事前にあるいは必要な時点で生成されるならば、その関連情報を活用して関連文書の検索を円滑に遂行できる。例えば、文書Ａと文書Ｂが相互に関連していることを示す関連情報が事前に登録されている場合、当該関連情報に基づいて、注目文書である文書Ａから関連文書である文書Ｂを直ちに特定することができる。逆の場合も同様である。そのような関連情報は、複数のユーザーが管理する多数の文書にわたってグローバルに作成されることもあるが、個々のユーザーごとにローカルで作成されることもある。例えば、各ユーザーが専用のワークスペース（文書配置空間）を使って文書処理を遂行している過程で、ワークスペースごとに１又は複数の関連情報がユーザー入力によりあるいは自動的に生成されることもあり得る。そのようなローカルワークスペースを基礎として生成された関連情報は、通常、各ユーザーによって管理、支配される情報であり、あるいは、他のユーザーをして積極的に利用困難な情報である。 On the other hand, if "related information" (related information between multiple documents) that identifies multiple documents that are related to each other is generated in advance or at a required time, search for related documents using the related information. Can be carried out smoothly. For example, when related information indicating that the document A and the document B are mutually related is registered in advance, the document B that is the related document is immediately converted from the document A that is the target document based on the related information. Can be identified. The same applies to the reverse case. Such related information may be created globally across multiple documents managed by multiple users, but may also be created locally for each individual user. For example, in the process where each user performs document processing using a dedicated workspace (document placement space), one or a plurality of related information may be generated by user input or automatically for each workspace. possible. The related information generated on the basis of such a local workspace is usually information that is managed and controlled by each user, or information that is difficult for other users to actively use.

特開平１１−５３３９２号公報JP-A-11-53392 特開２００１−３４４２４５号公報JP 2001-344245 A

上記の関連情報は一種の情報財であり、特定のユーザーを超えて他のユーザーにおいても利用価値があるといえる場合が多い。そこで、関連情報の利用促進つまり共用化を図ることが望まれる。 The above-mentioned related information is a kind of information goods, and it can be said that it is useful for other users beyond a specific user. Therefore, it is desirable to promote the use of related information, that is, to share it.

しかしながら、各ユーザーの管理下にあった関連情報をみだりに共有あるいは公開すると、情報の安全性（セキュリティ）の観点から見て、大きな問題が生じるおそれがある。例えば、関連情報が複数の文書の内容や所在を特定可能な記述として構成されている場合であって、いずれかの文書が機密文書又はそれに準ずるような文書であるとき、当該関連情報を無造作に公開してしまうと、他人に対して秘匿しておきたい文書又はそれに関する情報が、その文書の管理者の意に反して流出してしまうおそれがある。そこまでの機密性が求められていない文書であっても、一般に、文書の管理者（通常、作成者）の意図を確認することなく、文書の内容あるいは所在情報を公開することには慎重でなければならない。 However, if the relevant information under the control of each user is shared or disclosed in a generous manner, there is a risk that a big problem will occur from the viewpoint of information security. For example, when related information is structured as a description that can specify the contents and locations of multiple documents, and any document is a confidential document or a document that conforms to it, the related information is made If it is made public, there is a risk that a document that should be kept secret from others or information related thereto will leak out against the will of the administrator of the document. Even documents that do not require confidentiality are generally cautious about disclosing the contents or location information of a document without confirming the intention of the document administrator (usually the creator). There must be.

もっとも、個々の関連情報あるいは個々の文書について、個別的に公開の可否あるいは公開条件をきめ細かく設定しておけば、上記のような安全面での問題を回避できるかも知れない。しかしながら、そのような設定はユーザーにとって非常に煩雑であり、そのような仕組みでは、結局、関連情報の共有を促進することはできない。特許文献２には、ユーザーから検索要求が出された場合に、当該ユーザーについてのアクセス権をチェックした上で検索を実行する技術が記載されているが、アクセス権の事前の個別的な設定が必要である。 Of course, if individual disclosure information or individual documents are individually set as to whether disclosure is possible or disclosure conditions are finely set, the above-mentioned safety problems may be avoided. However, such setting is very complicated for the user, and with such a mechanism, sharing of related information cannot be promoted after all. Patent Document 2 describes a technique for executing a search after checking an access right for a user when a search request is issued from the user. is necessary.

本発明の目的は、文書検索システムにおいて、文書についての安全性を確保しつつも関連情報の利用の促進を図ることにある。 An object of the present invention is to promote the use of related information while ensuring the safety of a document in a document search system.

本発明は、関連する複数の文書で構成される関連文書グループごとに、当該関連文書グループを構成する複数の文書に基づいて、それらの文書を秘匿しつつ識別するための複数の秘匿識別データからなる秘匿関連情報を生成する生成手段と、複数のユーザー間で各秘匿関連情報を共有するために複数の秘匿関連情報が登録されるデータベースと、注目文書に関連する関連文書を検索する要求が要求元ユーザーから発行された場合に、前記注目文書についての秘匿識別データを用いて前記データベースを検索することにより、前記関連文書についての秘匿識別データを検索結果として特定する検索手段と、前記検索結果としての秘匿識別データにより識別される関連文書に対する前記要求元ユーザーのアクセス権限に応じて、当該要求元ユーザーに対して当該関連文書に関する情報を提供する提供手段と、を含む文書検索システムに関する。 The present invention provides, for each related document group including a plurality of related documents, based on a plurality of documents constituting the related document group, from a plurality of secret identification data for identifying the documents while keeping them secret. A request for searching for a related document related to a document of interest, a database for storing a plurality of secret related information for sharing each secret related information among a plurality of users When issued from an original user, by searching the database using secret identification data about the document of interest, search means for specifying secret identification data about the related document as a search result, and the search result The requesting user according to the access authority of the requesting user to the related document identified by the confidential identification data of It relates to a document retrieval system including a providing means, the providing information relating to the relevant document for.

本発明によれば、文書検索システムにおいて、文書についての安全性を確保しつつも関連情報の利用の促進を図れる。 According to the present invention, it is possible to promote the use of related information while ensuring the safety of a document in a document search system.

（１）実施形態の概要
まず、本発明に係る好適な実施形態の概要について説明する。本実施形態に係る文書検索システムは、秘匿関連情報を生成する生成手段と、複数の秘匿関連情報が登録されるデータベースと、データベースの検索により関連文書についての秘匿識別データを特定する検索手段と、要求元ユーザーのアクセス権限に応じて、特定された秘匿識別データに対応する関連文書に関する情報を提供する提供手段と、を含むものである。 (1) Outline of Embodiment First, an outline of a preferred embodiment according to the present invention will be described. The document search system according to the present embodiment includes a generation unit that generates confidential information, a database in which a plurality of confidential information is registered, a search unit that specifies confidential identification data for a related document by searching the database, Providing means for providing information related to the related document corresponding to the identified confidential identification data in accordance with the access authority of the requesting user.

上記構成によれば、生成手段によって、関連文書グループごとに、複数の文書に基づいて複数の秘匿識別データが生成され、それらによって秘匿関連情報が構成される。個々の秘匿関連情報は、基礎となった文書を秘匿しつつそれを識別するデータである。生成された秘匿識別データは、複数のユーザーによって利用し得るデータベース上に登録される。注目文書に関連する関連文書の検索が必要となった場合、検索手段が、注目文書についての秘匿識別データを用いてデータベースを検索する。その際、望ましくは、まず注目文書についての秘匿識別データに該当する秘匿識別データがデータベース上で見出され、次にそれに関連付けた（関連文書の）秘匿識別データが特定される。検索結果としての秘匿識別データそれ自体から直ちに関連文書を特定できるものではないが、例えば、検索結果としての秘匿識別データを、要求元ユーザーがアクセス権限を有している個々の文書の秘匿識別データにつき合わせることにより、検索結果から関連文書を容易に特定できる。これに関し、各ユーザーの管理下にある個々の文献について秘匿識別データを事前に作成、保存しておいて、それが必要になった段階で直ちに当該秘匿識別データをつき合わせに利用するようにしてもよいし、あるいは、そのような秘匿識別データが必要となった時点で、基礎となる文献から秘匿識別データを生成するようにしてもよい。検索要求を発行したユーザー（要求元ユーザー）が、検索結果から特定された関連文書に対してアクセス権限を有する場合に（例えば当該関連文書が自己において管理又は保有している文書である場合に）、提供手段によって、当該関連文書に関する情報が当該ユーザーに提供されるので、そのような情報を媒介として関連文書の利用が促進される。一方、アクセス権限を有していない場合には文書の秘匿性を完全に（場合によっては、部分的にあるいは条件付きで）維持するための措置がとられるのが望ましい。 According to the above configuration, the generation unit generates a plurality of secret identification data based on a plurality of documents for each related document group, and the secret related information is configured by them. The individual secret-related information is data for identifying the underlying document while keeping it secret. The generated secret identification data is registered on a database that can be used by a plurality of users. When it is necessary to search for a related document related to the document of interest, the search means searches the database using the confidential identification data for the document of interest. At this time, desirably, the secret identification data corresponding to the secret identification data of the document of interest is first found on the database, and then the secret identification data (related document) associated therewith is specified. Although it is not possible to immediately identify the related document from the confidential identification data itself as the search result, for example, the confidential identification data of the individual document for which the requesting user has access authority By matching, the related document can be easily specified from the search result. In this regard, secret identification data is created and stored in advance for each document under the control of each user, and the secret identification data is immediately used as a match when it becomes necessary. Alternatively, the secret identification data may be generated from the underlying document when such secret identification data is required. When the user who issued the search request (requesting user) has access authority to the related document identified from the search result (for example, when the related document is a document managed or held by the user) Since the information related to the related document is provided to the user by the providing means, the use of the related document is promoted through the information. On the other hand, if the user does not have access authority, it is desirable to take measures to maintain the confidentiality of the document completely (in some cases, partially or conditionally).

以上のように、本実施形態の文書検索システムによれば、各ユーザーが保有あるいは管理している関連情報を、文書を秘匿可能な形式をもった特別な関連情報として共有することにより、今まで埋もれていた情報財をユーザー間にわたって共有して、各ユーザーにおいて利便性や作業性を向上できる。しかも個々の文書自体の秘匿性を確保できる。 As described above, according to the document search system of the present embodiment, the related information held or managed by each user is shared as special related information having a format capable of concealing the document. Information assets that have been buried can be shared among users to improve convenience and workability for each user. In addition, the confidentiality of each document itself can be ensured.

ちなみに、複数の文書間の関連性は従来から提案されているあるいは今後提案される様々な手法を用いて特定できる。データベースへの各秘匿関連情報の登録に当たっては、ユーザーに対応付けつつ各秘匿関連情報が登録されるのが望ましい。この構成によれば、複数のユーザー側から収集された複数の秘匿関連情報それ全体を検索範囲としつつも、検索結果を提供する際に、アクセス権限に応じたフィルタリングを容易に行える。また、そのような構成によれば、検索範囲をユーザー単位で絞ることも容易である。ここで、ユーザーはユーザーグループを含む概念である。但し、データベース上において、個々の秘匿関連情報へのユーザーの対応付けを必ずしも行わなくてもよい。文書作業を行う装置（例えばクライアント装置）とユーザーとが一対一に対応付けられていてもよいし、一対多又は多対一に対応付けられてもよい。ユーザー以外の他の属性を更に個々の秘匿関連情報に対応付けるようにしてもよい。アクセス権限の判断に当たっては様々な手法を利用でき、例えば、その判断が、単に自己が保有している文書であるか否かの判断であってもよいし、アクセス権限が定義されたテーブルの参照によってアクセス権限を判断してもよい。アクセス権限が肯定された場合、関連文書に関する情報として、例えば、当該関連文書の詳細情報が提供されるのが望ましい。当該関連文書の名称が表示されれば直感的にそのカテゴリーや内容を認識でき、当該関連文書の所在が表示されれば当該関連文書へのアクセスを迅速に行える。勿論、それらに代えて、あるいは、それらと共に、他の情報が表示されてもよい。例えば、検索一致度の度合い又は類型を示す情報を表示するようにしてもよい。もちろん、検索の前提として様々な検索条件を付加するようにしてもよい。 Incidentally, the relationship between a plurality of documents can be specified by using various methods that have been proposed or will be proposed in the future. In registering each secret-related information in the database, it is desirable that each secret-related information is registered while being associated with the user. According to this configuration, it is possible to easily perform filtering according to access authority when providing a search result, while using a plurality of confidential information collected from a plurality of users as a whole as a search range. Moreover, according to such a configuration, it is easy to narrow the search range in units of users. Here, the user is a concept including a user group. However, it is not always necessary to associate the user with each confidential information on the database. A device that performs document work (for example, a client device) and a user may be associated one-to-one, or may be associated one-to-many or many-to-one. Attributes other than the user may be further associated with individual confidential information. Various methods can be used to determine access authority. For example, the determination may be simply whether or not the document is owned by itself, or a reference to a table in which access authority is defined. The access authority may be determined by When the access authority is affirmed, it is desirable to provide, for example, detailed information on the related document as information on the related document. If the name of the related document is displayed, the category and contents can be recognized intuitively, and if the location of the related document is displayed, the related document can be quickly accessed. Of course, other information may be displayed instead of or together with them. For example, information indicating the degree or type of search matching degree may be displayed. Of course, various search conditions may be added as a premise of the search.

上記の各手段は望ましくは実質的にソフトウエアの機能として実現される。そのような文書検索システム用のソフトウエアは、文書検索プログラムとして提供され、具体的には、記憶媒体又はネットワークを介して情報処理システムに組み込まれる。文書検索システムが、単体の装置ではなく、サーバー（検索装置）と複数のクライアント（文書作業装置）で構成される場合、上記の生成手段は各クライアントにあってもよいが、それがサーバーにあってもよい。データベースは基本的にサーバー側に設けられるが、いずれかの文書作業装置が実質的にサーバーとして機能してもよい。上記の検索手段はサーバーにあってもよいが、各クライアントにあってもよい。上記の提供手段についても同様である。 Each of the above-described means is preferably substantially realized as a software function. Software for such a document retrieval system is provided as a document retrieval program, and specifically, is incorporated into an information processing system via a storage medium or a network. If the document search system is not a single device but a server (search device) and a plurality of clients (document work devices), the above generation means may be in each client. May be. The database is basically provided on the server side, but any document work device may substantially function as the server. The search means may be in a server or in each client. The same applies to the providing means.

望ましくは、ユーザーごとに（あるいはクライアントごとに）ワークスペース（文書配置空間）が管理されるシステムに上記構成が設けられる。その場合、個々のユーザーのローカルワークスペースごとに生成される秘匿関連情報が共有されることになる。つまり、個々のワークスペースの枠組み内において従来同様に各文書の管理を行いつつも、関連情報についてはそのような枠組みを超えて複数のユーザー間で共有することが可能となる。そのようなワークスペースは個々のクライアントで管理されてもよいし、サーバーにおいて一括管理されてもよい。複数のユーザーが並行して文書作業を行っているような場合に、それぞれのユーザーにおいて関連文書を迅速かつ容易にしかも広く特定できれば、その作業性を著しく向上できる。なお、個々の文書は各クライアントに保存されてもよいし、サーバーに保存されてもよい。あるいは、別の装置上に個々の文書が保存されてもよい。いずれの場合にも上記構成を適用することができる。 Preferably, the above configuration is provided in a system in which a work space (document placement space) is managed for each user (or for each client). In this case, confidential information generated for each local workspace of each user is shared. In other words, while managing each document in the framework of individual workspaces as before, related information can be shared among a plurality of users beyond such a framework. Such a workspace may be managed by individual clients or may be collectively managed by a server. When a plurality of users are working on documents in parallel, if each user can quickly and easily specify a related document widely, the workability can be significantly improved. Each document may be stored in each client or stored in a server. Alternatively, individual documents may be stored on another device. In any case, the above configuration can be applied.

望ましくは、各秘匿識別データが、基礎となった文書に対してコード変換処理を施すことにより生成される少なくとも１つのコードで構成され、そのコードが、当該コードから文書の所在及び内容を特定できない不可逆性を有する。上記コード変換処理は、基礎となった文書を対象として行われるものであり、それがダイジェスト作成、暗号化処理、等に相当してもよい。特に望ましくは公知のハッシュ関数を利用した変換処理をあげることができる。秘匿関連情報としてハッシュ値を利用すれば、基礎となった文書についてのセキュリティを完全に保証しつつ、文書間の一致を正確に判断できる。しかも情報量が少ないので、データベース構築上有利である。更に、文書の実体部分についてのハッシュ値が利用されるならばファイル名が変更されても、あるいはアノテーションが付加されても、そのような実体間での一致を判断できるという利点がある。文書間の一致は以下のように様々なレベルあるいはカテゴリーにおいて判断することができる。 Preferably, each secret identification data is composed of at least one code generated by performing a code conversion process on the underlying document, and the code cannot identify the location and content of the document from the code. Has irreversibility. The code conversion process is performed on the basis document, and it may correspond to digest creation, encryption process, and the like. A conversion process using a publicly known hash function is particularly desirable. If a hash value is used as confidential information, it is possible to accurately determine matching between documents while completely guaranteeing the security of the underlying document. Moreover, since the amount of information is small, it is advantageous in constructing a database. Further, if the hash value for the entity part of the document is used, there is an advantage that even if the file name is changed or an annotation is added, it is possible to determine a match between such entities. Matching between documents can be judged at various levels or categories as follows.

望ましくは、前記生成手段は、前記基礎となった文書の実体部分及び属性部分の両方を識別する第１コードを生成する機能と、前記基礎となった文書の実体部分及び属性部分の一方を識別する第２コードを生成する機能と、を具備する。この構成において、前記秘匿関連情報を構成する各秘匿識別データは、前記第１コード及び前記第２コードを含む。この構成によれば、少なくとも、コンテンツに相当する実体部分と、それに付属するアノテーションやコンテキスト等の属性部分と、の両方又は一方を区別して検索処理を遂行できる。文書がテキスト等からなる電子的ドキュメントであれば、ページ単位で一致判断を行うようにしてもよい。かかる構成によれば一致と認定されたページ数や分量から類似度を評価することも可能となる。いずれにしても、全体的な一致と部分的な一致の両者に対応できるように構成しておけば、関連文書の検索に当たっての多様なニーズに応えることができ、システムの利便性を向上できる。 Preferably, the generating means generates a first code for identifying both an entity part and an attribute part of the underlying document, and identifies one of the entity part and the attribute part of the underlying document. And a function of generating the second code. In this configuration, each piece of secret identification data constituting the secret-related information includes the first code and the second code. According to this configuration, it is possible to perform the search process by distinguishing at least one of the entity part corresponding to the content and the attribute part such as annotation or context attached thereto. If the document is an electronic document composed of text or the like, the matching determination may be performed on a page-by-page basis. According to such a configuration, it is possible to evaluate the degree of similarity from the number of pages and the amount of pages that are recognized as coincident. In any case, if it is configured so as to be able to cope with both an overall match and a partial match, it can meet various needs in searching for related documents, and the convenience of the system can be improved.

望ましくは、前記提供手段は、前記要求元ユーザーが前記関連文書に対するアクセス権限を有する場合に、当該要求元ユーザーに対して当該関連文書についての詳細情報を提供する。ここで、詳細情報としては、例えば、名称、所在（ＵＲＬ）等をあげることができる。その他に作成日時、所有者、等の情報が表示されてもよい。表示内容及び表示形態をカスタマイズできるように構成してもよい。検索の際に判断される一致度又は類似度に応じて検索結果についての表示優先度を変えるようにしてもよい。例えば、一致度又は類似度がより高いものについてはより上位に表示されるようにしてもよいし、一致度又は類似度が一定以下のものについては表示されないようにしてもよい。表示条件をユーザーが適宜変更できるように構成するのが望ましい。 Preferably, the providing means provides detailed information about the related document to the requesting user when the requesting user has an access right to the related document. Here, as detailed information, a name, a location (URL), etc. can be mentioned, for example. In addition, information such as date of creation, owner, etc. may be displayed. You may comprise so that a display content and a display form can be customized. You may make it change the display priority about a search result according to the coincidence degree or similarity degree judged in the case of a search. For example, a higher degree of coincidence or similarity may be displayed at a higher level, and a case with a lower degree of coincidence or similarity may not be displayed. It is desirable that the user can change the display conditions as appropriate.

望ましくは、前記提供手段は、前記要求元ユーザーが前記関連文書に対するアクセス権限を有しない場合に、当該要求元ユーザーに対して当該関連文書に関する一切の情報を提供しない。この構成によれば、秘匿化関連情報も表示されないので、関連文書を完全に秘匿できる。もっとも、秘匿関連情報だけをユーザーに提供するようにしてもよいし、関連文書が存在している事実だけをユーザーに提供するようにしてもよい。 Preferably, the providing means does not provide any information related to the related document to the requesting user when the requesting user does not have access authority to the related document. According to this configuration, since the concealment related information is not displayed, the related document can be completely concealed. However, only confidential information may be provided to the user, or only the fact that the related document exists may be provided to the user.

望ましくは、前記提供手段は、前記要求元ユーザーが前記関連文書に対するアクセス権限を有しない場合に、当該要求元ユーザーに対して当該関連文書の管理権限を有するユーザーへの問い合わせに利用される情報を提供する。この構成によれば、関連文書を管理するあるいは保有するユーザーに対する問い合わせを促して、関連文書の利用の機会を増やせる。また、自動的な処理と人為的な判断とを併用して、安全で利便性の高いシステムを構築できる。問い合わせに利用される情報としては、管理者を秘匿したまま単に問い合わせの要否を選択させるものであってもよいし、管理者の名称やイニシャル等までを表示して問い合わせの要否や問い合わせ方法を判断させるものであってもよい。 Preferably, when the requesting user does not have an access right to the related document, the providing unit provides information used for inquiring to the user having an authority to manage the related document with respect to the requesting user. provide. According to this configuration, an inquiry to a user who manages or holds a related document is promoted, and an opportunity for using the related document can be increased. Moreover, a safe and highly convenient system can be constructed by using both automatic processing and human judgment. The information used for the inquiry may simply select whether or not the inquiry is necessary while keeping the administrator secret, or display the name and initials of the administrator to determine the necessity and inquiry method of the inquiry. It may be determined.

本願明細書において、文書は、通常、情報処理装置上において処理される電子的なデータ単位である。その概念には、テキスト、静止画像、動画像、音声情報、その他が含まれる。多くの場合、文書は、上記のように、テキスト等の文書実体と、アノテーションや管理情報等の文書属性と、により構成され、それら全体として１つのファイルが構成される。もっとも、文書実体を構成する個々のページ単位でファイルが構成されることもある。 In this specification, a document is usually an electronic data unit processed on an information processing apparatus. The concept includes text, still images, moving images, audio information, and others. In many cases, as described above, a document is composed of a document entity such as text and a document attribute such as annotation and management information, and one file as a whole. Of course, a file may be composed of individual pages constituting a document entity.

（２）実施形態の詳細な説明
次に、図面に基づいて実施形態を詳述する。 (2) Detailed description of embodiment Next, embodiment is explained in full detail based on drawing.

図1には、本実施形態の文書検索システムがブロック図として示されている。文書検索システムは、図1に示す例において、文書検索装置としてのサーバー１０と、サーバー１０に対してネットワークを介して接続された文書作業装置あるいは文書処理装置としての複数のクライアント１２，１４，１６と、によって構成されている。図１においては、３つのクライアント１２，１４，１６がサーバー１０に対して接続されている。但し、クライアントの台数は任意である。図１に示す例において、クライアント１２はユーザーＸによって使用される装置であり、クライアント１４はユーザーＹによって使用される装置であり、クライアント１６はユーザーＺによって使用される装置である。もちろん、一つのクライアントが複数のユーザーによって使用されてもよく、あるいは一人のユーザーが複数のクライアントを使用するものであってもよい。ちなみに、ユーザーはユーザーグループを含む概念であってもよい。一般に、各ユーザー単位で個々の文書が管理されており、あるユーザーが保有あるいは管理している文書に対する他のユーザーからのアクセスは制限される。もっとも、複数のユーザーによって共有される文書が存在していてもよい。いずれにしても、図示の例では、ユーザー（ユーザーグループ）単位で個々の文書に対するアクセス制限がかけられている。 FIG. 1 shows a block diagram of the document search system of the present embodiment. In the example shown in FIG. 1, the document search system includes a server 10 as a document search device and a plurality of clients 12, 14, 16 as document work devices or document processing devices connected to the server 10 via a network. And is composed of. In FIG. 1, three clients 12, 14 and 16 are connected to the server 10. However, the number of clients is arbitrary. In the example shown in FIG. 1, the client 12 is a device used by the user X, the client 14 is a device used by the user Y, and the client 16 is a device used by the user Z. Of course, one client may be used by a plurality of users, or one user may use a plurality of clients. Incidentally, the user may be a concept including a user group. In general, individual documents are managed for each user, and access from other users to documents held or managed by a certain user is limited. However, a document shared by a plurality of users may exist. In any case, in the illustrated example, access restrictions on individual documents are applied in units of users (user groups).

各クライアント１２，１４，１６について説明する。それらのクライアント１２，１４，１６は、上述したように文書の処理を行なうための装置であり、それぞれのクライアント１２，１４，１６ごとに、より具体的にはユーザーごとに、後に説明するワークスペース（文書配置空間）が管理されている。ワークスペースは文書の取り扱いあるいは作業を行なうための仮想的な管理空間である。図１に示す例において、クライアント１２，１４，１６はそれぞれ同一の構成を有しているため、以下に、クライアント１２を代表してその構成について説明する。 Each client 12, 14, 16 will be described. These clients 12, 14, and 16 are devices for processing documents as described above, and work spaces described later for each client 12, 14, and 16, more specifically, for each user. (Document placement space) is managed. A workspace is a virtual management space for handling or working with documents. In the example shown in FIG. 1, since the clients 12, 14, and 16 have the same configuration, the configuration of the client 12 will be described below as a representative.

クライアント１２は、複数のソフトウエア機能を有しており、それぞれの機能が図１においてブロックとして示されている。具体的に説明すると、文書管理部１８はワークスペースを管理するモジュールであり、このワークスペースについては後に図２等を用いて説明する。各文書の実体はシステム内のどこにあってもよく、各クライアントにおいて保有されていてもよいし、サーバー側あるいは他の装置上に存在していてもよい。関連情報生成部２０は、関連文書グループごとにそれを構成する複数の文書が関連していることを表す関連情報を生成するモジュールである。関連情報は様々な手法を利用して生成することができる。後に説明するワークスペース上においては、例えば、文書間の距離に基づいて関連文書グループが定義される。その場合においては、例えば、一定の距離以内に存在する複数の文書が相互に関連するものであるとみなされる。また、各文書に含まれるキーワードや各文書のダイジェストを基準として複数の文書間における一致度あるいは類似度を判断し、その判断結果に基づいて関連文書グループを定義するようにしてもよい。本実施形態においては、以下に説明するように、関連情報それ自体が共有されるのではなく、それに代わる秘匿関連情報が共有される。関連情報及び秘匿関連情報も相互に関連する複数の文書を特定（識別）する情報である点においては同じであるが、秘匿関連情報は、その名称が示す通り、文書それ自体を秘匿するための特別な形式をもった関連情報であるという点において、一般的な関連情報とは相違する。 The client 12 has a plurality of software functions, and each function is shown as a block in FIG. More specifically, the document management unit 18 is a module for managing a workspace, which will be described later with reference to FIG. The entity of each document may exist anywhere in the system, may be held in each client, or may exist on the server side or on another device. The related information generation unit 20 is a module that generates related information indicating that a plurality of documents constituting the related document group are related to each other. Related information can be generated using various techniques. On the workspace described later, for example, a related document group is defined based on the distance between documents. In that case, for example, it is considered that a plurality of documents existing within a certain distance are related to each other. Further, the degree of coincidence or similarity between a plurality of documents may be determined based on the keywords included in each document and the digest of each document, and a related document group may be defined based on the determination result. In the present embodiment, as will be described below, the related information itself is not shared, but secret related information is shared instead. Related information and confidential information are the same in that they identify (identify) a plurality of documents that are related to each other. However, as the name indicates, confidential information is used to conceal the document itself. It is different from general related information in that it is related information having a special format.

そのような秘匿関連情報を生成するのが変換処理部２２である。本実施形態において、変換処理部２２は、基礎となる文書に対してハッシュ関数に基づく変換処理を適用することによりハッシュ値を求めるモジュールである。本実施形態では、一つの文書に対して多面的な変換が実行されており、具体的には、文書全体についてのハッシュ値、文書の中身すなわちコンテンツについてのハッシュ値、文書の属性（コンテキスト）についてのハッシュ値、等の複数のハッシュ値が得られている。それらのハッシュ値の要素として後に説明する秘匿識別値セット（秘匿識別データセット）が構成される。変換処理部２２は、関連情報生成部２０によって生成された関連情報を参考に変換処理を行なうものであるが、ハッシュ値の生成は文書それ自体が基礎となっている。もちろん、関連情報を変換対象とすることも考えられる。 The conversion processing unit 22 generates such confidential information. In the present embodiment, the conversion processing unit 22 is a module that obtains a hash value by applying a conversion process based on a hash function to a basic document. In the present embodiment, multifaceted conversion is performed on one document. Specifically, a hash value for the entire document, a hash value for the content of the document, that is, a content, and an attribute (context) of the document A plurality of hash values, such as hash values of, are obtained. A secret identification value set (secret identification data set), which will be described later, is configured as an element of these hash values. The conversion processing unit 22 performs conversion processing with reference to the related information generated by the related information generating unit 20, but the generation of the hash value is based on the document itself. Of course, related information may be converted.

登録処理部２６は、互いの関連する複数の文書から生成された複数の秘匿識別値からなる秘匿識別値セットとしての秘匿関連情報をサーバー１０に登録する処理を実行するモジュールである。具体的には、各クライアントにおいて生成された各秘匿関連情報は、符号１００で示されるように、後に説明する秘匿関連情報データベース（ＤＢ）３２上に登録される。関連情報及び秘匿関連情報の生成タイミング、並びに、登録処理タイミングについては任意に定めることができ、ワークスペース上の構成に変化があったことを自動的に検知し、それをトリガとして生成処理、変換処理及び登録処理の一連の過程が自動的に実行されてもよい。一般には、新しく関連情報が生成されたタイミングで自動的に秘匿関連情報を生成した上で、それが秘匿関連情報ＤＢ上に登録されるように構成するのが望ましい。 The registration processing unit 26 is a module that executes processing for registering secret related information as a secret identification value set including a plurality of secret identification values generated from a plurality of documents associated with each other in the server 10. Specifically, each secret-related information generated in each client is registered on a secret-related information database (DB) 32 described later, as indicated by reference numeral 100. The generation timing of related information and confidential information, and the registration processing timing can be arbitrarily determined, and it is automatically detected that there is a change in the configuration on the workspace, and the generation processing and conversion are triggered by this. A series of processes of the process and the registration process may be automatically executed. In general, it is desirable that the secret-related information is automatically generated at the timing when new related information is generated, and is then registered on the secret-related information DB.

要求発行部２８は、サーバー１０上に登録されたシステム全体にわたるすなわち複数のユーザー間にわたる複数の秘匿関連情報を対象として、関連文書の検索を行なう要求を発行するモジュールである。その場合、検索条件を付加することができ、検索条件としては検索範囲や一致判定条件等があげられる。これについては後に説明する。要求発行部２８から発行される検索要求が符号１０２で表されており、その検索要求はサーバー１０内における検索処理部３４で受領され、また処理される。 The request issuing unit 28 is a module that issues a request for searching related documents for a plurality of secret related information over the entire system registered on the server 10, that is, between a plurality of users. In that case, a search condition can be added, and examples of the search condition include a search range and a match determination condition. This will be described later. A search request issued from the request issuing unit 28 is denoted by reference numeral 102, and the search request is received and processed by the search processing unit 34 in the server 10.

結果処理部３０は、検索処理部３４から出力される検索結果を表す情報に基づいてユーザーに対して関連文書のリストを提供するモジュールである。そのリストには検索の基礎となった注目文書に関連する１又は複数の文書を特定する情報が含まれる。ただし、後に説明するように、本実施形態では、各文書の秘匿性を担保するため、自己がアクセス権限を有する（具体的には自己が保有又は管理している）関連文書だけがリスト上に反映される。他のユーザーにおいて管理されている関連文書については、その利用を促進するために必要な措置がとられるのが望ましいが、これについては後に説明する。検索処理部３４から結果処理部３０へ渡される情報の流れが符号１０４で表されている。関連文書の秘匿性を担保するために検索結果に対するフィルタリングが必要な場合、そのフィルタリングは検索処理部３４において実行され、あるいは検索処理部３４において実行される。 The result processing unit 30 is a module that provides a user with a list of related documents based on information representing the search result output from the search processing unit 34. The list includes information for specifying one or more documents related to the document of interest that is the basis of the search. However, as will be described later, in this embodiment, in order to ensure the confidentiality of each document, only related documents for which the user has access authority (specifically, owned or managed by the user) are listed. Reflected. For related documents managed by other users, it is desirable to take necessary measures to promote their use, which will be described later. A flow of information passed from the search processing unit 34 to the result processing unit 30 is represented by reference numeral 104. When the search result needs to be filtered to ensure the confidentiality of the related document, the filtering is executed in the search processing unit 34 or in the search processing unit 34.

管理テーブル２４上には、関連情報生成部２０において生成された関連情報が登録され、また、変換処理部２２において生成された秘匿関連情報が登録される。具体的には、本実施形態において、管理テーブル２４上に、各文書について求められた１又は複数のハッシュ値からなる秘匿識別値セットが登録される。このような構成によれば、検索結果としてハッシュ値が渡された場合に、当該ハッシュ値からその生成の基礎となった関連文書を迅速に特定することができる。もちろん、一旦生成されたハッシュ値を保存することなく、必要なタイミングで再度ハッシュ値の計算を行なって、ハッシュ値同士のつき合わせ等を行なうようにしてもよい。このように、管理テーブル２４上において、各文書に関する情報が管理されており、詳しくは、文書ごとにファイル名やその所在（ＵＲＬ）等も管理されている。管理テーブル２４上に管理されている情報がサーバー１０において管理されてもよい。ただし、各文書と各ハッシュ値との対応付けを表す情報については、セキュリティの観点から見て各クライアントにおいてローカルで管理されるのが望ましい。あるいは、そのような情報がサーバー１０上において管理される場合には、当該情報へのアクセス制限を厳格に課すのが望ましい。 On the management table 24, the related information generated by the related information generating unit 20 is registered, and the secret related information generated by the conversion processing unit 22 is registered. Specifically, in this embodiment, a secret identification value set including one or a plurality of hash values obtained for each document is registered on the management table 24. According to such a configuration, when a hash value is passed as a search result, it is possible to quickly identify the related document that is the basis for generating the hash value. Of course, the hash value may be calculated again at a necessary timing without storing the generated hash value, and the hash values may be matched. As described above, information on each document is managed on the management table 24. Specifically, the file name, the location (URL), and the like are also managed for each document. Information managed on the management table 24 may be managed in the server 10. However, it is desirable that information indicating the association between each document and each hash value is locally managed in each client from the viewpoint of security. Alternatively, when such information is managed on the server 10, it is desirable to strictly restrict access to the information.

以上のように、クライアント１２には、文書管理部１８、関連情報生成部２０、変換処理部２２、等の構成が設けられているが、クライアント１２を例えばシンクライアント構成とし、クライアント１２におけるインテリジェンス機能を実質的にサーバー１０において実現するようにしてもよい。あるいは、関連文書の検索に関わる機能だけがサーバー１０上に搭載されてもよい。 As described above, the client 12 includes the document management unit 18, the related information generation unit 20, the conversion processing unit 22, and the like. The client 12 is configured as a thin client, for example, and the intelligence function in the client 12 is performed. May be substantially realized in the server 10. Alternatively, only a function related to retrieval of related documents may be installed on the server 10.

サーバー１０は、上述の通り、秘匿関連情報ＤＢ３２及び検索処理部３４を備えている。他の機能については図示省略されている。秘匿関連情報ＤＢ３２上には各クライアントから送られる秘匿関連情報が登録される。この場合において、ユーザー単位で個々の秘匿情報が管理されてもよいし、そうでなくてもよい。後に説明する図４には秘匿関連情報ＤＢ３２内の構成が例示されている。検索処理部３４は、あるクライアントすなわちあるユーザーから検索要求が発行された場合に、その検索要求に含まれる注目文書の秘匿識別値セット（すなわち、１又は複数のハッシュ値）を検索キーとし、秘匿関連情報ＤＢ３２を検索することにより、注目文書に関連する関連文書の秘匿識別値セットを特定するものである。通常、複数の関連文書に対応する複数の秘匿識別値セットが検索結果として特定されることになる。その中に、注目文書自体の秘匿識別値セットが存在するならば、それについては除外するようにしてもよい。検索処理部３４の具体的な処理内容については図４以降の各図を用いて詳述することにする。 As described above, the server 10 includes the confidentiality related information DB 32 and the search processing unit 34. Other functions are not shown. On the confidential information DB 32, confidential information transmitted from each client is registered. In this case, the individual confidential information may be managed for each user or not. FIG. 4 described later illustrates a configuration in the confidentiality related information DB 32. When a search request is issued from a certain client, that is, a certain user, the search processing unit 34 uses a secret identification value set (that is, one or a plurality of hash values) of the document of interest included in the search request as a search key, By searching the related information DB 32, the secret identification value set of the related document related to the document of interest is specified. Usually, a plurality of secret identification value sets corresponding to a plurality of related documents are specified as search results. If there is a secret identification value set of the document of interest itself, it may be excluded. Specific processing contents of the search processing unit 34 will be described in detail with reference to FIGS.

図２には、クライアントの表示画面上に表示されるワークスペースの一例が示されている。図示されるワークスペース３６はユーザーＸによって管理されるものである。ワークスペース３６は、文書の作業空間に相当し、ワークスペース３６上には各文書を表すシンボルとしてのオブジェクトが表示される。各オブジェクトはサムネイル画像（低解像度縮小画像）やアイコン等である。ワークスペース３６上に表示される各オブジェクトとそれが表す各文書とは本来区別されるべきものであるが、以下においては便宜上、それらを区別することなく説明することにする。 FIG. 2 shows an example of a workspace displayed on the display screen of the client. The illustrated workspace 36 is managed by the user X. The work space 36 corresponds to a work space of a document, and an object as a symbol representing each document is displayed on the work space 36. Each object is a thumbnail image (low-resolution reduced image), an icon, or the like. Although each object displayed on the workspace 36 and each document represented by the object should be distinguished from each other, the following description will be made without distinguishing them for convenience.

図２に示されるワークスペース３６上には、文書Ａ、文書Ｂ、文書Ｃ、文書Ｄが存在しており、それらは文書間距離の基準に基づいて相互に関連する文書であるとみなされている。すなわち、それらは符号３８で示されるように相互に関連する文書グループを構成している。もっとも、文書間距離ではなく他の基準、例えばキーワードやダイジェストの一致度あるいは類似度等を判断基準として関連文書グループを定義するようにしてもよい。複数の判断基準を組み合わせるようにしてもよい。このように関連文書グループが定義されると、各関連文書グループについて関連情報が生成される。 Document A, Document B, Document C, and Document D exist on the workspace 36 shown in FIG. 2, and these are regarded as mutually related documents based on the inter-document distance criterion. Yes. That is, they constitute an interrelated document group as indicated by reference numeral 38. Of course, the related document group may be defined based on a criterion other than the inter-document distance, for example, the degree of coincidence or similarity between keywords and digests. A plurality of judgment criteria may be combined. When related document groups are defined in this way, related information is generated for each related document group.

図３には、関連情報と秘匿関連情報との関連が示されている。まず、図３に示される関連情報４０について説明する。図示の例では、文書Ａ−文書Ｄの４つの文書が関連しているという前提の下で、関連情報４０として、複数の識別子４０ａの組み合わせが構成されており、具体的には、文書Ａ識別子、文書Ｂ識別子、文書Ｃ識別子、文書Ｄ識別子によって関連情報４０が構成されている。各識別子は例えばＵＲＬ等によって構成されてもよい。従来において、このような関連情報４０それ自体を検索用のデータベース上に登録すると、他のユーザーによってそれぞれの文書の所在が特定されてしまい、文書のセキュリティ面あるいは秘匿性の面で問題が生じていた。そこで、本実施形態においては上述したように、秘匿関連情報４１がデータベース上に登録される。 FIG. 3 shows the relationship between related information and confidential information. First, the related information 40 shown in FIG. 3 will be described. In the illustrated example, a combination of a plurality of identifiers 40a is configured as the related information 40 on the assumption that four documents of document A-document D are related. Specifically, the document A identifier The related information 40 is composed of the document B identifier, the document C identifier, and the document D identifier. Each identifier may be configured by a URL, for example. Conventionally, when such related information 40 itself is registered in a database for search, the location of each document is specified by another user, which causes a problem in terms of security or confidentiality of the document. It was. Therefore, in the present embodiment, as described above, the confidential information 41 is registered on the database.

秘匿関連情報４１は、図示されるように関連情報４０に対応したものであり、それは４つの文書に対応した４つの秘匿識別値セット４２によって構成される。個々の秘匿識別値セット４２は、複数のハッシュ値からなる秘匿識別データである。具体的には、基礎となった文書の全体についてのハッシュ値４２ａと、文書におけるコンテンツ部分についてのハッシュ値４２ｂと、文書のコンテキスト（属性）についてのハッシュ値４２ｃと、の３つのハッシュ値の組み合わせとして構成されている。もちろん、いずれかのハッシュ値だけによって文書Ａが特定されてもよい。一方、このように多面的にハッシュ値を用意しておくことにより、関連文書の検索ニーズに応じて適切な検索を行なえるという利点がある。ちなみに、文書Ｂ及び文書Ｄに対応する秘匿識別値セットにおいてはコンテキストによるハッシュ値は生成されていない。これは基礎となった文書についてコンテキストの情報が付加されていなかったためである。このように各文書について任意のハッシュ値の組み合わせを適用することが可能である。 The secret related information 41 corresponds to the related information 40 as shown in the figure, and is constituted by four secret identification value sets 42 corresponding to four documents. Each secret identification value set 42 is secret identification data composed of a plurality of hash values. Specifically, a combination of three hash values: a hash value 42a for the entire document that is the basis, a hash value 42b for the content portion in the document, and a hash value 42c for the context (attribute) of the document It is configured as. Of course, the document A may be specified only by any one of the hash values. On the other hand, by preparing hash values in many ways as described above, there is an advantage that an appropriate search can be performed according to the search needs of related documents. Incidentally, in the secret identification value set corresponding to the document B and the document D, the hash value by the context is not generated. This is because no context information was added to the underlying document. In this way, any combination of hash values can be applied to each document.

ハッシュ値は、ハッシュ関数の実行によって得られたコードあるいは記号列であり、そこからは元の文書のファイル名や所在をまったく特定することができない非可逆性を有するものである。そのような性質を本実施形態においては文書セキュリティの保全に役立てるものである。オリジナルの関連情報そのものを各ユーザー間において共有することはセキュリティ面で問題があるが、複数のハッシュ値によって構成される秘匿関連情報４１を複数のユーザー間において共有することにより、個々の文書の秘匿性を維持したまま文書間の関連性を表す情報の共有、すなわち情報材の共有を図ることが可能となる。 The hash value is a code or symbol string obtained by executing the hash function, and has irreversibility from which the file name and location of the original document cannot be specified at all. Such a property is useful in document security maintenance in this embodiment. Sharing the original related information itself between the users is problematic in terms of security, but by sharing the secret related information 41 composed of a plurality of hash values between the plurality of users, it is possible to conceal the individual documents. Thus, it is possible to share information representing the relationship between documents, that is, to share information materials while maintaining the property.

ちなみに、各クライアントにおいては複数の関連情報が生成され、それぞれの関連情報ごとに秘匿関連情報が生成される。そして、それらがデータベース上に登録される。そのような一連の流れが図３に概念図として示されている。なお、ハッシュ変換と同等な作用を発揮するものであれば、他の変換処理を利用することも可能である。例えば様々な暗号化処理を利用するようにしてもよい。ただし、ハッシュ値であれば文書の秘匿性を完全に担保できると共に、正確な検索を行なうことができ、しかもハッシュ値自体のデータ量は少ないので、データ取り扱い上も有利である。本実施形態のように多面的あるいは多段階的な変換によって複数種類のハッシュ値を用意しておくことにより、例えばファイル名が異なっても実体が同一の文書を関連文書として特定することが可能となるという利点を得られる。 Incidentally, each client generates a plurality of related information, and secret related information is generated for each related information. And they are registered on the database. Such a series of flows is shown as a conceptual diagram in FIG. It should be noted that other conversion processing can be used as long as it exhibits an effect equivalent to hash conversion. For example, various encryption processes may be used. However, if the hash value is used, the confidentiality of the document can be completely ensured, an accurate search can be performed, and the data amount of the hash value itself is small, which is advantageous in terms of data handling. By preparing multiple types of hash values by multi-faceted or multi-step conversion as in this embodiment, for example, it is possible to specify documents with the same entity as related documents even if the file names are different The advantage of becoming.

図４には、図１に示した秘匿関連情報データベースの構成が概念図として示されている。図４においては、そのような構成の他、検索処理結果についての取り扱いも図解されている。 FIG. 4 is a conceptual diagram showing the configuration of the confidentiality related information database shown in FIG. In FIG. 4, in addition to such a configuration, handling of search processing results is also illustrated.

秘匿関連情報ＤＢ３２には、本実施形態において、ユーザーごとに登録テーブル４４Ｘ，４４Ｙ，４４Ｚが構成されている。すなわち、各ユーザーを単位として情報が管理されている。もっとも、そのような属性付けを行なわずに、複数のユーザー間にわたって一連の秘匿関連情報の一元管理を行なうようにしてもよい。ここで、ユーザーＸについて生成されたテーブル４４Ｘに着目すると、当該テーブル４４Ｘには複数の秘匿関連情報４１が登録されている。各秘匿関連情報４１には図４に示す例において識別番号４１ａが付加されている。個々の秘匿関連情報４１は、上述したように、相互に関連する複数の秘匿識別値セットにより構成され、それぞれのセットは、１又は複数のハッシュ値により構成される。 In the secret related information DB 32, registration tables 44X, 44Y, and 44Z are configured for each user in the present embodiment. That is, information is managed in units of users. However, a series of concealment related information may be centrally managed across a plurality of users without performing such attribute assignment. Here, paying attention to the table 44X generated for the user X, a plurality of confidential information 41 is registered in the table 44X. In each secret related information 41, an identification number 41a is added in the example shown in FIG. As described above, the individual concealment related information 41 is composed of a plurality of concealment identification value sets related to each other, and each set is composed of one or a plurality of hash values.

ここで、関連情報の検索プロセスの一例について説明する。あるクライアントから検索要求が発行された場合、その検索要求に含まれるハッシュ値が認識される。そのハッシュ値は注目文書についてのハッシュ値である。通常、注目文書についての複数のハッシュ値が検索要求の中に含まれるが、ここでは一つのハッシュ値が含まれるものとする。サーバーにおける検索処理部は、そのハッシュ値を検索キーとして秘匿関連情報ＤＢに対する検索を実行する。すると、例えば符号４６で示すように、特定のハッシュ値との間に一致が判定される。もちろん、更に他の秘匿関連情報内のハッシュ値との間において一致が判定されることもある。そのような一致判定は、注目文書に関連する関連文書の特定に相当する。ただし、この段階では、関連文書それ自体が直ちに特定されるのではなく、当該関連文書に対応付けられた秘匿識別データ（つまり１又は複数の秘匿識別子としての１又は複数のハッシュ値）が特定される。図４に示す例では、例えば３つの関連文書に対応する３つのハッシュ値セット（符号４１参照）が特定され、それらが検索結果あるいは検索結果の候補とされる。 Here, an example of a related information search process will be described. When a search request is issued from a certain client, a hash value included in the search request is recognized. The hash value is a hash value for the document of interest. Normally, a plurality of hash values for the document of interest are included in the search request, but here, it is assumed that one hash value is included. The search processing unit in the server executes a search for the confidential information DB using the hash value as a search key. Then, for example, as indicated by reference numeral 46, a match with a specific hash value is determined. Of course, a match may be determined between the hash values in the other confidential information. Such coincidence determination corresponds to identification of a related document related to the document of interest. However, at this stage, the related document itself is not immediately specified, but the confidential identification data associated with the related document (that is, one or more hash values as one or more confidential identifiers) is specified. The In the example shown in FIG. 4, for example, three hash value sets (see reference numeral 41) corresponding to three related documents are specified, and these are set as search results or search result candidates.

関連文書を特定する情報４１は、図示の例では、符号１０６Ａで示すプロセスあるいは符号１０６Ｂで示すプロセスに従って処理される。各プロセスは選択的に利用されるものである。符号１０６Ａで示すプロセスでは、符号５０で示されるように、まずサーバー上において情報４１に対するフィルタリングが実行され、それによって検索結果としてのリストが生成される。そのフィルタリングには、検索要求を発行したユーザーがアクセス権限を有する関連文書のみを絞り込む処理に相当する。すなわち他のユーザーにおいて保有あるいは管理されている文書が関連文書として直ちに特定されないようにするマスク処理としてフィルタリング処理が位置づけられる。本実施形態においては、データベース３２上において各秘匿関連情報がユーザー単位で管理されているため、そのような管理体系を利用して上記のフィルタリング処理を行なうことが可能である。すなわち、上記の情報４１に含まれるハッシュ値あるいはハッシュ値セットが検索要求元のユーザーに対応付けられている場合に限り、当該ハッシュ値又はハッシュ値セットが検索結果に含まれるように処理される。符号５２で示す処理には、サーバーにおいて生成されたリストが、要求元であるクライアントに提供され、そのリストが当該クライアント上において表示されることになる。 In the illustrated example, the information 41 specifying the related document is processed according to the process indicated by reference numeral 106A or the process indicated by reference numeral 106B. Each process is used selectively. In the process indicated by reference numeral 106A, as indicated by reference numeral 50, first, filtering is performed on the information 41 on the server, thereby generating a list as a search result. The filtering corresponds to a process of narrowing down only related documents to which a user who has issued a search request has access authority. That is, the filtering process is positioned as a mask process that prevents a document held or managed by another user from being immediately identified as a related document. In the present embodiment, since each secret-related information is managed on a database 32 basis on a user basis, the above filtering process can be performed using such a management system. That is, only when the hash value or hash value set included in the information 41 is associated with the search requesting user, processing is performed so that the hash value or hash value set is included in the search result. In the process indicated by reference numeral 52, the list generated in the server is provided to the requesting client, and the list is displayed on the client.

一方、符号１０６Ｂで示すプロセスにおいては、符号５４で示すように、サーバー上において抽出された情報が検索結果としてそのまま要求元クライアントへ提供される。そして、符号５６で示されるように、当該クライアント上においてサーバーから提供された情報に対するフィルタリング処理が施される。これによって、当該ユーザーに提供可能なリストが構成され、それが画面上に表示されることになる。プロセス１０６Ｂにおいては、結果として、ユーザー側にすべてのハッシュ値が提供されることになるが、上述したように、ハッシュ値それ自体から文書を特定することは不可能であるため、プロセス１０６Ｂを採用する場合においても文書の秘匿性を保全できる。ただし、ハッシュ値それ自体の提供を行いたくないような場合には、プロセス１０６Ａを選択するのが望ましい。プロセス１０６Ａについては後に図８を用いて詳述する。 On the other hand, in the process indicated by reference numeral 106B, as indicated by reference numeral 54, the information extracted on the server is provided as it is as a search result to the requesting client. Then, as indicated by reference numeral 56, a filtering process is performed on the information provided from the server on the client. As a result, a list that can be provided to the user is constructed and displayed on the screen. As a result, in the process 106B, all the hash values are provided to the user side. However, as described above, since it is impossible to specify the document from the hash values themselves, the process 106B is adopted. Even in this case, the confidentiality of the document can be maintained. However, when it is not desired to provide the hash value itself, it is desirable to select the process 106A. The process 106A will be described in detail later with reference to FIG.

次に、関連文書の検索について具体例を説明する。その前提は以下の通りである。
[ＤＢ内容]
・ユーザーＸからの秘匿関連情報（図３）が登録済み
[ユーザーＹ側の状況]
・文書Ａ：ユーザーＹのワークスペース上に存在
・文書Ｂ：ユーザーＹのワークスペースからかなり離れた別の空間に存在
（それ故、文書Ａ及び文書Ｂの関連性は認識されていない）
・文書Ｂ’:文書Ｂに対してアノテーションを付加したもの
ユーザーＹのワークスペース上において文書Ａから少し離れて存在
・文書Ｃ：ユーザーＹのワークスペースに近い別の空間に存在
・文書Ｄ：不存在
上記前提において、ユーザーＹ側から、注目文書Ａに関連する関連文書の検索要求が発行されると、図５又は図６に示す検索結果がリスト表示される。 Next, a specific example of the related document search will be described. The premise is as follows.
[DB contents]
・ Confidential information (Figure 3) from user X has been registered
[User Y side status]
-Document A: Exists on the workspace of User Y-Document B: Exists in another space far away from User Y's workspace
(Therefore, the relationship between document A and document B is not recognized)
-Document B ': Annotated document B
Exists slightly away from document A in user Y's workspace Document C: exists in another space close to user Y's workspace Document D: does not exist When a related document search request is issued, search results shown in FIG. 5 or 6 are displayed in a list.

図５に示す例では、文書Ａの関連文書として文書Ｂ´及び文書Ｃが検索結果として表示される。図示の例では、各関連文書について、文書名５８、文書の所在（場所）６０及び一致種別６２の情報が表示され、それらの情報によって、それぞれの関連文書の詳細を把握できる。特に、一致種別の情報が併せて表示されるので、どのような条件の下で一致が判断されたのかを直ちに理解することができ、関連文書の利用にあたって当該情報を参考とすることができる。一致種別６２は、検索キーとしてのハッシュ値がどのレベル（どのカテゴリー）において一致したかを示すものであり、図４の例においては符号４６で示したように文書Ｂ´についてコンテンツレベルでハッシュ値一致が認められている。一方、ファイル全体についてハッシュ値一致が認められた場合には、図５において文書Ｃについて示すように、全体一致と表示される。コンテンツレベルにおいてハッシュ値の一致が認められた場合にはコンテンツ一致と表示される。例えば、コンテンツ一致の場合には、アノテーションの付加及び内容等によらずに実体間の一致をもって、関連文書を特定できるので、関連文書の探索に当たってその利便性を向上できる。 In the example illustrated in FIG. 5, the document B ′ and the document C are displayed as search results as related documents of the document A. In the example shown in the figure, information on the document name 58, the document location (location) 60, and the match type 62 is displayed for each related document, and the details of each related document can be grasped based on the information. In particular, since the match type information is also displayed, it is possible to immediately understand under what conditions the match is determined, and the information can be referred to when using the related document. The match type 62 indicates at which level (which category) the hash value as the search key is matched, and in the example of FIG. 4, the hash value at the content level for the document B ′ is indicated by reference numeral 46. Match is allowed. On the other hand, if hash value matching is recognized for the entire file, as shown for document C in FIG. When matching of hash values is recognized at the content level, content matching is displayed. For example, in the case of content matching, a related document can be identified by matching between entities without depending on the addition of annotations and contents, so that the convenience of searching related documents can be improved.

図６に示すリストも基本的に図５に示したリストと同様の構成を有しているが、図６に示すリスト上においては各文書ごとにそれを保有しているユーザーに関する情報４６が付加されている。このような情報はサーバー側から提供されるのが望ましい。すなわち、サーバー側において各秘匿識別値セットごとにそれに対応付けてユーザー情報を管理しておけば、検索結果としてリストを表示する段階において各文書を保有しているユーザーに関する情報を提供することが可能である。 The list shown in FIG. 6 has basically the same structure as the list shown in FIG. 5, except that information 46 relating to the user who holds the list is added to each list shown in FIG. Has been. Such information is preferably provided from the server side. In other words, if user information is managed in association with each secret identification value set on the server side, it is possible to provide information on users who own each document at the stage of displaying a list as a search result. It is.

図６に示すリスト上においては、検索要求を発行したユーザーＹ以外のユーザーが管理、保有している文書に相当する項目も含まれる（符号６６参照）。当該項目においては、文書が存在すること自体を認識できるものの、文書名、場所、一致種別といった情報については秘匿されている。一方、それを保有しているユーザーの情報だけが符号６４ａで示すように提供されている。このような構成によれば、不明な文書ではあるもののそれを保有しているユーザーへ問合せを行って、その保有者から必要な文書の提供を受けることが可能となる。その場合、保有しているユーザーが提供を拒否すれば当該文書についての秘匿性をそのまま維持できる。このように自動処理とユーザー判断とを組み合わせれば、秘匿性及び情報利用性の相反する２つのニーズを満たすことができる。保有ユーザーへの問合せにあたっては、その中に、対象となる関連文書についての秘匿識別データ（つまり１又は複数のハッシュ値）を含めるようにすればよい。かかる構成によれば、問い合わせを受けたユーザーにおいて、関連文書を確実に特定することが可能となる。ちなみに、そのような問合せの便宜を図るために、上記の項目６６をクリックすることにより、保有ユーザーへ所定の問い合わせが自動的に発行されるようにしてもよい。その場合においては、問い合わせ元のユーザーから文書を保有しているユーザーへ問合せが直接的に発行されるようにしてもよいし、サーバー経由でそのような問合せが伝送されるようにしてもよい。また、項目６６に示すような情報に代えて、保有ユーザー名も隠蔽し、単に問合せのボタンだけを表示するようにしてもよい。 The list shown in FIG. 6 includes items corresponding to documents managed and held by users other than the user Y who issued the search request (see reference numeral 66). In this item, although the existence of the document itself can be recognized, information such as the document name, location, and matching type is kept secret. On the other hand, only the information of the user who owns it is provided as indicated by reference numeral 64a. According to such a configuration, although it is an unknown document, it is possible to make an inquiry to a user who holds the document and to receive a necessary document from the owner. In that case, if the possessed user refuses to provide, the confidentiality of the document can be maintained as it is. If automatic processing and user judgment are combined in this way, two conflicting needs for confidentiality and information availability can be satisfied. In the inquiry to the possessing user, the secret identification data (that is, one or a plurality of hash values) regarding the target related document may be included therein. According to such a configuration, it is possible to reliably specify the related document for the user who has received the inquiry. Incidentally, in order to facilitate such an inquiry, a predetermined inquiry may be automatically issued to the possessing user by clicking the item 66 described above. In that case, the inquiry may be issued directly from the inquiring user to the user who owns the document, or such an inquiry may be transmitted via the server. Further, instead of the information as shown in the item 66, the possessed user name may be concealed and only the inquiry button may be displayed.

検索結果の表示にあたって、関連文書の特定をより容易にするために、例えば図７に示すような態様を採用することもできる。図７には、ユーザーＹが管理しているワークスペース６８が示されており、上述した前提状況がそのワークスペース６８上において生じている。すなわち、注目文書Ａ７０から少し離れた位置に関連文書としての文書Ｂ´７２が存在している。ユーザーＹ装置上、文書Ａと文書Ｂ´との関連性は認識されていないが、図２に示したユーザーＸのワークスペース上において生成された関連情報、具体的にはユーザーＸの秘匿関連情報を使って、ユーザーＹにおいて文書Ｂ´が文書Ａ７０の関連文書であることを特定することが可能となる。その際、図７の符号７４に示されるようにポップアップあるいは吹き出しの形態で特定の文書が関連文書であることを明らかにするようにしてもよい。符号７４で示される吹き出しにおいては、注目文書である文書Ａが特定されており、更に上述した一致種別の情報もそこに含まれている。更に、関連文書をめくるための２つのボタン７６が表示されており、ユーザーはそのようなボタンをクリックすることにより、リスト上において順番に関連文書を特定することができ、しかもその特定を行なうと、次に特定されている文章へ吹き出しが移動し、ユーザーは、そのような一連の表示を見ながら、必要な関連文書を容易に且つ迅速に選び出せるから、ユーザーにとって非常に便利である。 In displaying the search result, in order to more easily identify the related document, for example, an aspect as shown in FIG. 7 can be adopted. In FIG. 7, a work space 68 managed by the user Y is shown, and the above-described preconditions are generated on the work space 68. That is, the document B′72 as the related document exists at a position slightly away from the target document A70. Although the relationship between the document A and the document B ′ is not recognized on the user Y device, the related information generated on the user X workspace shown in FIG. 2, specifically, the confidential information related to the user X It is possible for the user Y to specify that the document B ′ is a related document of the document A70. At this time, as indicated by reference numeral 74 in FIG. 7, it may be made clear that the specific document is a related document in the form of a pop-up or a balloon. In the balloon indicated by the reference numeral 74, the document A that is the document of interest is specified, and the above-mentioned matching type information is also included therein. Further, two buttons 76 for turning the related document are displayed. When the user clicks such a button, the related document can be specified in order on the list. Since the balloon moves to the next specified sentence, the user can easily and quickly select a necessary related document while viewing such a series of displays, which is very convenient for the user.

上記実施形態においては、ファイル全体、コンテンツ、及びコンテキストといった３つのレベルあるいはカテゴリーを用いて検索を行えるようにしたが、それに加えてあるいはそれらに代えてページ単位での検索管理を行なうようにしてもよい。すなわちページごとにハッシュ値をもたせるものである。そして、ハッシュ値が一致した頁数あるいは分量に応じて関連度を評価するようにしてもよい。 In the above embodiment, the search can be performed using the three levels or categories such as the entire file, the content, and the context. In addition to or instead of the search, the search management can be performed in units of pages. Good. That is, each page has a hash value. Then, the degree of association may be evaluated according to the number of pages or the amount of pages with the same hash value.

次に、図８を用いて図１に示した構成の動作例を説明する。図８には検索処理プロセスがフローチャートとして示されている。図８において、Ｓ１０１では、あるクライアントにおいて注目文書が指定され、Ｓ１０２では当該クライアントにおいてユーザーにより検索条件が指定される。検索条件には、カテゴリーの指定、検索範囲の指定、が含まれてよい。Ｓ１０３では、クライアント（要求元クライアント）から検索要求が発行される。その検索要求には、本実施形態において、注目文書を特定する１又は複数のハッシュ値が含まれる。 Next, an operation example of the configuration shown in FIG. 1 will be described with reference to FIG. FIG. 8 is a flowchart showing the search process. In FIG. 8, in S101, a document of interest is designated in a certain client, and in S102, a search condition is designated by the user in the client. The search condition may include category specification and search range specification. In S103, a search request is issued from the client (request source client). In the present embodiment, the search request includes one or a plurality of hash values that specify the document of interest.

Ｓ１０４は図４の符号１０６Ａで示したプロセスに相当する。もちろん、図４の符号１０６Ａに示したプロセスあるいは他のプロセスが実行されてもよい。 S104 corresponds to the process indicated by reference numeral 106A in FIG. Of course, the process indicated by reference numeral 106A in FIG. 4 or other processes may be executed.

具体的には、Ｓ１０５において、サーバー側において、検索要求に対応した検索処理が開始される。各秘匿関連情報を順番に参照していく過程において、Ｓ１０６で、ハッシュ値が同一のものが発見された場合には、処理がＳ１０７以降へ移行する。 Specifically, in S105, search processing corresponding to the search request is started on the server side. In the process of sequentially referencing each secret-related information, if a hash value with the same hash value is found in S106, the process proceeds to S107 and subsequent steps.

Ｓ１０７では、一致が認められたハッシュ値を含む秘匿関連情報の中で、注目文書とは異なる文書（関連文書）に対応するハッシュ値セットが１つ取り出される。Ｓ１０８では、アクセス権限の有無が判断される。例えば、データベース上において、当該取り出されたハッシュ値セットに対応付けられている１又は複数のユーザーの中に要求元ユーザーが含まれている場合にアクセス権限ありと判断される。アクセス権限が肯定されれば、処理がＳ１０８からＳ１０９へ処理が移行し、Ｓ１０９において当該ハッシュ値セットが検索結果としてのリストに反映される。アクセス権限の有無の判断に当たっては各種の手法を利用可能である。 In S107, one set of hash values corresponding to a document (related document) different from the document of interest is extracted from the secret related information including the hash values for which the match is recognized. In S108, the presence / absence of access authority is determined. For example, if the requesting user is included in one or a plurality of users associated with the extracted hash value set on the database, it is determined that the user has access authority. If the access authority is affirmed, the process proceeds from S108 to S109, and the hash value set is reflected in the search result list in S109. Various methods can be used to determine whether there is access authority.

一方、Ｓ１０８において、要求元ユーザーがアクセス権限を有していないと判断された場合には、Ｓ１１０において、秘匿形式（秘匿モード）が判断され、秘匿形式として完全なる非公開が選択されている場合には、Ｓ１１１において上記において取り出されたハッシュ値セットはリストには反映されずに破棄される。一方、Ｓ１１０において条件付公開が設定されていると判断された場合には、Ｓ１１２において上記Ｓ１０７で取り出されたハッシュ値セットが一定条件下で（つまり公開制限された状態で）リストに反映される。例えば図６に示したように複数のブランク情報を含む項目として関連文書が表現されることになる。 On the other hand, if it is determined in S108 that the requesting user does not have access authority, in S110, a confidential format (confidential mode) is determined, and complete confidentiality is selected as the confidential format. In step S111, the hash value set extracted in the above is discarded without being reflected in the list. On the other hand, if it is determined in S110 that conditional disclosure is set, the hash value set extracted in S107 in S112 is reflected in the list under certain conditions (that is, in a state where disclosure is restricted). . For example, as shown in FIG. 6, the related document is expressed as an item including a plurality of blank information.

Ｓ１１３では、現在認識されている関連文書グループ内に、まだ検索対象となっていない他の関連文書(正確には他のハッシュ値セット)が残っているか否かが判断され、残っていれば上記のＳ１０７以降の各工程が繰り返し実行される。一方、残っていなければ処理がＳ１１３からＳ１１４へ移行し、Ｓ１１４ではさらに検索を続行するか否かが判断され、続行する場合には（すなわちまだ検索していない部分があれば）Ｓ１０６以降の各工程が繰り返し実行される。一方、データベースの最後まで、あるいは検索範囲の最後まで検索が進行した場合には、処理がＳ１１４からＳ１１５へ移行し、Ｓ１１５では、以上のようにユーザーを基準としてフィルタリング処理が実行された後のリストが、要求元ユーザーすなわち要求元クライアントへ提供される。Ｓ１１６では要求元クライアントにおける画面上にリストが表示されることになる。その際、要求元クライアントでは、管理テーブルが参照され、検索結果として送られてきた各ハッシュ値（又は各ハッシュ値セット）に対応する文書が特定され、それが関連文書として表示される。具体的には、関連文書に関してファイル名、所在、等の情報が表示される。他のユーザーが管理するハッシュ値を受領した場合、問い合わせを促すための処理が実行される。これについては既に説明した通りである。管理テーブル上に自己が保有している各文書についてのハッシュ値が格納されていない場合にはハッシュ値の再計算を行った上でつき合わせを行うようにすればよい。 In S113, it is determined whether or not other related documents (more precisely, other hash value sets) that have not yet been searched remain in the currently recognized related document group. The steps after S107 are repeatedly executed. On the other hand, if it does not remain, the process proceeds from S113 to S114. In S114, it is determined whether or not the search is to be continued, and if it is continued (that is, if there is a part that has not been searched yet) The process is repeated. On the other hand, if the search has progressed to the end of the database or the end of the search range, the process proceeds from S114 to S115. In S115, the list after the filtering process is executed based on the user as described above. Is provided to the requesting user or requesting client. In S116, the list is displayed on the screen of the requesting client. At that time, the requesting client refers to the management table, specifies a document corresponding to each hash value (or each hash value set) sent as a search result, and displays it as a related document. Specifically, information such as a file name, a location, and the like regarding the related document is displayed. When a hash value managed by another user is received, processing for prompting an inquiry is executed. This has already been explained. If the hash value for each document held by itself is not stored in the management table, the hash value may be recalculated and the matching may be performed.

上記図８のフローチャートにおいては、サーバー側においてフィルタリング処理が実行されていたが、図４を用いて説明したようにクライアント側においてフィルタリング処理その他の加工処理が行なわれてもよい。検索結果のリスト表示に当たっては、優先度に応じて順位付けを行ってもよい。例えば、カテゴリー、一致度合い、等の指標に基づいて、関連文書間における優劣を決定し、上位所定個の関連文書だけをリスト表示するようにしてもよい。また、その際に、ユーザーに表示条件を選択させるようにしてもよい。例えば、コンテンツ一致となった関連文書だけがリスト表示されるようにするものである。データベースの更新は、各ユーザーにおける状況、例えば文書配置、文書内容、等が変化したタイミングで行うようにしてもよいし、定期的に実行されるようにしてもよい。 In the flowchart of FIG. 8, the filtering process is executed on the server side. However, as described with reference to FIG. 4, the filtering process and other processing processes may be performed on the client side. In displaying a list of search results, ranking may be performed according to priority. For example, the superiority or inferiority between related documents may be determined based on indices such as the category and the degree of coincidence, and only the upper predetermined number of related documents may be displayed in a list. At that time, the user may select display conditions. For example, only related documents whose contents match are displayed as a list. The update of the database may be performed at a timing when the situation of each user, for example, the document arrangement, the document content, or the like changes, or may be periodically executed.

以上説明したように、本実施形態のシステムによれば、関連文書グループごとに秘匿関連情報がデータベース上に登録され、それが複数のユーザーにおいて共有されることになる。その場合において、文書それ自体の秘匿性を保ったままそれらの関連性を示す情報を共有できるので、個々のユーザーにおいて埋もれていた情報財を広く安全に活用できるという利点がある。上記実施形態においては、文書に対して多面的な観点からハッシュ値への変換を行なって、それぞれのハッシュ値を検索対象にすることができるので、検索における多様なニーズに応えることができ、ユーザーの利便性を極めて向上できるという利点がある。 As described above, according to the system of the present embodiment, confidential related information is registered on the database for each related document group, and is shared among a plurality of users. In that case, since the information indicating the relevance of the document itself can be shared while keeping the confidentiality of the document itself, there is an advantage that information assets buried in individual users can be used widely and safely. In the above embodiment, the document can be converted into a hash value from various viewpoints, and each hash value can be made a search target. There is an advantage that the convenience can be greatly improved.

本発明の実施形態を表すブロック図である。It is a block diagram showing embodiment of this invention. ユーザーＸによって管理されるワークスペースの例を示す図である。3 is a diagram illustrating an example of a workspace managed by a user X. FIG. 秘匿関連情報の構成及び生成過程を説明するための図である。It is a figure for demonstrating the structure and production | generation process of secrecy relevant information. データベースの構造及び検索処理結果の取り扱いを説明するための図である。It is a figure for demonstrating the handling of the structure of a database, and a search process result. 検索結果の一例を示す図である。It is a figure which shows an example of a search result. 検索結果の他の例を示す図である。It is a figure which shows the other example of a search result. ユーザーＹによって管理されるワークスペースを示す図である。It is a figure which shows the workspace managed by the user Y. 検索処理のプロセスを説明するためのフローチャートである。It is a flowchart for demonstrating the process of a search process.

Explanation of symbols

１０サーバー、１２，１４，１６クライアント、１８文書管理部（ワークスペース管理部）、２０関連情報生成部、２２変換処理部、２４管理テーブル、２６登録処理部、２８要求発行部、３０結果処理部、３２秘匿関連情報データベース、３４検索処理部、４０関連情報、４１秘匿関連情報、４２秘匿識別値セット（秘匿識別データ）。 10 server, 12, 14, 16 client, 18 document management section (workspace management section), 20 related information generation section, 22 conversion processing section, 24 management table, 26 registration processing section, 28 request issuing section, 30 result processing section , 32 secret related information database, 34 search processing unit, 40 related information, 41 secret related information, 42 secret identification value set (secret identification data).

Claims

For each related document group composed of a plurality of related documents, the confidentiality-related information composed of a plurality of confidential identification data for identifying the documents in a confidential manner based on the plurality of documents constituting the related document group. Generating means for generating
A database in which a plurality of confidential information is registered in order to share the confidential information among a plurality of users;
When a request for searching for a related document related to the document of interest is issued from the requesting user, the database is searched using the confidential identification data of the document of interest, thereby obtaining the confidential identification data of the related document. Search means to identify as search results;
Providing means for providing information on the related document to the requesting user according to the access authority of the requesting user to the related document identified by the confidential identification data as the search result;
A document retrieval system including:

The document search system according to claim 1, wherein
Each secret identification data constituting the secret-related information is composed of at least one code generated by performing a code conversion process on a basic document,
The code has irreversibility that cannot identify the location and content of the document based on the code.
A document retrieval system characterized by that.

The document search system according to claim 2, wherein
The generating means includes
A function of generating a first code for identifying both an entity part and an attribute part of the underlying document;
A function for generating a second code for identifying one of an entity part and an attribute part of the underlying document;
Comprising
Each secret identification data constituting the secret related information includes the first code and the second code,
A document retrieval system characterized by that.

The document search system according to any one of claims 1-3.
The providing means provides detailed information about the related document to the requesting user when the requesting user has access authority to the related document.
A document retrieval system characterized by that.

The document search system according to claim 4, wherein
The providing means does not provide any information related to the related document to the requesting user when the requesting user does not have access authority to the related document.
A document retrieval system characterized by that.

The document search system according to claim 4, wherein
The providing means provides the request source user with information used for inquiries to the user having the authority to manage the related document when the request source user does not have the access authority to the related document;
A document retrieval system characterized by that.

For each related document group which is provided in each client device used by the server device or user and is composed of a plurality of related documents, the documents are concealed based on the plurality of documents constituting the related document group. Generating means for generating secret related information composed of a plurality of secret identification data for identification while
Provided in the server device, a database in which a plurality of secret-related information is registered in order to share each secret-related information among a plurality of users;
When the request source user issues a request for searching for a related document related to the document of interest provided in the server device or each of the client devices, the database is searched using confidential identification data of the document of interest. Search means for specifying confidential identification data about the related document as a search result,
Information on the related document for the requesting user according to the access authority of the requesting user for the related document provided in the server device or each client device and identified by the confidential identification data as the search result Providing means for providing
A document retrieval system including:

A document search program executed in the information processing apparatus,
For each related document group composed of a plurality of related documents, the confidentiality-related information composed of a plurality of confidential identification data for identifying the documents in a confidential manner based on the plurality of documents constituting the related document group. With the ability to generate
When a request for searching for a related document related to a document of interest is issued by a requesting user, a plurality of users can use a plurality of users to share confidential information with a plurality of users using confidential identification data about the document of interest. A function of specifying confidential identification data about the related document as a search result by searching a database in which confidential information is registered;
Including
According to the access authority of the requesting user for the related document identified by the confidential identification data as the search result, information related to the related document is provided to the requesting user.
A document search program characterized by that.