JP2013210862A

JP2013210862A - Retrieval method for difference information between versions of document

Info

Publication number: JP2013210862A
Application number: JP2012080906A
Authority: JP
Inventors: Koya Okabe; 康矢岡部
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2012-03-30
Filing date: 2012-03-30
Publication date: 2013-10-10

Abstract

PROBLEM TO BE SOLVED: To more accurately retrieve a changed version regarding a change in specific part of a document.SOLUTION: A retrieval method includes retrieval request accepting means which accepts a retrieval request together with a retrieval word with respect to a different portion between versions of a document, difference information extracting means which extracts a different portion by investigating differences between all the versions in the document subjected to retrieval, and retrieval means which performs retrieval on the different portion extracted by the difference information extracting means while using the retrieval word accepted by the retrieval request accepting means. The different portion is extracted by investigating a difference between all versions in the document subjected to retrieval where a version number of the changed version or associated attribute information is acquired with respect to the retrieval word accepted by the retrieval request accepting means as a result of retrieval by the retrieval means, retrieval is performed on the extracted different portion by using the retrieval word accepted by the retrieval request accepting means, and version information which is changed with respect to the accepted retrieval word is acquired.

Description

本発明は、文書管理システムにおける文書のバージョン間の差異情報に対する検索方法に関する。 The present invention relates to a search method for difference information between versions of a document in a document management system.

文書管理システムでは、一般的に文書のバージョン管理が可能である。すなわち文書を新規登録、又は作成するとバージョンがＶ１．０となり、その文書を更新することによってＶ２．０、Ｖ３．０・・とバージョンが増加し、文書は上書きされることなくすべて保持される。 In the document management system, document version management is generally possible. That is, when a document is newly registered or created, the version becomes V1.0, and by updating the document, the version increases to V2.0, V3.0,... And all the documents are retained without being overwritten.

そこにおいて、過去のバージョンの内容を確認するためには、文書のバージョン履歴から一つバージョンを選択し、そのバージョンを開く必要があった。そのため、「この記載はどのバージョンで変更されたのか」といったように、文書の変更履歴を調べることは非常に煩雑な作業が必要であった。それを解決するために、文書中に存在すると思われる検索語で文書に対して検索を行い、キーワードを含むバージョンと、含まないバージョンを視覚的に表示するという技術がある（特許文献１参照）。 In order to confirm the contents of past versions, it is necessary to select one version from the version history of the document and open the version. For this reason, it is necessary to perform a very complicated work to check the document change history such as “Which version has this description changed?”. In order to solve this problem, there is a technique in which a document is searched with a search word that seems to exist in the document, and a version including a keyword and a version not including the keyword are visually displayed (see Patent Document 1). .

特開２００３−１６７９１１号公報JP 2003-167911 A

前述した技術では、文書全体に対して単純に検索を行うため、意図しない検索結果が得られる可能性が高い。例えば検索語が文書内に多く存在する場合、１２ページにおける変更履歴が知りたくとも、他のページで検索にＨｉｔしてしまう可能性がある。 In the technique described above, since the entire document is simply searched, there is a high possibility that an unintended search result is obtained. For example, when there are many search terms in a document, there is a possibility that even if the user wants to know the change history on the 12th page, the search is hit on another page.

上記課題を解決するために、本提案では、
文書のバージョン間の差異部分に対して検索語とともに検索要求を受け付ける検索要求受付手段（Ｓ６０４）と、
検索対象の文書において、すべてのバージョン間で差異を調べて、差異部分を抽出する差異情報抽出手段（Ｓ５０３）手段と、
前記検索要求受付手段（Ｓ６０４）で受け付けた検索語で、前記差異情報抽出手段（Ｓ５０３）で抽出した差異部分に対し検索を実行する検索手段（Ｓ６０５）と、
前記検索手段（Ｓ６０５）で検索した結果として、前記検索要求受付手段（Ｓ６０４）で受け付けた検索語に関して変化があったバージョンのバージョン番号や関連する属性情報を取得するバージョン情報取得手段（Ｓ６０７）と、
を有することを特徴とする。 In order to solve the above problems, this proposal
Search request accepting means (S604) for accepting a search request together with a search word for a difference between versions of a document;
A difference information extracting means (S503) means for examining differences between all versions in a document to be searched and extracting a difference portion;
Search means (S605) for executing a search for the difference portion extracted by the difference information extraction means (S503) with the search word received by the search request acceptance means (S604);
As a result of the search by the search means (S605), a version information acquisition means (S607) for acquiring the version number of the version having changed with respect to the search word received by the search request reception means (S604) and related attribute information; ,
It is characterized by having.

本発明においては、文書の特定の箇所の変更について、変更があったバージョンを、より正確に検索することができる。 In the present invention, it is possible to more accurately search for a changed version of a change in a specific part of a document.

システム構成図System Configuration ソフトウェア構成図Software configuration diagram 差異情報の例Example of difference information 検索ＵＩの例Search UI example 差異情報抽出処理のフローチャートFlow chart of difference information extraction processing 検索処理のフローチャートSearch process flowchart 第二実施の形態における差異情報抽出処理及び検索処理のフローチャートFlow chart of difference information extraction processing and search processing in the second embodiment 第三実施の形態における差異情報抽出処理のフローチャートFlow chart of difference information extraction processing in the third embodiment

以下、本発明を実施するための最良の形態について図面を用いて説明する。 The best mode for carrying out the present invention will be described below with reference to the drawings.

［実施例１］
本発明に係る第一実施の形態を図１から図６に基づき説明する。 [Example 1]
A first embodiment according to the present invention will be described with reference to FIGS.

図１はシステムの概略図である。少なくとも文書管理サーバー１０１、クライアント１０２が存在する。 FIG. 1 is a schematic diagram of the system. At least the document management server 101 and the client 102 exist.

サーバー１０１は、文書を蓄積し、クライアント１０２の要求に応じて文書の検索処理や取得、格納処理等を実行する。 The server 101 accumulates documents and executes document search processing, acquisition, storage processing, and the like in response to a request from the client 102.

クライアント１０２は、ユーザーの要求に応じてサーバー１０１に対して文書の検索等の要求を出す。また要求に対する結果を受け取りＵＩに表示する。 The client 102 issues a request such as document search to the server 101 in response to a user request. The result for the request is received and displayed on the UI.

サーバー１０１とクライアント１０２はイーサネット（登録商標）等のネットワークで接続されている。またこれらのＰＣはすべてＣＰＵ、ＲＡＭ、ＲＯＭ、ＨＤＤ、ネットワークインタフェースカード等のハードウェア構成物により構成される。 The server 101 and the client 102 are connected via a network such as Ethernet (registered trademark). These PCs are all configured by hardware components such as a CPU, RAM, ROM, HDD, and network interface card.

本実施例では２台構成で図示しているが、サーバー１０１はデータベース機能や検索機能を別のＰＣで実行するように構成しても良い。クライアント１０２も複数台存在して構わない。 In this embodiment, the server 101 is illustrated in a two-unit configuration, but the server 101 may be configured to execute the database function and the search function on another PC. There may be a plurality of clients 102.

図２はサーバー１０１のソフトウェア構成の一例を示した図である。これらのプログラムはサーバー１０１のＣＰＵ上で動作する。 FIG. 2 is a diagram illustrating an example of a software configuration of the server 101. These programs operate on the CPU of the server 101.

クライアント通信部２０１はユーザーからの入力を受けつけたり、サーバー１０１からクライアント１０２へ情報を送信したりする。 The client communication unit 201 receives input from the user or transmits information from the server 101 to the client 102.

差異抽出部２０２はＤＢアクセス部２０４を介してＤＢ部２０５へアクセスし、各文書のバージョン間の差異（以後、差異情報３００と呼ぶ）を抽出する。また差異情報３００をＤＢ部２０５へ格納する。差異は、公知で知られるｄｉｆｆプログラムと同等の機能により抽出する。そのため一般的には改行コードから改行コードまでをひと固まりとしてバージョン間で比較を行い、差異があれば抽出する。またその箇所に関する情報（ページや行など）も記録する。なお対象のファイルはテキストファイルに限らずバイナリファイルに対しても、公知の技術で可能である。 The difference extraction unit 202 accesses the DB unit 205 via the DB access unit 204 and extracts a difference between versions of each document (hereinafter referred to as difference information 300). Further, the difference information 300 is stored in the DB unit 205. The difference is extracted by a function equivalent to a known and known diff program. For this reason, in general, the versions from the line feed code to the line feed code are compared as a set, and if there is a difference, it is extracted. Also record information about the location (pages, lines, etc.). Note that the target file is not limited to a text file, and can be a binary file using a known technique.

検索部２０３はＤＢアクセス部２０４を介してＤＢ部２０５へアクセスし、差異情報３００に対して検索を実行する。得られた結果はクライアント通信部２０１へ渡す。検索結果の主たる情報はバージョンであり、それに付随して操作種別、内容、更新者等の情報があっても良い。 The search unit 203 accesses the DB unit 205 via the DB access unit 204 and executes a search for the difference information 300. The obtained result is passed to the client communication unit 201. The main information of the search result is a version, and there may be information such as operation type, contents, updater, etc. accompanying it.

ＤＢアクセス部２０４はＤＢ部２０５へのアクセス要求を受けてＤＢ部２０５へアクセスし、情報を取得、或いは格納する。 In response to an access request to the DB unit 205, the DB access unit 204 accesses the DB unit 205 to acquire or store information.

ＤＢ部２０５は文書や差異情報３００などの各種情報を保存する。 The DB unit 205 stores various information such as documents and difference information 300.

図３は差異情報３００の例を示した図である。この例ではある一つの文書に関する差異情報についてまとめている。 FIG. 3 is a diagram illustrating an example of the difference information 300. In this example, the difference information related to one document is summarized.

３０１はバージョンを示した列である。 Reference numeral 301 denotes a column indicating a version.

３０２は各バージョンにおいて、文書に対してなされた操作を示した列である。操作は例えば「追加」「削除」「修正」があるが、「修正」に関してはどちらで検索されてもＨｉｔするように「修正前」と「修正後」に分けて記録する。また、操作は文書に対して行われた操作を箇所ごとに記録する。例えばＶ２．０にバージョンアップするときに、文書中に２箇所追加された場合、行が２行作成され、どちらも「追加」として記録する。 Reference numeral 302 denotes a column indicating operations performed on the document in each version. For example, there are “addition”, “deletion”, and “correction” operations, but “correction” is recorded separately in “before correction” and “after correction” so as to be hit regardless of which one is searched. The operation records the operation performed on the document for each location. For example, when upgrading to V2.0, if two places are added to the document, two lines are created and both are recorded as “added”.

３０３は３０２の操作が行われたページを示した列である。 A column 303 indicates a page on which the operation 302 is performed.

３０４は３０３のページ内の行を示した列である。 Reference numeral 304 denotes a column indicating a row in the page 303.

３０５は３０２の操作対象となった文書の内容である。例えばＶ２．０にバージョンアップするときに「この文書は機密事項である。」という文字列が追加された場合、３０５には「この文書は機密事項である。」を記録する。 Reference numeral 305 denotes the contents of the document to be operated 302. For example, when a character string “This document is a confidential matter” is added when upgrading to V2.0, “This document is a confidential matter” is recorded in 305.

図４は差異情報に対する検索のＵＩの例を示した図である。このＵＩはクライアント１０２におけるブラウザ等に表示される。 FIG. 4 is a diagram showing an example of a search UI for the difference information. This UI is displayed on a browser or the like in the client 102.

４０１は検索対象の文書を受け付けるフィールドである。ユーザーが直接入力しても良いし、或いは文書を選択した状態でマウスの右クリック等のメニューから本ＵＩを起動するような構成を考えれば、自動で入力されても良い。 Reference numeral 401 denotes a field for receiving a search target document. The user may input directly, or may be input automatically when considering a configuration in which the UI is activated from a menu such as right-clicking the mouse with a document selected.

４０２は検索キーワードを受け付けるフィールドである。ユーザーが直接入力しても良いし、或いは文書を閲覧した状態で、検索したい箇所をマウス等で選択し、右クリック等のメニューから本ＵＩを起動するような構成を考えれば、自動で入力されても良い。この場合４０３のページについても同時に入力される。 Reference numeral 402 denotes a field for receiving a search keyword. The user may input directly, or while browsing the document, select the part you want to search with the mouse, etc. and enter this UI automatically from the right click menu etc. May be. In this case, the page 403 is also input simultaneously.

４０３はキーワードが検索するページを受け付けるフィールドである。指定がない場合全ページを検索対象とするようにしても良い。 Reference numeral 403 denotes a field for receiving a page searched by a keyword. If not specified, all pages may be searched.

４０４は検索実行要求を受け付けるボタンである。 Reference numeral 404 denotes a button for receiving a search execution request.

なお図示はしていないが、検索のパラメーターとして「追加」「削除」「修正」といった操作種別を指定できるようにしても良い。また検索対象のバージョンを「最新からＶ３．０まで」といったように期間で指定できるようにしても良い。またそれは日時で指定できるようにしても良い。 Although not shown, operation types such as “add”, “delete”, and “modify” may be designated as search parameters. Further, the search target version may be designated by a period such as “from latest to V3.0”. It may be specified by date and time.

図５は差異抽出部２０２による文書のバージョン間の差異情報抽出処理の流れを示したフローチャートである。この処理はユーザーによって検索の要求があった時点で実行してもよいし、文書が更新されたときでもよく、或いは定期的なスケジュールで実行しても良い。 FIG. 5 is a flowchart showing a flow of processing for extracting difference information between document versions by the difference extraction unit 202. This process may be executed when a search request is made by the user, may be executed when a document is updated, or may be executed on a regular schedule.

差異抽出部２０２は指定された文書のあるバージョン（例えば最新バージョン）を取得する（Ｓ５０１）。次にＳ５０１で取得したバージョンの次のバージョン（例えば２番目に古いバージョン）を取得する（Ｓ５０２）。そして差異抽出部２０２は取得した２つのバージョンについて、差異を抽出し（Ｓ５０３）、結果を差異情報３００としてＤＢアクセス部２０４を介してＤＢ部２０５へ保存する（Ｓ５０４）。 The difference extraction unit 202 acquires a certain version (for example, the latest version) of the designated document (S501). Next, the next version (for example, the second oldest version) acquired in S501 is acquired (S502). Then, the difference extraction unit 202 extracts differences between the two acquired versions (S503), and saves the results as difference information 300 in the DB unit 205 via the DB access unit 204 (S504).

次に差異抽出部２０２はまだ差異を抽出していないバージョンが存在するか調べ（Ｓ５０５）、存在していれば次のバージョンを取得し（Ｓ５０６）、処理を繰り返す。 Next, the difference extraction unit 202 checks whether there is a version from which a difference has not yet been extracted (S505), and if it exists, acquires the next version (S506) and repeats the process.

図６は検索部２０３による差異情報３００に対する検索処理の流れを示したフローチャートである。 FIG. 6 is a flowchart showing the flow of search processing for the difference information 300 by the search unit 203.

検索部２０３はクライアント通信部２０１から検索対象の文書の指定を受け付ける（Ｓ６０１）。そして検索部２０３は差異情報３００がＤＢ部２０５に存在するか調べる（Ｓ６０２）。存在しない場合は、図５で示した差異情報３００の抽出処理を実行する（Ｓ６０３）。存在する場合は、クライアント通信部２０１から検索キーワードの指定を受け付ける（Ｓ６０４）。そして検索部２０３は、ＤＢ部２０５に格納されている差異情報３００に対して、Ｓ６０４で受け付けた検索キーワードで検索を実行する（Ｓ６０５）。検索部２０３は検索がＨｉｔするかどうか調べ（Ｓ６０６）、Ｈｉｔした場合はＤＢ部２０５からバージョン番号、操作種別、内容、更新者等の情報を取得する（Ｓ６０７）。最後に検索部２０３はクライアント通信部２０１へ取得した情報を渡す（Ｓ６０８）。Ｓ６０６でＨｉｔしなかった場合は、検索がＨｉｔしなかったことをクライアント通信部２０１へ返す（Ｓ６０９）。 The search unit 203 accepts specification of a search target document from the client communication unit 201 (S601). Then, the search unit 203 checks whether the difference information 300 exists in the DB unit 205 (S602). If not, the extraction process of the difference information 300 shown in FIG. 5 is executed (S603). If it exists, the search keyword designation is received from the client communication unit 201 (S604). Then, the search unit 203 searches the difference information 300 stored in the DB unit 205 using the search keyword received in S604 (S605). The search unit 203 checks whether or not the search is hit (S606), and if hit, acquires information such as the version number, operation type, content, and updater from the DB unit 205 (S607). Finally, the search unit 203 passes the acquired information to the client communication unit 201 (S608). If not hit in S606, it returns to the client communication unit 201 that the search was not hit (S609).

なお複数Ｈｉｔした場合は、検索部２０３は複数分の情報を取得し、クライアント通信部２０１へ渡す。 If a plurality of hits are made, the search unit 203 acquires a plurality of pieces of information and passes them to the client communication unit 201.

［実施例２］
本発明に係る第二実施の形態を図７に基づき説明する。 [Example 2]
A second embodiment according to the present invention will be described with reference to FIG.

図７は差異抽出部２０２と検索部２０３による差異情報３００の抽出処理とそれに対する検索処理の流れを示したフローチャートである。 FIG. 7 is a flowchart showing the difference information 300 extraction process by the difference extraction unit 202 and the search unit 203 and the flow of the search process corresponding thereto.

第一実施の形態では文書の隣り合うバージョン間の差異情報を、全バージョン間で調べた上で検索を実施していたが、第二実施の形態では新しいバージョンと古いバージョン（例えば最新と最古）で差異情報を抽出し、検索がＨｉｔしたらバージョンを二分割し、再帰的に処理を行う。これにより検索の効率化が計れる。ただし検索結果は１件に絞られる。 In the first embodiment, the difference information between adjacent versions of a document is searched after checking between all versions. In the second embodiment, a new version and an old version (for example, the latest and oldest versions) are searched. ), The difference information is extracted, and if the search is hit, the version is divided into two and the process is recursively performed. This can improve the search efficiency. However, search results are limited to one.

検索部２０３はクライアント通信部２０１から検索対象の文書の指定を受け付ける（Ｓ７０１）。さらに検索部２０３はクライアント通信部２０１から検索キーワードの指定を受け付ける（Ｓ７０２）。 The search unit 203 receives the specification of the search target document from the client communication unit 201 (S701). Further, the search unit 203 receives a search keyword specification from the client communication unit 201 (S702).

次に差異抽出部２０２は対象文書の最新バージョンを取得する（Ｓ７０３）。これを便宜上バージョンＡと呼ぶ。さらに差異抽出部２０２は対象文書の最古バージョンを取得する（Ｓ７０４）。これを便宜上バージョンＢと呼ぶ。そして差異抽出部２０２はバージョンＡとバージョンＢの差異情報３００を抽出する（Ｓ７０５）。必要に応じて結果はＤＢ部２０５へ格納しても良い。 Next, the difference extraction unit 202 acquires the latest version of the target document (S703). This is referred to as version A for convenience. Further, the difference extraction unit 202 acquires the oldest version of the target document (S704). This is called version B for convenience. Then, the difference extraction unit 202 extracts the difference information 300 between version A and version B (S705). The result may be stored in the DB unit 205 as necessary.

次に検索部２０３はＳ７０５で得られた差異情報３００に対し、Ｓ７０２で指定された検索キーワードで検索を実行する（Ｓ７０６）。検索部２０３は検索がＨｉｔしたかどうか調べ（Ｓ７０７）、Ｈｉｔした場合はバージョンＡとバージョンＢの間にバージョンが存在するかどうか調べる（Ｓ７０９）。Ｓ７０７でＨｉｔしなかった場合は、検索がＨｉｔしなかったことをクライアント通信部２０１へ返す（Ｓ７０８）。 Next, the search unit 203 searches the difference information 300 obtained in S705 using the search keyword specified in S702 (S706). The search unit 203 checks whether or not the search is hit (S707), and if hit, checks whether a version exists between version A and version B (S709). If not hit in S707, it returns to the client communication unit 201 that the search was not hit (S708).

Ｓ７０９でバージョンが存在する場合、差異抽出部２０２は再度バージョンＡを取得する（Ｓ７１０）。さらに差異抽出部２０２はバージョンＡとバージョンＢの中間のバージョンを取得する（Ｓ７１１）。これを便宜上バージョンＣと呼ぶ。そして差異抽出部２０２はバージョンＡとバージョンＣの差異情報３００を抽出する（Ｓ７１２）。次に検索部２０３はＳ７１２で得られた差異情報３００に対し、Ｓ７０２で指定された検索キーワードで検索を実行する（Ｓ７１３）。検索部２０３は検索がＨｉｔしたかどうか調べ（Ｓ７１４）、Ｈｉｔした場合はバージョンＣをバージョンＢとした上で（Ｓ７１５）、Ｓ７０９からＳ７１６の処理を再帰的に行う。Ｓ７１４でＨｉｔしなかった場合は、バージョンＣをバージョンＡとした上で（Ｓ７１６）、Ｓ７０９からＳ７１６の処理を再帰的に行う。 If a version exists in S709, the difference extraction unit 202 acquires version A again (S710). Further, the difference extraction unit 202 acquires an intermediate version between version A and version B (S711). This is referred to as version C for convenience. Then, the difference extraction unit 202 extracts the difference information 300 between version A and version C (S712). Next, the search unit 203 searches the difference information 300 obtained in S712 with the search keyword specified in S702 (S713). The search unit 203 checks whether or not the search is hit (S714), and if hit, sets version C to version B (S715), and recursively performs the processing from S709 to S716. If not hit in S714, version C is changed to version A (S716), and the processing from S709 to S716 is performed recursively.

Ｓ７０９でバージョンが存在しない場合、Ｓ７０７、或いはＳ７１５でＨｉｔしたときのバージョンＡが求めるバージョンであるため、検索部２０３はＤＢ部２０５からバージョン番号、操作種別、内容、更新者等の情報を取得する（Ｓ７１７）。最後に検索部２０３はクライアント通信部２０１へ取得した情報を渡す（Ｓ７１８）。 If no version exists in S709, the search unit 203 obtains information such as the version number, operation type, content, and updater from the DB unit 205 because the version A obtained when hitting in S707 or S715 is the version to be obtained. (S717). Finally, the search unit 203 passes the acquired information to the client communication unit 201 (S718).

本実施例ではＳ７１０でバージョンＡを取得することにより、複数件Ｈｉｔする場合はもっとも新しいバージョンが返ることになるが、もっとも古いバージョンが返るようにしてもよい。すなわちその場合はＳ７１０でバージョンＢを取得し、Ｓ７１５でバージョンＣをバージョンＡとし、Ｓ７１６でバージョンＣをバージョンＢとすることになる。 In this embodiment, by acquiring version A in S710, the latest version is returned when multiple hits are made, but the oldest version may be returned. That is, in that case, version B is acquired in S710, version C is set as version A in S715, and version C is set as version B in S716.

なお、図示はしていないが、単純にバージョンを中間で二分するのではなく、文書サイズを考慮しても良い。すなわちサイズ変化が大きい境界をまたぐようにしてバージョンＣを決定する。 Although not shown in the figure, the document size may be taken into consideration instead of simply dividing the version in half. That is, the version C is determined so as to cross the boundary where the size change is large.

同様の観点で、バージョン管理がメジャーバージョン（例えばＶ１．０、Ｖ２．０・・）とマイナーバージョン（例えばＶ１．１、Ｖ１．２・・）で構成されている場合、まずメジャーバージョン同士の比較を行った上で最後にマイナーバージョン同士の比較を行っても良い。 From the same point of view, when version management is composed of major versions (eg V1.0, V2.0 ...) and minor versions (eg V1.1, V1.2 ...), first compare the major versions. Finally, you can compare minor versions.

［実施例３］
本発明に係る第三実施の形態を図８に基づき説明する。 [Example 3]
A third embodiment according to the present invention will be described with reference to FIG.

図８は差異抽出部２０２による差異情報３００の抽出処理の流れを示したフローチャートである。第一実施の形態では文書のバージョン間の差異情報を全バージョン間で必ず抽出していたが、第三実施の形態ではあるバージョン間で差異情報３００が多すぎる場合、そのバージョンは検索対象外とする。これにより検索結果からノイズを減らすことができる。 FIG. 8 is a flowchart showing a flow of extraction processing of difference information 300 by the difference extraction unit 202. In the first embodiment, difference information between versions of a document is always extracted between all versions. However, in the third embodiment, if there is too much difference information 300 between certain versions, the version is not searched. To do. Thereby, noise can be reduced from the search result.

図８は図５と比較し、Ｓ８０１の処理が加わる以外は同等であるため、Ｓ８０１についてのみ説明する。 FIG. 8 is the same as FIG. 5 except that the process of S801 is added, so only S801 will be described.

差異抽出部２０２はＳ５０３で抽出した差異情報３００の量が予めシステムで定められた閾値を超えていないかどうか判断する。量とは、差異としてカウントされた箇所の数や各差異の内容の文字数などが考えられる。また閾値は管理者によって指定可能としても良い。 The difference extraction unit 202 determines whether or not the amount of the difference information 300 extracted in S503 exceeds a threshold value determined in advance by the system. The amount may be the number of locations counted as a difference or the number of characters in the content of each difference. The threshold value may be specified by the administrator.

なお図示はしていないが、差異情報抽出処理ではフィルタリングせず、検索時に差異情報の量が多すぎるバージョンをフィルタリングしてもよい。 Although not shown in the figure, the difference information extraction process may not be filtered, and a version having too much difference information may be filtered during the search.

２０１クライアント通信部
２０２差異抽出部
２０３検索部
２０４ＤＢアクセス部
２０５ＤＢ部
201 Client communication unit 202 Difference extraction unit 203 Search unit 204 DB access unit 205 DB unit

Claims

In a document management system that can manage versions,
Search request accepting means (S604) for accepting a search request together with a search word for a difference between versions of a document;
A difference information extraction means (S503) means for examining differences between all versions in a document to be searched and extracting a difference portion;
Search means (S605) for executing a search for the difference portion extracted by the difference information extraction means (S503) with the search word received by the search request acceptance means (S604);
As a result of the search by the search means (S605), a version information acquisition means (S607) for acquiring the version number of the version having changed with respect to the search word received by the search request reception means (S604) and related attribute information; ,
A document management system comprising:

In a document management system that can manage versions,
Search request accepting means (S702) for accepting a search request together with a search word for the difference between versions of the document;
Latest version acquisition means (S703) for acquiring the latest version of the document;
An oldest version acquisition means (S704) for acquiring the oldest version of the document;
A first difference information extraction unit (S705) for examining a difference between the version acquired by the latest version acquisition unit (S703) and the version acquired by the oldest version acquisition unit (S704) and extracting a difference portion;
First search means (S706) for executing a search for the difference portion extracted by the first difference information extraction means (S705) with the search word received by the search request reception means (S702);
When the search is hit by the first search means (S706), an intermediate version between the version acquired by the latest version acquisition means (S703) and the version acquired by the oldest version acquisition means (S704) is acquired. Intermediate version acquisition means (S711),
A second difference information extraction unit (S712) for examining a difference between the version acquired by the latest version acquisition unit (S703) and the version acquired by the intermediate version acquisition unit (S711) and extracting a difference portion;
Second search means (S713) for executing a search on the difference portion extracted by the second difference information extraction means (S712) with the search word received by the search request reception means (S702);
When the second search means (S713) hits the search, the version acquired by the intermediate version acquisition means (S711) is regarded as the oldest version (S715), and the intermediate version acquisition means (S711) Means for recursively executing the processing up to the second search means (S713);
If the search is not hit by the second search means (S713), the version acquired by the intermediate version acquisition means (S711) is regarded as the latest version (S716), and the intermediate version acquisition means (S711) Means for recursively executing the processing from the second search means (S713),
When the version acquired by the intermediate version acquisition unit (S711) is equal to either the version acquired by the latest version acquisition unit or the version acquired by the oldest version acquisition unit, the first search unit (S706) Alternatively, as a result of the search by the second search means (S713), the version information acquisition for acquiring the version number of the version that has changed with respect to the search word received by the search request reception means (S702) and the related attribute information Means (S717);
A document management system comprising:

In the difference information extraction means (S503, S705, S712), the difference result is not stored when the number of extracted result difference portions and the number of difference characters in each portion exceed a predetermined threshold (S801). The document management system according to claim 1 or 2.