JP2008158918A

JP2008158918A - Document management keyword extraction system

Info

Publication number: JP2008158918A
Application number: JP2006348912A
Authority: JP
Inventors: Kazuya Niizaka; 和也新坂; Kentaro Onishi; 健太郎大西; Takanari Hashimoto; 隆也橋本
Original assignee: Hitachi Electronics Services Co Ltd
Current assignee: Hitachi Electronics Services Co Ltd
Priority date: 2006-12-26
Filing date: 2006-12-26
Publication date: 2008-07-10

Abstract

<P>PROBLEM TO BE SOLVED: To provide a document management keyword extraction system for efficiently and accurately setting, in management of a document by connecting a management purpose thereof with a keyword contained in the document, a keyword matched to the management purpose. <P>SOLUTION: The system comprises a client computer 2, a server 1, and a database 3, in which a plurality of files to be managed with the same management purpose is stored in a certain holder (e.g., A), and a character string composed of continuous n pieces of characters is extracted for each appearance place in the files, wherein as n, a natural number can be set as n. Such extraction is performed to all files (b, c, and d) stored in the holder A, and if the character string and the appearance place thereof are common to all the files, this character string is extracted and recorded as a keyword matched to the management purpose of the holder A. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、文書管理キーワード抽出システムに関し、特に管理目的に従って管理すべき文章から管理目的に合ったキーワードをより詳しく抽出してこのキーワードを用いて文書を効率的に正しく管理できる文書管理キーワード抽出システムに関する。 The present invention relates to a document management keyword extraction system, and more particularly, a document management keyword extraction system capable of more specifically extracting keywords suitable for a management purpose from sentences to be managed according to the management purpose and efficiently managing a document using the keyword. About.

情報システム内に存在する膨大な量の電子文書の中から、ある管理目的に従って管理するべき文書を自動的に選別する場合がある（例えば、特許文献１を参照）。
特開平６−１８７３７３号公報 There is a case where a document to be managed is automatically selected according to a certain management purpose from a huge amount of electronic documents existing in the information system (see, for example, Patent Document 1).
JP-A-6-187373

このように、ある管理目的に従って管理するべき文書を自動的に選別する過程では、キーワードを設定してそのキーワードを含む文書を選別する仕組みを利用する場合に、このキーワードが個人の主観等によって適切に設定することができず、また管理者はどういったキーワードを設定すればよいかがわからないために、必要な文書が確実に選別できない場合がある。 In this way, in the process of automatically selecting documents to be managed in accordance with a certain management purpose, when using a mechanism for setting a keyword and selecting a document that includes the keyword, this keyword is appropriate depending on the subjectivity of the individual. In some cases, the necessary documents cannot be reliably selected because the administrator does not know what keywords should be set.

本発明は、文書を管理する際に、その管理目的と文書内に含まれるキーワードを結び付けて文書管理をする際に、その管理目的に合ったキーワードを効率的に、且つ正しく設定するための文書管理キーワード抽出システムを提供することを目的とする。 The present invention relates to a document for efficiently and correctly setting a keyword suitable for the management purpose when managing the document by linking the management purpose and a keyword included in the document when managing the document. The purpose is to provide a management keyword extraction system.

本発明は、クライアントコンピュータと、サーバと、データベースとを備え、あるフォルダの中に、同じ管理目的で管理する複数のファイルを格納し、前記複数のファイル内の各出現場所毎に、ｎ文字（ｎは自然数）の連続した文字列を抽出し、前記フォルダに格納されたすべてのファイルに対して、前記複数のファイル内の各前記出現場所毎に、前記ｎ文字の連続した文字列の抽出を行い、前記文字列とその前記出現場所がすべての前記複数のファイルに共通なものを検出し、検出した文字列とその出現場所を前記フォルダの前記管理目的に合ったキーワードとして抽出し記録する文書管理キーワード抽出システムである。 The present invention includes a client computer, a server, and a database. A plurality of files managed for the same management purpose are stored in a folder, and n characters (for each appearance location in the plurality of files). n is a natural number), and for all the files stored in the folder, the continuous character string of the n characters is extracted for each appearance location in the plurality of files. And detecting the character string and its appearance location common to all the plurality of files, and extracting and recording the detected character string and its appearance location as keywords suitable for the management purpose of the folder It is a management keyword extraction system.

また、本発明は、前記フォルダに格納された前記複数のファイル以外のファイルに対して、前記キーワードの出現場所に対して文字列検索を行い、抽出したキーワードと出現個所が合致するファイルを検出し、合致したファイルを、前記フォルダの管理目的で管理すべきファイルとして警告をし、管理すべきファイルをフォルダに移動又はコピーする文書管理キーワード抽出システムである。 Further, the present invention performs a character string search for the appearance location of the keyword for files other than the plurality of files stored in the folder, and detects a file in which the extracted keyword matches the appearance location. The document management keyword extraction system warns a matched file as a file to be managed for the purpose of managing the folder, and moves or copies the file to be managed to the folder.

そして、本発明は、前記フォルダに新たに加えられたファイルに、抽出したキーワードが該当する出現場所にない場合に、抽出したキーワードを該当する出現場所に記載して、警告する文書管理キーワード抽出システムである。 Then, the present invention provides a document management keyword extraction system for writing a warning when an extracted keyword is not present in the corresponding appearance location in the file newly added to the folder and warning the extracted keyword It is.

更に、本発明は、新たに加えられた前記ファイルが、読取専用ファイルで書き込みが出来ない場合には警告をする文書管理キーワード抽出システムである。 Furthermore, the present invention is a document management keyword extraction system that issues a warning when a newly added file cannot be written as a read-only file.

本発明の文書管理キーワード抽出システムによれば、これまではキーワードの設定は個人の経験等に依存して設定されるケースが多かったが、同じ管理目的下にある複数のファイルに対して、キーワードとなりうる文字列を出現場所と組み合わせてキーワードとすることで、より詳細なキーワードの抽出が可能になり、管理目的に合ったキーワードを効率的に、且つ正しく設定することができる。 According to the document management keyword extraction system of the present invention, keywords have been set in many cases depending on personal experience so far. However, keywords are assigned to a plurality of files under the same management purpose. By combining the possible character strings with the appearance locations as keywords, more detailed keywords can be extracted, and keywords suitable for management purposes can be set efficiently and correctly.

本発明を実施するための最良の形態を説明する。
本発明の文書管理キーワード抽出システムの実施例について、図面を用いて説明する。図１は、実施例の文書管理キーワード抽出システムのシステム構成図である。 The best mode for carrying out the present invention will be described.
An embodiment of the document management keyword extraction system of the present invention will be described with reference to the drawings. FIG. 1 is a system configuration diagram of a document management keyword extraction system according to an embodiment.

図１に示すように、本実施例の文書管理キーワード抽出システムは、複数台のクライアントコンピュータ２と、管理者又は利用者４側のサーバ１を備えている。各クライアントコンピュータ２は、サーバ１に対してファイルの保存ができる。サーバ１は、フォルダ内のすべてのファイルに共通したキーワードを抽出し、抽出されたキーワードをデータベース３に記録することができる。サーバ１は、システム内に他の管理すべきファイルがある場合やフォルダ内のファイルに適切なキーワードがない場合には、管理者または利用者４に対して警告を発するようになっている。 As shown in FIG. 1, the document management keyword extraction system of this embodiment includes a plurality of client computers 2 and a server 1 on the manager or user 4 side. Each client computer 2 can store a file in the server 1. The server 1 can extract keywords common to all the files in the folder, and record the extracted keywords in the database 3. The server 1 issues a warning to the administrator or the user 4 when there is another file to be managed in the system or when there is no appropriate keyword in the file in the folder.

本実施例の文書管理キーワード抽出システムは、文書管理キーワード抽出方法を実行することで、これまでのキーワード設定においては個人の経験等に依存して設定されるケースが多かったが、同じ管理目的下にある複数のファイルに対して、キーワードとなりうる文字列を出現場所と組み合わせてキーワードとすることができ、より詳細なキーワードの抽出が可能になる。 In the document management keyword extraction system of the present embodiment, by executing the document management keyword extraction method, the keyword setting so far has often been set depending on personal experience etc., but for the same management purpose A character string that can be used as a keyword can be combined with the appearance location as a keyword, and more detailed keyword extraction can be performed.

本実施例の文書管理キーワード抽出システムにおける処理手順の一例を説明する。まず、（１）同じ管理目的で管理する複数のファイルを同じフォルダに格納する。図２（Ａ）は、フォルダＡに複数のファイルｂ，ｃ，ｄを格納する例を示している。ファイルｂ，ｃ，ｄは、同じ管理目的で管理する管理対象ファイルである。次に、（２）各ファイルの出現場所毎にｎ文字連続した文字列を抽出する。図２（Ｂ）は、ファイルｂから抽出したキーワードとその出現場所の例を示している。 An example of a processing procedure in the document management keyword extraction system of this embodiment will be described. First, (1) a plurality of files managed for the same management purpose are stored in the same folder. FIG. 2A shows an example in which a plurality of files b, c, and d are stored in the folder A. Files b, c, and d are managed files that are managed for the same management purpose. Next, (2) a character string having n consecutive characters is extracted for each appearance location of each file. FIG. 2B shows an example of keywords extracted from the file b and their appearance locations.

次に、（３）フォルダＡに格納されたすべてのファイルに対して上記（２）の手順を行う。図２（Ｃ）は、ファイルｃにおけるキーワードと出現場所の例を示している。図２（Ｄ）は、ファイルｄにおけるキーワードと出現場所の例を示している。 Next, (3) the procedure of (2) is performed on all files stored in the folder A. FIG. 2C shows an example of keywords and appearance locations in the file c. FIG. 2D shows an example of keywords and appearance locations in the file d.

次に、（４）フォルダＡ内のすべてのファイルに共通の文字列とその出現場所を抽出する。図２（Ｅ）は、フォルダＡ内のすべてのファイルｂ，ｃ，ｄに共通の文字列（キーワード）とその出現場所の例である。 Next, (4) a character string common to all files in the folder A and its appearance location are extracted. FIG. 2E shows an example of character strings (keywords) common to all files b, c, and d in the folder A and their appearance locations.

本実施例において、文書を管理する際に、その管理目的と文書ファイル内に含まれるキーワードを結び付けて管理する仕組みにおいて、その目的に合ったキーワードを効率的に、より詳しく且つ正しく設定することができる。具体的には、ある管理目的に合ったｎ個（ｎ：自然数）の文書ファイルを一つのフォルダに入れて、それらの文書ファイルのすべてに共通したキーワードと出現場所との組み合わせを見つけて、それを管理目的に合ったキーワードとして登録する。出現場所とは、図２（Ａ）〜図２（Ｅ）に例示するように、ヘッダ、メタデータや、フッタ、テンプレート、本文（何行目か、またその行のなかで何文字目か）を指す。 In this embodiment, when a document is managed, in a mechanism for managing the management purpose by linking the keyword included in the document file, it is possible to efficiently and more accurately and correctly set the keyword suitable for the purpose. it can. Specifically, n (n: natural number) document files suitable for a certain management purpose are put in one folder, a combination of a keyword and an appearance location common to all the document files is found, and Are registered as keywords suitable for management purposes. As shown in FIGS. 2 (A) to 2 (E), the appearance location is a header, metadata, footer, template, body (number of lines and number of characters in the lines). Point to.

ここで、実施例におけるフォルダＡ内のすべてのファイルｂ，ｃ，ｄに共通した文字列とその出現場所の抽出手順の一例を説明する。図３は、実施例の文書管理キーワード抽出システムにおける抽出手順の説明図である。抽出手順は、ステップＳ１〜Ｓ４を有する。ステップＳ１では、図２（Ａ）の処理（１）に例示するように、管理目的を設定して、その管理目的に合った文書のファイルｂ、ｃ、ｄを１つのフォルダＡ内に保存する。ステップＳ２では、図２（Ｂ）の（２）の処理に例示するように、図２（Ａ）の（１）の処理のフォルダＡ内のファイルｂに対して、ｎ文字連続の文字列（キーワード）５０とその出現場所５１を抽出する。 Here, an example of a procedure for extracting a character string common to all the files b, c, and d in the folder A and its appearance location in the embodiment will be described. FIG. 3 is an explanatory diagram of an extraction procedure in the document management keyword extraction system of the embodiment. The extraction procedure includes steps S1 to S4. In step S1, as illustrated in the process (1) of FIG. 2A, a management purpose is set, and document files b, c, and d that meet the management purpose are stored in one folder A. . In step S2, as exemplified in the process (2) in FIG. 2B, a character string (n characters continuous) is added to the file b in the folder A in the process (1) in FIG. Keyword) 50 and its appearance location 51 are extracted.

ステップＳ３では、図２（Ａ）の処理（１）のフォルダＡ内のすべてのファイルｃ、ｄに対して、図２（Ｃ）、（Ｄ）に示すように、それぞれ処理を行う。すなわち、図２（Ｃ）に示すように、ｎ文字連続の文字列（キーワード）５２とその出現場所５３を抽出し、図２（Ｄ）に示すように、ｎ文字連続の文字列（キーワード）５４とその出現場所５５を抽出する。ステップＳ４では、フォルダＡ内のすべてのファイルｂ、ｃ、ｄについての共通の文字列（キーワード）６０とその出現場所６１を抽出することで、抽出処理は終了する。 In step S3, as shown in FIGS. 2C and 2D, each of the files c and d in the folder A in the process (1) of FIG. 2A is processed. That is, as shown in FIG. 2C, a character string (keyword) 52 of n characters continuous and its appearance location 53 are extracted, and a character string (keyword) of n characters continuous is extracted as shown in FIG. 54 and its appearance location 55 are extracted. In step S4, the extraction process is completed by extracting a common character string (keyword) 60 and its appearance location 61 for all the files b, c, and d in the folder A.

次に、図２（Ｆ）に戻り、（５）フォルダＡ以外にファイルｅに、上記処理（４）で抽出した文字列が該当出現場所に存在した場合に、管理者に通報する。 Next, returning to FIG. 2F, (5) when the character string extracted in the process (4) exists in the file e in addition to the folder A, the administrator is notified.

次に、（６）フォルダＡに新たなファイルが生成・移動された場合に、該当出現場所にキーワードがあるかを確認し、該当出現場所にキーワードがない場合には書き込む。図２（Ｇ）は、フォルダＡに新たなファイルが生成・移動された場合に、該当出現場所にキーワードがあるかを確認して、該当出現場所にキーワードが無い場合には書き込みを行い、この際に書き込めない場合には管理者に通報する例を示す。 Next, (6) when a new file is generated / moved in the folder A, it is confirmed whether or not there is a keyword in the corresponding appearance location, and if there is no keyword in the corresponding appearance location, it is written. FIG. 2 (G) shows that when a new file is created / moved in folder A, it is confirmed whether or not there is a keyword at the corresponding appearance location. If there is no keyword at the corresponding appearance location, writing is performed. An example of reporting to the administrator when writing is not possible is shown.

本実施例における他のファイルｅの存在確認手順について説明する。図４は、実施例の文書管理キーワード抽出システムにおける他のファイルｅの存在確認処理手順の説明図である。ファイルｅの存在確認手順は、ステップＳ１０〜Ｓ１６を有する。ステップＳ１０では、図２（Ｆ）の処理（５）において、フォルダＡ内以外のファイルｅに対して、図２（Ｅ）の処理（４）により抽出した文字列（キーワード）が、該当出現場所に存在しているかを確認する。 The procedure for confirming the existence of another file e in this embodiment will be described. FIG. 4 is an explanatory diagram of another file e existence confirmation processing procedure in the document management keyword extraction system of the embodiment. The existence confirmation procedure for the file e includes steps S10 to S16. In step S10, the character string (keyword) extracted by the process (4) in FIG. 2 (E) for the file e other than the folder A in the process (5) in FIG. Check if it exists.

ステップＳ１１では、文字列（キーワード）が該当出現場所に存在していない場合には、確認手順を終了するが、文字列（キーワード）が該当出現場所に存在している場合には、ステップＳ１２において、管理者に通報して、ステップＳ１３では、ファイルｅをフォルダＡに移動するかまたはコピーする設定かどうかを判断して、移動またはコピーをするのでない場合には確認手順を終了する。移動またはコピーをする場合には、ステップＳ１４において、ファイルｅを移動するかまたはコピーする。ステップＳ１５では、ファイルｅを移動またはコピーが可能であると、確認手順を終了するが、ファイルｅを移動またはコピーが可能でない場合にはステップＳ１６では、管理者に通報をして確認手順を終了する。 In step S11, when the character string (keyword) does not exist at the corresponding appearance location, the confirmation procedure is terminated. However, when the character string (keyword) exists at the corresponding appearance location, in step S12, In step S13, it is determined whether or not the file e is set to be moved or copied to the folder A. If the file e is not moved or copied, the confirmation procedure is terminated. When moving or copying, the file e is moved or copied in step S14. In step S15, if the file e can be moved or copied, the confirmation procedure ends. If the file e cannot be moved or copied, in step S16, the administrator is notified and the confirmation procedure is completed. To do.

次に、実施例における新たなファイルｆの書き込み手順の一例について説明する。図５は、実施例の文書管理キーワード抽出システムにおける他のファイルへの書き込み手順の説明図である。図５の新たなファイルへの書き込み手順は、ステップＳ２０〜Ｓ２４を有する。ステップＳ２０では、図２（Ｇ）の処理（６）において、フォルダＡ内に生成または移動された新たなファイルｅに対して、図２（Ｅ）の処理（４）における文字列（キーワード）が該当出現場所に存在するかを確認する。 Next, an example of a procedure for writing a new file f in the embodiment will be described. FIG. 5 is an explanatory diagram of a procedure for writing to another file in the document management keyword extraction system of the embodiment. The procedure for writing to a new file in FIG. 5 includes steps S20 to S24. In step S20, the character string (keyword) in the process (4) of FIG. 2 (E) is added to the new file e generated or moved in the folder A in the process (6) of FIG. 2 (G). Check if it exists at the appropriate location.

ステップＳ２１では、文字列（キーワード）が該当出現場所が存在する場合に、書き込み手順を終了し、ステップＳ２２では、文字列（キーワード）が該当出現場所が存在しない場合には、そのファイルｆがフォルダＡに書き込めるかどうかを判断して、読取専用ファイルで書き込みが出来ない場合には、ステップＳ２３において管理者に通報して書き込み手順を終了する。ファイルｆがフォルダＡに書き込める場合には、ステップＳ２４では、文字列（キーワード）を該当出現個所に書き込んで、書き込み手順を終了する。 In step S21, when the character string (keyword) has a corresponding appearance location, the writing procedure is terminated. In step S22, if the character string (keyword) does not have the corresponding appearance location, the file f is a folder. When it is determined whether or not data can be written to A, and writing cannot be performed with the read-only file, the administrator is notified in step S23 and the writing procedure is terminated. When the file f can be written in the folder A, in step S24, a character string (keyword) is written at the corresponding appearance location, and the writing procedure is terminated.

ところで、実施例では、次のように文書管理キーワード抽出を行うことが好ましい。
［１］まず前提として、キーワードにならない文字列をデータベースとして作成する。このキーワードとならない文字列とは、辞書等に記載されている指示語や、ひらがなやカタカナから構成される意味をなさない文字列である。これらの文字列は利用者の設定によって適宜キーワードにならない文字列のリストからはずすことができるようにする。また利用者の設定によってキーワードとならない文字列を追加できるようにする。 Incidentally, in the embodiment, it is preferable to perform document management keyword extraction as follows.
[1] As a premise, a character string that does not become a keyword is created as a database. The character string that does not become a keyword is a character string that does not have a meaning composed of an instruction word described in a dictionary or the like, or hiragana or katakana. These character strings can be appropriately removed from the list of character strings that do not become keywords according to user settings. Also, a character string that does not become a keyword can be added according to user settings.

［２］図２（Ｂ）の処理（２）を実施する際に、文字列から自然文を抽出する技術を併用すると、候補となるキーワードを絞ることができるので、より確かなキーワード抽出が行える。 [2] When the process (2) of FIG. 2B is performed, if a technique for extracting a natural sentence from a character string is used in combination, candidate keywords can be narrowed down, so that more reliable keyword extraction can be performed. .

［３］図２の処理（１）〜（６）を実施するが、図２（Ｂ）の処理（２）と図２（Ｃ）の処理（３）を実施する際に、上記［１］で作成したキーワード候補とならない文字列のデータベースに照らし合わせる。この作業を行うことで、キーワードとなる文字列をある程度絞ることができる。 [3] The processes (1) to (6) in FIG. 2 are performed. When the process (2) in FIG. 2B and the process (3) in FIG. Check against the database of character strings that cannot be used as keyword candidates. By performing this operation, it is possible to narrow down the character string as a keyword to some extent.

［４］図２（Ｂ）の処理（２）を実施する際に、ｎの値は大きい数字から順に小さくしていくことで、キーワードを絞りやすく出来る。ｎの初期値は本発明を実行する際にあらかじめ設定しておくことにするが、現実的には８〜１０前後の値から開始することが望ましいと考えている。 [4] When the process (2) in FIG. 2B is performed, it is possible to narrow down the keywords by decreasing the value of n in order from the largest number. The initial value of n is set in advance when the present invention is executed, but in practice, it is desirable to start with a value of about 8 to 10.

［５］図２（Ｃ）の処理（３）を実施する際に、フォルダＡ内にあまり多くのファイルが存在すると、すべてに共通のキーワードが見つかりにくくなる。一方、ファイルの数が少なすぎると、候補となるキーワードが多くなりすぎる可能性もある。よって実際には、好ましくは図２に例示するように管理目的に厳格に適したファイルを３〜４つ選び、キーワードの出現場所もメタデータやヘッダ、フッタなど限られた場所に限定した上で、徐々にファイルの数を増やしてキーワードの候補場所も広げていく。 [5] When the process (3) in FIG. 2C is performed, if there are too many files in the folder A, it is difficult to find a keyword common to all of them. On the other hand, if the number of files is too small, there may be too many candidate keywords. Therefore, in practice, preferably three to four files that are strictly suitable for management purposes are selected as illustrated in FIG. 2, and keywords appear only in limited places such as metadata, headers, and footers. , Gradually increase the number of files to expand the keyword candidate location.

［６］図２の処理（１）〜（４）は、フォルダＡ内のファイルが増減するたびに随時実施し、適切なキーワード候補とその場所の組み合わせが維持できるようにメンテナンスしていくことが望ましい。 [6] The processes (1) to (4) in FIG. 2 are performed whenever the files in the folder A increase or decrease, and maintenance is performed so that an appropriate combination of keyword candidates and their locations can be maintained. desirable.

［７］図２（Ｆ）の処理（５）を実施する際に、システム内のファイルが多いと検索するのに時間がかかる。そこで、最初はすべてのファイルに対して検索を行うが、次回からは最初の実行時から変更があったファイルや追加されたファイルに対してのみ検索を行うことで、検索時間の短縮を図ることができる。 [7] When the process (5) of FIG. 2 (F) is performed, it takes time to search if there are many files in the system. Therefore, the search is performed for all files at first, but from the next time, the search time is shortened by searching only for files that have been changed or added since the first execution. Can do.

図２（Ｇ）の処理（６）において、新たな文書ファイルｅを作成する際に、新たな文書ファイルｅを、以前作成した文書ファイルをコピーして作成すると、その以前作成した際のメタデータが引き継がれてしまい、場合によってはある顧客に他の顧客の名称がついたものを提出してしまう場合がある。これはメタデータは通常意識しないために起こってしまう現象であるが、本発明を利用することで、新たな文書ファイルｅをフォルダに格納する際に、指定されたメタ情報を強制的に書き込むことができるので、こういった問題を回避することが出来る。 In the process (6) of FIG. 2G, when creating a new document file e, if the new document file e is created by copying a previously created document file, the metadata created before that is created. May be taken over, and in some cases, a customer may be given a name with the name of another customer. This is a phenomenon that occurs because the metadata is usually not conscious, but by using the present invention, when storing a new document file e in a folder, the specified metadata is forcibly written. Can avoid these problems.

実施例の文書管理キーワード抽出システムでは、あるフォルダ（例えばＡとする）の中に、同じ管理目的で管理する複数のファイルを格納し、ファイル内の各出現場所毎に、ｎ文字連続した文字列を抽出する。ｎは自然数を設定可能とし、これらをフォルダＡに格納されたすべてのファイル（図示例ではｂ，ｃ，ｄの３つとする）に対して行い、文字列とその出現場所が上記のファイルすべてに共通なものがあれば、それをフォルダＡの管理目的に合ったキーワードとして抽出して記録する。これにより、これまではキーワードの設定は個人の経験等に依存して設定されるケースが多かったが、同じ管理目的下にある複数のファイルに対して、キーワードとなりうる文字列を出現場所と組み合わせてキーワードとすることで、より詳細なキーワードの抽出が可能になり、管理目的に合ったキーワードを効率的に、且つ正しく設定することができる。 In the document management keyword extraction system of the embodiment, a plurality of files managed for the same management purpose are stored in a certain folder (for example, A), and a character string of n characters continuous for each appearance location in the file. To extract. n is a natural number that can be set for all files (b, c, and d in the example shown) stored in the folder A, and the character string and its appearance location are set in all the above files. If there is something in common, it is extracted and recorded as a keyword suitable for the management purpose of folder A. In the past, keyword settings were often set depending on personal experience, etc., but for multiple files under the same management purpose, character strings that can be used as keywords are combined with their appearance locations. Therefore, more detailed keywords can be extracted, and keywords suitable for management purposes can be set efficiently and correctly.

また、実施例の文書管理キーワード抽出システムでは、フォルダＡに格納されたファイル以外のファイル（図示例ではファイルｅとする）に対しても、キーワードの出現場所に対して文字列検索を行い、前記抽出したキーワードと出現個所が合致するファイルがあれば、そのファイルｅもフォルダＡの管理目的で管理すべきファイルとして警告し、可能であれば、そのファイルをフォルダＡに移動又はコピーするが、これはあらかじめ設定された内容に基づくものとする。これにより、フォルダに格納されたファイル以外のファイルであっても、管理目的に合ったキーワードを効率的に、且つ正しく設定することができる。 Further, in the document management keyword extraction system of the embodiment, a character string search is performed for a keyword appearance location for a file other than the file stored in the folder A (referred to as file e in the illustrated example), If there is a file whose appearance matches the extracted keyword, the file e is also warned as a file to be managed for the management purpose of the folder A, and if possible, the file is moved or copied to the folder A. Is based on preset contents. As a result, even if the file is other than the file stored in the folder, the keyword suitable for the management purpose can be set efficiently and correctly.

そして、実施例の文書管理キーワード抽出システムでは、フォルダＡに新たに加えられたファイル（図示例ではｆとする）に、前記抽出したキーワードが該当する出現場所にない場合には、このキーワードを該当する出現場所に記載して警告し、仮にファイルｆが読取専用ファイルで書き込みが出来ない場合は警告する。これにより、フォルダに対して新たに加えられたファイルであっても、管理目的に合ったキーワードを効率的に、且つ正しく設定することができる。 In the document management keyword extraction system of the embodiment, if the extracted keyword does not exist in the corresponding appearance location in the file newly added to the folder A (referred to as f in the illustrated example), the keyword is determined to be applicable. A warning is given when the file f is a read-only file and cannot be written. As a result, even if the file is newly added to the folder, keywords suitable for the management purpose can be set efficiently and correctly.

以上実施例を説明したが、本発明は、上記実施例に限定されず、特許請求の範囲を逸脱しない範囲で、種々の変形例を採用できる。 Although the embodiments have been described above, the present invention is not limited to the above-described embodiments, and various modifications can be adopted without departing from the scope of the claims.

実施例の文書管理キーワード抽出システムのシステム構成図。The system block diagram of the document management keyword extraction system of an Example. 実施例の文書管理キーワード抽出システムの説明図。Explanatory drawing of the document management keyword extraction system of an Example. 実施例の文書管理キーワード抽出システムにおける抽出手順の説明図。Explanatory drawing of the extraction procedure in the document management keyword extraction system of an Example. 実施例の文書管理キーワード抽出システムにおける他のファイルの存在確認の説明図。Explanatory drawing of the presence confirmation of the other file in the document management keyword extraction system of an Example. 実施例の文書管理キーワード抽出システムにおける新たなファイルへの書き込み手順の説明図。Explanatory drawing of the write-in procedure to the new file in the document management keyword extraction system of an Example.

Explanation of symbols

１管理者側のサーバ
２クライアントコンピュータ
３データベース
４管理者又は利用者
Ａフォルダ
ｂ、ｃ、ｄフォルダＡのファイル
ｅフォルダＡ以外のファイル
ｆ新たなファイル 1 Administrator server 2 Client computer 3 Database 4 Administrator or user A Folders b, c, d Folder A files e Files other than folder A f New files

Claims

A client computer, a server, and a database are provided, a plurality of files managed for the same management purpose are stored in a folder, and n characters (n is a natural number) for each appearance location in the plurality of files For all the files stored in the folder, for each occurrence location in the plurality of files, the continuous character string of n characters is extracted, and the characters are extracted. Detecting a character string and its appearance location that are common to all of the plurality of files, and extracting and recording the detected character string and its appearance location as a keyword suitable for the management purpose of the folder Document management keyword extraction system.

For a file other than the plurality of files stored in the folder, perform a character string search for the appearance location of the keyword, detect a file that matches the extracted keyword and the appearance location, The document management keyword extraction system according to claim 1, wherein a warning is given as a file to be managed for the purpose of managing the folder, and the file to be managed is moved or copied to the folder.

3. The document management keyword extraction system according to claim 2, wherein when the extracted keyword is not present in the corresponding appearance location in the file newly added to the folder, the extracted keyword is described in the corresponding appearance location and a warning is given. .

4. The document management keyword extraction system according to claim 3, wherein a warning is issued when the newly added file is a read-only file and cannot be written.