JP2011039976A

JP2011039976A - Document storage device and document storage program

Info

Publication number: JP2011039976A
Application number: JP2009189180A
Authority: JP
Inventors: Yukio Uematsu; 幸生植松; Yoshihiko Kazuhara; 良彦数原; Ryoji Kataoka; 良治片岡; Takashi Inoue; 孝史井上
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2009-08-18
Filing date: 2009-08-18
Publication date: 2011-02-24
Anticipated expiration: 2029-08-18
Also published as: JP5281516B2

Abstract

<P>PROBLEM TO BE SOLVED: To efficiently hold information of an input document by reducing columns for the degree of importance of the document in a database. <P>SOLUTION: In a document storage device 1, an index reference part 2 gives an internal identifier, based on the degree of importance of each document, for a document identifier of the document in an input document group, and stores identifiers of respective documents in a database 7 in accordance with an array based on values of the given internal identifier. The index reference part 2 includes: an internal ID giving means 4 for giving internal identifiers based on degrees of importance of respective documents, for identifiers of respective documents in the input document group; and a document storing means 5 for storing identifiers of respective documents in the database 7 in accordance with the array based on value of the given internal identifiers. Furthermore, the index reference part 2 includes an importance degree assigning means 3 for rearranging identifiers of respective documents of the input document group on the basis of degrees of importance of the respective documents. Also, the internal ID giving means 4 calculates the value of the degree of importance on the basis of the value of the internal identifier. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、文書に対する重要度を格納するデータベース分野に属し、特に文書検索に主に用いられる技術に関する。 The present invention belongs to the field of databases that store the importance of documents, and particularly relates to a technique mainly used for document retrieval.

文書に対する重要度とは、文書を取得する際にその文書のランキングに用いられる値のことで、通常は一文書に対して一つの値を設定することができる。ある文書に付与された重要度を保存する従来技術としてはＩＳＡＭ（ＩｎｄｅｘｅｄＳｅｑｕｅｎｔｉａｌＡｃｃｅｓｓＭｅｔｈｏｄ）がある（非特許文献１）。ＩＳＡＭでは，主キーとなる文書から内部識別子（以下、内部ＩＤ）を参照し，その内部ＩＤの値をキーとして内部ＩＤが対応する所望の値を取得する。 The importance level for a document is a value used for ranking the document when the document is acquired. Usually, one value can be set for one document. As a conventional technique for storing the importance assigned to a document, there is an ISAM (Indexed Sequential Access Method) (Non-patent Document 1). In ISAM, an internal identifier (hereinafter referred to as an internal ID) is referred to from a document serving as a primary key, and a desired value corresponding to the internal ID is obtained using the internal ID value as a key.

図６を参照しながら従来技術に係る文書格納装置１０について説明する。図６に示された文書格納装置１０はインデックス参照部１１を備える。インデックス参照部１１は少なくとも内部ＩＤ付与手段１２と文書格納手段１３とを有する。内部ＩＤ付与手段１２は文書入力部１４を介して入力された文書の識別子（以下、文書ＩＤ）に対して内部ＩＤ（例えば「１」）を付与する。文書格納手段１３は少なくとも文書ＩＤ「文書Ａ」に対応した入力文書の重要度をデータベース１５における当該入力文書の重要度のカラムに格納させる機能を有する。 A document storage device 10 according to the prior art will be described with reference to FIG. The document storage device 10 shown in FIG. 6 includes an index reference unit 11. The index reference unit 11 includes at least an internal ID assigning unit 12 and a document storage unit 13. The internal ID assigning means 12 assigns an internal ID (for example, “1”) to the document identifier (hereinafter, document ID) input via the document input unit 14. The document storage unit 13 has a function of storing at least the importance level of the input document corresponding to the document ID “document A” in the importance level column of the input document in the database 15.

図７に例示されたデータベース１５における先頭の主キーのカラムには「文書ｎ」が格納されている。「文書ｎ」は入力された文書に付与された識別子である。また、データベース１５には「内部ＩＤ」のカラムが設けられている。「内部ＩＤ」は入力された文書群の各文書の文書ＩＤに対して当該各文書の入力順に割り付けられた識別子である。例えば、文書ＩＤ「文書Ｂ」は内部ＩＤ「３」に割り付けられている。その内部ＩＤによって、所望の値を取得できる。図中の例で説明すると、内部ＩＤ「３」を利用して３列目のデータにアクセスし、重要度の値「０．６」や更新日時「１２３４９１５８８」を取得できる。 “Document n” is stored in the first primary key column in the database 15 illustrated in FIG. “Document n” is an identifier assigned to the input document. The database 15 has a column of “internal ID”. “Internal ID” is an identifier assigned to the document ID of each document in the input document group in the order of input of each document. For example, the document ID “Document B” is assigned to the internal ID “3”. A desired value can be acquired by the internal ID. In the example in the figure, the data in the third column can be accessed using the internal ID “3”, and the importance value “0.6” and the update date “123491588” can be acquired.

図７及び図８を参照しながらデータベース１５の作成手順について説明する。 A procedure for creating the database 15 will be described with reference to FIGS.

Ｓ００１：インデックス参照部１１に対して初期内部ＩＤｉの値として「１」が設定される。 S001: “1” is set as the value of the initial internal IDi for the index reference unit 11.

Ｓ００２：インデックス参照部１１は文書入力部１４から入力された文書のＩＤと重要度の値の入力を受ける。具体的には、入力された文書に対する文書ＩＤとして例えば「文書Ａ」が、当該文書の重要度の値として「０．８」がインデックス参照部１１に入力される。 S002: The index reference unit 11 receives the document ID and importance value input from the document input unit 14. Specifically, for example, “document A” is input to the index reference unit 11 as the document ID for the input document, and “0.8” is input as the importance value of the document.

Ｓ００３：内部ＩＤ付与手段１２は前記入力された文書のＩＤに対して初期内部ＩＤを付与する。内部ＩＤ付与手段１２は具体的に例えばデータベース１５におけるＩＤ「文書Ａ」に対応した初期内部ＩＤのカラムに初期内部ＩＤの値「１」を入力する。 S003: The internal ID assigning means 12 assigns an initial internal ID to the input document ID. Specifically, the internal ID assigning means 12 inputs the initial internal ID value “1” into the column of the initial internal ID corresponding to the ID “document A” in the database 15, for example.

Ｓ００４：文書格納手段１３は前記付与された初期内部ＩＤを利用して重要度をデータベース１５にストアする。文書格納手段１３は具体的に例えば図７のデータベース１５の末尾に追加されたカラムに重要度の値として「０．８」を格納する。 S004: The document storage unit 13 stores the importance in the database 15 using the assigned initial internal ID. Specifically, the document storage unit 13 stores “0.8” as the importance value in a column added to the end of the database 15 in FIG.

Ｓ００５：インデックス参照部１１は次の入力文書が存在するかどうかをチェックする。例えば図７に示したように文書ＩＤ「文書Ｃ」が存在すれば、Ｓ００６に処理に移行する。 S005: The index reference unit 11 checks whether or not the next input document exists. For example, as shown in FIG. 7, if the document ID “document C” exists, the process proceeds to S006.

Ｓ００６：インデックス参照部１１は内部ＩＤｉに「１」を追加して、Ｓ００２からの処理を繰り返す。例えば、この処理の過程におけるＳ００５では、前記「１」が追加された内部ＩＤを利用して重要度（例えば０．９）をデータベース１５にストアする。Ｓ００５で次の文書が無くなれば終了する。 S006: The index reference unit 11 adds “1” to the internal IDi and repeats the processing from S002. For example, in S005 in the course of this process, the importance (for example, 0.9) is stored in the database 15 using the internal ID to which “1” is added. If there is no next document in S005, the process ends.

酒井法雄、“データベース再入門：データベースの基本構造を理解しよう”、［ｏｎｌｉｎｅ］、１９９７年１２月、ｉｎｔ２１Ｃｏｒｐｏｒａｔｉｏｎ、［２００９年７月８日検索］、ＵＲＬ：ｈｔｔｐ：／／ｗｗｗ．ｉｎｔ２１．ｃｏ．ｊｐ／ｐｃｄｎ／ｖｂ／ｎｏｒｉｏｌｉｂ／ｖｂｍａｇ／９７１２／ｒｄｂｍｓ／Norio Sakai, “Introduction to Databases: Let's Understand the Basic Structure of Databases” [online], December 1997, int21 Corporation, [Search July 8, 2009], URL: http: // www. int21. co. jp / pcdn / vb / noriolib / vbmag / 9712 / rdbms /

前述した従来技術のようなリレーショナルデータベースの場合、ある文書に付与された重要度が１つの値に集約された場合、効率的に格納、参照ができないという課題がある。Ｗｅｂの文書などでは重要度を１つに集約することが可能であるため、ＩＳＡＭのようなデータ構造の場合、空間効率やアクセス速度が遅いという問題点がある。 In the case of the relational database as in the conventional technique described above, there is a problem that when the importance assigned to a certain document is aggregated into one value, it cannot be efficiently stored and referenced. In the case of a Web document or the like, it is possible to consolidate the importance into one. Therefore, in the case of a data structure such as ISAM, there is a problem that space efficiency and access speed are slow.

前記課題を解決するための本発明は文書毎に付与される内部識別子に重要度の意味を持たせることで、データベースの別のカラムにアクセスすることなく高速な参照を可能にする。また、重要度を格納するためのカラムが低減し、データベースの空間効率が向上する。 The present invention for solving the above-mentioned problems enables high-speed reference without accessing another column of the database by giving the meaning of importance to the internal identifier assigned to each document. In addition, the number of columns for storing the importance is reduced, and the space efficiency of the database is improved.

本発明の文書格納装置の態様としては、入力された文書の識別子をデータベースに格納させる文書格納装置であって、入力された文書群の各文書の文書識別子に対して当該各文書の重要度に基づく内部識別子を付与し、この付与された内部識別子の値に基づく配列で前記各文書の識別子をデータベースに格納させるインデックス参照手段を備える。 An aspect of the document storage device of the present invention is a document storage device that stores an identifier of an input document in a database, and the importance of each document with respect to the document identifier of each document of the input document group. Index reference means for assigning an internal identifier based on the ID and storing the identifier of each document in a database in an array based on the value of the assigned internal identifier.

本発明は、上記文書格納装置を構成する手段としてコンピュータを機能させる文書格納プログラムの態様とすることができる。 The present invention can be in the form of a document storage program that causes a computer to function as means for configuring the document storage device.

以上の発明によればデータベースにおける文書の重要度のカラムが低減し効率的に入力文書の情報を保持できる。 According to the above invention, the document importance column in the database is reduced, and the input document information can be efficiently held.

発明の実施形態に係る文書格納装置の構成図。The block diagram of the document storage apparatus which concerns on embodiment of invention. 実施形態１に係る文書格納装置によって作成されたデータベースの一例。3 is an example of a database created by the document storage device according to the first embodiment. 実施形態１に係る文書格納装置によるデータベースの作成手順を説明したフローチャート。6 is a flowchart for explaining a database creation procedure by the document storage device according to the first embodiment. 実施形態２に係る文書格納装置によって作成されたデータベースの一例。An example of a database created by the document storage device according to the second embodiment. 実施形態２に係る文書格納装置によるデータベースの作成手順を説明したフローチャート。9 is a flowchart for explaining a database creation procedure by the document storage device according to the second embodiment. 従来技術に係る文書格納装置の構成図。The block diagram of the document storage apparatus which concerns on a prior art. 従来技術に係る文書格納装置によって作成されたデータベースの一例。An example of the database produced by the document storage apparatus which concerns on a prior art. 従来技術に係る文書格納装置によるデータベースの作成手順を説明したフローチャート。The flowchart explaining the creation procedure of the database by the document storage apparatus concerning a prior art.

本発明は、文書ＩＤに対して内部ＩＤを付与する際に内部ＩＤに重要度の意味も付与することで、ＩＳＡＭなどのインデックスを利用すること無く参照でき、かつデータをストアするストレージの削減を実現させる。 In the present invention, when an internal ID is assigned to a document ID, the meaning of importance is also given to the internal ID, so that it can be referred to without using an index such as ISAM, and storage for storing data can be reduced. make it happen.

本発明の実施形態１に係る文書格納装置１は図１に示したようにインデックス参照部２を備える。インデックス参照部２は文書入力部６からの文書の入力を受ける。また、インデックス参照部２はデータベース７に対してアクセス可能となっている。データベース７は少なくとも文書入力部６を介して入力された文書の文書ＩＤ（文書識別子）を主キーとして格納している。尚、前記入力された文書及びデータベース７は図示省略されたハードディスク装置、サーバ装置に例示される記憶手段に保存される。 The document storage device 1 according to the first embodiment of the present invention includes an index reference unit 2 as shown in FIG. The index reference unit 2 receives a document input from the document input unit 6. Further, the index reference unit 2 can access the database 7. The database 7 stores at least a document ID (document identifier) of a document input via the document input unit 6 as a main key. The input document and the database 7 are stored in a storage unit exemplified by a hard disk device and a server device (not shown).

インデックス参照部２は、文書入力部６によって入力された文書群の各文書の文書ＩＤに対して当該各文書の重要度に基づく内部ＩＤ（内部識別子）を付与し、この付与された内部ＩＤの値に基づく配列で前記各文書の文書ＩＤをデータベース７に格納させる。 The index reference unit 2 assigns an internal ID (internal identifier) based on the importance of each document to the document ID of each document of the document group input by the document input unit 6, and the assigned internal ID The document ID of each document is stored in the database 7 in an array based on the values.

インデックス参照部２は具体的には図１に示されたように重要度割り付け手段３と内部ＩＤ付与手段４と文書格納手段５の機能を有する。重要度割り付け手段３は文書入力部６を介して入力された各文書の文書ＩＤを当該各文書の重要度に基づき並び替える。内部ＩＤ付与手段４は前記各文書の文書ＩＤに対して当該各文書の重要度に基づく内部ＩＤを付与する。また、内部ＩＤ付与手段４は前記内部ＩＤの値に基づき前記重要度の値を算出する機能を有する。文書格納手段５は前記付与された内部ＩＤの値に基づく配列で前記各文書のＩＤをデータベース７に格納する。 Specifically, the index reference unit 2 has functions of an importance level assigning means 3, an internal ID assigning means 4, and a document storing means 5 as shown in FIG. The importance level assigning means 3 rearranges the document IDs of the documents input via the document input unit 6 based on the importance levels of the respective documents. The internal ID assigning unit 4 assigns an internal ID based on the importance of each document to the document ID of each document. The internal ID assigning means 4 has a function of calculating the importance value based on the internal ID value. The document storage means 5 stores the ID of each document in the database 7 in an array based on the assigned internal ID value.

図２を参照しながら本実施形態の文書格納装置１によって作成されたデータベース７の一例について説明する。 An example of the database 7 created by the document storage device 1 of this embodiment will be described with reference to FIG.

本発明に係る文書格納装置１と従来技術に係る文書格納装置１０との違いは、入力された文書の重要度のカラムをデータベースに保持させずに、当該文書の重要度を内部ＩＤに割り付けていることである。 The difference between the document storage device 1 according to the present invention and the document storage device 10 according to the prior art is that the importance level of the input document is assigned to the internal ID without holding the importance level column of the document in the database. It is that you are.

従来技術では内部ＩＤを文書の入力順に付与していたが、本発明に係る文書格納装置１では、内部ＩＤを予め文書の重要度順に割り付け、これに基づき文書ＩＤを並べ替えることによって前記文書ＩＤに対応した文書の情報を格納している。図２に示された事例では重要度が「１．０」と最も高い文書の文書ＩＤが内部ＩＤ「１」に割り付けられている。 In the prior art, internal IDs are assigned in the order of document input. However, in the document storage device 1 according to the present invention, the internal IDs are assigned in advance in order of importance of the documents, and the document IDs are rearranged based on the internal IDs. The document information corresponding to is stored. In the example shown in FIG. 2, the document ID of the document having the highest importance “1.0” is assigned to the internal ID “1”.

図３を参照しながら本実施形態の文書格納装置１が図２のデータベース７を作成するための手順（Ｓ１０１〜Ｓ１０７）について説明する。 A procedure (S101 to S107) for creating the database 7 of FIG. 2 by the document storage device 1 of the present embodiment will be described with reference to FIG.

前述の従来技術に係る手順（Ｓ００１〜Ｓ００６）との唯一の違いはＳ１０２において、重要度割り付け手段３が入力された各文書の文書ＩＤを各文書の重要度の順に並べ替えている点である。これにより、内部ＩＤが重要度の高い文書順に付与され、重要度の値を格納するカラムを余分に確保することなく重要度の相対的な大きさを保存することができる。尚、本実施形態の方式は重要度そのものの値を得るものではない。 The only difference from the above-described prior art procedure (S001 to S006) is that in S102, the document ID of each document to which the importance level assigning means 3 is input is rearranged in the order of importance of each document. . As a result, the internal IDs are assigned in order of the documents having the highest importance, and the relative magnitude of the importance can be preserved without securing an extra column for storing the importance value. Note that the method of the present embodiment does not obtain the value of the importance level itself.

Ｓ１０１：インデックス参照部２において初期内部ＩＤｉとして「１」が設定される。 S101: The index reference unit 2 sets “1” as the initial internal IDi.

Ｓ１０２：重要度割り付け手段３は文書入力部５によって入力された各文書の文書ＩＤを各文書の重要度順に並べ替える。 S102: The importance level assigning means 3 rearranges the document IDs of the documents input by the document input unit 5 in the order of importance of each document.

Ｓ１０３：インデックス参照部２は並びかえられた文書の文書ＩＤと当該文書の重要度とをデータベース７に入力する。例えば、入力された文書の文書ＩＤとして「文書Ｄ」がデータベース７の主キーのカラムに入力され、当該文書の重要度の値として「１．０」がデータベース７の前記カラムと同列のカラムに入力される。 S103: The index reference unit 2 inputs the document ID of the rearranged document and the importance of the document to the database 7. For example, “Document D” is input to the primary key column of the database 7 as the document ID of the input document, and “1.0” is input to the column in the same row as the column of the database 7 as the importance value of the document. Entered.

Ｓ１０４：内部ＩＤ付与手段４はデータベース７の文書ＩＤに対して前記設定された初期内部ＩＤｉを付与する。例えば図２に示したようにデータベース７における文書ＩＤ「文書Ｄ」のアラムと同列のカラムに初期内部ＩＤｉとして「１」が付与される。 S104: The internal ID assigning means 4 assigns the set initial internal IDi to the document ID in the database 7. For example, as shown in FIG. 2, “1” is assigned as the initial internal IDi to a column in the same column as the alum of the document ID “document D” in the database 7.

Ｓ１０５：文書格納手段５は前記付与された初期内部ＩＤｉを利用して重要度「１．０」をデータベース７にストアする。また、そのときの更新日時がデータベース７のカラムに記録される。 S105: The document storage means 5 stores the importance “1.0” in the database 7 using the assigned initial internal IDi. Further, the update date and time at that time is recorded in the column of the database 7.

Ｓ１０６：インデックス参照部２は文書入力部６から供された次の文書が存在するかどうかをチェックする。次の文書が存在すれば、例えば図７に示されたように識別子を「文書Ｃ」とする文書が存在すれば、Ｓ１０６に処理に移行する。 S106: The index reference unit 2 checks whether or not the next document provided from the document input unit 6 exists. If the next document exists, for example, as shown in FIG. 7, if there is a document with the identifier “document C”, the process proceeds to S106.

Ｓ１０７：インデックス参照部２は内部ＩＤｉに「１」を追加して、Ｓ１０３からの処理を繰り返す。例えば、この処理の過程におけるＳ１０５では、前記「１」が追加された内部ＩＤｉを利用して重要度の値をデータベース７にストアする。例えば文書ＩＤ「文書Ｂ」に付与された内部ＩＤｉとして「２」が利用されて重要度の値として「０．９」がデータベース７にストアされる。Ｓ１０６で次の文書が無いと判断されると処理を終了する。 S107: The index reference unit 2 adds “1” to the internal IDi and repeats the processing from S103. For example, in S105 in the course of this process, the importance value is stored in the database 7 using the internal IDi to which “1” is added. For example, “2” is used as the internal IDi assigned to the document ID “document B”, and “0.9” is stored in the database 7 as the importance value. If it is determined in S106 that there is no next document, the process is terminated.

以上のように実施形態１の文書格納装置１によれば、内部ＩＤを割り付ける際に重要度を考慮することで、内部ＩＤから文書ＩＤの相対的な大きさを得ることができる。また、重要度のカラムが削減されて、効率的に情報を保持することができる。以上のように文書データに対して付与された重要度を効率的に保持できる。 As described above, according to the document storage device 1 of the first embodiment, the relative size of the document ID can be obtained from the internal ID by considering the importance when allocating the internal ID. In addition, the importance column is reduced, and information can be held efficiently. As described above, the importance assigned to the document data can be efficiently held.

次いで本発明の実施形態２に係る文書格納装置１によるデータベースの作成手順について説明する。実施形態２で作成されたデータベース８の例を図４に示した。 Next, a database creation procedure by the document storage device 1 according to the second embodiment of the present invention will be described. An example of the database 8 created in the second embodiment is shown in FIG.

実施形態２と従来技術との違いは実施形態１と同様に重要度毎に内部ＩＤが割り付けられている点である。この例では重要度１．０に対して内部ＩＤとして１〜１０が、重要度０．９に対して内部ＩＤとして１１〜２０が割り付けられている。また、実施形態２では、内部ＩＤをある所定の値で除算した結果を利用して重要度を取得できるようになっている。これにより、文書ＩＤを格納するデータベースから当該文書ＩＤに係る文書の重要度の絶対値を取得することが可能である。 The difference between the second embodiment and the prior art is that an internal ID is assigned for each importance as in the first embodiment. In this example, 1 to 10 are assigned as internal IDs for importance 1.0, and 11 to 20 are assigned as internal IDs for importance 0.9. In the second embodiment, the importance can be acquired by using the result obtained by dividing the internal ID by a predetermined value. Thereby, it is possible to acquire the absolute value of the importance of the document related to the document ID from the database storing the document ID.

図４及び図５を参照しながら実施形態２に係るデータベースを作成する手順（Ｓ２０１〜Ｓ２０９）について説明する。 A procedure (S201 to S209) for creating a database according to the second embodiment will be described with reference to FIGS.

Ｓ２０１：重要度割り付け手段３は文書入力部６を介してインデックス参照部２内に入力された文書群とその各文書の重要度から有効桁数Ｎを調べる。例えば重要度が０から０．１刻みで１までの場合は、有効桁数Ｎは１０になる。 S201: The importance level assigning means 3 checks the number of significant digits N from the document group input into the index reference unit 2 via the document input unit 6 and the importance level of each document. For example, when the importance is from 0 to 1 in increments of 0.1, the number of significant digits N is 10.

Ｓ２０２：重要度割り付け手段３は前記入力された文書群において同じ重要度の最大頻度Ｍを算出する。例えば入力された文書集合中に０．１という重要度が最も多くの文書がある場合、その文書の個数Ｍを数える。このＭは同じ重要度の最大頻度以上であれば任意の値を設定できる。 S202: The importance level assigning means 3 calculates the maximum frequency M of the same importance level in the input document group. For example, if there is a document having the highest importance of 0.1 in the input document set, the number M of the documents is counted. This M can be set to any value as long as it is greater than or equal to the maximum frequency of the same importance.

Ｓ２０３：重要度割り付け手段３は前記入力された文書群の各文書の識別子を各文書の重要度順に並べかえる。ＵＲＬの順番に意味が無い場合は並べ替える必要はない。 S203: The importance level assigning means 3 sorts the identifiers of the documents in the input document group in the order of importance of the documents. If the URL order is meaningless, there is no need to rearrange.

Ｓ２０４：インデックス参照部２において各重要度の内部ＩＤを保存するための初期配列の値が設定される。例えば、各初期配列の値は「１」と初期化される。 S204: The index reference unit 2 sets an initial array value for storing the internal ID of each importance level. For example, the value of each initial array is initialized to “1”.

Ｓ２０５：インデックス参照部２は入力された最初の文書の文書ＩＤと当該文書の重要度の値を調べる。 S205: The index reference unit 2 checks the document ID of the input first document and the importance value of the document.

Ｓ２０６：内部ＩＤ付与手段４はＳ２０５で得られた重要度の値から下記の式（１）で示された演算式によって算出した内部ＩＤの値をベータベース８のカラムにセットする。
内部ＩＤ＝（１−重要度）×Ｎ×Ｍ＋ｉ［（１−重要度）×Ｎ］ …（１）
Ｓ２０７：文書格納手段５はＳ２０６での式（１）によって得られた内部ＩＤの値が格納されたベータベース８のカラムと同列のカラムに文書ＩＤを格納する。 S206: The internal ID assigning means 4 sets the value of the internal ID calculated from the importance value obtained in S205 by the arithmetic expression shown in the following formula (1) in the beta base 8 column.
Internal ID = (1−importance) × N × M + i [(1−importance) × N] (1)
S207: The document storage unit 5 stores the document ID in the same column as the beta base 8 column in which the value of the internal ID obtained by the expression (1) in S206 is stored.

Ｓ２０８：インデックス参照部２は式（１）のｉ［（１−重要度）×Ｎ］の値に１を加える。 S208: The index reference unit 2 adds 1 to the value of i [(1-importance) × N] in the equation (1).

Ｓ２０９：インデックス参照部２は文書入力部６から供された他に文書が無いかを調べ、ある場合はＳ２０５からの処理を繰り返す。 S209: The index reference unit 2 checks whether there is any other document provided from the document input unit 6, and if there is, repeats the processing from S205.

次に、Ｓ２０１〜Ｓ２０９のステップで作成された図４に例示のデータベース８からの特定の文書の重要度の算出方法について説明する。 Next, a method for calculating the importance of a specific document from the database 8 illustrated in FIG. 4 created in steps S201 to S209 will be described.

文書の重要度の値は内部ＩＤ付与手段４によって算出される。図４のデータベース８においては、Ｎ＝１０、Ｍ＝１０に設定されている。例えば、文書ＩＤが文書Ｅである文書の重要度を取得しようとすると、文書Ｅに係る内部ＩＤの値は「１２」なので、この値をＭ（＝１０）で割ると、１２÷１０＝１余り２となる。この算出された値「１」は（１）式の右辺第一項「（１−重要度）×Ｎ」に相当する。一方、余りの値「２」は、（１）式の右辺第２項「ｉ［（１−重要度）×Ｎ］」の値であって、初期配列の値「１」に対して「１」が加算された値に相当する。ゆえに、重要度の値は、式（１）の右辺第一項に基づくＮを算出するための方程式「（１−重要度）×Ｎ＝１」を解くと「重要度＝１−１÷Ｎ＝１−１÷１０」の演算によって「０．９」と算出される。以上のように内部ＩＤの値「１２」から当該内部ＩＤに係る文書の重要度の値として「０．９」が取得される。 The importance value of the document is calculated by the internal ID assigning means 4. In the database 8 of FIG. 4, N = 10 and M = 10 are set. For example, when the importance of a document whose document ID is document E is to be acquired, the value of the internal ID related to document E is “12”, and when this value is divided by M (= 10), 12 ÷ 10 = 1. The remainder is 2. This calculated value “1” corresponds to the first term “(1−importance) × N” on the right side of equation (1). On the other hand, the remainder value “2” is the value of the second term “i [(1−importance) × N]” on the right side of equation (1), and is “1” with respect to the value “1” of the initial array. "Corresponds to the added value. Therefore, the importance value is calculated by solving the equation “(1−importance) × N = 1” for calculating N based on the first term on the right side of the equation (1). = 0.9 is calculated by the calculation of “= 1−1 ÷ 10”. As described above, “0.9” is acquired as the importance value of the document related to the internal ID from the internal ID value “12”.

したがって、実施形態２に係る文書格納装置１によれば実施形態１に係る文書格納装置１の効果に加えて内部ＩＤから文書ＩＤの絶対値を得ることができる。 Therefore, according to the document storage device 1 according to the second embodiment, the absolute value of the document ID can be obtained from the internal ID in addition to the effects of the document storage device 1 according to the first embodiment.

以上説明した本発明の実施形態１，２に係る文書格納装置１に係る機能手段２〜８は、コンピュータのハードウェアリソース、例えばＣＰＵ、メモリ（ＲＡＭ）、ハードディスク装置、通信デバイス等によって構成できる。すなわち、機能手段２〜８はＣＰＵとプログラムとの協働によって実現できる。また、機能手段７，８はハードディスク装置やサーバ装置に例示される記録手段に格納すればよい。 The functional units 2 to 8 related to the document storage apparatus 1 according to the first and second embodiments of the present invention described above can be configured by computer hardware resources such as a CPU, a memory (RAM), a hard disk device, a communication device, and the like. That is, the function units 2 to 8 can be realized by cooperation between the CPU and the program. The functional means 7 and 8 may be stored in a recording means exemplified by a hard disk device or a server device.

さらに、本発明は、上述の実施形態に係る機能手段２〜６をコンピュータに実現させる文書格納プログラムまたはこれを記録したコンピュータ読み取り可能な記録媒体の態様としてコンピュータのＣＰＵ（ＭＰＵ）が当該プログラムを読み出し実行することで実現できる。その場合、記録媒体から読み出されたプログラム自体が上述した実施の形態の機能を実現することになり、そのプログラムを記憶した記録媒体、例えばＣＤ−ＲＯＭ、ＤＶＤ−ＲＯＭ、ＣＤ−Ｒ、ＭＯ、ＨＤＤ等は本発明を構成する。 Furthermore, the present invention provides a computer storage unit (MPU) that reads out the program as a document storage program that causes a computer to implement the functional units 2 to 6 according to the above-described embodiments or a computer-readable recording medium that records the program. It can be realized by executing. In that case, the program itself read from the recording medium realizes the functions of the above-described embodiment, and a recording medium storing the program, for example, a CD-ROM, DVD-ROM, CD-R, MO, An HDD or the like constitutes the present invention.

１…文書格納装置
２…インデックス参照部（インデックス参照手段）
３…重要度割り付け手段
４…内部ＩＤ付与手段（内部識別子付与手段）
５…文書格納手段
７，８…データベース DESCRIPTION OF SYMBOLS 1 ... Document storage apparatus 2 ... Index reference part (index reference means)
3 ... Importance assigning means 4 ... Internal ID assigning means (internal identifier assigning means)
5 ... Document storage means 7, 8 ... Database

Claims

A document storage device for storing an input document identifier in a database,
An internal identifier based on the importance of each document is assigned to the document identifier of each document in the input document group, and the identifier of each document is stored in the database in an array based on the value of the assigned internal identifier. A document storage device comprising index reference means.

The index reference means includes
An internal identifier giving means for giving an internal identifier based on the importance of each document to the identifier of each document in the input document group;
The document storage device according to claim 1, further comprising: a document storage unit that stores the identifier of each document in the database in an array based on the value of the assigned internal identifier.

The index reference means includes
3. The document storage device according to claim 2, further comprising importance level assigning means for rearranging the identifiers of the documents of the input document group based on the importance levels of the respective documents.

4. The document storage device according to claim 2, wherein the internal identifier assigning unit calculates the importance value based on the internal identifier value.

A document storage program for causing a computer to function as means for constituting the document storage device according to claim 1.