JP2013077174A

JP2013077174A - Document management device and program

Info

Publication number: JP2013077174A
Application number: JP2011216731A
Authority: JP
Inventors: Yuki Matsuoka; 有希松岡; Rei Yano; 令矢野
Original assignee: Toshiba Corp; Toshiba Solutions Corp
Current assignee: Toshiba Corp; Toshiba Digital Solutions Corp
Priority date: 2011-09-30
Filing date: 2011-09-30
Publication date: 2013-04-25

Abstract

PROBLEM TO BE SOLVED: To provide a document management device capable of presenting items to be written in a document when a user newly prepares the document, and a program.SOLUTION: Document storage means stores multiple documents in which various types of items are written. Retrieval means retrieves documents matching retrieval criteria specified by a user in the document storage means on the basis of the search retrieval criteria. Acquisition means acquires documents similar to the retrieved documents from the document storage means. Identification means identifies items commonly written in the retrieved document and the acquired documents. Presentation means present the identified items to the user.

Description

本発明の実施形態は、文書に記述すべき項目を提示する文書管理装置およびプログラムに関する。 Embodiments described herein relate generally to a document management apparatus and program for presenting items to be described in a document.

一般的に、ソフトウェアの仕様書または特許等の文書においては、類似する分野であれば当該文書に記述すべき項目が重複（つまり、共通）する場合が多い。具体的には、ソフトウェアの仕様書の場合、従業員情報を管理するＷｅｂアプリケーションと給与明細を管理するＷｅｂアプリケーションでは、共に画面に関する仕様（の項目）を記述する必要がある。特許の文書に関しても同様に、共通する項目を記述する場合が多い。 Generally, in a document such as a software specification or a patent, items to be described in the document are often duplicated (that is, common) in similar fields. Specifically, in the case of software specifications, both the Web application that manages employee information and the Web application that manages salary details need to describe specifications (items) related to the screen. Similarly, common items are often described for patent documents.

特開平０９−３０５３９０号公報JP 09-305390 A 特開平０８−１４７１５２号公報Japanese Patent Laid-Open No. 08-147152

ところで、ユーザが上記したような文書を新規に作成する場合、特にこれまで手掛けたことのない分野に関する文書を作成する場合には、当該文書に記載すべき項目がわからないことがある。このような場合には、文書の作成に時間がかかり、更には、当該文書に記述すべき項目の抜けが発生する場合も多い。 By the way, when a user creates a new document as described above, particularly when creating a document related to a field that has not been handled, items to be described in the document may not be known. In such a case, it takes time to create a document, and there are many cases where missing items to be described in the document occur.

そこで、本発明が解決しようとする課題は、ユーザが新規に文書を作成する際に、当該文書に記述すべき項目を提示することが可能な文書管理装置およびプログラムを提供することにある。 Therefore, the problem to be solved by the present invention is to provide a document management apparatus and program capable of presenting items to be described in a document when a user creates a new document.

実施形態に係る文書管理装置は、文書格納手段と、検索手段と、取得手段と、特定手段と、提示手段とを具備する。 The document management apparatus according to the embodiment includes a document storage unit, a search unit, an acquisition unit, a specification unit, and a presentation unit.

文書格納手段は、各種項目が記述された複数の文書を格納する。 The document storage means stores a plurality of documents in which various items are described.

検索手段は、ユーザによって指定された検索条件に基づいて、当該検索条件に一致する文書を前記文書格納手段から検索する。 The search means searches the document storage means for a document that matches the search condition based on the search condition specified by the user.

取得手段は、前記検索された文書と類似する文書を、前記文書格納手段から取得する。 The acquisition unit acquires a document similar to the searched document from the document storage unit.

特定手段は、前記検索された文書および前記取得された文書において共通して記述されている項目を特定する。 The specifying unit specifies items that are commonly described in the searched document and the acquired document.

提示手段は、前記特定された項目を前記ユーザに提示する。 The presenting means presents the identified item to the user.

実施形態に係る文書管理装置のハードウェア構成を示すブロック図。FIG. 2 is a block diagram illustrating a hardware configuration of the document management apparatus according to the embodiment. 図１に示す文書管理装置３０の主として機能構成を示すブロック図。FIG. 2 is a block diagram mainly showing a functional configuration of the document management apparatus 30 shown in FIG. 1. 図２に示す文書データリポジトリ２２に格納されているメタモデルを概念的に説明するための図。The figure for demonstrating notionally the metamodel stored in the document data repository 22 shown in FIG. ＸＭＬ形式のメタモデルのファイルのデータ構造の一例を示す図。The figure which shows an example of the data structure of the file of the meta model of an XML format. 図２に示す文書登録部３２の機能構成を示すブロック図。The block diagram which shows the function structure of the document registration part 32 shown in FIG. 図２に示す文書検索部３３の機能構成を示すブロック図。The block diagram which shows the function structure of the document search part 33 shown in FIG. 図２に示す項目取得部３４の機能構成を示すブロック図。The block diagram which shows the function structure of the item acquisition part 34 shown in FIG. 本実施形態に係る文書管理装置３０において実行される文書登録処理の処理手順を示すフローチャート。5 is a flowchart showing a processing procedure of document registration processing executed in the document management apparatus 30 according to the present embodiment. 入力部３１によって入力される「機能要件」が記述された文書の一例を示す図。6 is a diagram showing an example of a document describing “functional requirements” input by an input unit 31. FIG. 入力部３１によって入力される「非機能要件」が記述された文書の一例を示す図。FIG. 6 is a diagram illustrating an example of a document describing “non-functional requirements” input by an input unit 31; ＸＭＬ形式に変換された文書について具体的に説明するための図。The figure for demonstrating concretely about the document converted into the XML format. ＸＭＬ形式に変換された文書について具体的に説明するための図。The figure for demonstrating concretely about the document converted into the XML format. 文書データリポジトリ２２に登録されたメタデータのデータ構造の一例を示す図。The figure which shows an example of the data structure of the metadata registered into the document data repository. メタモデル更新処理の処理手順を示すフローチャート。The flowchart which shows the process sequence of a metamodel update process. メタモデル更新処理後のメタモデルのファイルのデータ構造の一例を示す図。The figure which shows an example of the data structure of the file of the metamodel after a metamodel update process. 本実施形態に係る文書管理装置３０において実行される項目提示処理の処理手順を示すフローチャート。5 is a flowchart showing a processing procedure of item presentation processing executed in the document management apparatus 30 according to the present embodiment. 図１３に示すメタデータに含まれる各項目がクラスとして定義されたメタモデルを概念的に説明するための図。The figure for demonstrating notionally the metamodel in which each item contained in the metadata shown in FIG. 13 was defined as a class. 選択文書Ｄｓのメタデータの一例を示す図。The figure which shows an example of the metadata of the selection document Ds. 対象文書Ｄｏのメタデータの一例を示す図。The figure which shows an example of the metadata of the object document Do. ユーザに提示される項目の一例を示す図。The figure which shows an example of the item shown to a user.

以下、図面を参照して、実施形態について説明する。 Hereinafter, embodiments will be described with reference to the drawings.

図１は、本実施形態に係る文書管理装置のハードウェア構成を示すブロック図である。図１に示すように、コンピュータ１０は、例えばハードディスクドライブ（ＨＤＤ：Hard Disk Drive）のような外部記憶装置２０と接続されている。この外部記憶装置２０は、コンピュータ１０によって実行されるプログラム２１を格納する。コンピュータ１０および外部記憶装置２０は、文書管理装置３０を構成する。 FIG. 1 is a block diagram showing a hardware configuration of the document management apparatus according to the present embodiment. As shown in FIG. 1, the computer 10 is connected to an external storage device 20 such as a hard disk drive (HDD). The external storage device 20 stores a program 21 executed by the computer 10. The computer 10 and the external storage device 20 constitute a document management device 30.

図２は、図１に示す文書管理装置３０の主として機能構成を示すブロック図である。図２に示すように、文書管理装置３０は、入力部３１、文書登録部３２、文書検索部３３、項目取得部３４および出力部３５を含む。本実施形態において、これらの各部３１〜３５は、図１に示すコンピュータ１０が外部記憶装置２０に格納されているプログラム２１を実行することにより実現されるものとする。このプログラム２１は、コンピュータ読み取り可能な記憶媒体に予め格納して頒布可能である。また、このプログラム２１が、例えばネットワークを介してコンピュータ１０にダウンロードされても構わない。 FIG. 2 is a block diagram mainly showing a functional configuration of the document management apparatus 30 shown in FIG. As shown in FIG. 2, the document management apparatus 30 includes an input unit 31, a document registration unit 32, a document search unit 33, an item acquisition unit 34, and an output unit 35. In the present embodiment, these units 31 to 35 are realized by the computer 10 illustrated in FIG. 1 executing the program 21 stored in the external storage device 20. This program 21 can be stored in advance in a computer-readable storage medium and distributed. Further, this program 21 may be downloaded to the computer 10 via, for example, a network.

また、文書管理装置３０は、文書データリポジトリ２２を含む。本実施形態において、文書データリポジトリ２２は、例えば外部記憶装置２０に格納される。 The document management apparatus 30 includes a document data repository 22. In the present embodiment, the document data repository 22 is stored in, for example, the external storage device 20.

文書データリポジトリ２２には、各種項目が記述された複数の文書が格納される。文書データリポジトリ２２に格納されている文書の各々には、単語が含まれる。 The document data repository 22 stores a plurality of documents in which various items are described. Each of the documents stored in the document data repository 22 includes a word.

また、文書データリポジトリ２２には、当該文書データリポジトリ２２に格納されている文書毎に、当該文書に関する詳細な情報が記述されたメタデータ（以下、文書のメタデータと表記）が格納される。このメタデータは、例えば文書に記述されている項目および当該文書に含まれる単語（を示す情報）を含む。 The document data repository 22 stores metadata (hereinafter referred to as document metadata) in which detailed information about the document is described for each document stored in the document data repository 22. This metadata includes, for example, items described in the document and words (information indicating) included in the document.

また、文書データリポジトリ２２には、当該文書データリポジトリ２２に格納されている文書の各々に記述されている項目および当該項目間の関係が定義されたモデル（以下、メタモデルと表記）が格納される。 The document data repository 22 stores items described in each document stored in the document data repository 22 and a model (hereinafter referred to as a meta model) in which the relationship between the items is defined. The

入力部３１は、例えばユーザ（文書作成者）によって作成された文書を入力する。入力部３１によって入力された文書には、上記したように各種項目が記述されており、単語が含まれる。 The input unit 31 inputs a document created by a user (document creator), for example. In the document input by the input unit 31, various items are described as described above, and a word is included.

また、入力部３１は、文書を作成しようとするユーザによって指定された検索条件（例えば、検索キーワード）を入力する。 Further, the input unit 31 inputs search conditions (for example, search keywords) designated by a user who wants to create a document.

文書登録部３２は、入力部３１によって入力された文書に記述されている項目に基づいて、文書データリポジトリ２２に格納されているメタモデルを更新する。文書登録部３２は、入力部３１によって入力された文書を文書データリポジトリ２２に登録する。また、文書登録部３２は、入力部３１によって入力された文書のメタデータを文書データリポジトリ２２に登録する。 The document registration unit 32 updates the metamodel stored in the document data repository 22 based on the items described in the document input by the input unit 31. The document registration unit 32 registers the document input by the input unit 31 in the document data repository 22. Further, the document registration unit 32 registers the metadata of the document input by the input unit 31 in the document data repository 22.

文書検索部３３は、入力部３１によって入力された検索条件に基づいて、当該検索条件に合致する文書を文書データリポジトリ２２から検索する。 Based on the search condition input by the input unit 31, the document search unit 33 searches the document data repository 22 for a document that matches the search condition.

項目取得部３４は、文書データリポジトリ２２に格納されている各文書のメタデータおよびメタモデルに基づいて、文書検索部３３によって検索された文書と類似する文書（以下、類似文書と表記）を当該文書データリポジトリ２２から取得する。 The item acquisition unit 34 selects a document similar to the document searched by the document search unit 33 (hereinafter referred to as a similar document) based on the metadata and metamodel of each document stored in the document data repository 22. Obtained from the document data repository 22.

項目取得部３４は、文書検索部３３によって検索された文書および当該文書と類似する文書において共通して記述されている項目を特定する。 The item acquisition unit 34 specifies items commonly described in the document searched by the document search unit 33 and a document similar to the document.

出力部３５は、項目取得部３４によって特定された項目を出力する。これにより、ユーザに対して文書に記述すべき項目が提示される。 The output unit 35 outputs the item specified by the item acquisition unit 34. Thereby, items to be described in the document are presented to the user.

ここで、図３を参照して、図２に示す文書データリポジトリ２２に格納されているメタモデルを概念的に説明する。なお、文書データリポジトリ２２には、１つのメタモデルが格納されている。 Here, the metamodel stored in the document data repository 22 shown in FIG. 2 will be conceptually described with reference to FIG. The document data repository 22 stores one metamodel.

図３は、ソフトウェアの仕様書（の要件）に関するメタモデルの一例を示す。メタモデルにおいては、上述したように文書データリポジトリ２２に格納されている文書の各々に記述されている項目および当該項目間の関係が定義されている。なお、このメタモデルにおいて、文書の各々に記述されている項目はクラスと称される。 FIG. 3 shows an example of a meta model related to (a requirement for) software specifications. In the meta model, as described above, items described in each document stored in the document data repository 22 and a relationship between the items are defined. In this metamodel, items described in each document are referred to as classes.

ソフトウェアの仕様書は、メタモデルに定義されているクラスを取捨選択することによって作成することができる。例えば、Ｗｅｂアプリケーションに関する仕様書であれば、「画面」クラス（図示せず）を選択し、当該「画面」に関する仕様を記述する。一方、フレームワークに関する仕様書であれば、「画面」クラスは必要ないため当該「画面」に関する仕様は記述しない。その他、メタモデルにおいて定義されているクラス以外で記述すべき項目があれば追加することができる。 Software specifications can be created by selecting classes defined in the metamodel. For example, if it is a specification regarding a Web application, a “screen” class (not shown) is selected, and a specification regarding the “screen” is described. On the other hand, if the specifications are related to the framework, the “screen” class is not necessary, so the specifications regarding the “screen” are not described. In addition, if there are items that should be described outside the class defined in the metamodel, they can be added.

図３に示すメタモデルにおいては、「機能要件」クラスおよび「非機能要件」クラスが定義されている。 In the meta model shown in FIG. 3, a “functional requirement” class and a “non-functional requirement” class are defined.

「機能要件」クラスの下位のクラス（以下、サブクラスと表記）として、「共通機能」サブクラスおよび「機能」サブクラスが定義されている。「機能」サブクラスの下位のクラス（以下、サブサブクラスと表記）として、「機能ＩＤ」サブサブクラスおよび「機能名」サブサブクラスが定義されている。これによれば、例えば「機能要件」が記述された文書（ソフトウェアの仕様書）では、「共通機能」および「機能」の項目が記述され、当該「機能」の属性として「機能ＩＤ」および「機能名」の項目が記述されることが示されている。 A “common function” subclass and a “function” subclass are defined as subordinate classes (hereinafter referred to as subclasses) of the “functional requirement” class. A “function ID” sub-subclass and a “function name” sub-subclass are defined as subordinate classes (hereinafter referred to as sub-subclasses) of the “function” sub-class. According to this, for example, in a document (software specification) in which “function requirements” are described, items of “common function” and “function” are described, and “function ID” and “function” are attributes as the “function”. It is shown that the item “function name” is described.

また、「非機能要件」クラスの下位のサブクラスとして、「業務特性」サブクラスおよび「制約事項」サブクラスが定義されている。「制約事項」サブクラスの下位のサブサブクラスとして、「制約事項ＩＤ」サブサブクラスおよび「制約事項名」サブサブクラスが定義されている。これによれば、例えば「非機能要件」が記述された文書（ソフトウェアの仕様書）では、「業務特性」および「制約事項」の項目が記述され、当該「制約事項」の属性として「制約事項ＩＤ」および「制約事項名」の項目が記述されることが示されている。 In addition, a “business characteristic” subclass and a “restrictions” subclass are defined as subclasses below the “non-functional requirement” class. As a sub-subclass lower than the “restriction” subclass, a “restriction ID” sub-subclass and a “restriction name” sub-subclass are defined. According to this, for example, in a document (software specification) in which “non-functional requirements” are described, items of “business characteristics” and “restrictions” are described, and “restrictions” are set as attributes of the “restrictions”. It is shown that items of “ID” and “Restriction item name” are described.

つまり、文書データリポジトリ２２に格納されているメタモデルでは、当該メタモデルにおける第１階層のノード毎（ここでは、「機能要件」クラスおよび「非機能要件」クラス毎）に１つの文書に記述される項目および当該項目の関係が定義されている。 In other words, in the metamodel stored in the document data repository 22, it is described in one document for each node (here, “functional requirement” class and “non-functional requirement” class) of the first hierarchy in the metamodel. Items and their relationships are defined.

上記した図３においてはメタモデルを概念的に説明したが、メタモデルは、例えば図４に示すようなＸＭＬ（Extensible Markup Language）形式のファイル（以下、メタモデルのファイルと表記）で文書データリポジトリ２２に格納されている。 Although the meta model is conceptually described in FIG. 3, the meta model is, for example, an XML (Extensible Markup Language) format file (hereinafter referred to as a meta model file) as shown in FIG. 22.

なお、図４に示すメタモデルのファイルには、各クラスを定義する「ｃｌａｓｓ」要素、「ｓｕｂｃｌａｓｓ」要素および「ｓｕｂｓｕｂｃｌａｓｓ」要素が含まれる。また、「ｃｌａｓｓ」要素、「ｓｕｂｃｌａｓｓ」要素および「ｓｕｂｓｕｂｃｌａｓｓ」要素は、それぞれ当該要素によって定義されるクラスのクラス名（つまり、項目名）を属性値とする「ｎａｍｅ」属性を有する。 Note that the metamodel file shown in FIG. 4 includes a “class” element, a “subclass” element, and a “subclass” element that define each class. Each of the “class” element, the “subclass” element, and the “subclass” element has a “name” attribute whose attribute value is the class name (that is, item name) of the class defined by the element.

また、図４に示すメタモデルのファイルにおける「ｃｌａｓｓ」要素、「ｓｕｂｃｌａｓｓ」要素および「ｓｕｂｓｕｂｃｌａｓｓ」要素は、当該メタモデルにおいて定義されている各クラスの階層構造を構成する。ここでは、「ｓｕｂｃｌａｓｓ」要素は、「ｃｌａｓｓ」要素に定義されているクラスの下位のサブクラスを定義している。また、「ｓｕｂｓｕｂｃｌａｓｓ」要素は、「ｓｕｂｃｌａｓｓ」要素に定義されているサブクラスの下位のサブサブクラスを定義している。また、メタモデルにおいて定義されている各クラスには、当該クラスを識別するための識別子（ｉｄ）が属性値として付与されている。具体的には、「ｃｌａｓｓ」要素、「ｓｕｂｃｌａｓｓ」要素および「ｓｕｂｓｕｂｃｌａｓｓ」要素は、当該要素によって定義されるクラスを識別するための識別子を属性値とする「ｉｄ」属性を有する。 In addition, the “class” element, the “subclass” element, and the “subclass” element in the metamodel file shown in FIG. 4 constitute a hierarchical structure of each class defined in the metamodel. Here, the “subclass” element defines a subclass lower than the class defined in the “class” element. Further, the “subsubclass” element defines a sub-subclass lower than the subclass defined in the “subclass” element. Each class defined in the meta model is given an identifier (id) for identifying the class as an attribute value. Specifically, the “class” element, the “subclass” element, and the “subclass” element have an “id” attribute whose attribute value is an identifier for identifying a class defined by the element.

図５は、図２に示す文書登録部３２の機能構成を示すブロック図である。図５に示すように、文書登録部３２は、文書取得部３２１、メタモデル取得部３２２、メタモデル更新部３２３、メタデータ生成部３２４および登録部３２５を含む。 FIG. 5 is a block diagram showing a functional configuration of the document registration unit 32 shown in FIG. As shown in FIG. 5, the document registration unit 32 includes a document acquisition unit 321, a metamodel acquisition unit 322, a metamodel update unit 323, a metadata generation unit 324, and a registration unit 325.

文書取得部３２１は、入力部３１によって入力された文書を取得する。文書取得部３２１は、取得された文書を例えばＸＭＬ形式に変換する。 The document acquisition unit 321 acquires the document input by the input unit 31. The document acquisition unit 321 converts the acquired document into, for example, an XML format.

メタモデル取得部３２２は、文書データリポジトリ２２からメタモデル（のファイル）を取得する。このメタモデルは、上記したように文書データリポジトリ２２に格納されている文書の各々に記述されている項目および当該項目間の関係が定義されたモデルである。 The meta model acquisition unit 322 acquires the meta model (file) from the document data repository 22. This meta model is a model in which the items described in each of the documents stored in the document data repository 22 and the relationship between the items are defined as described above.

メタモデル更新部３２３は、文書取得部３２１によってＸＭＬ形式に変換された文書に記述されている項目およびメタモデルに定義されているクラスを比較することにより、当該文書に記述されている項目の中で当該メタモデルに定義されていない項目（クラス）を抽出する。メタモデル更新部３２３は、抽出されたクラスをメタモデルのファイルに追加（追記）することによって、当該メタモデルを更新する。 The metamodel update unit 323 compares the items described in the document converted into the XML format by the document acquisition unit 321 and the classes defined in the metamodel, thereby comparing the items described in the document. To extract items (classes) that are not defined in the metamodel. The metamodel update unit 323 updates the metamodel by adding (appending) the extracted class to the metamodel file.

メタデータ生成部３２４は、文書取得部３２１によってＸＭＬ形式に変換された文書に基づいて、当該文書のメタデータを生成する。この場合、メタデータ生成部３２４は、文書取得部３２１によってＸＭＬ形式に変換された文書に記述された項目、当該項目間の関係、当該文書に含まれる単語および当該文書における当該単語の出現数（を示す情報）を含むメタデータを生成する。 The metadata generation unit 324 generates metadata of the document based on the document converted into the XML format by the document acquisition unit 321. In this case, the metadata generation unit 324 includes the items described in the document converted into the XML format by the document acquisition unit 321, the relationship between the items, the words included in the document, and the number of occurrences of the word in the document ( Metadata including information) is generated.

登録部３２５は、文書取得部３２１によってＸＭＬ形式に変換された文書およびメタデータ生成部３２４によって生成されたメタデータを文書データリポジトリ２２に登録する。 The registration unit 325 registers the document converted into the XML format by the document acquisition unit 321 and the metadata generated by the metadata generation unit 324 in the document data repository 22.

図６は、図２に示す文書検索部３３の機能構成を示すブロック図である。図６に示すように、文書検索部３３は、検索条件取得部３３１および検索結果取得部３３２を含む。 FIG. 6 is a block diagram showing a functional configuration of the document search unit 33 shown in FIG. As shown in FIG. 6, the document search unit 33 includes a search condition acquisition unit 331 and a search result acquisition unit 332.

検索条件取得部３３１は、入力部３１によって入力された検索条件を取得する。この検索条件は、例えばユーザによって指定された検索キーワードを含む。 The search condition acquisition unit 331 acquires the search condition input by the input unit 31. This search condition includes, for example, a search keyword specified by the user.

検索結果取得部３３２は、検索条件取得部３３１によって取得された検索条件（検索キーワード）に基づいて、当該検索条件に合致する文書を文書データリポジトリ２２から検索する。これにより、検索結果取得部３３２は、検索条件取得部３３１によって取得された検索条件に合致する文書を含む検索結果を取得する。なお、検索結果取得部３３２によって取得された検索結果は、出力部３５を介してユーザに提示される。 The search result acquisition unit 332 searches the document data repository 22 for a document that matches the search condition based on the search condition (search keyword) acquired by the search condition acquisition unit 331. As a result, the search result acquisition unit 332 acquires a search result including a document that matches the search condition acquired by the search condition acquisition unit 331. Note that the search result acquired by the search result acquisition unit 332 is presented to the user via the output unit 35.

図７は、図２に示す項目取得部３４の機能構成を示すブロック図である。図７に示すように、項目取得部３４は、選択文書取得部３４１、メタデータ取得部３４２、類似文書取得部３４３および項目算出部３４４を含む。 FIG. 7 is a block diagram showing a functional configuration of the item acquisition unit 34 shown in FIG. As illustrated in FIG. 7, the item acquisition unit 34 includes a selected document acquisition unit 341, a metadata acquisition unit 342, a similar document acquisition unit 343, and an item calculation unit 344.

選択文書取得部３４１は、ユーザに対して提示された検索結果（に含まれる文書）の中から、当該ユーザによって選択された文書を取得する。 The selected document acquisition unit 341 acquires the document selected by the user from the search results (documents included) presented to the user.

メタデータ取得部３４２は、文書データリポジトリ２２に格納されている全ての文書のメタデータを当該文書データリポジトリ２２から取得する。 The metadata acquisition unit 342 acquires the metadata of all documents stored in the document data repository 22 from the document data repository 22.

類似文書取得部３４３は、メタデータ取得部３４２によって取得されたメタデータに基づいて、選択文書取得部３４１によって取得された文書および文書データリポジトリ２２に格納されている文書の類似度を算出する。類似文書取得部３４３は、算出された類似度に基づいて、選択文書取得部３４１によって取得された文書と類似する文書を取得する。 The similar document acquisition unit 343 calculates the similarity between the document acquired by the selected document acquisition unit 341 and the document stored in the document data repository 22 based on the metadata acquired by the metadata acquisition unit 342. The similar document acquisition unit 343 acquires a document similar to the document acquired by the selected document acquisition unit 341 based on the calculated similarity.

項目算出部３４４は、文書データリポジトリ２２からメタモデルを取得する。項目算出部３４４は、取得されたメタモデルに定義されているクラス（項目）毎に、選択文書取得部３４１によって取得された文書および類似文書取得部３４３によって取得された文書のうち当該項目が記述されている文書の数を重複度として算出する。項目算出部３４４は、算出された重複度に基づいて、選択文書取得部３４１によって取得された文書および類似文書取得部３４３によって取得された文書において共通して記述されている項目を特定する。なお、項目算出部３４４によって特定された項目は、ユーザが作成しようとしている文書に記述すべき項目として出力部３５を介して当該ユーザに提示される。 The item calculation unit 344 acquires a meta model from the document data repository 22. The item calculation unit 344 describes, for each class (item) defined in the acquired metamodel, the item in the document acquired by the selected document acquisition unit 341 and the document acquired by the similar document acquisition unit 343. The number of documents being calculated is calculated as the degree of duplication. The item calculation unit 344 specifies items commonly described in the document acquired by the selected document acquisition unit 341 and the document acquired by the similar document acquisition unit 343 based on the calculated degree of duplication. The item specified by the item calculation unit 344 is presented to the user via the output unit 35 as an item to be described in the document that the user intends to create.

以下、本実施形態に係る文書管理装置３０の動作について説明する。本実施形態に係る文書管理装置３０においては、ユーザによって作成された文書を文書データリポジトリ２２に登録する処理（以下、文書登録処理と表記）およびユーザが新規に文書を作成する際に当該文書に記載すべき項目を提示する処理（以下、項目提示処理と表記）が実行される。 Hereinafter, the operation of the document management apparatus 30 according to the present embodiment will be described. In the document management apparatus 30 according to the present embodiment, a process for registering a document created by a user in the document data repository 22 (hereinafter referred to as document registration process) and a new document created by the user are included in the document. Processing for presenting items to be described (hereinafter referred to as item presentation processing) is executed.

ここで、図８のフローチャートを参照して、本実施形態に係る文書管理装置３０において実行される文書登録処理の処理手順について説明する。ここでは、文書データリポジトリ２２には、上述した図４に示すようなメタモデルのファイルが格納されているものとする。 Here, with reference to the flowchart of FIG. 8, a processing procedure of document registration processing executed in the document management apparatus 30 according to the present embodiment will be described. Here, it is assumed that the document data repository 22 stores a metamodel file as shown in FIG.

まず、入力部３１は、例えばユーザの操作に応じて、当該ユーザによって作成された文書を入力する（ステップＳ１）。入力部３１によって入力される文書には、各種項目が記述されている。また、入力部３１によって入力される文書には、単語が含まれる。なお、入力部３１によって入力される文書は、複数であってもよい。 First, the input unit 31 inputs a document created by the user in accordance with, for example, a user operation (step S1). Various items are described in the document input by the input unit 31. The document input by the input unit 31 includes words. Note that a plurality of documents may be input by the input unit 31.

入力部３１によって入力される文書は、例えばソフトウェアの仕様書、特許明細書または規程文書等のどのような種類の文書であってもよいが、文書データリポジトリ２２に格納されているメタモデルに基づいた文書であるものとする。 The document input by the input unit 31 may be any type of document such as a software specification, patent specification, or regulation document, but is based on a metamodel stored in the document data repository 22. Document.

ここで、図９および図１０を参照して、入力部３１によって入力される文書について具体的に説明する。 Here, with reference to FIG. 9 and FIG. 10, the document input by the input part 31 is demonstrated concretely.

図９は、ソフトウェア仕様書において「機能要件」が記述された文書の一例を示す。図９に示す例では、「機能要件」が記述された文書には、「共通機能」、「機能」、「業務ロジック」、「例外処理」および「画面」のシートがあり、「機能」のシートには、「機能ＩＤ」、「機能名」、「説明」および「詳細説明」が記述されている。なお、「機能」のシートに記述されている「機能ＩＤ」、「機能名」、「説明」および「詳細説明」は、当該「機能」の属性である。 FIG. 9 shows an example of a document in which “functional requirements” are described in the software specification. In the example illustrated in FIG. 9, a document in which “functional requirements” are described includes sheets of “common function”, “function”, “business logic”, “exception processing”, and “screen”. In the sheet, “function ID”, “function name”, “description”, and “detailed description” are described. Note that “function ID”, “function name”, “description”, and “detailed description” described in the “function” sheet are attributes of the “function”.

この図９に示す文書においては、「機能要件」、「共通機能」、「機能」、「業務ロジック」、「例外処理」、「画面」、「機能ＩＤ」、「機能名」、「説明」および「詳細説明」がそれぞれ当該文書に記述された項目である。 In the document shown in FIG. 9, “functional requirements”, “common functions”, “functions”, “business logic”, “exception processing”, “screen”, “function ID”, “function name”, “explanation”. And “detailed description” are items described in the document.

一方、図１０は、ソフトウェア仕様書において「非機能要件」が記述された文書の一例を示す。図１０に示す例では、「非機能要件」が記述された文書には、「業務特性」、「制約事項」、「メモリ制約」、「設計制約」および「稼動場所」のシートがあり、「制約事項」のシートには、「制約事項ＩＤ」、「制約事項名」、「説明」および「由来」が記述されている。なお、「制約事項」のシートに記述されている「制約事項ＩＤ」、「制約事項名」、「説明」および「由来」は、当該「制約事項」の属性である。 On the other hand, FIG. 10 shows an example of a document in which “non-functional requirements” are described in the software specification. In the example illustrated in FIG. 10, a document in which “non-functional requirements” is described includes sheets of “business characteristics”, “restrictions”, “memory constraints”, “design constraints”, and “working locations”. In the “restriction item” sheet, “restriction item ID”, “restriction item name”, “explanation”, and “origin” are described. Note that “restriction ID”, “restriction name”, “description”, and “derived” described in the “restriction” sheet are attributes of the “restriction”.

この図１０に示す文書においては、「非機能要件」、「業務特性」、「制約事項」、「メモリ制約」、「設計制約」、「稼動場所」、「制約事項ＩＤ」、「制約事項名」、「説明」および「由来」がそれぞれ当該文書に記述された項目である。 In the document shown in FIG. 10, “non-functional requirements”, “business characteristics”, “restrictions”, “memory constraints”, “design constraints”, “operation locations”, “restriction IDs”, “restriction names” ”,“ Description ”, and“ Origin ”are items described in the document.

再び図８に戻ると、文書登録部３２に含まれる文書取得部３２１は、入力部３１によって入力された文書を取得する。なお、文書取得部３２１によって取得された文書は、ＸＭＬ形式に変換される（ステップＳ２）。文書取得部３２１によって取得された文書が複数である場合、当該複数の文書の全てが１つのＸＭＬ形式の文書（ファイル）に変換される。 Returning to FIG. 8 again, the document acquisition unit 321 included in the document registration unit 32 acquires the document input by the input unit 31. Note that the document acquired by the document acquisition unit 321 is converted into the XML format (step S2). When there are a plurality of documents acquired by the document acquisition unit 321, all of the plurality of documents are converted into one XML format document (file).

ここで、図１１および図１２を参照して、ＸＭＬ形式に変換された文書について具体的に説明する。図１１および図１２は、上記した図９および図１０に示す文書がＸＭＬ形式のファイルに変換された例を示す。 Here, with reference to FIG. 11 and FIG. 12, the document converted into the XML format will be specifically described. FIGS. 11 and 12 show examples in which the documents shown in FIGS. 9 and 10 are converted into XML format files.

図１１および図１２に示す例では、「機能要件」要素の内容（つまり、＜機能要件＞から＜／機能要件＞までの内容）が図９に示す「機能要件」が記述された文書に該当する。なお、「機能要件」要素は、「機能要件」が記述された文書に記述されている各項目を示す要素を含む。また、「機能要件」が記述された文書に記述されている各項目を示す要素の内容は、当該項目に記述されている文書の内容を示す。 In the example shown in FIGS. 11 and 12, the content of the “functional requirement” element (that is, the content from <functional requirement> to </ functional requirement>) corresponds to the document in which “functional requirement” shown in FIG. 9 is described. To do. The “functional requirement” element includes an element indicating each item described in the document in which the “functional requirement” is described. Further, the content of the element indicating each item described in the document in which “functional requirements” is described indicates the content of the document described in the item.

同様に、図１１および図１２に示す例では、「非機能要件」要素の内容（＜非機能要件＞から＜／非機能要件＞までの内容）が図１０に示す「非機能要件」が記述された文書に該当する。なお、「非機能要件」要素は、「非機能要件」が記述された文書に記述されている各項目を示す要素を含む。また、「非機能要件」が記述された文書に記述されている各項目を示す要素の内容は、当該項目に記述されている文書の内容を示す。 Similarly, in the examples shown in FIGS. 11 and 12, the contents of the “non-functional requirements” element (contents from <non-functional requirements> to </ non-functional requirements>) describe “non-functional requirements” shown in FIG. Corresponds to the document. The “non-functional requirement” element includes an element indicating each item described in the document in which “non-functional requirement” is described. Further, the content of the element indicating each item described in the document in which “non-functional requirement” is described indicates the content of the document described in the item.

再び図８に戻ると、メタモデル取得部３２２は、文書データリポジトリ２２からメタモデル（のファイル）を取得する。 Returning to FIG. 8 again, the metamodel acquisition unit 322 acquires a metamodel (file) from the document data repository 22.

メタモデル更新部３２３は、文書取得３２１によって取得された文書（ＸＭＬ形式に変換された文書）およびメタモデル取得部３２２によって取得されたメタモデルに基づいて、メタモデル更新処理を実行する（ステップＳ３）。このメタモデル更新処理によれば、文書取得部３２１によって取得された文書に記述されている項目の中でメタモデルに定義されていない項目が抽出され、当該抽出された項目がクラスとしてメタモデルに追加（追記）されることにより、当該メタモデルが更新される。なお、メタモデル更新処理の詳細については後述する。 The metamodel update unit 323 executes metamodel update processing based on the document acquired by the document acquisition 321 (the document converted into the XML format) and the metamodel acquired by the metamodel acquisition unit 322 (step S3). ). According to this metamodel update processing, items that are not defined in the metamodel are extracted from items described in the document acquired by the document acquisition unit 321, and the extracted items are included in the metamodel as classes. By adding (adding), the metamodel is updated. Details of the metamodel update process will be described later.

次に、メタデータ生成部３２４は、文書取得部３２１によって取得された文書に記述されている各項目の内容（文字列）を形態素解析する。具体的には、ＸＭＬ形式に変換された文書に記述された各項目を示す要素の内容に含まれる文字列が形態素解析される。メタデータ生成部３２４は、形態素解析結果に基づいて、文書取得部３２１によって取得された文書に含まれる単語を取得する。 Next, the metadata generation unit 324 performs morphological analysis on the content (character string) of each item described in the document acquired by the document acquisition unit 321. Specifically, a character string included in the content of an element indicating each item described in the document converted into the XML format is subjected to morphological analysis. The metadata generation unit 324 acquires words included in the document acquired by the document acquisition unit 321 based on the morphological analysis result.

メタデータ生成部３２４は、文書取得部３２１によって取得された文書および当該文書に含まれる単語に基づいて、当該文書のメタデータを生成する（ステップＳ４）。なお、メタデータ生成部３２４によって生成されるメタデータには、文書取得部３２１によって取得された文書に記述されている項目、当該項目間の関係、文書に含まれる単語および当該単語が当該文書において出現する数（出現数）が含まれる。 The metadata generation unit 324 generates metadata of the document based on the document acquired by the document acquisition unit 321 and the words included in the document (step S4). Note that the metadata generated by the metadata generation unit 324 includes items described in the document acquired by the document acquisition unit 321, relationships between the items, words included in the document, and the words in the document. The number of occurrences (number of occurrences) is included.

登録部３２５は、文書取得部３２１によって取得された文書およびメタデータ生成部３２４によって生成される当該文書のメタデータを、文書データリポジトリ２２に登録（格納）する（ステップＳ５）。なお、登録部３２５は、メタデータを文書データリポジトリ２２に登録する際、当該登録日（を示す情報）を当該メタデータに付加する。つまり、文書データリポジトリ２２に登録される文書のメタデータには、当該文書（およびメタデータ）の登録日が含まれる。 The registration unit 325 registers (stores) the document acquired by the document acquisition unit 321 and the metadata of the document generated by the metadata generation unit 324 in the document data repository 22 (step S5). The registration unit 325 adds the registration date (information indicating) to the metadata when registering the metadata in the document data repository 22. That is, the metadata of the document registered in the document data repository 22 includes the registration date of the document (and metadata).

ここで、図１３は、文書データリポジトリ２２に登録されたメタデータのデータ構造の一例を示す。なお、図１３は、例えばｄｏｃ＿００００１によって識別される文書のメタデータである。 Here, FIG. 13 shows an example of the data structure of the metadata registered in the document data repository 22. FIG. 13 shows metadata of a document identified by doc — 00001, for example.

図１３に示すメタデータによれば、当該メタデータの登録日は、「ｄａｔｅ」要素において示されている。また、メタデータに含まれる文書に記述された各項目は、「ｍｅｔａｍｏｄｅｌ」要素内に含まれる各要素において示されている。また、メタデータに含まれるＸＭＬ文書に含まれる単語および当該単語の出現数は、「ｗｏｒｄｓ」要素に含まれる「ｗｏｒｄ」要素が有する「ｎａｍｅ」属性および「ｎｕｍ」属性の属性値として示されている。 According to the metadata shown in FIG. 13, the registration date of the metadata is indicated in the “date” element. In addition, each item described in the document included in the metadata is indicated in each element included in the “metamodel” element. Further, the word included in the XML document included in the metadata and the number of occurrences of the word are indicated as attribute values of the “name” attribute and the “num” attribute included in the “word” element included in the “words” element. Yes.

図１３に示すように、文書データリポジトリ２２に登録されるメタデータは、ＸＭＬ形式で記述される。 As shown in FIG. 13, the metadata registered in the document data repository 22 is described in the XML format.

次に、図１４のフローチャートを参照して、上述したメタモデル更新処理（図８に示すステップＳ３の処理）の処理手順について説明する。このメタモデル更新処理によれば、文書登録部３２に含まれる文書取得部３２１によって取得された文書に記述されている項目の中でメタモデルに定義されていない項目（クラス）がメタモデルに追加される。なお、このメタモデル更新処理は、文書登録部３２に含まれるメタモデル更新部３２３によって実行される。 Next, the processing procedure of the metamodel update process (the process of step S3 shown in FIG. 8) will be described with reference to the flowchart of FIG. According to this metamodel update process, items (classes) not defined in the metamodel among items described in the document acquired by the document acquisition unit 321 included in the document registration unit 32 are added to the metamodel. Is done. This metamodel update process is executed by the metamodel update unit 323 included in the document registration unit 32.

メタモデル更新処理においては、まず、文書取得部３２１によって取得された文書（ＸＭＬ形式に変換された文書）における第１階層の要素の各々について以下のステップＳ１１〜Ｓ１５の処理が実行される。 In the metamodel update process, first, the following steps S11 to S15 are executed for each element of the first hierarchy in the document acquired by the document acquisition unit 321 (the document converted into the XML format).

この場合、メタモデル更新部３２３は、文書取得部３２１によって取得された文書（ＸＭＬ形式に変換された文書）における第１階層の要素を１つ取得する（ステップＳ１１）。ここで、文書取得部３２１によって取得された文書における第１階層の要素とは、当該文書におけるルート要素の１階層下位の要素を指す。図１１および図１２に示す文書を参照して具体的に説明すると、当該文書における第１階層の要素には、「機能要件」要素および「非機能要件」要素が該当する。以下、ステップＳ１１において取得された要素を対象要素として説明する。 In this case, the metamodel update unit 323 acquires one element of the first layer in the document acquired by the document acquisition unit 321 (the document converted into the XML format) (Step S11). Here, the first layer element in the document acquired by the document acquisition unit 321 refers to an element one layer lower than the root element in the document. More specifically, referring to the documents shown in FIG. 11 and FIG. 12, the “functional requirement” element and the “non-functional requirement” element correspond to the elements of the first hierarchy in the document. Hereinafter, the element acquired in step S11 will be described as a target element.

次に、メタモデル更新部３２３は、文書データリポジトリ２２に格納されているメタモデルのファイルから、対象要素（ステップＳ１１において取得された第１階層の要素）に対応する要素（以下、対応要素と表記）を取得する（ステップＳ１２）。ここで、対応要素とは、ステップＳ１１において取得された要素の階層と同じ階層のメタモデルのファイルにおける要素を指す。メタモデル更新部３２３は、対応要素の「ｎａｍｅ」属性の属性値を取得する。 Next, the metamodel update unit 323 selects an element (hereinafter referred to as a corresponding element) corresponding to the target element (the first layer element acquired in step S11) from the metamodel file stored in the document data repository 22. (Notation) is acquired (step S12). Here, the “corresponding element” refers to an element in the metamodel file in the same hierarchy as the element hierarchy acquired in step S11. The metamodel update unit 323 acquires the attribute value of the “name” attribute of the corresponding element.

図４に示すメタモデルのファイルを参照して具体的に説明すると、対象要素が第１階層の要素である場合、対応要素は、メタモデルにおける第１階層の要素である例えば「ｃｌａｓｓ（クラス）」要素である。この場合、メタモデル更新部３２３は、対応要素である「ｃｌａｓｓ」要素の「ｎａｍｅ」属性の属性値「機能要件」を取得する。以下、メタモデルの対応要素の「ｎａｍｅ」属性の属性値を単に対応要素の属性値と称する。 More specifically, with reference to the metamodel file shown in FIG. 4, when the target element is an element in the first hierarchy, the corresponding element is an element in the first hierarchy in the metamodel, for example, “class (class)” Element. In this case, the metamodel update unit 323 acquires the attribute value “functional requirement” of the “name” attribute of the “class” element that is the corresponding element. Hereinafter, the attribute value of the “name” attribute of the corresponding element of the meta model is simply referred to as the attribute value of the corresponding element.

メタモデル更新部３２３は、上記したように取得された対象要素（の要素名）および対応要素の属性値が一致するか否かを判定する（ステップＳ１３）。 The metamodel update unit 323 determines whether or not the target element (the element name) acquired as described above matches the attribute value of the corresponding element (step S13).

対象要素の要素名および対応要素の属性値が一致しないと判定された場合（ステップＳ１３のＮＯ）、メタモデル更新部３２３は、他の対応要素があるか否かを文書データリポジトリ２２に格納されているメタモデル（のファイル）を参照して判定する（ステップＳ１４）。つまり、メタモデル更新部３２３は、ステップＳ１２において取得された対応要素以外に対応要素が存在するか否かを判定する。 When it is determined that the element name of the target element and the attribute value of the corresponding element do not match (NO in step S13), the metamodel update unit 323 stores in the document data repository 22 whether or not there is another corresponding element. The determination is made with reference to the metamodel (file) (step S14). That is, the metamodel update unit 323 determines whether there is a corresponding element other than the corresponding element acquired in step S12.

他のメタモデルの対応要素があると判定された場合（ステップＳ１４のＹＥＳ）、上記したステップＳ１２に戻って処理が繰り返される。 When it is determined that there is a corresponding element of another metamodel (YES in step S14), the process returns to the above-described step S12 and is repeated.

一方、他のメタモデルの対応要素がないと判定された場合（ステップＳ１４のＮＯ）、メタモデル更新部３２３は、対象要素をクラスとして、文書データリポジトリ２２に格納されているメタモデルのファイルに追加（追記）する（ステップＳ１５）。具体的には、メタモデル更新部３２３は、対象要素の要素名を属性値とする「ｎａｍｅ」属性を有する要素（ここでは、「ｃｌａｓｓ」要素）をメタモデルのファイルに追加する。これにより、文書データリポジトリ２２に格納されているメタモデル（のファイル）が更新される。ここで、文書取得部３２１によって取得された文書において、対象要素より下位の階層に要素が存在する場合には、当該対象要素および当該対象要素より下位の階層の要素の全てがメタモデルのファイルに追加される。 On the other hand, when it is determined that there is no corresponding element of another meta model (NO in step S14), the meta model update unit 323 sets the target element as a class to a meta model file stored in the document data repository 22. Add (append) (step S15). Specifically, the metamodel update unit 323 adds an element having a “name” attribute (here, “class” element) having the element name of the target element as an attribute value to the metamodel file. As a result, the metamodel (file) stored in the document data repository 22 is updated. Here, in the document acquired by the document acquisition unit 321, when an element exists in a hierarchy lower than the target element, all of the target element and the elements lower in the hierarchy are included in the metamodel file. Added.

ステップＳ１５の処理が実行されると、文書取得部３２１によって取得された文書における全ての第１階層の要素について上記したステップＳ１１〜Ｓ１５の処理が実行されたか否かが判定される（ステップＳ１６）。 When the process of step S15 is executed, it is determined whether or not the processes of steps S11 to S15 described above have been executed for all the first layer elements in the document acquired by the document acquisition unit 321 (step S16). .

全ての第１階層の要素について処理が実行されていないと判定された場合、上記したステップＳ１１に戻って処理が繰り返される。 When it is determined that the process is not executed for all the elements of the first hierarchy, the process returns to the above-described step S11 and the process is repeated.

一方、全ての第１階層の要素について処理が実行されたと判定された場合、当該第１階層の要素についてのメタモデル更新処理は終了される。 On the other hand, when it is determined that the process has been executed for all the elements of the first hierarchy, the metamodel update process for the elements of the first hierarchy is terminated.

ここでは、文書取得部３２１によって取得された文書（ＸＭＬ形式に変換された文書）における第１階層の要素についてのメタモデル更新処理について説明したが、当該文書における第２階層の要素についても同様にメタモデル更新処理が実行される。 Here, the metamodel update processing for the first layer element in the document acquired by the document acquisition unit 321 (the document converted into the XML format) has been described, but the same applies to the second layer element in the document. Metamodel update processing is executed.

なお、第２階層の要素のうちメタモデル更新処理の対象となる要素は、上記したステップＳ１３において対応要素と一致すると判定された第１階層の要素の下位の要素である。つまり、ステップＳ１３において対応要素と一致しないと判定された第１階層の要素の下位の要素についてはステップＳ１５において既にメタモデルに追加されているため、メタモデル更新処理は実行されない。 Of the elements in the second hierarchy, the elements to be subjected to the metamodel update process are elements below the elements in the first hierarchy determined to match the corresponding elements in step S13 described above. That is, since the elements below the first hierarchy element determined not to match the corresponding element in step S13 have already been added to the metamodel in step S15, the metamodel update process is not executed.

この場合、上記したメタモデル更新処理における第１階層の要素を第２階層の要素として処理が実行される。なお、第３階層以降の要素についても同様である。 In this case, the process is executed with the element of the first hierarchy in the metamodel update process described above as the element of the second hierarchy. The same applies to elements in the third and subsequent layers.

このようにメタモデル更新処理が実行されることによって、文書取得部３２１によって取得された文書における各階層の要素の各々について、文書データリポジトリ２２に格納されているメタモデルにクラスとして定義されているか否かが判定される。これにより、メタモデルにクラスとして定義されていない要素（および当該要素の下位の階層にある要素）については、当該メタモデルに追加される。 By executing the metamodel update process in this way, whether each element of each hierarchy in the document acquired by the document acquisition unit 321 is defined as a class in the metamodel stored in the document data repository 22 It is determined whether or not. As a result, elements that are not defined as classes in the metamodel (and elements that are in a lower hierarchy of the element) are added to the metamodel.

ここで、図１５は、メタモデル更新処理後のメタモデルのファイルのデータ構造の一例を示す。ここでは、図４に示すメタモデルのファイルが更新されたものとして説明する。 FIG. 15 shows an example of the data structure of the metamodel file after the metamodel update process. Here, it is assumed that the metamodel file shown in FIG. 4 has been updated.

図１５に示すメタモデルのファイルにおいては、図４に示すメタモデルのファイルと比較して、「ｎａｍｅ」属性の属性値が「詳細説明」である「ｓｕｂｓｕｂｃｌａｓｓ」要素が追加されている。 In the meta model file shown in FIG. 15, a “subclass” element having an attribute value “detailed description” is added as compared to the meta model file shown in FIG.

これによれば、文書取得部３２１によって取得された文書における「詳細説明」要素（つまり、当該文書に記述されている「詳細説明」の項目）がクラスとしてメタモデル中に定義されておらず、当該「詳細説明」要素がメタモデルに追加されたことが示されている。 According to this, the “detailed description” element (that is, the item of “detailed description” described in the document) in the document acquired by the document acquisition unit 321 is not defined in the metamodel as a class, It is shown that the “detailed description” element has been added to the metamodel.

次に、図１６のフローチャートを参照して、本実施形態に係る文書管理装置３０において実行される項目提示処理の処理手順について説明する。ここでは、文書データリポジトリ２２には、上記したように複数の文書、当該文書毎のメタデータおよび当該複数の文書の各々に記述された項目および当該項目間の関係が定義された１つのメタモデルが格納されているものとする。なお、文書データリポジトリ２２に格納されている文書、メタデータおよびメタモデルは、上記したようにＸＭＬ形式のファイルである。 Next, with reference to a flowchart of FIG. 16, a processing procedure of item presentation processing executed in the document management apparatus 30 according to the present embodiment will be described. Here, in the document data repository 22, as described above, one metamodel in which a plurality of documents, metadata for each document, items described in each of the plurality of documents, and relationships between the items are defined. Is stored. The documents, metadata, and metamodel stored in the document data repository 22 are XML format files as described above.

まず、入力部３１は、ユーザによって指定された検索キーワードを入力する（ステップＳ２１）。 First, the input unit 31 inputs a search keyword designated by the user (step S21).

次に、文書検索部３３に含まれる検索条件取得部３３１は、入力部３１によって入力された検索キーワードを取得する。 Next, the search condition acquisition unit 331 included in the document search unit 33 acquires the search keyword input by the input unit 31.

文書検索部３３に含まれる検索結果取得部３３２は、検索条件取得部３３１によって取得された検索キーワードを用いて、当該検索キーワードに合致する文書を文書データリポジトリ２２から検索する（ステップＳ２２）。これにより、検索結果取得部３３２は、検索キーワードに合致する文書を含む検索結果を取得する。 The search result acquisition unit 332 included in the document search unit 33 uses the search keyword acquired by the search condition acquisition unit 331 to search the document data repository 22 for a document that matches the search keyword (step S22). As a result, the search result acquisition unit 332 acquires a search result including a document that matches the search keyword.

検索結果取得部３３２は、取得された検索結果を、出力部３５を介して出力する（ステップＳ２３）。これにより、検索結果（に含まれる文書）は、ユーザに対して提示される。 The search result acquisition unit 332 outputs the acquired search result via the output unit 35 (step S23). As a result, the search results (documents included) are presented to the user.

ここで、ユーザは、提示された検索結果の中から、当該ユーザにとって所望の文書を選択することができる。具体的には、ユーザは、例えば当該ユーザが作成しようとする文書と似ている文書を選択することができる。 Here, the user can select a desired document for the user from the presented search results. Specifically, the user can select, for example, a document similar to a document that the user intends to create.

文書管理装置３０においては、ユーザによって文書が選択されたか否かが判定される（ステップＳ２４）。 In the document management apparatus 30, it is determined whether or not a document has been selected by the user (step S24).

ユーザによって文書が選択されていないと判定された場合（ステップＳ２４のＮＯ）、ステップＳ２１に戻って処理が繰り返される。 If it is determined that the user has not selected a document (NO in step S24), the process returns to step S21 and the process is repeated.

一方、ユーザによって文書が選択されたと判定された場合（ステップＳ２４のＹＥＳ）、項目取得部３４に含まれる選択文書取得部３４１は、当該ユーザによって選択された文書（以下、選択文書と表記）を例えば文書データリポジトリ２２から取得する（ステップＳ２５）。 On the other hand, when it is determined that a document has been selected by the user (YES in step S24), the selected document acquisition unit 341 included in the item acquisition unit 34 displays the document selected by the user (hereinafter referred to as a selected document). For example, it is acquired from the document data repository 22 (step S25).

次に、項目取得部３４に含まれるメタデータ取得部３４２は、文書データリポジトリ２２に格納されている全ての文書のメタデータを取得する（ステップＳ２６）。 Next, the metadata acquisition unit 342 included in the item acquisition unit 34 acquires metadata of all documents stored in the document data repository 22 (step S26).

項目取得部３４に含まれる類似文書取得部３４３は、メタデータ取得部３４２によって取得されたメタデータに基づいて、選択文書に類似する文書（以下、類似文書と表記）を文書データリポジトリ２２に格納されている文書の中から取得（算出）する（ステップＳ２７）。 The similar document acquisition unit 343 included in the item acquisition unit 34 stores a document similar to the selected document (hereinafter referred to as a similar document) in the document data repository 22 based on the metadata acquired by the metadata acquisition unit 342. It is obtained (calculated) from the document being processed (step S27).

この場合、類似文書取得部３４３は、選択文書と文書データリポジトリ２２に格納されている文書の各々との類似度を算出する。類似文書取得部３４３は、算出された類似度が予め定められた値（閾値）以上の文書を、類似文書（選択文書に類似する文書）として取得する。 In this case, the similar document acquisition unit 343 calculates the similarity between the selected document and each of the documents stored in the document data repository 22. The similar document acquisition unit 343 acquires a document whose calculated similarity is equal to or greater than a predetermined value (threshold value) as a similar document (a document similar to the selected document).

ここで、類似文書取得部３４３による類似度の算出処理について説明する。ここでは、類似文書取得部３４３は、第１および第２の類似度を算出するものとする。以下の説明では、選択文書との類似度が算出される文書を対象文書とする。 Here, the similarity calculation processing by the similar document acquisition unit 343 will be described. Here, it is assumed that the similar document acquisition unit 343 calculates the first and second similarities. In the following description, a document whose similarity with the selected document is calculated is a target document.

まず、類似文書取得部３４３によって算出される第１の類似度について説明する。第１の類似度は、選択文書のメタデータに含まれる各項目（当該選択文書に記述された項目および当該項目間の関係）がクラスとして定義されたメタモデル（以下、選択文書のメタモデルと表記）および対象文書のメタデータに含まれる各項目（当該対象文書に記述された項目および当該項目間の関係）がクラスとして定義されたメタモデル（以下、対象文書のメタモデルと表記）を用いて算出される。 First, the first similarity calculated by the similar document acquisition unit 343 will be described. The first similarity is a meta model in which each item (items described in the selected document and the relationship between the items) included in the metadata of the selected document is defined as a class (hereinafter referred to as a meta model of the selected document). Notation) and the meta model (hereinafter referred to as the meta model of the target document) in which each item (the item described in the target document and the relationship between the items) included in the metadata of the target document is defined as a class Is calculated.

ここで、図１７を参照して、文書（選択文書および対象文書）のメタモデルについて具体的に説明する。この文書のメタモデルは、当該文書のメタデータに基づいて生成される。また、文書のメタモデルは、文書データリポジトリ２２に格納されている文書毎に生成される。 Here, a meta model of a document (selected document and target document) will be specifically described with reference to FIG. The meta model of the document is generated based on the metadata of the document. A document metamodel is generated for each document stored in the document data repository 22.

図１７に示す例では、文書のメタモデルには、当該文書のメタデータの「ｍｅｔａｍｏｄｅｌ」要素および当該「ｍｅｔａｍｏｄｅｌ」要素内に含まれる各要素（によって示される項目）がクラスとして含まれる。 In the example illustrated in FIG. 17, the metamodel of the document includes “metamodel” element of the metadata of the document and each element (item indicated by) included in the “metamodel” element as a class.

具体的には、文書のメタモデルには、当該文書のメタデータの「ｃｌａｓｓ」要素（「ｍｅｔａｍｏｄｅｌ」要素の１階層下位の要素）、「ｓｕｂｃｌａｓｓ」要素（「ｃｌａｓｓ」要素の１階層下位の要素）および「ｓｕｂｓｕｂｃｌａｓｓ」要素（「ｓｕｂｃｌａｓｓ」要素の１階層下位の要素）がクラスとして含まれる。 Specifically, the meta model of a document includes a “class” element (element that is one layer lower than the “metamodel” element) and a “subclass” element (element that is one layer lower than the “class” element) of the metadata of the document. ) And “subsubclass” elements (elements one level lower than the “subclass” element) are included as classes.

第１の類似度の算出においては、選択文書のメタモデルを木構造とした場合における第１階層以下の要素（クラス）からなる部分木単位での対象文書との比較が行われる。図１７に示す例においては、破線１００ａおよび１００ｂで囲まれた部分が対象文書との比較の対象となる部分木である。 In the calculation of the first similarity, a comparison is made with the target document in units of subtrees composed of elements (classes) in the first hierarchy and below when the meta model of the selected document is a tree structure. In the example shown in FIG. 17, a portion surrounded by broken lines 100a and 100b is a subtree to be compared with the target document.

ここで、選択文書のメタモデルをＭｓ、対象文書のメタモデルをＭｏとする。ＭｓとＭｏとの類似度（つまり、第１の類似度）をｓｉｍ＿ｍｅｔａｍｏｄｅｌ（Ｍｓ，Ｍｏ）とすると、ｓｉｍ＿ｍｅｔａｍｏｄｅｌ（Ｍｓ，Ｍｏ）は以下の式（１）で表される。 Here, the meta model of the selected document is Ms, and the meta model of the target document is Mo. If the similarity between Ms and Mo (that is, the first similarity) is sim_metamodel (Ms, Mo), sim_metamodel (Ms, Mo) is expressed by the following equation (1).

式（１）のＭｓｉは、Ｍｓの各部分木を表す。また、ｃｌａｓｓ（Ｍｓｉ）は、Ｍｓの部分木（つまり、Ｍｓｉ）に含まれるクラス数を表す。また、ｍａｔｃｈ（Ｍｏ）は、Ｍｓｉに含まれるクラスと一致するＭｏに含まれるクラス数を表す。 Msi in Expression (1) represents each subtree of Ms. Class (Msi) represents the number of classes included in the subtree of Ms (that is, Msi). Further, match (Mo) represents the number of classes included in Mo that matches the class included in Msi.

次に、類似文書取得部３４３によって算出される第２の類似度について説明する。第２の類似度は、選択文書および対象文書のメタデータに含まれる単語（当該文書に含まれる単語）および当該単語の出現数を用いて算出される。 Next, the second similarity calculated by the similar document acquisition unit 343 will be described. The second similarity is calculated using words (words included in the document) included in the metadata of the selected document and the target document and the number of appearances of the word.

ここでは、選択文書および対象文書に含まれる単語のｔｆ／ｉｄｆ値が算出される。ここで、文書ｄにおける単語ｔの出現数（頻度）をｔｆ（ｔ，ｄ）とする。なお、単語ｔは、文書ｄのメタデータに含まれる「ｗｏｒｄ」要素において定義されている。また、ｔｆ（ｔ，ｄ）は、文書ｄのメタデータに含まれる「ｗｏｒｄ」要素の「ｎｕｍ」属性の属性値である。 Here, tf / idf values of words included in the selected document and the target document are calculated. Here, the number of occurrences (frequency) of the word t in the document d is assumed to be tf (t, d). The word t is defined in the “word” element included in the metadata of the document d. Tf (t, d) is an attribute value of the “num” attribute of the “word” element included in the metadata of the document d.

文書データリポジトリ２２に格納されている文書の数をＮ、単語ｔが１回以上出現する文書の数をｄｆ（ｔ）とすると、ｉｄｆ（ｔ）は以下の式（２）で表される。 If the number of documents stored in the document data repository 22 is N and the number of documents in which the word t appears one or more times is df (t), idf (t) is expressed by the following equation (2).

また、単語ｔの文書ｄにおける重みｗ（ｔ，ｄ）は以下の式（３）で表される。 Further, the weight w (t, d) of the word t in the document d is expressed by the following equation (3).

更に、選択文書をＤｓとし、当該選択文書Ｄｓに含まれる各単語のｔｆ／ｉｄｆ値をｗ₁、ｗ₂、…、ｗ_pとすると、選択文書Ｄｓのベクトルは以下の式（４）で表される。 Furthermore, if the selected document is Ds and the tf / idf values of each word included in the selected document Ds are w ₁ , w ₂ ,..., W _p , the vector of the selected document Ds is expressed by the following equation (4) Is done.

また、対象文書をＤｏとし、当該対象文書Ｄｏに含まれる各単語のｔｆ／ｉｄｆ値をｗ₁、ｗ₂、…、ｗ_pとすると、対象文書Ｄｏのベクトルは以下の式（５）で表される。 If the target document is Do and the tf / idf values of each word included in the target document Do are w ₁ , w ₂ ,..., W _p , the vector of the target document Do is expressed by the following equation (5). Is done.

なお、対象文書Ｄｏのベクトルには、選択文書Ｄｓのベクトルに含まれるｔｆ／ｉｄｆ値に対応する各単語のｔｆ／ｉｄｆ値が含まれる。つまり、対象文書Ｄｏに選択文書Ｄｓに含まれる単語が含まれていない場合、対象文書Ｄｏに含まれる当該単語のｔｆ／ｉｄｆ値は０となる。 The vector of the target document Do includes the tf / idf value of each word corresponding to the tf / idf value included in the vector of the selected document Ds. That is, when the target document Do does not include a word included in the selected document Ds, the tf / idf value of the word included in the target document Do is 0.

選択文書Ｄｓおよび対象文書Ｄｏに含まれる単語および当該単語の出現数を用いた類似度（つまり、第２の類似度）をｓｉｍ＿ｗｏｒｄｓ（Ｄｓ，Ｄｏ）とすると、ｓｉｍ＿ｗｏｒｄｓ（Ｄｓ，Ｄｏ）は以下の式（６）で表される。 Assuming that the similarity (that is, the second similarity) using the words included in the selected document Ds and the target document Do and the number of occurrences of the words is sim_words (Ds, Do), sim_words (Ds, Do) is as follows: It is represented by Formula (6).

上記したように第１および第２の類似度が算出された場合、類似文書取得部３４３は、選択文書と対象文書との類似度を算出する。選択文書Ｄｓと対象文書Ｄｏとの類似度をｓｉｍ(Ｄｓ，Ｄｏ)とすると、ｓｉｍ(Ｄｓ，Ｄｏ）は以下の式（７）で表される。 As described above, when the first and second similarities are calculated, the similar document acquisition unit 343 calculates the similarity between the selected document and the target document. When the similarity between the selected document Ds and the target document Do is sim (Ds, Do), sim (Ds, Do) is expressed by the following equation (7).

類似文書取得部３４３は、式（７）によって算出された選択文書Ｄｓおよび対象文書Ｄｏとの類似度が閾値以上である場合、当該対象文書Ｄｏを選択文書Ｄｓに類似する文書、つまり、類似文書として取得する。 When the similarity between the selected document Ds calculated by Expression (7) and the target document Do is equal to or greater than the threshold, the similar document acquisition unit 343 determines that the target document Do is similar to the selected document Ds, that is, a similar document. Get as.

以下、選択文書Ｄｓと対象文書Ｄｏとの類似度（つまり、ｓｉｍ(Ｄｓ，Ｄｏ)）の算出処理について具体的に説明する。ここでは、図１８に示すメタデータが選択文書Ｄｓのメタデータであるものとする。また、図１９に示すメタデータが対象文書Ｄｏのメタデータであるものとする。 Hereinafter, the calculation process of the similarity (that is, sim (Ds, Do)) between the selected document Ds and the target document Do will be specifically described. Here, it is assumed that the metadata shown in FIG. 18 is the metadata of the selected document Ds. Further, it is assumed that the metadata shown in FIG. 19 is the metadata of the target document Do.

まず、選択文書Ｄｓおよび対象文書Ｄｏのメタモデルを用いて算出される第１の類似度について具体的に説明する。ここでは、選択文書ＤｓのメタモデルをＭｓ、対象文書ＤｏのメタモデルをＭｏとする。 First, the first similarity calculated using the meta model of the selected document Ds and the target document Do will be specifically described. Here, the meta model of the selected document Ds is Ms, and the meta model of the target document Do is Mo.

ここで、図１８に示す選択文書Ｄｓのメタデータによれば、選択文書ＤｓのメタモデルＭｓは、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１」である「ｃｌａｓｓ」要素をルート要素とする部分木（以下、部分木Ｍｓ１と表記）と「ｉｄ」属性の属性値が「ｍｅｔａ＿００２」である「ｃｌａｓｓ」要素をルート要素とする部分木（以下、部分木Ｍｓ２と表記）とを含む。 Here, according to the metadata of the selected document Ds shown in FIG. 18, the metamodel Ms of the selected document Ds is a subtree having a “class” element whose attribute value of “id” attribute is “meta — 001” as a root element. (Hereinafter referred to as a partial tree Ms1) and a partial tree (hereinafter referred to as a partial tree Ms2) having a “class” element having an attribute value of “meta_002” as a root element.

部分木Ｍｓ１には、選択文書Ｄｓのメタデータに含まれる「ｉｄ」属性の属性値が「ｍｅｔａ＿００１」である「ｃｌａｓｓ」要素、当該「ｃｌａｓｓ」要素の下位にある「ｓｕｂｃｌａｓｓ」要素および「ｓｕｂｓｕｂｃｌａｓｓ」要素（によって示される項目）がクラスとして含まれる。具体的には、図１８に示す選択文書Ｄｓのメタデータによれば、部分木Ｍｓ１には、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１」である「ｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿００１」である「ｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１＿０００１」である「ｓｕｂｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１＿０００２」である「ｓｕｂｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１＿０００３」である「ｓｕｂｓｕｂｃｌａｓｓ」要素および「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１＿０００７」である「ｓｕｂｓｕｂｃｌａｓｓ」要素が含まれる。したがって、部分木Ｍｓ１に含まれるクラス（要素）数を表すｃｌａｓｓ（Ｍｓ１）は６となる。 In the subtree Ms1, the “class” element whose attribute value of the “id” attribute included in the metadata of the selected document Ds is “meta — 001”, the “subclass” element and the “subclass” element subordinate to the “class” element. The element (item indicated by) is included as a class. Specifically, according to the metadata of the selected document Ds illustrated in FIG. 18, the subtree Ms1 includes a “class” element having an attribute value of “meta_001” and an attribute value of the “id” attribute in the subtree Ms1. Is a “subclass” element whose meta attribute is “meta — 001 — 001”, an “subsubclass” element whose attribute value is “meta — 001 — 0001 — 0001”, and a “subclass” element whose attribute value is “meta — 001 — 0001 — 0002” An “subsubclass” element whose attribute value is “meta — 001 — 0001 — 0003” and an “subsubclass” element whose attribute value is “meta — 001 — 0001 — 0007” are included. Therefore, class (Ms1) representing the number of classes (elements) included in the subtree Ms1 is 6.

また、図１９に示す対象文書Ｄｏのメタデータによれば、対象文書ＤｏのメタモデルＭｏには、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１」である「ｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１」である「ｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１＿０００１」である「ｓｕｂｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１＿０００２」である「ｓｕｂｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１＿０００４」である「ｓｕｂｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１＿０００９」である「ｓｕｂｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２」である「ｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１」である「ｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１＿０００１」である「ｓｕｂｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１＿０００２」である「ｓｕｂｓｕｂｃｌａｓｓ」要素および「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１＿０００５」である「ｓｕｂｓｕｂｃｌａｓｓ」要素（によって示される項目）がクラスとして含まれる。 Further, according to the metadata of the target document Do shown in FIG. 19, the meta model Mo of the target document Do includes the “class” element whose attribute value of the “id” attribute is “meta — 001”, and the attribute of the “id” attribute. A “subclass” element whose value is “meta — 001 — 0001”, an “subsubclass” element whose attribute value is “meta — 001 — 0001 — 0001”, and a “subsubclass” element whose attribute value is “meta — 001 — 0001 — 0002” The “subsubclass” element whose attribute value is “meta — 001 — 0001 — 0004”, the “subsubclass” element whose attribute value is “meta — 001 — 0001 — 0009”, and the attribute value of the “id” attribute is “me” The “class” element with “a — 002”, the “subclass” element with the attribute value of “id” attribute “meta — 002 — 0001”, the “subsubclass” element with the attribute value of “meta — 002 — 0001 — 0001”, and the “id” attribute A “subsubclass” element whose attribute value is “meta — 002 — 0001 — 0002” and an “subsubclass” element whose “id” attribute attribute value is “meta — 002 — 0001 — 0005” (items indicated by) are included as classes.

この場合、対象文書ＤｏのメタモデルＭｏに含まれるクラスと一致する部分木Ｍｓ１に含まれるクラス（項目）は、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１」である「ｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１」である「ｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１＿０００１」である「ｓｕｂｓｕｂｃｌａｓｓ」要素および「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１＿０００２」である「ｓｕｂｓｕｂｃｌａｓｓ」要素である。したがって、対象文書ＤｏのメタモデルＭｏに含まれるクラスと一致する部分木Ｍｓ１に含まれるクラス数を表すｍａｔｃｈ（Ｍｏ）は４となる。 In this case, the class (item) included in the subtree Ms1 that matches the class included in the meta model Mo of the target document Do has the “class” element whose attribute value “meta — 001” is “meta_001”, “id”. The “subclass” element whose attribute value is “meta — 001 — 0001”, the “subsubclass” element whose attribute value is “meta — 001 — 0001 — 0001”, and the “subclass” element whose attribute value is “meta — 001 — 0001 — 0002” It is. Therefore, match (Mo) representing the number of classes included in the subtree Ms1 that matches the class included in the meta model Mo of the target document Do is 4.

一方、部分木Ｍｓ２には、選択文書Ｄｓのメタデータに含まれる「ｉｄ」属性の属性値が「ｍｅｔａ＿００２」である「ｃｌａｓｓ」要素、当該「ｃｌａｓｓ」要素の下位にある「ｓｕｂｃｌａｓｓ」要素および「ｓｕｂｓｕｂｃｌａｓｓ」要素がクラスとして含まれる。具体的には、図１８に示す選択文書Ｄｓのメタデータによれば、部分木Ｍｓ２には、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２」である「ｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１」である「ｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１＿０００１」である「ｓｕｂｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１＿０００２」である「ｓｕｂｓｕｂｃｌａｓｓ」要素および「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１＿０００３」である「ｓｕｂｓｕｂｃｌａｓｓ」要素が含まれる。したがって、部分木Ｍｓ２に含まれるクラス数を表すｃｌａｓｓ（Ｍｓ２）は５となる。 On the other hand, in the subtree Ms2, the “class” element whose attribute value of the “id” attribute included in the metadata of the selected document Ds is “meta — 002”, the “subclass” element subordinate to the “class” element, and “ The “subclass” element is included as a class. Specifically, according to the metadata of the selected document Ds shown in FIG. 18, in the subtree Ms2, the “class” element whose attribute value of the “id” attribute is “meta — 002”, and the attribute value of the “id” attribute "Subclass" element whose attribute value is "meta_002_0001", "subsubclass" element whose attribute value is "meta_002_0001_0001", and "subclass" element whose attribute value is "meta_002_0001_0002" and "id" An “subsubclass” element whose attribute value is “meta — 002 — 0001 — 0003” is included. Therefore, class (Ms2) representing the number of classes included in the subtree Ms2 is 5.

この場合、上記した対象文書ＤｏのメタモデルＭｏに含まれるクラスと一致する部分木Ｍｓ２に含まれるクラス（項目）は、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２」である「ｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１」である「ｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１＿０００１」である「ｓｕｂｓｕｂｃｌａｓｓ」要素および「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１＿０００２」である「ｓｕｂｓｕｂｃｌａｓｓ」要素である。したがって、対象文書ＤｏのメタモデルＭｏに含まれるクラスと一致する部分木Ｍｓ２に含まれるクラス数を表すｍａｔｃｈ（Ｍｏ）は４となる。 In this case, the class (item) included in the subtree Ms2 that matches the class included in the meta model Mo of the target document Do described above has a “class” element whose attribute value of “id” attribute is “meta — 002”, “ The “subclass” element whose attribute value is “meta — 002 — 0001”, the “subsubclass” element whose attribute value is “meta — 002 — 0001 — 0001”, and the “subsubclass” whose attribute value is “meta — 002 — 0001 — 0002” Element. Therefore, match (Mo) representing the number of classes included in the subtree Ms2 that matches the class included in the meta model Mo of the target document Do is 4.

以上の値を用いると、選択文書ＤｓのメタモデルＭｓと対象文書ＤｏのメタモデルＭｏとの類似度（第１の類似度）ｓｉｍ＿ｍｅｔａｍｏｄｅｌ（Ｍｓ，Ｍｏ）は、４／６＋４／５＝１．３と算出される。 Using the above values, the similarity (first similarity) sim_metamodel (Ms, Mo) between the meta model Ms of the selected document Ds and the meta model Mo of the target document Do is 4/6 + 4/5 = 1.3. Is calculated.

次に、選択文書Ｄｓおよび対象文書Ｄｏのメタデータに含まれる単語および当該単語の出現数を用いて算出される第２の類似度について具体的に説明する。ここでは、文書データリポジトリ２２に格納されている文書の数Ｎを１０とする。 Next, the second similarity calculated using the words included in the metadata of the selected document Ds and the target document Do and the number of appearances of the words will be specifically described. Here, the number N of documents stored in the document data repository 22 is assumed to be 10.

図１８に示す選択文書Ｄｓのメタデータによれば、選択文書Ｄｓには、単語「顧客」、「情報」、「管理」および「検索」が含まれている。なお、選択文書Ｄｓに含まれる単語「顧客」の数（つまり、出現数）は１０である。選択文書Dsに含まれる単語「情報」の数は２０である。選択文書Dsに含まれる単語「管理」の数は５である。また、選択文書Dsに含まれる単語「検索」の数は５である。 According to the metadata of the selected document Ds shown in FIG. 18, the selected document Ds includes the words “customer”, “information”, “management”, and “search”. The number of words “customer” included in the selected document Ds (that is, the number of appearances) is 10. The number of words “information” included in the selected document Ds is 20. The number of words “management” included in the selected document Ds is five. The number of words “search” included in the selected document Ds is five.

この選択文書Ｄｓに含まれる単語は、当該選択文書Ｄｓのメタデータの「ｗｏｒｄ」要素が有する「ｎａｍｅ」属性の属性値から得ることができる。また、選択文書Ｄｓに含まれる単語の出現数は、当該選択文書Ｄｓのメタデータの「ｗｏｒｄ」要素が有する「ｎｕｍ」属性の属性値から得ることができる。 The word included in the selected document Ds can be obtained from the attribute value of the “name” attribute of the “word” element of the metadata of the selected document Ds. Further, the number of appearances of words included in the selected document Ds can be obtained from the attribute value of the “num” attribute included in the “word” element of the metadata of the selected document Ds.

ここで、文書データリポジトリ２２に格納されている文書の中で、選択文書Ｄｓに含まれる単語「顧客」が１回以上出現する文書の数（つまり、ｄｆ（顧客））は３であるものとする。この場合、選択文書Ｄｓに含まれる単語「顧客」のｔｆ／ｉｄｆ値（つまり、ｗ（顧客、選択文書Ｄｓ）)は、１０＊ｌｏｇ（１０／３）＝５．２となる。 Here, among the documents stored in the document data repository 22, the number of documents in which the word “customer” included in the selected document Ds appears at least once (that is, df (customer)) is 3. To do. In this case, the tf / idf value (that is, w (customer, selected document Ds)) of the word “customer” included in the selected document Ds is 10 * log (10/3) = 5.2.

また、文書データリポジトリ２２に格納されている文書の中で、選択文書Ｄｓに含まれる単語「情報」が１回以上出現する文書の数（つまり、ｄｆ（情報））は５であるものとする。この場合、選択文書Ｄｓに含まれる単語「情報」のｔｆ／ｉｄｆ値（つまり、ｗ（情報、選択文書Ｄｓ）)は、２０＊ｌｏｇ（１０／５）＝６となる。 The number of documents in which the word “information” included in the selected document Ds appears one or more times (ie, df (information)) among the documents stored in the document data repository 22 is 5. . In this case, the tf / idf value (that is, w (information, selected document Ds)) of the word “information” included in the selected document Ds is 20 * log (10/5) = 6.

また、文書データリポジトリ２２に格納されている文書の中で、選択文書Ｄｓに含まれる単語「管理」が１回以上出現する文書の数（つまり、ｄｆ（管理））は１であるものとする。この場合、選択文書Ｄｓに含まれる単語「管理」のｔｆ／ｉｄｆ値（つまり、ｗ（管理、選択文書Ｄｓ）)は、５＊ｌｏｇ（１０／１）＝５となる。 In addition, among the documents stored in the document data repository 22, the number of documents in which the word “management” included in the selected document Ds appears one or more times (that is, df (management)) is 1. . In this case, the tf / idf value (that is, w (management, selected document Ds)) of the word “management” included in the selected document Ds is 5 * log (10/1) = 5.

また、文書データリポジトリ２２に格納されている文書の中で、選択文書Ｄｓに含まれる単語「検索」が１回以上出現する文書の数（つまり、ｄｆ（検索））は３であるものとする。この場合、選択文書Ｄｓに含まれる単語「検索」のｔｆ／ｉｄｆ値（つまり、ｗ（検索、選択文書Ｄｓ）)は、５＊ｌｏｇ（１０／３）＝２．６となる。 The number of documents in which the word “search” included in the selected document Ds appears at least once among documents stored in the document data repository 22 (that is, df (search)) is 3. . In this case, the tf / idf value (that is, w (search, selected document Ds)) of the word “search” included in the selected document Ds is 5 * log (10/3) = 2.6.

これによれば、選択文書Ｄｓのベクトルは、以下の式（８）のようになる。 According to this, the vector of the selected document Ds is expressed by the following equation (8).

一方、図１９に示す対象文書Ｄｏのメタデータによれば、対象文書Ｄｏには、単語「従業員」、「情報」、「検索」および「登録」が含まれている。なお、対象文書Ｄｏに含まれる単語「従業員」の数（つまり、出現数）は１０である。対象文書Ｄｏに含まれる単語「情報」の数は１９である。対象文書Ｄｏに含まれる単語「検索」の数は６である。また、対象文書Ｄｏに含まれる単語「登録」の数は６である。 On the other hand, according to the metadata of the target document Do shown in FIG. 19, the target document Do includes the words “employee”, “information”, “search”, and “registration”. The number of words “employee” (that is, the number of appearances) included in the target document Do is 10. The number of words “information” included in the target document Do is 19. The number of words “search” included in the target document Do is six. Further, the number of words “registration” included in the target document Do is six.

この対象文書Ｄｏに含まれる単語は、当該対象文書Ｄｏのメタデータの「ｗｏｒｄ」要素が有する「ｎａｍｅ」属性の属性値から得ることができる。また、対象文書Ｄｏに含まれる単語の出現数は、当該対象文書Ｄｏのメタデータの「ｗｏｒｄ」要素が有する「ｎｕｍ」属性の属性値から得ることができる。 The word included in the target document Do can be obtained from the attribute value of the “name” attribute of the “word” element of the metadata of the target document Do. Further, the number of appearances of words included in the target document Do can be obtained from the attribute value of the “num” attribute included in the “word” element of the metadata of the target document Do.

ここで、文書データリポジトリ２２に格納されている文書の中で、対象文書Ｄｏに含まれる単語「従業員」が１回以上出現する文書の数（つまり、ｄｆ（従業員））は１であるものとする。この場合、対象文書Ｄｏに含まれる単語「従業員」のｔｆ／ｉｄｆ値（つまり、ｗ（従業員、対象文書Ｄｏ）)は、１０＊ｌｏｇ（１０／１）＝１０となる。 Here, among documents stored in the document data repository 22, the number of documents in which the word “employee” included in the target document Do appears one or more times (that is, df (employee)) is 1. Shall. In this case, the tf / idf value (that is, w (employee, target document Do)) of the word “employee” included in the target document Do is 10 * log (10/1) = 10.

また、文書データリポジトリ２２に格納されている文書の中で、対象文書Ｄｏに含まれる単語「情報」が１回以上出現する文書の数（つまり、ｄｆ（情報））は５であるものとする。この場合、対象文書Ｄｏに含まれる単語「情報」のｔｆ／ｉｄｆ値（つまり、ｗ（情報、対象文書Ｄｏ）)は、１９＊ｌｏｇ（１０／５）＝５．７となる。 The number of documents in which the word “information” included in the target document Do appears one or more times (ie, df (information)) among the documents stored in the document data repository 22 is 5. . In this case, the tf / idf value (that is, w (information, target document Do)) of the word “information” included in the target document Do is 19 * log (10/5) = 5.7.

また、文書データリポジトリ２２に格納されている文書の中で、対象文書Ｄｏに含まれる単語「検索」が１回以上出現する文書の数（つまり、ｄｆ（検索））は３であるものとする。この場合、対象文書Ｄｏに含まれる単語「検索」のｔｆ／ｉｄｆ値（つまり、ｗ（検索、対象文書Ｄｏ）)は、６＊ｌｏｇ（１０／３）＝３．１となる。 The number of documents in which the word “search” included in the target document Do appears one or more times (ie, df (search)) among documents stored in the document data repository 22 is 3. . In this case, the tf / idf value (that is, w (search, target document Do)) of the word “search” included in the target document Do is 6 * log (10/3) = 3.1.

また、文書データリポジトリ２２に格納されている文書の中で、対象文書Ｄｏに含まれる単語「登録」が１回以上出現する文書の数（つまり、ｄｆ（登録））は３であるものとする。この場合、対象文書Ｄｏに含まれる単語「登録」のｔｆ／ｉｄｆ値（つまり、ｗ（登録、対象文書Ｄｏ）)は、６＊ｌｏｇ（１０／３）＝３．１となる。 The number of documents in which the word “registration” included in the target document Do appears one or more times (ie, df (registration)) among the documents stored in the document data repository 22 is 3. . In this case, the tf / idf value (that is, w (registration, target document Do)) of the word “registration” included in the target document Do is 6 * log (10/3) = 3.1.

ここで、上述したように対象文書Ｄｏのベクトルには、選択文書Ｄｓのベクトルに含まれるｔｆ／ｉｄｆ値に対応する各単語のｔｆ／ｉｄｆ値が含まれる。これによれば、対象文書Ｄｏのベクトルは、以下の式（９）のようになる。 Here, as described above, the vector of the target document Do includes the tf / idf value of each word corresponding to the tf / idf value included in the vector of the selected document Ds. According to this, the vector of the target document Do is as shown in the following equation (9).

上記した式（８）に示す選択文書Ｄｓのベクトルおよび式（９）に示す対象文書Ｄｏのベクトルを用いると、選択文書Ｄｓおよび対象文書Ｄｏに含まれる単語および当該単語の出現数を用いた類似度（第２の類似度）ｓｉｍ＿ｗｏｒｄｓ（Ｄｓ，Ｄｏ）は、以下の式（１０）のように算出することができる。 Using the vector of the selected document Ds shown in the equation (8) and the vector of the target document Do shown in the equation (9), the similarity using the words included in the selected document Ds and the target document Do and the number of occurrences of the word is used. The degree (second similarity) sim_words (Ds, Do) can be calculated as in the following Expression (10).

上記したｓｉｍ＿ｍｅｔａｍｏｄｅｌ（Ｍｓ，Ｍｏ）およびｓｉｍ＿ｗｏｒｄｓ（Ｄｓ，Ｄｏ）を用いると、選択文書Ｄｓと対象文書Ｄｏとの類似度ｓｉｍ（Ｄｓ，Ｄｏ）は、ｓｉｍ＿ｍｅｔａｍｏｄｅｌ（Ｍｓ，Ｍｏ）＋ｓｉｍ＿ｗｏｒｄｓ（Ｄｓ，Ｄｏ）＝１．３＋０．６７＝１．９７となる。類似文書を取得するための閾値が例えば１．５であるものとすると、対象文書Ｄｏは、選択文書Ｄｓに類似する文書として取得される。 When the above-described sim_metamodel (Ms, Mo) and sim_words (Ds, Do) are used, the similarity sim (Ds, Do) between the selected document Ds and the target document Do is sim_metamodel (Ms, Mo) + sim_words (Ds, Do). = 1.3 + 0.67 = 1.97. If the threshold for acquiring a similar document is 1.5, for example, the target document Do is acquired as a document similar to the selected document Ds.

ここでは、選択文書Ｄｓと対象文書Ｄｏとの類似度について説明したが、選択文書Ｄｓとの類似度は、文書データリポジトリ２２に格納されている文書の全てについて算出される。 Although the similarity between the selected document Ds and the target document Do has been described here, the similarity with the selected document Ds is calculated for all the documents stored in the document data repository 22.

再び図１６に戻ると、項目取得部３４に含まれる項目算出部３４４は、選択文書と類似文書とを用いて、文書データリポジトリ２２に格納されているメタモデルに定義されている項目（クラス）毎に重複度を算出する（ステップＳ２８）。重複度は、メタモデルにおいて定義されている項目が出現する選択文書および類似文書の数（つまり、重複している数）を示す。 Returning to FIG. 16 again, the item calculation unit 344 included in the item acquisition unit 34 uses items (classes) defined in the metamodel stored in the document data repository 22 using the selected document and the similar document. The degree of duplication is calculated every time (step S28). The degree of duplication indicates the number of selected documents and similar documents in which items defined in the metamodel appear (that is, the number of duplicates).

ここで、項目算出部３４４によって算出される重複度について説明する。ここでは、選択文書が文書Ａであり、類似文書が文書Ｂ〜Ｅであるものとする。 Here, the degree of duplication calculated by the item calculation unit 344 will be described. Here, it is assumed that the selected document is document A and the similar documents are documents B to E.

この場合、項目算出部３４４は、メタモデルに定義されている項目（クラス）が文書Ａ〜Ｅの各々に含まれているか否かを判定する。具体的には、項目算出部３４４は、メタモデルに定義されている「ｃｌａｓｓ」要素、「ｓｕｂｃｌａｓｓ」要素および「ｓｕｂｓｕｂｃｌａｓｓ」要素等が文書Ａ〜Ｅのメタデータに含まれているか否かを判定する。この判定処理は、メタモデルに定義されている「ｃｌａｓｓ」要素、「ｓｕｂｃｌａｓｓ」要素および「ｓｕｂｓｕｂｃｌａｓｓ」要素が有する「ｉｄ」属性の属性値と文書Ａ〜Ｅのメタデータに含まれる「ｃｌａｓｓ」要素、「ｓｕｃｌａｓｓ」要素および「ｓｕｂｓｕｂｃｌａｓｓ」要素が有する「ｉｄ」属性の属性値とを比較することによって行われる。 In this case, the item calculation unit 344 determines whether an item (class) defined in the meta model is included in each of the documents A to E. Specifically, the item calculation unit 344 determines whether or not the “class” element, the “subclass” element, the “subsubclass” element, and the like defined in the metamodel are included in the metadata of the documents A to E. To do. This determination processing is performed by the attribute value of the “id” attribute included in the “class” element, the “subclass” element, and the “subclass” element defined in the meta model, and the “class” element included in the metadata of the documents A to E. , “Subclass” element and “subsubclass” element are compared with the attribute value of the “id” attribute.

なお、メタモデルに定義されているある項目が文書Ａ〜Ｅの全てに含まれている場合には、当該項目の重複度は５となる。一方、メタモデルに定義されているある項目が文書Ａ〜Ｅの全てに含まれていない場合には、当該項目の重複度は０となる。 If a certain item defined in the meta model is included in all of the documents A to E, the duplication degree of the item is 5. On the other hand, when a certain item defined in the meta model is not included in all of the documents A to E, the duplication degree of the item is zero.

次に、項目算出部３４４は、算出された重複度が予め定められた値（閾値）以上である項目（クラス）を特定する。 Next, the item calculation unit 344 specifies an item (class) whose calculated degree of overlap is a predetermined value (threshold value) or more.

項目算出部３４４は、特定された項目を出力部３５を介して出力する（ステップＳ２９）。これにより、項目算出部３４４によって特定された項目（つまり、重複度が閾値以上である項目）が文書に記述すべき項目としてユーザに提示される。 The item calculation unit 344 outputs the identified item via the output unit 35 (step S29). As a result, the item specified by the item calculation unit 344 (that is, the item whose degree of overlap is greater than or equal to the threshold) is presented to the user as an item to be described in the document.

ここで、図２０は、ユーザに提示される項目の一例を示す。図２０に示す例では、ユーザに提示される項目には、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１」である「ｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１」である「ｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１＿０００１」である「ｓｕｂｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００１＿０００１＿０００２」である「ｓｕｂｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２」である「ｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１」である「ｓｕｂｃｌａｓｓ」要素、「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１＿０００１」である「ｓｕｂｓｕｂｃｌａｓｓ」要素および「ｉｄ」属性の属性値が「ｍｅｔａ＿００２＿０００１＿０００２」である「ｓｕｂｓｕｂｃｌａｓｓ」要素が含まれている。 Here, FIG. 20 shows an example of items presented to the user. In the example illustrated in FIG. 20, items presented to the user include a “class” element whose attribute value of the “id” attribute is “meta — 001”, and “subclass” whose attribute value of the “id” attribute is “meta — 001 — 0001”. The “subsubclass” element whose attribute value is “meta — 001 — 0001 — 0001”, the “subsubclass” element whose attribute value is “meta — 001 — 0001 — 0002”, and the attribute value of the “id” attribute is “meta — 002” A certain “class” element, a “subclass” element whose attribute value of “id” attribute is “meta — 002 — 0001”, a “subsubclass” element whose attribute value of “id” attribute is “meta — 002 — 0001 — 0001”, and an “id” genus Value of the attribute is included in the "subsubclass" element is "meta_002_0001_0002".

なお、文書に記述すべき項目は図２０に示すような形式でユーザに提示されてもよいが、よりユーザが容易に記述すべき項目を認識できるように、文書データリポジトリ２２に格納されているメタモデルを参照することによって例えば当該項目の項目名（メタモデルに含まれる「ｃｌａｓｓ」要素、「ｓｕｂｃｌａｓｓ」要素および「ｓｕｂｓｕｂｃｌａｓｓ」要素が有する「ｎａｍｅ」属性の属性値）が提示される方が好ましい。 The items to be described in the document may be presented to the user in the format shown in FIG. 20, but are stored in the document data repository 22 so that the user can more easily recognize the items to be described. By referring to the meta model, for example, the item name of the item (the “class” attribute attribute value of the “class” element, the “subclass” element, and the “subsubclass” element included in the meta model) is preferably presented. .

上記したように本実施形態においては、ユーザによって指定された検索条件（検索キーワード）に基づいて当該検索条件に合致する文書を文書データリポジトリ２２から検索し、当該検索された文書と類似する文書を文書データリポジトリ２２から取得し、当該検索された文書および取得された文書において共通して記述されている項目を特定し、当該特定された項目をユーザに提示する構成により、ユーザが新規に文書を作成する際に、当該文書に記述すべき項目を提示することが可能となる。つまり、本実施形態においては、ユーザがこれから作成しようとする文書に記述すべき項目を提示することにより、当該ユーザが新規に文書を作成する際の工数削減を図り、また、当該文書に必要な項目を落とすことを防止することが可能となる。 As described above, in the present embodiment, a document that matches the search condition is searched from the document data repository 22 based on the search condition (search keyword) specified by the user, and a document similar to the searched document is searched. An item acquired from the document data repository 22, an item commonly described in the searched document and the acquired document is specified, and the specified item is presented to the user, so that the user can create a new document. At the time of creation, items to be described in the document can be presented. That is, in the present embodiment, by presenting items to be described in a document that the user intends to create from now on, it is possible to reduce the man-hours when the user creates a new document, and also necessary for the document. It is possible to prevent items from being dropped.

また、本実施形態においては、ユーザによって作成された文書を入力し、当該入力された文書を文書データリポジトリ２２に登録し、当該入力された文書に記述されている項目のうち文書データリポジトリ２２に格納されているメタモデルに定義されていない項目を当該メタモデルに追加することによって当該メタモデルを更新する構成により、ユーザによって文書が作成された際に、当該文書に記述されている項目に応じて自動的にメタモデルを更新することが可能となる。 In the present embodiment, a document created by the user is input, the input document is registered in the document data repository 22, and among the items described in the input document, the document data repository 22 is registered. Depending on the items described in the document when the document is created by the user, the metamodel is updated by adding an item that is not defined in the stored metamodel to the metamodel. The metamodel can be automatically updated.

なお、本実施形態においては、第１および第２の類似度に基づいて選択文書に類似する文書（類似文書）を取得するものとして説明したが、例えば第１および第２の類似度のいずれか一方のみに基づいて類似文書が取得される構成であっても構わない。また、上述した第１および第２の類似度の算出方法は一例であり、他の方法で算出された類似度に基づいて類似文書が取得されるような構成であってもよい。 In the present embodiment, the description has been made assuming that a document (similar document) similar to the selected document is acquired based on the first and second similarities. For example, any one of the first and second similarities is obtained. A configuration may be used in which similar documents are acquired based only on one side. Moreover, the calculation method of the 1st and 2nd similarity mentioned above is an example, and the structure that a similar document is acquired based on the similarity calculated by another method may be sufficient.

また、本願発明は、上記実施形態そのままに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組合せにより種々の発明を形成できる。例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。 Further, the present invention is not limited to the above-described embodiments as they are, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. In addition, various inventions can be formed by appropriately combining a plurality of constituent elements disclosed in the embodiment. For example, some components may be deleted from all the components shown in the embodiment.

１０…コンピュータ、２０…外部記憶装置、２２…文書データリポジトリ（文書格納手段）、３０…文書管理装置、３１…入力部、３２…文書登録部、３３…文書検索部、３４…項目取得部、３５…出力部。 DESCRIPTION OF SYMBOLS 10 ... Computer, 20 ... External storage device, 22 ... Document data repository (document storage means), 30 ... Document management apparatus, 31 ... Input part, 32 ... Document registration part, 33 ... Document search part, 34 ... Item acquisition part, 35: Output unit.

Claims

Document storage means for storing a plurality of documents in which various items are described;
A search unit that searches the document storage unit for a document that matches the search condition based on a search condition specified by a user;
Obtaining means for obtaining a document similar to the retrieved document from the document storage means;
Specifying means for specifying items commonly described in the retrieved document and the acquired document;
A document management apparatus comprising: presenting means for presenting the identified item to the user.

The acquisition means includes
Generating means for generating, for each document stored in the document storing means, a metamodel in which items described in the document are defined;
Similarity calculation means for calculating the similarity between the retrieved document and each of the documents stored in the document storage means based on the generated metamodel;
The document management apparatus according to claim 1, wherein a document similar to the searched document is acquired from the document storage unit based on the calculated similarity.

Each of the documents stored in the document storage means includes a word,
The acquisition means includes
Calculation means for calculating the similarity between the retrieved document and each of the documents stored in the document storage means based on words included in each of the documents stored in the document storage means;
The document management apparatus according to claim 1, wherein a document similar to the searched document is acquired from the document storage unit based on the calculated similarity.

The document storage means further stores a meta model in which items described in each of the documents stored in the document storage means and a relationship between the items are defined,
The specifying means is:
For each item defined in the metamodel, including a degree-of-duplication calculation means for calculating the number of the retrieved document and the acquired document in which the item is described as a degree of redundancy,
The document management apparatus according to claim 1, wherein an item commonly described in the retrieved document and the acquired document is specified based on the calculated degree of duplication.

An input means for inputting a document created by the user;
Registration means for registering the input document in the document storage means;
Updating means for updating the metamodel by adding items that are not defined in the metamodel stored in the document storage means among the items described in the input document to the metamodel; The document management apparatus according to claim 4, further comprising:

A program executed by a computer in a document management apparatus including an external storage device having a document storage means for storing a plurality of documents in which various items are described, and a computer using the external storage device. ,
In the computer,
Searching the document storage means for a document that matches the search condition based on the search condition specified by the user;
Obtaining a document similar to the retrieved document from the document storage means;
Identifying items commonly described in the retrieved document and the acquired document;
A program for causing the identified item to be presented to the user.