JP2014160486A

JP2014160486A - Inconsistency detection device, program and method, and correction support device, program and method

Info

Publication number: JP2014160486A
Application number: JP2014077708A
Authority: JP
Inventors: Ryo Ishizaki; 諒石崎; Masahiro Asaoka; 正洋麻岡; Isao Nanba; 功難波
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2009-12-15
Filing date: 2014-04-04
Publication date: 2014-09-04
Anticipated expiration: 2030-06-21
Also published as: JP5790820B2; JP2011146019A; JP5648336B2

Abstract

PROBLEM TO BE SOLVED: To complement omitted data in a document.SOLUTION: An inconsistency detection device has: a document DB which stores an independent word group extracted from a sentence and an item name group extracted from item definitions including item names and definitions of the item names included in the sentence for every document; a data storage section which stores the independent word group and the item name group of the document for diagnosis; means for specifying a similar document which is a document similar to the document for diagnosis from the document DB and extracting an independent word group and an item name group of the similar document from the document database; means for extracting item names consistent with independent words included in the independent word group stored in the data storage section from the item name group of the similar document; and means for specifying item names which are not included in the item name group stored in the data storage section among the extracted item names as inconsistent item names.

Description

本技術は、文書の品質を把握又は改善する技術に関する。 The present technology relates to a technology for grasping or improving document quality.

システム開発においては、各開発工程において様々な設計書が作成されている。設計書の品質はシステム開発プロジェクトの進行、ひいては完成されたシステムの品質に大きな影響を与えるため、設計書の品質を適切に管理するための技術が求められている。 In system development, various design documents are created in each development process. Since the quality of the design document greatly influences the progress of the system development project and eventually the quality of the completed system, a technique for appropriately managing the quality of the design document is required.

ところで、例えばＵＩ（User Interface）工程では、処理論理の仕様を記載した文章（以下、処理詳細と呼ぶ）と、処理詳細に含まれる項目名を定義する定型フォーム（以下、項目定義と呼ぶ）とを含む設計書を作成することが行われる。このような設計書の一例を図１に示す。 By the way, in a UI (User Interface) process, for example, a sentence describing the specifications of processing logic (hereinafter referred to as processing details) and a fixed form (hereinafter referred to as item definition) that defines item names included in the processing details. A design document including An example of such a design document is shown in FIG.

このような設計書の品質管理においては、項目定義と処理詳細との間に不整合が無いかを調べることが必要である。ここでいう不整合とは、項目定義に定義されている項目名であるにも関わらず処理詳細に記載が無いという不整合や、処理詳細に含まれる項目名であるにも関わらず項目定義において定義されていないという不整合である。 In quality control of such a design document, it is necessary to check whether there is any inconsistency between item definitions and processing details. An inconsistency here is an inconsistency that the process name is not described in spite of the item name defined in the item definition, or that the item name is included in the process details. It is an inconsistency that is not defined.

前者の不整合については、容易に発見することができる。例えば図１の例であれば、「事業所コード」という項目名で処理詳細を検索することにより、処理詳細中に「事業所コード」という記載があるかを判断すればよい。 The former inconsistency can be easily found. For example, in the example of FIG. 1, it may be determined whether or not there is a description of “establishment code” in the process details by searching for process details using the item name “establishment code”.

一方、後者の不整合については、容易に発見することはできない。処理詳細には項目名以外の語句が多数含まれているため、単純に処理詳細から名詞句等を抽出し、抽出された名詞句が項目定義に定義されているかを調べるような方法では、本来は項目定義に定義されるべきでない名詞句までもが未定義の項目名であると判断されてしまうことになる。例えば図１の例であれば、「画面」や「通知メッセージ」等の名詞句が未定義の項目名であると判断されてしまうことになり、適切ではない。 On the other hand, the latter inconsistency cannot be easily found. Since the processing details include many words other than the item name, a method such as simply extracting a noun phrase from the processing details and checking whether the extracted noun phrase is defined in the item definition Even a noun phrase that should not be defined in the item definition is determined to be an undefined item name. For example, in the example of FIG. 1, noun phrases such as “screen” and “notification message” are determined to be undefined item names, which is not appropriate.

従来、文書から語句を抽出するための技術として、項目名称を抽出する際の検索キー（キーワード）を予め用意しておき、仕様書の文書中からキーワード文字を含む文字列を洗い出し、さらに不要文字の削除を行い、整合性チェックの対象となるチェック対象項目名称を抽出する技術が知られている。 Conventionally, as a technique for extracting a phrase from a document, a search key (keyword) for extracting an item name is prepared in advance, character strings including keyword characters are identified from a specification document, and unnecessary characters are further extracted. Is known, and a check target item name to be subjected to a consistency check is extracted.

しかしながら、上記従来技術では、何らかの方法により適切なキーワードを選定することができなければ、整合性チェックの対象として相応しいチェック対象項目名を高精度で抽出することはできない。また、仕様書の内容等に応じて検索に用いるキーワードを予め用意しておくような方法はコストが大きい上に、汎用性が低いという問題がある。 However, in the above prior art, unless an appropriate keyword can be selected by any method, it is not possible to extract a check target item name suitable for a consistency check with high accuracy. In addition, a method of preparing a keyword used for a search in advance according to the contents of the specification document has a problem that the cost is high and the versatility is low.

また、前者の不整合についても、以下のような問題がある。具体的には、従来技術では、不整合を発見し、処理詳細に含まれていない項目名に関連するデータを他の設計書の処理詳細から抽出することはできるが、抽出されたデータを補充すべき位置を特定することはできないという問題がある。 In addition, the former inconsistency has the following problems. Specifically, in the prior art, it is possible to detect inconsistencies and extract data related to item names not included in the processing details from the processing details of other design documents, but supplement the extracted data. There is a problem that a position to be specified cannot be specified.

特開２００８−１８６３５６号公報JP 2008-186356 A

従って、本技術の目的は、一側面によれば、文書において欠落しているデータを補完するための技術を提供することである。 Accordingly, an object of the present technology is, according to one aspect, to provide a technology for complementing missing data in a document.

本技術の一側面に係る不整合検出装置は、（Ａ）文書に含まれる文章から抽出された自立語群と、当該文章に含まれる項目名及び当該項目名の定義を含む項目定義から抽出された項目名群とを文書毎に格納する文書データベースと、（Ｂ）診断対象である第１の文書の自立語群及び項目名群を格納するデータ格納部と、（Ｃ）文書データベースに格納されている各文書の自立語群と、データ格納部に格納されている自立語群との類似度を算出し、当該類似度が所定の閾値以上である文書を類似文書として特定し、特定された当該類似文書の自立語群及び項目名群を文書データベースから抽出する類似文書特定手段と、（Ｄ）データ格納部に格納されている自立語群に含まれる自立語である第１自立語に一致する項目名を、類似文書特定手段により抽出された項目名群から抽出する項目候補抽出手段と、（Ｅ）項目候補抽出手段により抽出された項目名のうち、データ格納部に格納されている項目名群に含まれていない項目名を不整合項目名として特定する不整合項目特定手段とを有する。 The inconsistency detection apparatus according to one aspect of the present technology is extracted from an item definition including (A) an independent word group extracted from a sentence included in a document, an item name included in the sentence, and a definition of the item name. Stored in the document database, (B) a data storage unit storing the independent word group and item name group of the first document to be diagnosed, and (C) the document database. The degree of similarity between the independent word group of each document and the independent word group stored in the data storage unit is calculated, and a document whose similarity is equal to or greater than a predetermined threshold is identified as a similar document and identified. Similar document specifying means for extracting the independent word group and item name group of the similar document from the document database, and (D) the first independent word that is an independent word included in the independent word group stored in the data storage unit Item name to be used for similar document identification means Item candidate extracting means for extracting from the extracted item name group, and (E) item names not included in the item name group stored in the data storage unit among the item names extracted by the item candidate extracting means And an inconsistent item specifying means for specifying as an inconsistent item name.

本技術の他の側面に係る修正支援装置は、（Ａ）処理に関連する項目名を定義する項目定義データ及び当該処理の内容を定義する処理詳細データを処理毎に格納するデータベースと、（Ｂ）データベースから、第１の項目定義データ及び当該第１の項目定義データに対応する第１の処理詳細データを読み出し、当該第１の項目定義データに定義されており且つ当該第１の処理詳細データに含まれていない項目名を不整合項目名として抽出する項目抽出部と、（Ｃ）データベースに格納されている処理詳細データのうち不整合項目名を含む第２の処理詳細データについて、不整合項目名が第２の処理詳細データに出現する位置を特定し、当該位置を表す位置情報を記憶装置に格納する補充データ特定部と、（Ｄ）データベースから、第１の項目定義データ及び第２の処理詳細データに対応する第２の項目定義データに共に定義されている項目名を共通項目名として抽出し、当該共通項目名が第１の処理詳細データ及び第２の処理詳細データにおいて出現する位置をそれぞれ特定し、当該位置を表す位置情報を共通項目名に対応付けて記憶装置に格納する対応位置特定部と、（Ｅ）記憶装置に格納されているデータを用いて、共通項目名のうち、第２の処理詳細データにおいて不整合項目名の直前に出現する共通項目名及び直後に出現する共通項目名を直前項目名及び直後項目名として特定する範囲特定部とを有する。 A correction support apparatus according to another aspect of the present technology includes (A) a database that stores item definition data that defines item names related to processing, and detailed processing data that defines the details of the processing for each processing; ) Read out the first item definition data and the first process detail data corresponding to the first item definition data from the database, and are defined in the first item definition data and the first process detail data An item extraction unit that extracts item names that are not included as inconsistent item names, and (C) second processing detailed data including inconsistent item names among the processing detailed data stored in the database. A replenishment data specifying unit for specifying a position where the item name appears in the second processing detailed data and storing position information representing the position in the storage device; and (D) the first item from the database. The item names defined together in the second item definition data corresponding to the definition data and the second process detail data are extracted as common item names, and the common item names are the first process detail data and the second process. Using the corresponding position specifying unit that specifies each position appearing in the detailed data, stores the position information representing the position in the storage device in association with the common item name, and (E) the data stored in the storage device A range specifying unit for specifying the common item name that appears immediately before the inconsistent item name and the common item name that appears immediately after the common item name as the immediately preceding item name and the immediately following item name in the second processing detailed data. Have.

文書において欠落しているデータを補完できるようになる。 The missing data in the document can be supplemented.

図１は、本実施の形態の前提を説明するための図である。FIG. 1 is a diagram for explaining the premise of the present embodiment. 図２は、第１の実施の形態に係る不整合検出装置の機能ブロック図である。FIG. 2 is a functional block diagram of the inconsistency detection apparatus according to the first embodiment. 図３は、設計書ＤＢに格納されるデータの一例を示す図である。FIG. 3 is a diagram illustrating an example of data stored in the design document DB. 図４（ａ）乃至（ｃ）は、処理詳細の自立語リストを生成する方法を説明するための図である。FIGS. 4A to 4C are diagrams for explaining a method of generating an independent word list with processing details. 図５（ａ）及び（ｂ）は、項目定義の項目名リストを生成する方法を説明するための図である。FIGS. 5A and 5B are diagrams for explaining a method of generating an item name list of item definitions. 図６は、第１の実施の形態におけるメインの処理フローを示す図である。FIG. 6 is a diagram showing a main processing flow in the first embodiment. 図７は、類似設計書特定処理の処理フローを示す図である。FIG. 7 is a diagram illustrating a process flow of the similar design document specifying process. 図８は、項目候補抽出処理の処理フローを示す図である。FIG. 8 is a diagram showing a processing flow of item candidate extraction processing. 図９は、一致項目データ格納部に格納されるデータの一例を示す図である。FIG. 9 is a diagram illustrating an example of data stored in the matching item data storage unit. 図１０は、項目候補抽出処理の処理フローを示す図である。FIG. 10 is a diagram illustrating a processing flow of item candidate extraction processing. 図１１は、項目候補格納部に格納されるデータの一例を示す図である。FIG. 11 is a diagram illustrating an example of data stored in the item candidate storage unit. 図１２は、出現部分比較処理の処理フローを示す図である。FIG. 12 is a diagram illustrating a process flow of the appearance portion comparison process. 図１３は、共通項目データ格納部に格納されるデータの一例を示す図である。FIG. 13 is a diagram illustrating an example of data stored in the common item data storage unit. 図１４は、出現部分比較処理の処理フローを示す図である。FIG. 14 is a diagram illustrating a process flow of the appearance part comparison process. 図１５は、類似部分データ格納部に格納されるデータの一例を示す図である。FIG. 15 is a diagram illustrating an example of data stored in the similar partial data storage unit. 図１６は、出現部分比較処理の処理フローを示す図である。FIG. 16 is a diagram illustrating a process flow of the appearance part comparison process. 図１７は、距離データ格納部に格納されるデータの一例を示す図である。FIG. 17 is a diagram illustrating an example of data stored in the distance data storage unit. 図１８は、項目定義比較処理の処理フローを示す図である。FIG. 18 is a diagram illustrating a process flow of the item definition comparison process. 図１９は、第２の実施の形態に係る修正支援装置の機能ブロック図である。FIG. 19 is a functional block diagram of the correction support apparatus according to the second embodiment. 図２０は、設計書ＤＢに格納されている項目定義データの一例を示す図である。FIG. 20 is a diagram illustrating an example of item definition data stored in the design document DB. 図２１は、設計書ＤＢに格納されている処理詳細データの一例を示す図である。FIG. 21 is a diagram illustrating an example of detailed processing data stored in the design document DB. 図２２は、第２の実施の形態におけるメインの処理フローを示す図である。FIG. 22 is a diagram showing a main processing flow in the second embodiment. 図２３は、入力データ格納部に格納されている項目定義データの一例を示す図である。FIG. 23 is a diagram illustrating an example of item definition data stored in the input data storage unit. 図２４は、入力データ格納部に格納されている処理詳細データの一例を示す図である。FIG. 24 is a diagram illustrating an example of detailed processing data stored in the input data storage unit. 図２５は、補充文格納部に格納されているデータの一例を示す図である。FIG. 25 is a diagram illustrating an example of data stored in the supplementary sentence storage unit. 図２６は、範囲絞り込み処理の処理フローを示す図である。FIG. 26 is a diagram illustrating a processing flow of range narrowing processing. 図２７は、共通項目名のデータの一例を示す図である。FIG. 27 is a diagram illustrating an example of common item name data. 図２８は、行番号リスト格納部に格納されているデータの一例を示す図である。FIG. 28 is a diagram illustrating an example of data stored in the line number list storage unit. 図２９は、対応位置データ格納部に格納されているデータの一例を示す図である。FIG. 29 is a diagram illustrating an example of data stored in the corresponding position data storage unit. 図３０は、対応位置特定処理の処理フローを示す図である。FIG. 30 is a diagram illustrating a process flow of the corresponding position specifying process. 図３１は、絞り込み処理結果格納部に格納されているデータの一例を示す図である。FIG. 31 is a diagram illustrating an example of data stored in the narrowing processing result storage unit. 図３２は、範囲絞り込み処理について説明するための図である。FIG. 32 is a diagram for explaining the range narrowing processing. 図３３は、対応位置特定部により行われる処理を説明するための図である。FIG. 33 is a diagram for explaining processing performed by the corresponding position specifying unit. 図３４は、対応位置特定部により行われる処理を説明するための図である。FIG. 34 is a diagram for explaining processing performed by the corresponding position specifying unit. 図３５は、第１の補充位置決定処理の処理フローを示す図である。FIG. 35 is a diagram illustrating a processing flow of the first replenishment position determination processing. 図３６は、ウィンドウ・データの一例を示す図である。FIG. 36 is a diagram illustrating an example of window data. 図３７は、探索部により行われる処理を説明するための図である。FIG. 37 is a diagram for explaining processing performed by the search unit. 図３８は、出力データ格納部に格納されているデータの一例を示す図である。FIG. 38 is a diagram illustrating an example of data stored in the output data storage unit. 図３９は、第１の補充位置決定処理について説明するための図である。FIG. 39 is a diagram for describing the first replenishment position determination process. 図４０は、第２の補充位置決定処理の処理フローを示す図である。FIG. 40 is a diagram illustrating a process flow of the second replenishment position determination process. 図４１は、ウィンドウ・データの一例を示す図である。FIG. 41 is a diagram showing an example of window data. 図４２は、コンピュータの機能ブロック図である。FIG. 42 is a functional block diagram of a computer.

以下、本実施の形態について詳細に説明するが、本実施の形態では、設計書が項目定義及び処理詳細を含むことが前提となっている。 Hereinafter, the present embodiment will be described in detail. In the present embodiment, it is assumed that the design document includes item definitions and processing details.

［実施の形態１］
まず、処理詳細に含まれる項目名であるにも関わらず項目定義において定義されていないという不整合（第１の不整合）を検出するための処理について説明する。 [Embodiment 1]
First, a process for detecting inconsistency (first inconsistency) that is not defined in the item definition although it is an item name included in the processing details will be described.

第１の実施の形態に係る不整合検出装置の機能ブロック図を図２に示す。第１の実施の形態に係る不整合検出装置は、（Ａ）診断対象の設計書の処理詳細及び項目定義の入力を受け付け、自立語リスト及び項目名リストを生成する入力データ処理部１と、（Ｂ）入力データ処理部１により生成された自立語リスト及び項目名リストを格納する入力データ格納部３と、（Ｃ）設計書毎に自立語リスト及び項目名リストを格納する設計書ＤＢ７と、（Ｄ）入力データ格納部３に格納されている自立語リストと設計書ＤＢ７に格納されている自立語リストとに基づき、後で説明する類似設計書特定処理を実施する類似設計書特定部９と、（Ｅ）類似設計書特定部９により特定された類似設計書の自立語リスト及び項目名リストを格納する類似設計書格納部１１と、（Ｆ）入力データ格納部３に格納されているデータ及び類似設計書格納部１１に格納されているデータに基づき、後で説明する項目候補抽出処理を実施する項目候補抽出部５と、（Ｇ）項目候補抽出部５により抽出された項目候補のデータを格納する項目候補格納部１３と、（Ｈ）入力データ格納部３に格納されている項目名リストに基づき、項目候補格納部１３に格納されているデータから不整合項目を特定する第一不整合項目特定部１５と、（Ｉ）第一不整合項目特定部１５により特定された不整合項目のデータを格納する第一不整合項目格納部１７と、（Ｊ）第一不整合項目格納部１７に格納されているデータを出力する処理を実施する出力部１９とを有する。 FIG. 2 shows a functional block diagram of the inconsistency detection apparatus according to the first embodiment. The inconsistency detection apparatus according to the first embodiment includes (A) an input data processing unit 1 that accepts input of process details and item definition of a design document to be diagnosed, and generates an independent word list and an item name list; (B) An input data storage unit 3 that stores the independent word list and item name list generated by the input data processing unit 1, and (C) a design document DB 7 that stores an independent word list and item name list for each design document. (D) A similar design document specifying unit that performs a similar design document specifying process to be described later based on the independent word list stored in the input data storage unit 3 and the independent word list stored in the design document DB 7 9, (E) a similar design document storage unit 11 for storing an independent word list and item name list of the similar design document specified by the similar design document specifying unit 9, and (F) an input data storage unit 3. Data and data Based on the data stored in the design document storage unit 11, the item candidate extraction unit 5 that performs an item candidate extraction process, which will be described later, and (G) the item candidate data extracted by the item candidate extraction unit 5 are stored. A first inconsistent item that identifies an inconsistent item from the data stored in the item candidate storage unit 13 based on the item candidate storage unit 13 to be executed and (H) the item name list stored in the input data storage unit 3 The identification unit 15, (I) a first inconsistency item storage unit 17 that stores data of inconsistency items identified by the first inconsistency item identification unit 15, and (J) the first inconsistency item storage unit 17 And an output unit 19 that performs processing for outputting stored data.

なお、項目候補抽出部５は、一致項目抽出部５０１と、一致項目データ格納部５０３と、絞り込み部５０５と、共通項目データ格納部５０７と、類似部分データ格納部５０９と、距離データ格納部５１１とを有する。また、絞り込み部５０５は、項目定義比較部５０５１と、出現部分比較部５０５３とを有する。 The item candidate extraction unit 5 includes a matching item extraction unit 501, a matching item data storage unit 503, a narrowing unit 505, a common item data storage unit 507, a similar portion data storage unit 509, and a distance data storage unit 511. And have. The narrowing-down unit 505 includes an item definition comparison unit 5051 and an appearance part comparison unit 5053.

一致項目抽出部５０１は、入力データ格納部３に格納されているデータ及び類似設計書格納部１１に格納されているデータに基づき一致項目データを生成し、一致項目データ格納部５０３に格納する。絞り込み部５０５は、入力データ格納部３及び類似設計書格納部１１に格納されているデータに基づき、一致項目データ格納部５０３に格納されているデータから項目候補を特定し、項目候補格納部１３に格納する。項目定義比較部５０５１は、一致項目データ格納部５０３に格納されているデータに対して、類似設計書格納部１１に格納されている項目名リストと入力データ格納部３に格納されている項目名リストとに基づき、後で説明する項目定義比較処理を実施し、処理結果を項目候補格納部１３に格納する。出現部分比較部５０５３は、類似設計書格納部１１に格納されている項目名リストに含まれる項目名のうち入力データ格納部３に格納されている項目名リストに含まれる項目名と一致する項目名を共通項目データ格納部５０７に格納する。また、出現部分比較部５０５３は、類似設計書格納部１１に格納されている自立語リスト及び共通項目データ格納部５０７に格納されているデータに基づき類似部分データを生成し、類似部分データ格納部５０９に格納する。さらに、出現部分比較部５０５３は、類似部分データ格納部５０９に格納されているデータ及び一致項目データ格納部５０３に格納されているデータに基づき距離データを生成し、距離データ格納部５１１に格納する。 The matching item extraction unit 501 generates matching item data based on the data stored in the input data storage unit 3 and the data stored in the similar design document storage unit 11 and stores the matching item data in the matching item data storage unit 503. The narrowing-down unit 505 specifies item candidates from the data stored in the matching item data storage unit 503 based on the data stored in the input data storage unit 3 and the similar design document storage unit 11, and the item candidate storage unit 13 To store. The item definition comparison unit 5051 compares the item name stored in the similar design document storage unit 11 and the item name stored in the input data storage unit 3 with respect to the data stored in the matching item data storage unit 503. Based on the list, an item definition comparison process, which will be described later, is performed, and the processing result is stored in the item candidate storage unit 13. The appearance part comparison unit 5053 matches an item name included in the item name list stored in the input data storage unit 3 among the item names included in the item name list stored in the similar design document storage unit 11. The name is stored in the common item data storage unit 507. In addition, the appearance part comparison unit 5053 generates similar part data based on the independent word list stored in the similar design document storage unit 11 and the data stored in the common item data storage unit 507, and the similar part data storage unit 509 is stored. Furthermore, the appearance part comparison unit 5053 generates distance data based on the data stored in the similar part data storage unit 509 and the data stored in the matching item data storage unit 503 and stores the distance data in the distance data storage unit 511. .

図３に、設計書ＤＢ７に格納されるデータの一例を示す。図３の例では、画面ＩＤと、処理名と、処理詳細ファイル名と、項目定義ファイル名と、処理詳細の自立語リストと、項目定義の項目名リストとが格納されるようになっている。このように、設計書ＤＢ７には、画面ＩＤにより特定される設計書毎にデータが格納されるようになっている。なお、自立語リスト及び項目名リストは、以下で述べるような方法により生成される。 FIG. 3 shows an example of data stored in the design document DB 7. In the example of FIG. 3, a screen ID, a process name, a process detail file name, an item definition file name, a process detail free word list, and an item definition item name list are stored. . Thus, the design document DB 7 stores data for each design document specified by the screen ID. Note that the independent word list and the item name list are generated by the method described below.

図４及び図５を用いて、自立語リスト及び項目名リストを生成する方法について簡単に説明する。まず、図４を用いて、自立語リストを生成する方法について説明する。本実施の形態では、処理詳細に対して形態素解析を行うことにより、文章を形態素に分解すると共に、それぞれの形態素の品詞を判別する。例えば図４（ａ）のような処理詳細に対して形態素解析を行った場合、図４（ｂ）に示すような解析結果が得られる。さらに、形態素解析の解析結果に対して、形態素を複合語に連結する処理を行うことにより、自立語を抽出する。例えば図４（ｂ）のような解析結果に対して連結処理を行うと、図４（ｃ）に示すような自立語のリストが得られる。なお、このような処理により自立語を抽出する方法は周知技術であるので、ここではこれ以上詳しくは述べない。 A method for generating an independent word list and an item name list will be briefly described with reference to FIGS. First, a method for generating an independent word list will be described with reference to FIG. In the present embodiment, by performing morphological analysis on the processing details, the sentence is decomposed into morphemes and the part of speech of each morpheme is determined. For example, when the morphological analysis is performed on the processing details as shown in FIG. 4A, an analysis result as shown in FIG. 4B is obtained. Furthermore, a self-supporting word is extracted by performing a process of connecting the morpheme to a compound word on the analysis result of the morpheme analysis. For example, when the concatenation process is performed on the analysis result as shown in FIG. 4B, a list of independent words as shown in FIG. 4C is obtained. In addition, since the method of extracting an independent word by such a process is a well-known technique, it is not described in detail here.

また、図５を用いて、項目名リストを生成する方法について説明する。項目名リストは、項目定義から項目名を抽出することにより生成する。例えば図５（ａ）のような項目定義に対して処理を行った場合には、図５（ｂ）に示すような項目名リストが生成される。なお、項目名の左に付されている数字は項目ＩＤであり、自動的に割り当てられる。 A method for generating an item name list will be described with reference to FIG. The item name list is generated by extracting item names from the item definition. For example, when processing is performed on the item definition as shown in FIG. 5A, an item name list as shown in FIG. 5B is generated. The number attached to the left of the item name is the item ID and is automatically assigned.

次に、図６乃至図１８を用いて、図２に示した不整合検出装置の処理内容について説明する。まず、不整合検出装置の入力データ処理部１は、診断対象の設計書の処理詳細及び項目定義の入力を受け付け、メインメモリ等の記憶装置に格納する（図６：ステップＳ１）。ここでは、キーボードやマウス等の入力装置を介してユーザから直接入力を受け付けるようにしてもよいし、設計書のリストをユーザに提示し、ユーザ指定の設計書の処理詳細及び項目定義を図示しない記憶装置等（又はネットワークを介して接続されている他のコンピュータ）から取得するような処理であってもよい。 Next, processing contents of the inconsistency detection apparatus shown in FIG. 2 will be described with reference to FIGS. First, the input data processing unit 1 of the inconsistency detection apparatus receives processing details and item definition input of a design document to be diagnosed and stores it in a storage device such as a main memory (FIG. 6: Step S1). Here, the input may be received directly from the user via an input device such as a keyboard or a mouse, the list of design documents is presented to the user, and the processing details and item definitions of the user-specified design document are not shown. The processing may be acquired from a storage device or the like (or another computer connected via a network).

そして、入力データ処理部１は、入力された処理詳細から自立語を抽出して自立語リストを生成すると共に、入力された項目定義から項目名を抽出して項目名リストを生成し、入力データ格納部３に格納する（ステップＳ３）。自立語リスト及び項目名リストを生成する方法については既に図４及び図５を用いて説明したので、ここでは説明を省略する。また、入力データ格納部３には、図４（ｃ）と同様のデータフォーマットで自立語リストが格納され、図５（ｂ）と同様のデータフォーマットで項目名リストが格納される。 Then, the input data processing unit 1 extracts an independent word from the input process details to generate an independent word list, extracts an item name from the input item definition, generates an item name list, and inputs data Store in the storage unit 3 (step S3). Since the method for generating the independent word list and the item name list has already been described with reference to FIGS. 4 and 5, the description thereof is omitted here. The input data storage unit 3 stores an independent word list in the same data format as in FIG. 4C, and stores an item name list in the same data format as in FIG. 5B.

そして、類似設計書特定部９は、類似設計書特定処理を実施する（ステップＳ５）。類似設計書特定処理については、図７を用いて詳しく説明する。類似設計書特定部９は、設計書ＤＢ７から未処理の自立語リストを１つ取得する（図７：ステップＳ２１）。また、類似設計書特定部９は、取得された自立語リストと入力データ格納部３に格納されている自立語リストとの類似度を算出する（ステップＳ２３）。ステップＳ２３においては、例えば両自立語リストに共通に含まれる自立語の数を、入力データ格納部３に格納されている自立語リストに含まれる自立語の数で割った値を類似度として算出する。さらに、類似設計書特定部９は、類似度が所定の閾値以上であるか判断する（ステップＳ２５）。 And the similar design document specific | specification part 9 implements a similar design document specific process (step S5). The similar design document specifying process will be described in detail with reference to FIG. The similar design document specifying unit 9 acquires one unprocessed independent word list from the design document DB 7 (FIG. 7: Step S21). Further, the similar design document specifying unit 9 calculates the similarity between the acquired independent word list and the independent word list stored in the input data storage unit 3 (step S23). In step S23, for example, a value obtained by dividing the number of independent words included in both independent word lists by the number of independent words included in the independent word list stored in the input data storage unit 3 is calculated as the similarity. To do. Further, the similar design document specifying unit 9 determines whether the similarity is equal to or greater than a predetermined threshold (step S25).

類似度が所定の閾値未満である場合（ステップＳ２５：Ｎｏルート）、ステップＳ２９に移行する。一方、類似度が所定の閾値以上である場合（ステップＳ２５：Ｙｅｓルート）、類似設計書特定部９は、取得された自立語リスト及び当該自立語リストに対応する項目名リストを類似設計書格納部１１に格納する（ステップＳ２７）。項目名リストは、設計書ＤＢ７から取得する。なお、類似設計書格納部１１には、設計書ＤＢ７と同様のデータフォーマットでデータが格納される。 When the similarity is less than the predetermined threshold (step S25: No route), the process proceeds to step S29. On the other hand, when the similarity is equal to or greater than a predetermined threshold (step S25: Yes route), the similar design document specifying unit 9 stores the acquired independent word list and the item name list corresponding to the independent word list in the similar design document. Stored in the unit 11 (step S27). The item name list is acquired from the design document DB 7. The similar design document storage unit 11 stores data in the same data format as the design document DB 7.

そして、類似設計書格納部１１は、設計書ＤＢ７における全ての自立語リストについて処理したか判断する（ステップＳ２９）。全ての自立語リストについて処理していない場合（ステップＳ２９：Ｎｏルート）、次の自立語リストについて処理を実施するため、ステップＳ２１に戻る。一方、全ての自立語リストについて処理した場合（ステップＳ２９：Ｙｅｓルート）、元の処理に戻る。 And the similar design document storage part 11 judges whether it processed about all the independent word lists in design document DB7 (step S29). When all the independent word lists have not been processed (step S29: No route), the process returns to step S21 in order to execute the process for the next independent word list. On the other hand, when all the independent word lists have been processed (step S29: Yes route), the processing returns to the original processing.

以上のような処理を実施することにより、診断対象の設計書と内容が類似する設計書を適切に特定することができる。 By carrying out the processing as described above, it is possible to appropriately specify a design document whose contents are similar to the design document to be diagnosed.

図６の説明に戻り、項目候補抽出部５は、項目候補抽出処理を実施する（ステップＳ７）。項目候補抽出処理については、図８乃至図１８を用いて詳細に説明する。まず、項目候補抽出部５の一致項目抽出部５０１は、入力データ格納部３に格納されている自立語リストから未処理の自立語を１つ特定する（図８：ステップＳ３１）。また、一致項目抽出部５０１は、類似設計書格納部１１から未処理の項目名リストを１つ特定する（ステップＳ３３）。さらに、一致項目抽出部５０１は、ステップＳ３３において特定された項目名リストから未処理の項目名を１つ特定する（ステップＳ３５）。そして、一致項目抽出部５０１は、ステップＳ３１において特定された自立語とステップＳ３５において特定された項目名との表記の類似度を算出する（ステップＳ３７）。ステップＳ３７においては、例えば特開平６−８３８７１号公報に開示されているような技術を用いて類似度を算出する。 Returning to the description of FIG. 6, the item candidate extraction unit 5 performs an item candidate extraction process (step S <b> 7). The item candidate extraction process will be described in detail with reference to FIGS. First, the matching item extraction unit 501 of the item candidate extraction unit 5 identifies one unprocessed independent word from the independent word list stored in the input data storage unit 3 (FIG. 8: step S31). Further, the matching item extraction unit 501 specifies one unprocessed item name list from the similar design document storage unit 11 (step S33). Further, the matching item extraction unit 501 specifies one unprocessed item name from the item name list specified in step S33 (step S35). Then, the matching item extraction unit 501 calculates the similarity of the notation between the independent word specified in step S31 and the item name specified in step S35 (step S37). In step S37, the similarity is calculated using, for example, a technique disclosed in Japanese Patent Laid-Open No. 6-83871.

そして、一致項目抽出部５０１は、ステップＳ３７において算出された類似度が所定の閾値以上であるか判断する（ステップＳ３９）。ステップＳ３７において算出された類似度が所定の閾値未満である場合（ステップＳ３９：Ｎｏルート）、ステップＳ４３の処理に移行する。 Then, the matching item extraction unit 501 determines whether the similarity calculated in step S37 is greater than or equal to a predetermined threshold (step S39). When the similarity calculated in step S37 is less than a predetermined threshold (step S39: No route), the process proceeds to step S43.

一方、ステップＳ３７において算出された類似度が所定の閾値以上である場合（ステップＳ３９：Ｙｅｓルート）、一致項目抽出部５０１は、ステップＳ３７において類似度が算出された自立語及び項目名を含む一致項目データを生成し、一致項目データ格納部５０３に格納する（ステップＳ４１）。 On the other hand, when the similarity calculated in step S37 is equal to or greater than a predetermined threshold (step S39: Yes route), the matching item extraction unit 501 includes a match including the independent word and item name whose similarity is calculated in step S37. Item data is generated and stored in the matching item data storage unit 503 (step S41).

図９に、一致項目データ格納部５０３に格納されるデータの一例を示す。図９の例では、自立語ＩＤの列と、自立語の列と、類似設計書ＩＤの列と、項目ＩＤの列と、項目名の列と、一致度の列とが含まれる。自立語ＩＤの列には、例えばステップＳ３１において特定された自立語が何番目に特定された自立語であるかに従い割り振られる番号を格納する。類似設計書ＩＤ及び項目ＩＤの列には、ステップＳ３５において特定された項目名に対応する画面ＩＤ及び項目ＩＤを類似設計書格納部１１から取得して格納する。また、一致度は、ステップＳ３７において算出された類似度が１、すなわち自立語と項目名とが完全に一致する場合には「完全一致」とし、ステップＳ３７において算出された類似度が所定の閾値以上１未満であれば「部分一致」とする。例えば図９においては、自立語「事業所コード」と項目名「事業所コード」とは同一の語句であるので、一致度の列には「完全一致」が格納されている。また、自立語「事業所名」と項目名「事業所名称」とは類似度は高いが同一の語句ではないので、一致度の列には「部分一致」が格納されている。一方、自立語「事業所区分名」に対して項目名が「受払番号」であるような場合、ステップＳ３７において算出される類似度が所定の閾値に満たないため、一致項目データ格納部５０３にはデータは格納されない。 FIG. 9 shows an example of data stored in the matching item data storage unit 503. The example of FIG. 9 includes an independent word ID column, an independent word column, a similar design document ID column, an item ID column, an item name column, and a coincidence column. In the column of the independent word ID, for example, a number assigned according to the number of the independent word specified in step S31 is stored. In the similar design document ID and item ID column, the screen ID and item ID corresponding to the item name specified in step S35 are acquired from the similar design document storage unit 11 and stored. The degree of coincidence is “complete match” when the degree of similarity calculated in step S37 is 1, that is, the independent word and the item name completely match, and the degree of similarity calculated in step S37 is a predetermined threshold value. If it is less than 1, it is determined as “partial match”. For example, in FIG. 9, since the independent word “office code” and the item name “office code” are the same word, “complete match” is stored in the match degree column. Further, since the independent word “establishment name” and the item name “establishment name” have high similarity but are not the same word, “partial match” is stored in the coincidence column. On the other hand, when the item name is “payment / payment number” with respect to the independent word “office division name”, the similarity calculated in step S37 does not satisfy the predetermined threshold value, and therefore the matching item data storage unit 503 stores the similarity. Does not store any data.

図８の説明に戻り、一致項目抽出部５０１は、全ての項目名について処理したか判断する（ステップＳ４３）。全ての項目名について処理していない場合（ステップＳ４３：Ｎｏルート）、次の項目名について処理を実施するため、ステップＳ３５の処理に戻る。 Returning to the description of FIG. 8, the matching item extraction unit 501 determines whether all item names have been processed (step S43). If all the item names have not been processed (step S43: No route), the process returns to step S35 in order to execute the process for the next item name.

一方、全ての項目名について処理した場合（ステップＳ４３：Ｙｅｓルート）、一致項目抽出部５０１は、全ての項目名リストについて処理したか判断する（ステップＳ４５）。全ての項目名リストについて処理していない場合（ステップＳ４５：Ｎｏルート）、次の項目名リストについて処理を実施するため、ステップＳ３３の処理に戻る。 On the other hand, when all item names have been processed (step S43: Yes route), the matching item extraction unit 501 determines whether all item name lists have been processed (step S45). If all the item name lists have not been processed (step S45: No route), the process returns to step S33 to execute the process for the next item name list.

一方、全ての項目名リストについて処理した場合（ステップＳ４５：Ｙｅｓルート）、一致項目抽出部５０１は、全ての自立語について処理したか判断する（ステップＳ４７）。全ての自立語について処理していない場合（ステップＳ４７：Ｎｏルート）、次の自立語について処理を実施するため、ステップＳ３１の処理に戻る。 On the other hand, when all item name lists have been processed (step S45: Yes route), the matching item extraction unit 501 determines whether all independent words have been processed (step S47). If all the independent words are not processed (step S47: No route), the process returns to the process of step S31 in order to execute the process for the next independent word.

一方、全ての自立語について処理した場合（ステップＳ４７：Ｙｅｓルート）、処理は端子Ａを介してステップＳ４９（図１０）の処理に移行する。 On the other hand, when all independent words have been processed (step S47: Yes route), the process proceeds to the process of step S49 (FIG. 10) via the terminal A.

このように、類似度が所定の閾値以上である一致項目を、項目候補になる可能性がある項目名としてまず特定する。そして、さらに以下で述べるような絞り込みを行うことにより項目候補を特定する。 As described above, a matching item having a similarity equal to or higher than a predetermined threshold is first identified as an item name that may become an item candidate. Then, candidate items are specified by further narrowing down as described below.

図１０の説明に移行して、絞り込み部５０５は、一致項目データ格納部５０３から未処理の自立語を１つ特定し、当該自立語に対応付けられている項目名、類似設計書ＩＤ及び一致度を特定し、メインメモリ等の記憶装置に格納する（図１０：ステップＳ４９）。そして、一致項目データ格納部５０３は、特定された一致度に、「完全一致」である一致度が含まれるか判断する（ステップＳ５１）。「完全一致」である一致度が含まれる場合（ステップＳ５１：Ｙｅｓルート）、絞り込み部５０５は、「完全一致」である項目名を項目候補格納部１３に格納する（ステップＳ６３）。例えば図９において、自立語「受払番号」に対応付けられている項目名には「受払番号」があるが、この項目名の一致度は「完全一致」であるため、項目名「受払番号」は項目候補格納部１３に格納される。類似設計書において定義されている項目名が処理詳細中の自立語と完全に一致するのであれば、診断対象の設計書においても定義すべき項目名である可能性が高いからである。 Shifting to the description of FIG. 10, the narrowing-down unit 505 identifies one unprocessed independent word from the matching item data storage unit 503, and the item name, similar design document ID, and matching that are associated with the independent word The degree is specified and stored in a storage device such as a main memory (FIG. 10: Step S49). Then, the matching item data storage unit 503 determines whether or not the identified matching degree includes a matching degree that is “perfect matching” (step S51). When the degree of coincidence that is “complete match” is included (step S51: Yes route), the narrowing-down unit 505 stores the item name that is “complete match” in the item candidate storage unit 13 (step S63). For example, in FIG. 9, the item name associated with the independent word “payment / payment number” includes “payment / payment number”. Since the matching degree of this item name is “complete match”, the item name “payment / payment number” Is stored in the item candidate storage unit 13. This is because if the item name defined in the similar design document completely matches the independent word in the processing details, it is highly likely that the item name should be defined in the design document to be diagnosed.

図１１に、項目候補格納部１３に格納されるデータの一例を示す。図１１の例では、項目名が格納されるようになっている。 FIG. 11 shows an example of data stored in the item candidate storage unit 13. In the example of FIG. 11, item names are stored.

図１０の説明に戻り、「完全一致」である一致度が含まれない場合（ステップＳ５１：Ｎｏルート）、絞り込み部５０５は、ステップＳ４９において特定された項目名が１つであるか判断する（ステップＳ５３）。ステップＳ４９において特定された項目名が１つである場合（ステップＳ５３：Ｙｅｓルート）、絞り込み部５０５は、ステップＳ４９において特定された項目名を項目候補格納部１３に格納する（ステップＳ６３）。例えば図９において、自立語「事業所区分名」に対応付けられている項目名は、一致度が「部分一致」である項目名「事業所区分」だけであるため、この項目名が項目候補格納部１３に格納される。一致度が「部分一致」であっても項目名が１つしか特定されていなければ、項目名の絞り込みを行う必要はないためである。 Returning to the description of FIG. 10, when the degree of coincidence “completely coincident” is not included (step S51: No route), the narrowing-down unit 505 determines whether there is one item name identified in step S49 ( Step S53). When the number of item names specified in step S49 is one (step S53: Yes route), the narrowing down unit 505 stores the item name specified in step S49 in the item candidate storage unit 13 (step S63). For example, in FIG. 9, the item name associated with the self-supported word “establishment division name” is only the item name “establishment division” having the degree of coincidence “partial match”. It is stored in the storage unit 13. This is because it is not necessary to narrow down item names if only one item name is specified even if the degree of coincidence is “partial match”.

一方、ステップＳ４９において特定された項目名が１つではないと判断された場合（ステップＳ５３：Ｎｏルート）、絞り込み部５０５は、ステップＳ４９において特定された項目名と一致する項目名が、入力データ格納部３に格納されている項目名リストに含まれるか判断する（ステップＳ５５）。含まれる場合（ステップＳ５５：Ｙｅｓルート）、絞り込み部５０５は、入力データ格納部３に格納されている項目名リストに含まれる項目名を、項目候補格納部１３に格納する（ステップＳ６３）。例えば図９において、自立語「事業所名」には項目名「事業所名称」及び「自事業所名」が対応付けられているが、入力データ格納部３に格納されている項目名リストに項目名として「事業所コード」、「事業所名称」、「事業所区分」、「受入番号」、「受払番号」、「相手事業所コード」及び「相手事業所名称」が含まれる場合には、項目名「事業所名称」が項目候補格納部１３に格納される。診断対象の設計書において既に定義されている項目名であれば、当然項目候補として特定されるべきであるからである。なお、ステップＳ６３において項目候補格納部１３に格納されたとしても、後で説明するステップＳ９の処理において、入力データ格納部３に格納されている項目名リストに含まれる項目名であると判断されるので、不整合項目として特定されることはない。 On the other hand, when it is determined that there is not one item name specified in step S49 (step S53: No route), the narrowing unit 505 determines that the item name that matches the item name specified in step S49 is input data. It is determined whether it is included in the item name list stored in the storage unit 3 (step S55). If included (step S55: Yes route), the narrowing-down unit 505 stores the item names included in the item name list stored in the input data storage unit 3 in the item candidate storage unit 13 (step S63). For example, in FIG. 9, the independent name “office name” is associated with the item names “office name” and “own office name”, but in the item name list stored in the input data storage unit 3 When the item name includes "Establishment Code", "Establishment Name", "Establishment Category", "Acceptance Number", "Payment Number", "Owner Site Code", and "Owner Site Name" , The item name “business establishment name” is stored in the item candidate storage unit 13. This is because an item name already defined in the design document to be diagnosed should be identified as an item candidate. Even if the item is stored in the item candidate storage unit 13 in step S63, it is determined that the item name is included in the item name list stored in the input data storage unit 3 in the process of step S9 described later. Therefore, it is not specified as an inconsistent item.

一方、含まれない場合（ステップＳ５５：Ｎｏルート）、絞り込み部５０５は、ステップＳ４９において特定された類似設計書ＩＤに、同一の類似設計書ＩＤが複数含まれるか判断する（ステップＳ５７）。同一の類似設計書ＩＤが複数含まれる場合（ステップＳ５７：Ｙｅｓルート）、絞り込み部５０５の出現部分比較部５０５３は、出現部分比較処理を実施する（ステップＳ５９）。例えば図９において、自立語「受払状態」に対応付けられている類似設計書ＩＤは６行目の「Ｇ００２」、７行目の「Ｇ００３」、８行目の「Ｇ００４」及び９行目の「Ｇ００４」であるので、自立語「受払状態」については出現部分比較処理を実施する。なお、便宜上、ステップＳ４９において特定された項目名のうち、ステップＳ５７においてＩＤが複数含まれると判断された類似設計書ＩＤに対応する項目名をＩＤ重複項目名と呼ぶ。図９の例であれば、「受払元状態」及び「総受払状態」がＩＤ重複項目名に該当する。出現部分比較処理は、ＩＤ重複項目名のうち項目候補として最も相応しい項目名を選び出すための処理である。 On the other hand, when not included (step S55: No route), the narrowing down unit 505 determines whether or not the similar design document ID specified in step S49 includes a plurality of identical similar design document IDs (step S57). When a plurality of the same similar design document IDs are included (step S57: Yes route), the appearance part comparison unit 5053 of the narrowing-down unit 505 performs an appearance part comparison process (step S59). For example, in FIG. 9, the similar design document ID associated with the independent word “payment state” is “G002” on the 6th line, “G003” on the 7th line, “G004” on the 8th line, and 9th line. Since it is “G004”, the appearance partial comparison process is performed for the independent word “payment / payment state”. For convenience, among the item names identified in step S49, the item name corresponding to the similar design document ID determined to include a plurality of IDs in step S57 is referred to as an ID duplicate item name. In the example of FIG. 9, “payment / payment source state” and “total payment / payment state” correspond to ID duplication item names. The appearance part comparison process is a process for selecting an item name that is most suitable as an item candidate from among ID duplicated item names.

ここで、出現部分比較処理について、図１２乃至図１７を用いて詳細に説明する。まず、出現部分比較部５０５３は、ステップＳ５７においてＩＤが複数含まれると判断された類似設計書ＩＤのうち未処理の類似設計書ＩＤ（以下、処理対象の類似設計書ＩＤと呼ぶ）を１つ特定する（図１２：ステップＳ８１）。そして、出現部分比較部５０５３は、処理対象の類似設計書ＩＤに対応する項目名リストを類似設計書格納部１１から特定し、当該項目名リストから未処理の項目名を１つ特定する（ステップＳ８３）。また、ステップＳ８３においては、未処理の項目名に対応する項目ＩＤについても類似設計書格納部１１から特定する。 Here, the appearance portion comparison processing will be described in detail with reference to FIGS. First, the appearance portion comparison unit 5053 selects one unprocessed similar design document ID (hereinafter referred to as a similar design document ID to be processed) from among similar design document IDs determined to include a plurality of IDs in step S57. It identifies (FIG. 12: step S81). Then, the appearance portion comparison unit 5053 identifies an item name list corresponding to the similar design document ID to be processed from the similar design document storage unit 11, and identifies one unprocessed item name from the item name list (step S1). S83). In step S83, the item ID corresponding to the unprocessed item name is also specified from the similar design document storage unit 11.

そして、出現部分比較部５０５３は、ステップＳ８３において特定された項目名と一致する項目名が、入力データ格納部３に格納されている項目名リストに含まれるか判断する（ステップＳ８５）。含まれない場合（ステップＳ８５：Ｎｏルート）、ステップＳ８９の処理に移行する。 Then, the appearance portion comparison unit 5053 determines whether or not the item name that matches the item name specified in step S83 is included in the item name list stored in the input data storage unit 3 (step S85). If not included (step S85: No route), the process proceeds to step S89.

一方、含まれる場合（ステップＳ８５：Ｙｅｓルート）、出現部分比較部５０５３は、ステップＳ８３において特定された項目名及び項目ＩＤを共通項目データ格納部５０７に格納する（ステップＳ８７）。 On the other hand, when it is included (step S85: Yes route), the appearance portion comparison unit 5053 stores the item name and item ID specified in step S83 in the common item data storage unit 507 (step S87).

図１３に、共通項目データ格納部５０７に格納されるデータの一例を示す。図１３の例では、項目ＩＤの列と、項目名の列とが含まれる。 FIG. 13 shows an example of data stored in the common item data storage unit 507. In the example of FIG. 13, an item ID column and an item name column are included.

図１２の説明に戻り、出現部分比較部５０５３は、全ての項目名について処理したか判断する（ステップＳ８９）。全ての項目名について処理していない場合（ステップＳ８９：Ｎｏルート）、次の項目名について処理を実施するため、ステップＳ８３の処理に戻る。一方、全ての項目名について処理した場合（ステップＳ８９：Ｙｅｓルート）、処理は端子Ｂを介してステップＳ９１（図１４）の処理に移行する。 Returning to the description of FIG. 12, the appearance portion comparison unit 5053 determines whether all item names have been processed (step S89). If all the item names have not been processed (step S89: No route), the process returns to step S83 in order to execute the process for the next item name. On the other hand, if all item names have been processed (step S89: Yes route), the process proceeds to the process of step S91 (FIG. 14) via the terminal B.

図１４の説明に移行して、出現部分比較部５０５３は、処理対象の類似設計書ＩＤに対応する自立語リストを類似設計書格納部１１から特定し、当該自立語リストから未処理の自立語を１つ特定する（ステップＳ９１）。また、ステップＳ９１においては、特定された自立語に対して自立語ＩＤを割り当てる。自立語ＩＤは、例えばステップＳ９１において特定された自立語が何番目に特定された自立語であるかに従い割り当てられる番号である。 Shifting to the description of FIG. 14, the appearance portion comparison unit 5053 identifies an independent word list corresponding to the similar design document ID to be processed from the similar design document storage unit 11, and an unprocessed independent word from the independent word list. Is identified (step S91). In step S91, an independent word ID is assigned to the specified independent word. The independent word ID is a number assigned according to the number of the independent word specified in step S91, for example.

そして、出現部分比較部５０５３は、ステップＳ９１において特定された自立語と共通項目データ格納部５０７に格納されている項目名の各々との表記の類似度を算出する（ステップＳ９３）。ステップＳ９３における類似度の算出方法は、ステップＳ３７と同様である。また、出現部分比較部５０５３は、算出された類似度のうち最大の類似度が所定の閾値以上であるかを判断する（ステップＳ９５）。例えば、ステップＳ９１において特定された自立語が「相手事業所名」であり、共通項目データ格納部５０７には図１３に示すデータが格納されている場合を考える。そして、ステップＳ９３において、項目名「相手事業所コード」について算出された類似度が０．７１であり、項目名「相手事業所名称」について算出された類似度が０．９２であるとする。この場合、ステップＳ９５においては、項目名「相手事業所名称」について算出された類似度０．９２が、所定の閾値以上であるかを判断する。 Then, the appearance portion comparison unit 5053 calculates the similarity of the notation between the independent word specified in step S91 and each item name stored in the common item data storage unit 507 (step S93). The method of calculating the similarity in step S93 is the same as that in step S37. In addition, the appearance portion comparison unit 5053 determines whether or not the maximum similarity among the calculated similarities is equal to or greater than a predetermined threshold (step S95). For example, let us consider a case where the independent word specified in step S91 is “partner business name” and the common item data storage unit 507 stores the data shown in FIG. In step S93, it is assumed that the similarity calculated for the item name “partner office code” is 0.71, and the similarity calculated for the item name “partner office name” is 0.92. In this case, in step S95, it is determined whether the similarity 0.92 calculated for the item name “partner office name” is equal to or greater than a predetermined threshold.

そして、算出された類似度のうち最大の類似度が所定の閾値以上である場合（ステップＳ９５：Ｙｅｓルート）、出現部分比較部５０５３は、最大の類似度が算出された自立語及び項目名を含むデータを類似部分データ格納部５０９に格納する（ステップＳ９７）。一方、算出された類似度のうち最大の類似度が所定の閾値未満である場合（ステップＳ９５：Ｎｏルート）、出現部分比較部５０５３は、最大の類似度が算出された自立語を含むデータを類似部分データ格納部５０９に格納する（ステップＳ９９）。 If the maximum similarity among the calculated similarities is greater than or equal to a predetermined threshold (step S95: Yes route), the appearance portion comparison unit 5053 displays the independent word and item name for which the maximum similarity is calculated. The included data is stored in the similar partial data storage unit 509 (step S97). On the other hand, when the maximum similarity is less than the predetermined threshold among the calculated similarities (step S95: No route), the appearance portion comparison unit 5053 includes data including the independent word for which the maximum similarity is calculated. It stores in the similar partial data storage unit 509 (step S99).

図１５に、類似部分データ格納部５０９に格納されるデータの一例を示す。図１５の例では、自立語ＩＤの列と、自立語の列と、項目ＩＤの列と、項目名の列と、類似部分ＩＤの列とが含まれる。図１５の例であれば、ステップＳ９７において４、６、１８及び２１行目のデータが格納され、ステップＳ９９においてそれ以外の行のデータが格納される。なお、後で説明するステップＳ１０３の処理を実施していない段階では、類似部分ＩＤの列にはデータは格納されていない。 FIG. 15 shows an example of data stored in the similar partial data storage unit 509. The example of FIG. 15 includes an independent word ID column, an independent word column, an item ID column, an item name column, and a similar part ID column. In the example of FIG. 15, the data of the fourth, sixth, 18th and 21st lines are stored in step S97, and the data of the other lines are stored in step S99. It should be noted that no data is stored in the similar part ID column when the process of step S103 described later is not performed.

図１４の説明に戻り、出現部分比較部５０５３は、全ての自立語について処理したか判断する（ステップＳ１０１）。全ての自立語について処理していない場合（ステップＳ１０１：Ｎｏルート）、次の自立語について処理を実施するため、ステップＳ９１の処理に戻る。一方、全ての自立語について処理した場合（ステップＳ１０１：Ｙｅｓルート）、出現部分比較部５０５３は、類似部分データ格納部５０９において類似部分を特定し、類似部分ＩＤを類似部分の自立語に対応付けて類似部分データ格納部５０９に格納する（ステップＳ１０３）。そして、処理は端子Ｃを介してステップＳ１０５（図１６）に移行する。 Returning to the description of FIG. 14, the appearance portion comparison unit 5053 determines whether or not all independent words have been processed (step S101). When all the independent words are not processed (step S101: No route), the process returns to the process of step S91 in order to execute the process for the next independent word. On the other hand, when all the independent words are processed (step S101: Yes route), the appearance portion comparison unit 5053 identifies a similar portion in the similar portion data storage unit 509, and associates the similar portion ID with the independent portion of the similar portion. And stored in the similar partial data storage unit 509 (step S103). Then, the process proceeds to step S105 (FIG. 16) via the terminal C.

ステップＳ１０３においては、例えば、連続するｍ個の自立語に対して、項目名がｎ個以上対応付けて格納されているかにより判断する。例えば図１５において、ｍ＝５且つｎ＝２と設定した場合を考える。まず、自立語ＩＤ「４」である自立語から自立語ＩＤ「６」である自立語までは３つの自立語があり、それらの自立語に対応付けられている項目名は「相手事業所名称」及び「相手事業所コード」の２つである。従って、自立語ＩＤ「４」から「６」までの部分は類似部分として特定される。また、自立語ＩＤ「６」である自立語から自立語ＩＤ「１８」である自立語までは１３の自立語があるため、それらの自立語に「相手事業所コード」及び「相手事業所名称」という２つの項目名が対応付けられていても、類似部分として特定されることはない。また、自立語ＩＤ「１８」である自立語から自立語ＩＤ「２１」である自立語までは４つの自立語があり、それらの自立語に対応付けられている項目名は「相手事業所名称」及び「相手事業所コード」の２つである。従って、自立語ＩＤ「１８」から「２１」までの部分が類似部分として特定される。なお、ステップＳ１０３において、類似部分を多く特定したいのであれば、ｍは大きく又はｎは小さくなるように設定し、類似部分をあまり特定したくないのであれば、ｍは小さく又はｎは大きくなるように予め設定すればよい。 In step S103, for example, a determination is made based on whether n or more item names are stored in association with m consecutive independent words. For example, consider the case where m = 5 and n = 2 are set in FIG. First, there are three independent words from the independent word with the independent word ID “4” to the independent word with the independent word ID “6”, and the item names associated with these independent words are “the partner office name”. ”And“ partner office code ”. Therefore, the parts from the independent word IDs “4” to “6” are specified as similar parts. Moreover, since there are 13 independent words from the independent word with the independent word ID “6” to the independent word with the independent word ID “18”, the “partner establishment code” and the “partner establishment name” are included in these independent words. Are associated with each other, they are not identified as similar parts. In addition, there are four independent words from the independent word with the independent word ID “18” to the independent word with the independent word ID “21”, and the item names associated with these independent words are “the partner office name”. ”And“ partner office code ”. Accordingly, the parts from the independent words ID “18” to “21” are specified as similar parts. In step S103, if it is desired to specify many similar parts, m is set to be large or n is set to be small. If it is not desired to specify similar parts very much, m is set to be small or n is set to be large. May be set in advance.

図１６の説明に移行して、出現部分比較部５０５３は、類似部分データ格納部５０９に格納されている未処理の自立語を１つ特定する（図１６：ステップＳ１０５）。そして、出現部分比較部５０５３は、特定された自立語と、ＩＤ重複項目名の各々との表記の類似度を算出する（ステップＳ１０７）。ステップＳ１０７における類似度の算出方法は、ステップＳ３７と同様である。 Shifting to the description of FIG. 16, the appearance portion comparison unit 5053 identifies one unprocessed independent word stored in the similar portion data storage unit 509 (FIG. 16: step S105). Then, the appearance part comparison unit 5053 calculates the similarity of the notation between the identified independent word and each ID duplicate item name (step S107). The method for calculating the similarity in step S107 is the same as in step S37.

そして、出現部分比較部５０５３は、算出された類似度のうち最大の類似度が所定の閾値以上であるか判断する（ステップＳ１０９）。所定の閾値未満である場合（ステップＳ１０９：Ｎｏルート）、ステップＳ１１５の処理に移行する。一方、所定の閾値以上である場合（ステップＳ１０９：Ｙｅｓルート）、出現部分比較部５０５３は、類似部分データ格納部５０９において、ステップＳ１０５において特定された自立語と前後の類似部分との距離を算出する（ステップＳ１１１）。また、出現部分比較部５０５３は、算出された距離のうち最小のものと、ステップＳ１０７において算出された類似度のうち最大の類似度の算出に用いたＩＤ重複項目名とを含む距離データを生成し、距離データ格納部５１１に格納する（ステップＳ１１３）。なお、ステップＳ１０５において特定された自立語が類似部分に含まれる場合には、距離は「０」を設定する。 Then, the appearance portion comparison unit 5053 determines whether the maximum similarity among the calculated similarities is equal to or greater than a predetermined threshold (step S109). When it is less than the predetermined threshold (step S109: No route), the process proceeds to step S115. On the other hand, when it is equal to or greater than the predetermined threshold (step S109: Yes route), the appearance portion comparison unit 5053 calculates the distance between the independent word specified in step S105 and the preceding and following similar portions in the similar portion data storage unit 509. (Step S111). In addition, the appearance portion comparison unit 5053 generates distance data including the smallest one of the calculated distances and the ID duplication item name used for calculating the maximum similarity among the similarities calculated in step S107. And stored in the distance data storage unit 511 (step S113). In addition, when the independent word specified in step S105 is included in the similar part, the distance is set to “0”.

ステップＳ１０５乃至Ｓ１１３の処理について、具体例を用いて簡単に説明する。例えば類似部分データが図１５に示すデータであり、ステップＳ１０５において特定された自立語が自立語ＩＤ「８」の「受払元状態」であり、またＩＤ重複項目名が図９の８行目及び９行目の項目名（すなわち、「受払元状態」及び「総受払状態」）である場合を考える。そして、ステップＳ１０７においてＩＤ重複項目名「受払元状態」について算出された類似度が１．０であり、ＩＤ重複項目名「総受払状態」について算出された類似度が０．８であるとする。この場合、ステップＳ１０９においては、類似度１．０が所定の閾値以上であるか判断するが、類似度１．０は完全一致であることを示しており所定の閾値以上である。従って、ステップＳ１１１においては、自立語ＩＤ「８」である自立語「受払元状態」から類似部分ＳＰ１に含まれる自立語のうち最も距離が近い自立語である「相手事業所コード」（自立語ＩＤ「６」）までの距離「２」と、自立語ＩＤ「８」である自立語「受払元状態」から類似部分ＳＰ２に含まれる自立語のうち最も距離が近い自立語である「相手事業所名」（自立語ＩＤ「１８」）までの距離「１０」とが算出される。そして、ステップＳ１１３においては、距離「２」とＩＤ重複項目名「受払元状態」とを含む距離データを生成し、距離データ格納部５１１に格納する。 The processing in steps S105 to S113 will be briefly described using a specific example. For example, the similar partial data is the data shown in FIG. 15, the self-supporting word specified in step S105 is the “payment / payment source state” of the self-supporting word ID “8”, and the duplicate ID item name is Consider a case in which the item name is on the ninth line (that is, “payment source state” and “total payment state”). In step S107, the similarity calculated for the duplicate ID item name “payment / payment source state” is 1.0, and the similarity calculated for the duplicate ID item name “total payment / payment state” is 0.8. . In this case, in step S109, it is determined whether the similarity 1.0 is equal to or greater than a predetermined threshold. However, the similarity 1.0 indicates a complete match, and is equal to or greater than the predetermined threshold. Therefore, in step S111, the “partner office code” (independent word) that is the closest independent word among the independent words included in the similar part SP1 from the independent word “payment / payment source state” having the independent word ID “8”. ID “6”) and the independent word “8” that is the closest independent word from the independent word “payment source state” having the independent word ID “8” included in the similar part SP2. A distance “10” to “place name” (independent word ID “18”) is calculated. In step S 113, distance data including the distance “2” and the ID duplication item name “payment / payment source state” is generated and stored in the distance data storage unit 511.

図１７に、距離データ格納部５１１に格納されるデータの一例を示す。図１７の例では、自立語ＩＤの列と、項目ＩＤの列と、項目名の列と、類似部分との距離の列とが含まれる。 FIG. 17 shows an example of data stored in the distance data storage unit 511. In the example of FIG. 17, an independent word ID column, an item ID column, an item name column, and a distance column between similar parts are included.

図１６の説明に戻り、出現部分比較部５０５３は、全ての自立語について処理したか判断する（ステップＳ１１５）。全ての自立語について処理していない場合（ステップＳ１１５：Ｎｏルート）、次の自立語について処理を実施するため、ステップＳ１０５の処理に戻る。 Returning to the description of FIG. 16, the appearance portion comparison unit 5053 determines whether or not all independent words have been processed (step S115). If all the independent words have not been processed (step S115: No route), the process returns to step S105 in order to execute the process for the next independent word.

一方、全ての自立語について処理した場合（ステップＳ１１５：Ｙｅｓルート）、出現部分比較部５０５３は、距離データ格納部５１１において、最小の距離に対応付けられているＩＤ重複項目名以外のＩＤ重複項目名を特定し、特定されたＩＤ重複項目名についてのデータを一致項目データ格納部５０３から削除する（ステップＳ１１７）。図１７の例であれば、最小の距離である「２」に対応付けられている項目名は「受払元状態」であるため、「受払元状態」以外の項目名である「総受払状態」が特定される。そして、図９のデータが一致項目データ格納部５０３に格納されており、ＩＤ重複項目名が８行目の「受払元状態」及び９行目の「総受払状態」である場合、９行目の「総受払状態」についてのデータが削除される。 On the other hand, when all independent words are processed (step S115: Yes route), the appearance portion comparison unit 5053 causes the ID data items other than the ID duplication item name associated with the minimum distance in the distance data storage unit 511. The name is specified, and the data for the specified ID duplicate item name is deleted from the matching item data storage unit 503 (step S117). In the example of FIG. 17, since the item name associated with the minimum distance “2” is “payment / payment source state”, the item name other than “payment / payment source state” is “total payment / payment state”. Is identified. If the data of FIG. 9 is stored in the matching item data storage unit 503 and the ID duplicate item name is “payment source state” on the eighth line and “total payment state” on the ninth line, the ninth line The data on “total payment state” is deleted.

図１６の説明に戻り、出現部分比較部５０５３は、全ての類似設計書ＩＤについて処理したか判断する（ステップＳ１１９）。全ての類似設計書ＩＤについて処理していない場合（ステップＳ１１９：Ｎｏルート）、次の類似設計書ＩＤについて処理を実施するため、処理は端子Ｄを介してステップＳ８１に戻る。一方、全ての類似設計書ＩＤについて処理した場合（ステップＳ１１９：Ｙｅｓルート）、元の処理に戻る。 Returning to the description of FIG. 16, the appearance portion comparison unit 5053 determines whether or not all similar design document IDs have been processed (step S119). When all the similar design document IDs have not been processed (step S119: No route), the process returns to step S81 via the terminal D in order to perform the process for the next similar design document ID. On the other hand, when all the similar design document IDs are processed (step S119: Yes route), the process returns to the original process.

以上のような処理を実施することにより、１つの自立語に対して、１つの類似設計書の項目名リストから複数の項目名が抽出された場合であっても、項目候補として最も相応しい項目名を特定できるようになる。 By performing the above processing, even if multiple item names are extracted from the item name list of one similar design document for one independent word, the item name most suitable as an item candidate Can be identified.

図１０の説明に戻り、同一の類似設計書ＩＤが複数含まれない場合（ステップＳ５７：Ｎｏルート）、絞り込み部５０５の項目定義比較部５０５１は、項目定義比較処理を実施する（ステップＳ６１）。例えば図９においては、自立語「受払状態区分」に対応付けられている類似設計書ＩＤは「Ｇ００２」及び「Ｇ００３」であり、同一の類似設計書ＩＤは複数含まれないので、自立語「受払状態区分」については項目定義比較処理を実施する。 Returning to the explanation of FIG. 10, when a plurality of identical similar design document IDs are not included (step S57: No route), the item definition comparing unit 5051 of the narrowing down unit 505 performs an item definition comparing process (step S61). For example, in FIG. 9, the similar design document IDs associated with the independent word “payment / payment state classification” are “G002” and “G003”, and a plurality of identical similar design document IDs are not included. Item definition comparison processing is performed for “payment / payment status classification”.

図１８を用いて、項目定義比較処理について詳細に説明する。まず、項目定義比較部５０５１は、ステップＳ４９において特定された類似設計書ＩＤから未処理の類似設計書ＩＤを１つ特定する（図１８：ステップＳ１２１）。そして、特定された類似設計書ＩＤに対応する項目名リストを類似設計書格納部１１から特定し、特定された項目名リストと入力データ格納部３に格納されている項目名リストとの類似度を算出し、メインメモリ等の記憶装置に格納する（ステップＳ１２３）。ステップＳ１２３においては、例えば両項目名リストに共通に含まれる項目名の数を類似度とする。例えば、ステップＳ１２３において特定された項目名リストに項目名「事業所コード」、「受入番号」、「受払番号」、「受入状態」、「受入状態区分」、「業務区分」及び「在庫区分」が含まれており、入力データ格納部３に格納されている項目名リストに「事業所コード」、「事業所名称」、「事業所区分」、「受入番号」、「受払番号」、「相手事業所コード」及び「相手事業所名称」が含まれる場合には、項目名「事業所コード」、「受入番号」及び「受払番号」が共通するため、類似度は「３」となる。 The item definition comparison process will be described in detail with reference to FIG. First, the item definition comparison unit 5051 identifies one unprocessed similar design document ID from the similar design document IDs identified in step S49 (FIG. 18: step S121). Then, the item name list corresponding to the specified similar design document ID is specified from the similar design document storage unit 11, and the similarity between the specified item name list and the item name list stored in the input data storage unit 3 Is calculated and stored in a storage device such as a main memory (step S123). In step S123, for example, the number of item names commonly included in both item name lists is set as the similarity. For example, the item name “office code”, “acceptance number”, “acceptance number”, “acceptance status”, “acceptance status category”, “business category” and “inventory category” in the item name list specified in step S123. The item name list stored in the input data storage unit 3 includes “establishment code”, “establishment name”, “estate classification”, “acceptance number”, “acceptance number”, “partner” When the “establishment code” and the “partner establishment name” are included, the item names “establishment code”, “acceptance number”, and “payment number” are common, and the similarity is “3”.

そして、項目定義比較部５０５１は、全ての類似設計書ＩＤについて処理したか判断する（ステップＳ１２５）。全ての類似設計書ＩＤについて処理していない場合（ステップＳ１２５：Ｎｏルート）、次の類似設計書ＩＤについて処理を実施するため、処理はステップＳ１２１に戻る。 Then, the item definition comparison unit 5051 determines whether all similar design document IDs have been processed (step S125). When all the similar design document IDs are not processed (step S125: No route), the process returns to step S121 in order to perform the process for the next similar design document ID.

一方、全ての類似設計書ＩＤについて処理した場合（ステップＳ１２５：Ｙｅｓルート）、項目定義比較部５０５１は、ステップＳ４９において特定された項目名のうち、ステップＳ１２３において算出された類似度が最大となる類似設計書ＩＤに対応する項目名を項目候補格納部１３に格納し、元の処理に戻る（ステップＳ１２７）。例えば図９において、類似設計書ＩＤが「Ｇ００２」である類似設計書の項目名リストについて算出された類似度が５であり、類似設計書ＩＤが「Ｇ００３」である類似設計書の項目名リストについて算出された類似度が３である場合、自立語「受払状態区分」に対応付けられている項目名「受払先状態区分」及び「受入状態区分」のうち、項目名「受払先状態区分」を項目候補格納部１３に格納する。 On the other hand, when processing is performed for all similar design document IDs (step S125: Yes route), the item definition comparison unit 5051 has the maximum similarity calculated in step S123 among the item names specified in step S49. The item name corresponding to the similar design document ID is stored in the item candidate storage unit 13, and the process returns to the original process (step S127). For example, in FIG. 9, the item name list of the similar design document whose similarity degree calculated for the item name list of the similar design document whose similar design document ID is “G002” is 5 and whose similar design document ID is “G003”. When the similarity calculated for the item is 3, the item name “payee status category” among the item names “payee status category” and “acceptance status category” associated with the independent word “payment status category” Are stored in the item candidate storage unit 13.

以上のような処理を実施することにより、１つの自立語に対して、複数の類似設計書の項目名リストから項目名が抽出された場合であっても、項目名を抽出するのに最も相応しい項目名リストを決定できるようになる。 By performing the above processing, even when item names are extracted from a list of item names of a plurality of similar design documents for one independent word, it is most suitable for extracting item names. The item name list can be determined.

図１０の説明に戻り、ステップＳ６３又はステップＳ６１の後、絞り込み部５０５は、全ての自立語について処理したか判断する（ステップＳ６５）。全ての自立語について処理していない場合（ステップＳ６５：Ｎｏルート）、次の自立語について処理を実施するため、ステップＳ４９の処理に戻る。一方、全ての自立語について処理した場合（ステップＳ６５：Ｙｅｓルート）、元の処理に戻る。 Returning to the description of FIG. 10, after step S63 or step S61, the narrowing-down unit 505 determines whether or not all independent words have been processed (step S65). When all the independent words have not been processed (step S65: No route), the process returns to step S49 in order to execute the process for the next independent word. On the other hand, if all independent words have been processed (step S65: Yes route), the process returns to the original process.

以上のような処理を実施することにより、項目候補となるべき項目名を適切に抽出することができる。 By performing the processing as described above, it is possible to appropriately extract item names that should be item candidates.

図６の説明に戻り、第一不整合項目特定部１５は、項目候補格納部１３に格納されている項目名のうち、入力データ格納部３に格納されている項目名リストに含まれない項目名である不整合項目名を特定し、第一不整合項目格納部１７に格納する（ステップＳ９）。例えば、図１１に示すデータが項目候補格納部１３に格納されており、図５（ｂ）に示すデータが入力データ格納部３に格納されている場合には、項目名「受払先状態」及び「受払先状態区分」を第一不整合項目格納部１７に格納する。 Returning to the description of FIG. 6, the first inconsistent item specifying unit 15 includes items that are not included in the item name list stored in the input data storage unit 3 among the item names stored in the item candidate storage unit 13. An inconsistent item name that is a name is specified and stored in the first inconsistent item storage unit 17 (step S9). For example, when the data shown in FIG. 11 is stored in the item candidate storage unit 13 and the data shown in FIG. 5B is stored in the input data storage unit 3, the item name “payee status” and The “payee status classification” is stored in the first inconsistent item storage unit 17.

そして、出力部１９は、第一不整合項目格納部１７に格納されている項目名を出力する処理を実施する（ステップＳ１１）。不整合検出装置に表示装置や印刷装置が接続されている場合には、当該表示装置に表示したり、印刷装置などに出力するようにしても良い。さらに、ネットワークを介して接続されている他のコンピュータに出力するようにしてもよい。 And the output part 19 implements the process which outputs the item name stored in the 1st inconsistent item storage part 17 (step S11). When a display device or a printing device is connected to the mismatch detection device, it may be displayed on the display device or output to the printing device. Furthermore, the data may be output to another computer connected via a network.

以上述べたように、類似設計書の項目名リストを用いているので、項目名でない自立語を項目名として定義してしまうことを防止しつつ、定義されるべき項目名を高精度で特定できるようになる。 As mentioned above, since the item name list of similar design documents is used, it is possible to specify the item name to be defined with high accuracy while preventing the independent word that is not the item name from being defined as the item name. It becomes like this.

［実施の形態２］
次に、項目定義において定義されているにも関わらず処理詳細に含まれていないという不整合（第二の不整合）が生じている項目名を検出し、当該項目名に関するデータを補充すべき位置を特定するための処理について説明する。 [Embodiment 2]
Next, you should detect the item name that has been inconsistent (second inconsistency) that is defined in the item definition but not included in the processing details, and replenish the data related to the item name. Processing for specifying the position will be described.

第２の実施の形態に係る修正支援装置の機能ブロック図を図１９に示す。第２の実施の形態に係る修正支援装置は、入力処理部１０１と、入力データ格納部１０３と、設計書ＤＢ１０５と、類似設計書特定部１０７と、類似設計書格納部１１３と、第二不整合項目特定部１０９と、第二不整合項目格納部１１１と、補充文抽出部１１５と、補充文格納部１１７と、絞り込み処理部１１９と、絞り込み処理結果格納部１２１と、補充位置決定部１２３と、出力データ格納部１２５と、出力部１２７とを含む。 FIG. 19 shows a functional block diagram of the correction support apparatus according to the second embodiment. The correction support apparatus according to the second embodiment includes an input processing unit 101, an input data storage unit 103, a design document DB 105, a similar design document specifying unit 107, a similar design document storage unit 113, and a second memory. The matching item specifying unit 109, the second inconsistent item storage unit 111, the supplementary sentence extraction unit 115, the supplementary sentence storage unit 117, the narrowing processing unit 119, the narrowing processing result storage unit 121, and the supplementary position determination unit 123. And an output data storage unit 125 and an output unit 127.

入力処理部１０１は、診断対象の設計書の処理詳細及び項目定義の入力を受け付け、入力データ格納部１０３に格納する。類似設計書特定部１０７は、入力データ格納部１０３及び設計書ＤＢ１０５に格納されているデータに基づき、類似設計書を設計書ＤＢ１０５から抽出し、類似設計書格納部１１３に格納する。第二不整合項目特定部１０９は、入力データ格納部１０３に格納されているデータ用いて不整合項目名を抽出し、第二不整合項目格納部１１１に格納する。補充文抽出部１１５は、第二不整合項目格納部１１１に格納されている不整合項目名及び類似設計書格納部１１３に格納されている処理詳細を用いて、入力された処理詳細に補充すべきデータを抽出する処理等を実施し、補充文格納部１１７に格納する。絞り込み処理部１１９は、入力データ格納部１０３、類似設計書格納部１１３及び補充文格納部１１７に格納されているデータを用いて、後で説明する範囲絞り込み処理を実施し、処理結果を絞り込み処理結果格納部１２１に格納する。補充位置決定部１２３は、入力データ格納部１０３、類似設計書格納部１１３及び絞り込み処理結果格納部１２１に格納されているデータを用いて、後で説明する補充位置決定処理を実施し、処理結果を出力データ格納部１２５に格納する。出力部１２７は、出力データ格納部１２５に格納されているデータを表示装置に出力する。 The input processing unit 101 receives processing details and item definition input of a design document to be diagnosed, and stores it in the input data storage unit 103. The similar design document specifying unit 107 extracts the similar design document from the design document DB 105 based on the data stored in the input data storage unit 103 and the design document DB 105 and stores the similar design document in the similar design document storage unit 113. The second inconsistent item specifying unit 109 extracts the inconsistent item name using the data stored in the input data storage unit 103 and stores it in the second inconsistent item storage unit 111. The supplementary sentence extraction unit 115 supplements the input processing details using the inconsistent item name stored in the second inconsistent item storage unit 111 and the processing details stored in the similar design document storage unit 113. A process for extracting power data is performed and stored in the supplementary sentence storage unit 117. The narrowing processing unit 119 performs range narrowing processing, which will be described later, using data stored in the input data storage unit 103, the similar design document storage unit 113, and the supplementary sentence storage unit 117, and narrows down the processing result. The result is stored in the result storage unit 121. The replenishment position determination unit 123 uses the data stored in the input data storage unit 103, the similar design document storage unit 113, and the narrowing-down process result storage unit 121 to perform a replenishment position determination process to be described later. Is stored in the output data storage unit 125. The output unit 127 outputs the data stored in the output data storage unit 125 to the display device.

絞り込み処理部１１９は、行番号リスト格納部１１９１と、対応位置データ格納部１１９２と、行番号リスト生成部１１９３と、対応位置特定部１１９４と、範囲特定部１１９５とを含む。行番号リスト生成部１１９３は、入力データ格納部１０３及び類似設計書格納部１１３に格納されているデータを用いて、後で説明する行番号リストを生成する処理等を実施し、処理結果を行番号リスト格納部１１９１に格納する。対応位置特定部１１９４は、行番号リスト格納部１１９１及び入力データ格納部１０３に格納されているデータを用いて、後で説明する対応位置特定処理等を実施し、処理結果を対応位置データ格納部１１９２に格納する。範囲特定部１１９５は、補充文格納部１１７及び対応位置データ格納部１１９２に格納されているデータに基づき処理を行い、処理結果を絞り込み処理結果格納部１２１に格納する。 The narrowing processing unit 119 includes a line number list storage unit 1191, a corresponding position data storage unit 1192, a line number list generation unit 1193, a corresponding position specifying unit 1194, and a range specifying unit 1195. The line number list generation unit 1193 uses the data stored in the input data storage unit 103 and the similar design document storage unit 113 to perform a process of generating a line number list, which will be described later, and outputs the processing result. Stored in the number list storage unit 1191. The corresponding position specifying unit 1194 uses the data stored in the line number list storage unit 1191 and the input data storage unit 103 to perform a corresponding position specifying process, which will be described later, and the processing result is stored in the corresponding position data storage unit. 1192. The range specifying unit 1195 performs processing based on the data stored in the supplementary sentence storage unit 117 and the corresponding position data storage unit 1192 and stores the processing result in the narrowing processing result storage unit 121.

補充位置決定部１２３は、類似度格納部１２３１と、ウィンドウ生成部１２３２と、探索部１２３３とを含む。ウィンドウ生成部１２３２は、後で説明するウィンドウ・データを生成する処理等を行い、生成されたウィンドウ・データを探索部１２３３に出力する。探索部１２３３は、ウィンドウ生成部１２３２から受け取ったウィンドウ・データ及び入力データ格納部１０３に格納されているデータを用いて処理を行い、処理結果を出力データ格納部１２５に格納する。 The replenishment position determination unit 123 includes a similarity storage unit 1231, a window generation unit 1232, and a search unit 1233. The window generation unit 1232 performs processing for generating window data, which will be described later, and outputs the generated window data to the search unit 1233. The search unit 1233 performs processing using the window data received from the window generation unit 1232 and the data stored in the input data storage unit 103, and stores the processing result in the output data storage unit 125.

図２０及び図２１に、設計書ＤＢ１０５に格納されているデータの一例を示す。図２０は、設計書ＤＢ１０５に格納されている項目定義のデータの一例を示している。図２０の例では、項目定義書ＩＤと、ファイル名と、処理名とが格納されるようになっており、また、番号の列と、項目名の列と、項目種別の列と、寄せの列と、I/O（入出力）の列とが含まれる。また、図２１は、設計書ＤＢ１０５に格納されている処理詳細のデータの一例を示している。図２１の例では、処理詳細定義書ＩＤと、ファイル名とが格納されるようになっており、また、行番号の列と、処理の内容を表す文の列とが含まれる。設計書ＤＢ１０５には、処理毎に項目定義及び処理詳細がセットで格納されるようになっている。例えば図２０及び図２１は、「在庫受入登録」という処理についての項目定義及び処理詳細である。 20 and 21 show an example of data stored in the design document DB 105. FIG. FIG. 20 shows an example of item definition data stored in the design document DB 105. In the example of FIG. 20, an item definition document ID, a file name, and a process name are stored, and a number column, an item name column, an item type column, Columns and I / O (input / output) columns are included. FIG. 21 shows an example of processing detail data stored in the design document DB 105. In the example of FIG. 21, the detailed processing definition document ID and the file name are stored, and a row number column and a statement column representing the contents of the processing are included. In the design document DB 105, item definitions and process details are stored as a set for each process. For example, FIG. 20 and FIG. 21 show item definitions and processing details for the process of “stock receipt registration”.

次に、図２２乃至図４１を用いて、図１９に示した修正支援装置の処理内容について説明する。まず、修正支援装置の入力処理部１０１は、診断対象の設計書の処理詳細及び項目定義の入力を受け付け、入力データ格納部１０３に格納する（図２２：ステップＳ２０１）。ここでは、キーボードやマウス等の入力装置を介してユーザから直接入力を受け付けるようにしてもよいし、設計書のリストをユーザに提示し、ユーザ指定の設計書の処理詳細及び項目定義を図示しない記憶装置等（又はネットワークを介して接続されている他のコンピュータ）から取得するような処理であってもよい。また、設計書ＤＢ１０５から項目定義及び処理詳細を抽出するようにしてもよい。 Next, processing contents of the correction support apparatus shown in FIG. 19 will be described with reference to FIGS. First, the input processing unit 101 of the correction support apparatus receives processing details and item definition input of a design document to be diagnosed and stores it in the input data storage unit 103 (FIG. 22: step S201). Here, the input may be received directly from the user via an input device such as a keyboard or a mouse, the list of design documents is presented to the user, and the processing details and item definitions of the user-specified design document are not shown. The processing may be acquired from a storage device or the like (or another computer connected via a network). Further, item definitions and processing details may be extracted from the design document DB 105.

図２３及び図２４に、入力データ格納部１０３に格納されているデータの一例を示す。図２３は、入力データ格納部１０３に格納されている項目定義のデータの一例を示しており、データのフォーマットは図２０に示したデータと同様である。図２４は、入力データ格納部１０３に格納されている処理詳細のデータの一例を示しており、データのフォーマットは図２１に示したデータと同様である。 23 and 24 show an example of data stored in the input data storage unit 103. FIG. FIG. 23 shows an example of item definition data stored in the input data storage unit 103, and the data format is the same as the data shown in FIG. FIG. 24 shows an example of detailed processing data stored in the input data storage unit 103, and the data format is the same as the data shown in FIG.

図２０の説明に戻り、第二不整合項目特定部１０９は、入力データ格納部１０３に格納されている項目定義（以下、入力項目定義と呼ぶ）に定義されている項目名のうち、入力データ格納部１０３に格納されている処理詳細（以下、入力処理詳細と呼ぶ）に含まれていない不整合項目名を特定し、第二不整合項目格納部１１１に格納する（ステップＳ２０３）。不整合項目名は、項目定義に定義されているにも関わらず、処理詳細において処理内容に関するデータが欠落している項目名である。 Returning to the description of FIG. 20, the second inconsistent item specifying unit 109 selects the input data among the item names defined in the item definition (hereinafter referred to as input item definition) stored in the input data storage unit 103. An inconsistent item name not included in the processing details stored in the storage unit 103 (hereinafter referred to as input processing details) is specified and stored in the second inconsistent item storage unit 111 (step S203). The inconsistent item name is an item name in which data regarding the processing content is missing in the processing details although it is defined in the item definition.

そして、類似設計書特定部１０７は、入力処理詳細との類似度が最も高い処理詳細及び当該処理詳細に対応する項目定義を設計書ＤＢ１０５から読み出し、類似設計書格納部１１３に格納する（ステップＳ２０５）。 Then, the similar design document specifying unit 107 reads out the process details having the highest similarity to the input process details and the item definition corresponding to the process details from the design document DB 105, and stores them in the similar design document storage unit 113 (step S205). ).

ステップＳ２０５においては、例えば、第１の実施の形態で説明したステップＳ３及びＳ５において行われる処理と同様の処理を行うようにすればよい。この場合、類似設計書特定部１０７は、入力処理詳細から自立語を抽出して自立語リストを生成すると共に、設計書ＤＢ１０５に格納されている処理詳細の各々についても自立語リストを生成する。また、類似設計書特定部１０７は、設計書ＤＢ１０５に格納されている処理詳細の各々について、当該処理詳細の自立語リストと入力処理詳細についての自立語リストとの類似度を算出する。そして、類似設計書特定部１０７は、算出された類似度が最も高い処理詳細と当該処理詳細に対応する項目定義を設計書ＤＢ１０５から読み出し、類似設計書格納部１１３に格納する。以下では、類似設計書格納部１１３に格納された処理詳細を類似処理詳細と呼び、類似設計書格納部１１３に格納された項目定義を類似項目定義と呼び、類似処理詳細と類似項目定義のセットを類似設計書と呼ぶ。 In step S205, for example, a process similar to the process performed in steps S3 and S5 described in the first embodiment may be performed. In this case, the similar design document specifying unit 107 extracts an independent word from the input processing details to generate an independent word list, and also generates an independent word list for each of the processing details stored in the design document DB 105. In addition, the similar design document specifying unit 107 calculates, for each processing detail stored in the design document DB 105, a similarity between the independent word list of the processing details and the independent word list of the input processing details. Then, the similar design document specifying unit 107 reads out the process details having the highest calculated similarity and the item definition corresponding to the process details from the design document DB 105 and stores them in the similar design document storage unit 113. Hereinafter, the process details stored in the similar design document storage unit 113 are referred to as similar process details, the item definitions stored in the similar design document storage unit 113 are referred to as similar item definitions, and a set of similar process details and similar item definitions is set. Is called a similar design document.

次に、補充文抽出部１１５は、第二不整合項目格納部１１１に格納されている不整合項目名で類似処理詳細を探索し、特定された不整合項目名を含む文及び当該文の行番号を補充文格納部１１７に格納する（ステップＳ２０７）。 Next, the supplementary sentence extraction unit 115 searches for similar processing details using the inconsistent item name stored in the second inconsistent item storage unit 111, and includes the sentence including the identified inconsistent item name and the line of the sentence. The number is stored in the supplementary sentence storage unit 117 (step S207).

図２５に、補充文格納部１１７に格納されているデータの一例を示す。図２５の例では、行番号と、補充すべき文とが格納されるようになっている。 FIG. 25 shows an example of data stored in the supplementary sentence storage unit 117. In the example of FIG. 25, the line number and the sentence to be supplemented are stored.

図２２の説明に戻り、絞り込み処理部１１９は、範囲絞り込み処理を実施する（ステップＳ２０９）。範囲絞り込み処理については、図２６乃至図３４を用いて説明する。 Returning to the description of FIG. 22, the narrowing-down processing unit 119 performs the range narrowing-down process (step S209). The range narrowing process will be described with reference to FIGS.

まず、絞り込み処理部１１９の行番号リスト生成部１１９３は、入力項目定義及び類似項目定義の両方に含まれる項目名を共通項目名として抽出し、メインメモリ等の記憶装置に格納する（図２６：ステップＳ２２１）。例えば、類似項目定義が図２０の例に示したようなデータであり、入力項目定義が図２３の例に示したようなデータであるような場合には、図２７に示すようなデータが共通項目名として抽出される。 First, the line number list generation unit 1193 of the narrowing processing unit 119 extracts item names included in both the input item definition and the similar item definition as common item names and stores them in a storage device such as a main memory (FIG. 26: Step S221). For example, when the similar item definition is data as shown in the example of FIG. 20 and the input item definition is data as shown in the example of FIG. 23, the data as shown in FIG. 27 is common. Extracted as an item name.

そして、行番号リスト生成部１１９３は、入力処理詳細及び類似処理詳細のそれぞれについて、ステップＳ２２１において抽出された共通項目名と当該共通項目名を含む文の行番号を抽出して行番号リストを生成し、行番号リスト格納部１１９１に格納する（ステップＳ２２３）。 Then, the line number list generation unit 1193 generates the line number list by extracting the common item name extracted in step S221 and the line number of the sentence including the common item name for each of the input process details and the similar process details. Then, it is stored in the line number list storage unit 1191 (step S223).

図２８に、行番号リスト格納部１１９１に格納されるデータの一例を示す。図２８の例では、処理詳細定義書のＩＤと、行番号と、項目名とが格納されている。図２８の左側に示したデータは、入力処理詳細について生成された行番号リストの一例を示し、右側に示したデータは、類似処理詳細について生成された行番号リストの一例を示す。 FIG. 28 shows an example of data stored in the line number list storage unit 1191. In the example of FIG. 28, the ID, line number, and item name of the detailed processing definition document are stored. The data shown on the left side of FIG. 28 shows an example of the line number list generated for the input process details, and the data shown on the right side shows an example of the line number list generated for the similar process details.

図２６の説明に戻り、対応位置特定部１１９４は、入力処理詳細についての行番号リスト及び類似処理詳細についての行番号リストから、最後に抽出した項目名の次の項目名（すなわち、最後に抽出した項目名の次に行番号が小さい項目名）をそれぞれ抽出し、メインメモリ等の記憶装置に格納する（ステップＳ２２５）。初めてステップＳ２２５の処理を実施する場合には、最も行番号が小さい項目名を抽出する。以下では、入力処理詳細についての行番号リストから抽出した項目名を第１の項目名と呼び、類似処理詳細についての行番号リストから抽出した項目名を第２の項目名と呼ぶ。 Returning to the description of FIG. 26, the corresponding position specifying unit 1194 extracts the item name next to the last extracted item name from the line number list for the input process details and the line number list for the similar process details (that is, the last extracted item name). Each item name having the next smallest row number) is extracted and stored in a storage device such as a main memory (step S225). When the process of step S225 is performed for the first time, the item name with the smallest row number is extracted. Hereinafter, the item name extracted from the line number list for the input process details is referred to as a first item name, and the item name extracted from the line number list for the similar process details is referred to as a second item name.

そして、対応位置特定部１１９４は、第１の項目名及び第２の項目名が抽出されたか判断する（ステップＳ２２７）。例えば行番号が最も大きい項目名を既に処理した場合には、次に処理すべき項目名は無いため、項目名は抽出されない。第１の項目名及び第２の項目名が抽出された場合には（ステップＳ２２７：Ｙｅｓルート）、対応位置特定部１１９４は、第１の項目名及び第２の項目名が同一であるか判断する（ステップＳ２２９）。そして、第１の項目名及び第２の項目名が同一である場合には（ステップＳ２２９：Ｙｅｓルート）、対応位置特定部１１９４は、項目名に対応する行番号を、入力処理詳細についての行番号リスト及び類似処理詳細についての行番号リストからそれぞれ特定し、当該項目名に対応付けて対応位置データ格納部１１９２に格納する（ステップＳ２３１）。 Then, the corresponding position specifying unit 1194 determines whether the first item name and the second item name are extracted (step S227). For example, when the item name having the largest line number has already been processed, the item name is not extracted because there is no item name to be processed next. When the first item name and the second item name are extracted (step S227: Yes route), the corresponding position specifying unit 1194 determines whether the first item name and the second item name are the same. (Step S229). When the first item name and the second item name are the same (step S229: Yes route), the corresponding position specifying unit 1194 sets the line number corresponding to the item name to the line for the input processing details. The number is identified from the number list and the line number list for details of similar processing, and stored in the corresponding position data storage unit 1192 in association with the item name (step S231).

図２９に、対応位置データ格納部１１９２に格納されているデータの一例を示す。図２９の例では、順番の列と、項目名の列と、入力処理詳細での行番号の列と、類似処理詳細での行番号の列とが含まれる。 FIG. 29 shows an example of data stored in the corresponding position data storage unit 1192. 29 includes an order column, an item name column, a row number column in the input processing details, and a row number column in the similar processing details.

図２６の説明に戻り、第１の項目名及び第２の項目名が同一でない場合（ステップＳ２２９：Ｎｏルート）、対応位置特定部１１９４は、対応位置特定処理を実施する（ステップＳ２３３）。対応位置特定処理については、図３０を用いて説明する。 Returning to FIG. 26, when the first item name and the second item name are not the same (step S229: No route), the corresponding position specifying unit 1194 performs the corresponding position specifying process (step S233). The corresponding position specifying process will be described with reference to FIG.

まず、対応位置特定部１１９４は、対応位置データ格納部１１９２から、最後に格納された項目名を特定する（図３０：ステップＳ２４１）。また、対応位置特定部１１９４は、入力項目定義から、ステップＳ２４１において特定された項目名よりも後に定義されている項目名（すなわち、特定された項目名よりも「番号」が大きい項目名）を抽出し、抽出された項目名を含む候補リストを生成する（ステップＳ２４３）。また、対応位置特定部１１９４は、候補リストに含まれる未処理の項目名のうち、最も先に定義されている項目名（すなわち、「番号」が最も小さい項目名）を特定する（ステップＳ２４５）。 First, the corresponding position specifying unit 1194 specifies the last stored item name from the corresponding position data storage unit 1192 (FIG. 30: Step S241). In addition, the corresponding position specifying unit 1194 selects an item name defined after the item name specified in step S241 from the input item definition (that is, an item name having a “number” larger than the specified item name). Extraction is performed, and a candidate list including the extracted item names is generated (step S243). In addition, the corresponding position specifying unit 1194 specifies the item name defined first (that is, the item name having the smallest “number”) among the unprocessed item names included in the candidate list (step S245). .

そして、対応位置特定部１１９４は、ステップＳ２４５において特定された項目名が、第１の項目名及び第２の項目名のいずれかと同一であるか判断する（ステップＳ２４７）。第１の項目名及び第２の項目名のいずれとも同一ではないと判断された場合（ステップＳ２４７：Ｎｏルート）、次の項目名について処理を実施するため、ステップＳ２４５の処理に戻る。 Then, the corresponding position specifying unit 1194 determines whether the item name specified in step S245 is the same as either the first item name or the second item name (step S247). When it is determined that neither the first item name nor the second item name is the same (step S247: No route), the process returns to the process of step S245 to execute the process for the next item name.

一方、第１の項目名及び第２の項目名のいずれかと同一であると判断された場合（ステップＳ２４７：Ｙｅｓルート）、対応位置特定部１１９４は、ステップＳ２４５において特定された項目名が、第１の項目名と同一であるか判断する（ステップＳ２４９）。 On the other hand, when it is determined that it is the same as either the first item name or the second item name (step S247: Yes route), the corresponding position specifying unit 1194 determines that the item name specified in step S245 is the first item name. It is determined whether it is the same as the item name of 1 (step S249).

第１の項目名と同一であると判断された場合（ステップＳ２４９：Ｙｅｓルート）、対応位置特定部１１９４は、類似処理詳細についての行番号リストにおいて、第２の項目名より行番号が大きい項目名の中から第１の項目名と同一の項目名を探索し、当該項目名及び当該項目名に対応する行番号を抽出する（ステップＳ２５１）。 When it is determined that the name is the same as the first item name (step S249: Yes route), the corresponding position specifying unit 1194 has an item whose line number is larger than the second item name in the line number list for similar processing details. An item name identical to the first item name is searched from the names, and the item name and the line number corresponding to the item name are extracted (step S251).

一方、第１の項目名と同一ではない（すなわち、第２の項目名と同一である）と判断された場合（ステップＳ２４９：Ｎｏルート）、対応位置特定部１１９４は、入力処理詳細についての行番号リストにおいて、第１の項目名より行番号が大きい項目名の中から第２の項目名と同一の項目名を探索し、当該項目名及び当該項目名に対応する行番号を抽出する（ステップＳ２５３）。 On the other hand, when it is determined that it is not the same as the first item name (that is, the same as the second item name) (step S249: No route), the corresponding position specifying unit 1194 performs a line for details of the input process. In the number list, an item name having the same line number as the second item name is searched from item names having a line number larger than that of the first item name, and the item name and the line number corresponding to the item name are extracted (step) S253).

そして、対応位置特定部１１９４は、ステップＳ２４５において特定された項目名と、当該項目名に対応する行番号とを対応付けて対応位置データ格納部１１９２に格納する（ステップＳ２５５）。ステップＳ２４９において、ステップＳ２４５において特定された項目名が第１の項目名と同一であると判断された場合には、入力処理詳細についての行番号リストにおいて第１の項目名に対応する行番号と、ステップＳ２５１において抽出された行番号とを格納する。ステップＳ２４９において、ステップＳ２４５において特定された項目名が第１の項目名と同一ではない（すなわち、第２の項目名と同一である）と判断された場合には、類似処理詳細についての行番号リストにおいて第２の項目名に対応する行番号と、ステップＳ２５３において抽出された行番号とを格納する。そして元の処理に戻る。 Then, the corresponding position specifying unit 1194 associates the item name specified in step S245 with the line number corresponding to the item name and stores it in the corresponding position data storage unit 1192 (step S255). If it is determined in step S249 that the item name specified in step S245 is the same as the first item name, the line number corresponding to the first item name in the line number list for the input processing details is The line number extracted in step S251 is stored. If it is determined in step S249 that the item name specified in step S245 is not the same as the first item name (that is, the same as the second item name), the line number for the similar process details In the list, the line number corresponding to the second item name and the line number extracted in step S253 are stored. Then, the process returns to the original process.

図２６の説明に戻り、いずれかの行番号リストから項目名が抽出されなかった場合には（ステップＳ２２７：Ｎｏルート）、対応位置特定部１１９４は、補充すべき文の行番号及び対応位置データ格納部１１９２に格納されているデータに基づき、補充すべき文を補充する範囲を特定する処理を行い、処理結果を絞り込み処理結果格納部１２１に格納する（ステップＳ２３５）。そして元の処理に戻る。 Returning to the description of FIG. 26, if the item name is not extracted from any of the line number lists (step S227: No route), the corresponding position specifying unit 1194 reads the line number and corresponding position data of the sentence to be supplemented. Based on the data stored in the storage unit 1192, a process for specifying a range in which a sentence to be supplemented is specified is performed, and the processing result is stored in the narrowing-down processing result storage unit 121 (step S235). Then, the process returns to the original process.

図３１に、絞り込み処理結果格納部１２１に格納されているデータの一例を示す。図３１の例では、順番の列と、項目名の列と、入力処理詳細での行番号の列と、類似処理詳細での行番号の列とが含まれる。 FIG. 31 shows an example of data stored in the narrowing processing result storage unit 121. The example of FIG. 31 includes an order column, an item name column, a row number column in the input processing details, and a row number column in the similar processing details.

ここで、ステップＳ２３５において行われる処理について説明する。例えば図２５に示すように、補充すべき文の行番号が「３３１」であるとする。一方で、対応位置データ格納部１１９２には、図２９に示すようなデータが格納されているとする。すると、補充すべき文は、類似処理詳細において「運用日付」という共通項目名が含まれる行と、「事業所コード」という共通項目名が含まれる行と間に位置していることがわかる。そこで、本実施の形態では、図３２に示すように、入力処理詳細において「運用日付」という共通項目名が含まれる行と、「事業所コード」という共通項目名が含まれる行とで挟まれた範囲の中に、欠落しているデータを補充すべき位置があると推定する。これは、入力処理詳細に類似する処理詳細（類似処理詳細）は、項目名が出現する順序も類似するという性質を利用したものである。 Here, the process performed in step S235 will be described. For example, as shown in FIG. 25, it is assumed that the line number of the sentence to be supplemented is “331”. On the other hand, it is assumed that the corresponding position data storage unit 1192 stores data as shown in FIG. Then, it can be seen that the sentence to be supplemented is located between the line including the common item name “operation date” and the line including the common item name “office code” in the similar process details. Therefore, in the present embodiment, as shown in FIG. 32, the input processing details are sandwiched between a line including the common item name “operation date” and a line including the common item name “office code”. It is estimated that there is a position to fill in the missing data in the range. This uses the property that the processing details similar to the input processing details (similar processing details) are similar in the order in which the item names appear.

また、対応位置特定部１１９４により行われる処理について、説明を追加しておく。図３３は、対応位置特定部１１９４により行われる、行番号の対応付けを模式的に示した図である。まず、対応位置特定部１１９４は、それぞれの行番号リストに含まれる項目名を上から順に（すなわち、行番号が小さい順に）同一であるかを判断していく。図３３の例であれば、まず、入力処理詳細についての行番号リストにおける「運用日付」と、類似処理詳細についての行番号リストにおける「運用日付」とが同一であるかを判断する。ここでは、同一であるので、対応位置データ格納部１１９２には、１行目のようなデータが格納される。次の項目名である「事業所コード」、及びその後に出現する「運用日付」についても同様の処理が行われる。 In addition, a description of the processing performed by the corresponding position specifying unit 1194 will be added. FIG. 33 is a diagram schematically showing line number association performed by the corresponding position specifying unit 1194. First, the corresponding position specifying unit 1194 determines whether the item names included in the respective line number lists are the same in order from the top (that is, in the order from the smallest row number). In the example of FIG. 33, first, it is determined whether the “operation date” in the line number list for the input process details is the same as the “operation date” in the line number list for the similar process details. Here, since they are the same, the corresponding position data storage unit 1192 stores data as in the first row. The same processing is performed on the “item code” that is the next item name and the “operation date” that appears thereafter.

そして、その次の項目名として、入力処理詳細についての行番号リストから「事業所コード」が特定され、類似処理詳細についての行番号リストから「取引先区分」が特定される。これらの項目名は同一ではないので、対応位置特定部１１９４は、上で述べた対応位置特定処理を実施する。例えば入力項目定義のデータが図２３のようなデータである場合には、「事業所コード」の方が「取引先区分」よりも先に定義されているので、「事業所コード」を優先して処理を行うことになる。ここでは、類似処理詳細についての行番号リストにおいて、「取引先区分」の後に位置する項目名の中から「事業所コード」を特定し、この項目名に対応する行番号「３３７」を、入力処理詳細についての行番号リストにおける行番号「２９４」と対応付ける。 Then, as the next item name, “establishment code” is specified from the line number list for the input process details, and “partner classification” is specified from the line number list for the similar process details. Since these item names are not the same, the corresponding position specifying unit 1194 performs the corresponding position specifying process described above. For example, if the input item definition data is as shown in FIG. 23, the “establishment code” is defined before the “business partner classification”, so the “establishment code” has priority. Process. Here, in the line number list for the details of similar processing, the “establishment code” is identified from the item names located after the “partner classification”, and the line number “337” corresponding to this item name is input. Corresponding to the line number “294” in the line number list for processing details.

このように、本実施の形態では、それぞれの行番号リストから抽出された項目名が同一でない場合には、入力項目定義において先に定義されている項目名を優先して処理を行うようにしている。 As described above, in the present embodiment, when the item names extracted from the respective line number lists are not the same, the item name defined earlier in the input item definition is preferentially processed. Yes.

一方、図３３の例において、入力項目定義において項目名が定義されている順序を考慮せず、類似処理詳細についての行番号リストにおける「取引先区分」という項目名を優先して処理した場合について説明する。この場合は、入力処理詳細についての行番号リストにおいて、「事業所コード」の後に位置する項目名の中から「取引先区分」を特定し、この項目名に対応する行番号「４２１」を、類似処理詳細についての行番号リストにおける行番号「３２１」と対応付ける。この結果得られる対応位置データは例えば図３４のようになる。このデータに従えば、補充すべき文は、入力処理詳細における行番号「４２１」と行番号「４３９」との間に補充すべきということになる。しかし、これは、結果として、誤った絞り込みを行ってしまっている。 On the other hand, in the example of FIG. 33, in the case where processing is performed with priority given to the item name “partner classification” in the line number list for similar processing details without considering the order in which the item names are defined in the input item definition explain. In this case, in the line number list for the details of the input process, the “partner category” is identified from the item names located after the “office code”, and the line number “421” corresponding to this item name is Corresponding to the line number “321” in the line number list for similar processing details. The corresponding position data obtained as a result is as shown in FIG. 34, for example. According to this data, the sentence to be supplemented should be supplemented between the line number “421” and the line number “439” in the input processing details. However, this has resulted in incorrect refinement.

そこで、このような誤った絞り込みをしてしまう可能性を低くするため、本実施の形態では、入力項目定義において項目名が定義されている順序を考慮した形で行番号の対応付けを行っている。 Therefore, in order to reduce the possibility of such an erroneous narrowing, in this embodiment, line numbers are associated in a form that considers the order in which item names are defined in the input item definition. Yes.

以上のようにして範囲絞り込み処理を実施することにより、入力処理詳細においてデータが欠落していると推定される範囲を高精度で絞り込むことができるようになる。 By performing the range narrowing process as described above, it is possible to narrow down the range where it is estimated that data is missing in the input process details with high accuracy.

図２２の説明に戻り、補充位置決定部１２３は、補充位置決定処理を実施する（ステップＳ２１１）。はじめに、第１の補充位置決定処理について図３５乃至図３７を用いて説明する。 Returning to the description of FIG. 22, the replenishment position determination unit 123 performs a replenishment position determination process (step S211). First, the first replenishment position determination process will be described with reference to FIGS.

まず、補充位置決定部１２３のウィンドウ生成部１２３２は、ウィンドウ幅ｎを１に設定する（図３５：ステップＳ２６１）。そして、ウィンドウ生成部１２３２は、補充すべき文の行番号のデータを補充文格納部１１７から読み出し、補充すべき文の前後ｎ行を類似処理詳細から抽出してウィンドウ・データを生成し、メインメモリ等の記憶装置に格納する（ステップＳ２６３）。 First, the window generation unit 1232 of the replenishment position determination unit 123 sets the window width n to 1 (FIG. 35: step S261). Then, the window generation unit 1232 reads the data of the line number of the sentence to be supplemented from the supplementary sentence storage unit 117, extracts n lines before and after the sentence to be supplemented from the details of similar processing, and generates window data. The data is stored in a storage device such as a memory (step S263).

図３６に、ウィンドウ・データの一例を示す。図３６の例では、ウィンドウＩＤと、行番号と、補充すべき文の前後ｎ文（ここではｎ＝３）のデータとが格納されるようになっている。 FIG. 36 shows an example of window data. In the example of FIG. 36, the window ID, the line number, and the data of n sentences before and after the sentence to be supplemented (here, n = 3) are stored.

図３５の説明に戻り、探索部１２３３は、入力処理詳細においてデータが欠落していると推定される範囲（ステップＳ２３５において特定された範囲）において、生成されたウィンドウ・データとの類似度が最も高い位置を探索する（ステップＳ２６５）。 Returning to the description of FIG. 35, the search unit 1233 has the highest degree of similarity with the generated window data in the range where the data is estimated to be missing in the input process details (the range specified in step S235). A high position is searched (step S265).

ステップＳ２６５において行われる処理について、図３７を用いて説明する。図３７では、入力処理詳細においてデータが欠落していると推定される範囲として、行番号「２７８」から行番号「２９４」までの範囲が示されている。例えばｎ＝３の場合、まず範囲３７１に含まれるデータとウィンドウ・データとの類似度を算出する。類似度は、例えば範囲３７１に含まれる自立語とウィンドウ・データに含まれる自立語とのうち一致するものの数を用いればよい。同様に、範囲３７２及び３７３についても類似度を算出し、範囲３７４まで類似度を算出すると、類似度の算出を終了する。そして、算出された類似度のうち最も高い類似度に対応する範囲における、所定の位置（例えば中央）を特定する。 The process performed in step S265 will be described with reference to FIG. In FIG. 37, the range from the line number “278” to the line number “294” is shown as the range in which the data is estimated to be missing in the input processing details. For example, when n = 3, first, the similarity between the data included in the range 371 and the window data is calculated. As the similarity, for example, the number of words that match the free words included in the range 371 and the free words included in the window data may be used. Similarly, when the similarities are calculated for the ranges 372 and 373 and the similarities are calculated up to the range 374, the calculation of the similarity is terminated. And the predetermined position (for example, center) in the range corresponding to the highest similarity among the calculated similarities is specified.

図３５の説明に戻り、探索部１２３３は、ステップＳ２６５における探索処理により特定された、類似度が最も高い位置についてのデータと、対応する類似度とを、ウィンドウ幅ｎに対応付けて類似度格納部１２３１に格納する（ステップＳ２６７）。 Returning to the description of FIG. 35, the search unit 1233 stores the degree of similarity specified by the search processing in step S <b> 265 and the corresponding degree of similarity in association with the window width n and stores the degree of similarity. The data is stored in the unit 1231 (step S267).

次に、探索部１２３３は、ウィンドウ幅ｎを１インクリメントする（ステップＳ２６９）。そして、探索部１２３３は、ウィンドウ幅ｎが上限値を超えたか判断する（ステップＳ２７１）。ウィンドウ・データを抽出する範囲は、ステップＳ２３５において特定される範囲を超えないことが望ましい。従って、ウィンドウ幅の上限値は、例えば図３１に示したデータが絞り込み処理結果格納部１２１に格納されており、補充すべき文の行番号が「３３１」である場合には「６」となる。 Next, the search unit 1233 increments the window width n by 1 (step S269). Then, the search unit 1233 determines whether the window width n exceeds the upper limit value (Step S271). It is desirable that the range for extracting window data does not exceed the range specified in step S235. Therefore, the upper limit value of the window width is “6” when, for example, the data shown in FIG. 31 is stored in the narrowing processing result storage unit 121 and the line number of the sentence to be supplemented is “331”. .

そして、ウィンドウ幅ｎが上限値を超えていない場合（ステップＳ２７１：Ｎｏルート）、ステップＳ２６３に戻る。一方、ウィンドウ幅ｎが上限値を超えた場合（ステップＳ２７１：Ｙｅｓルート）、探索部１２３３は、類似度格納部１２３１に格納されている類似度のうち、最も高い類似度に対応する位置についてのデータを抽出し、出力データ格納部１２５に格納する（ステップＳ２７３）。そして元の処理に戻る。 If the window width n does not exceed the upper limit (step S271: No route), the process returns to step S263. On the other hand, when the window width n exceeds the upper limit value (step S271: Yes route), the search unit 1233 determines the position corresponding to the highest similarity among the similarities stored in the similarity storage unit 1231. Data is extracted and stored in the output data storage unit 125 (step S273). Then, the process returns to the original process.

図３８に、出力データ格納部１２５に格納されるデータの一例を示す。図３８の例では、補充すべき箇所の行番号と、補充すべき箇所に含まれる文のデータとが格納されるようになっている。図３８の例は、入力処理詳細における行番号「２８９」の文と行番号「２９０」の文との間にデータを補充すべきであることを表している。 FIG. 38 shows an example of data stored in the output data storage unit 125. In the example of FIG. 38, the line number of the part to be supplemented and the data of the sentence included in the part to be supplemented are stored. The example of FIG. 38 indicates that data should be supplemented between the sentence with line number “289” and the sentence with line number “290” in the input processing details.

図３９は、ステップＳ２７３において行われる処理の考え方を示す図である。第１の補充位置決定処理においては、ウィンドウ幅ｎをｎ＝１，２，３，・・・と変化させ、各ウィンドウ幅について類似度の最大値を算出する。そして、算出された類似度のうち最も大きい類似度（図３９の例では、ｎ＝５の場合の類似度）に対応する位置を、データが欠落している位置として特定する。 FIG. 39 is a diagram illustrating a concept of processing performed in step S273. In the first replenishment position determination process, the window width n is changed as n = 1, 2, 3,..., And the maximum similarity is calculated for each window width. Then, the position corresponding to the highest similarity (similarity in the case of n = 5 in the example of FIG. 39) among the calculated similarities is specified as a position where data is missing.

以上のような処理を実施することにより、絞り込まれた範囲の中から、データが欠落している位置として最も確からしい位置を特定することができるようになる。 By performing the processing as described above, it is possible to identify the most probable position as the position where data is missing from the narrowed down range.

なお、第１の補充位置決定処理の代わりに、以下で説明するような第２の補充位置決定処理を行うようにしてもよい。 Instead of the first refill position determination process, a second refill position determination process as described below may be performed.

図４０及び図４１を用いて、第２の補充位置決定処理について説明する。まず、補充位置決定部１２３のウィンドウ生成部１２３２は、ウィンドウの上幅及び下幅に１を設定する（図４０：ステップＳ２８１）。そして、ウィンドウ生成部１２３２は、設定された上幅及び下幅に基づきウィンドウ・データを生成し、メインメモリ等の記憶装置に格納する（ステップＳ２８３）。 The second replenishment position determination process will be described with reference to FIGS. 40 and 41. FIG. First, the window generation unit 1232 of the replenishment position determination unit 123 sets 1 to the upper and lower widths of the window (FIG. 40: step S281). Then, the window generation unit 1232 generates window data based on the set upper width and lower width, and stores the window data in a storage device such as a main memory (step S283).

次に、ウィンドウ生成部１２３２は、ウィンドウ・データに含まれる上端の文（すなわち、最も行番号が小さい文）が章又は節の見出しを表すデータを含むか判断する（ステップＳ２８５）。上端の文に章又は節の見出しを表すデータが含まれない場合（ステップＳ２８５：Ｎｏルート）、ウィンドウ生成部１２３２は、ウィンドウの上幅を１増加し（ステップＳ２８７）、ステップＳ２８３の処理に戻る。 Next, the window generation unit 1232 determines whether the uppermost sentence (that is, the sentence with the smallest line number) included in the window data includes data representing the chapter or section heading (step S285). If the top sentence does not include data representing a chapter or section heading (step S285: No route), the window generation unit 1232 increases the upper width of the window by 1 (step S287), and the process returns to step S283. .

一方、上端の文に章又は節の見出しを表すデータが含まれる場合（ステップＳ２８５：Ｙｅｓルート）、ウィンドウ生成部１２３２は、ウィンドウ・データに含まれる下端の文（すなわち、最も行番号が大きい文）が章又は節の見出しを表すデータを含むか判断する（ステップＳ２８９）。下端の文に章又は節の見出しを表すデータが含まれない場合（ステップＳ２８９：Ｎｏルート）、ウィンドウ生成部１２３２は、ウィンドウの下幅を１増加し（ステップＳ２９１）、ステップＳ２８９の処理に戻る。 On the other hand, when the top sentence includes data representing the chapter or section heading (step S285: Yes route), the window generator 1232 displays the bottom sentence (that is, the sentence with the largest line number) included in the window data. ) Includes data representing a chapter or section heading (step S289). When the sentence representing the heading of the chapter or section is not included in the sentence at the bottom (step S289: No route), the window generating unit 1232 increases the lower width of the window by 1 (step S291), and the process returns to step S289. .

一方、下端の文に章又は節の見出しを表すデータが含まれる場合（ステップＳ２８９：Ｙｅｓルート）、ウィンドウ生成部１２３２は、設定された上幅及び下幅でウィンドウ・データを生成し、メインメモリ等の記憶装置に格納する（ステップＳ２９３）。また、探索部１２３３は、入力処理詳細において、生成されたウィンドウ・データとの類似度が最も高い位置を探索し（ステップＳ２９５）、類似度が最も高い位置についてのデータを出力データ格納部１２５に格納する。ステップＳ２９５において行われる処理は、ステップＳ２６５において行われる処理と同様である。そして元の処理に戻る。 On the other hand, when data representing a chapter or section heading is included in the sentence at the bottom (step S289: Yes route), the window generation unit 1232 generates window data with the set upper and lower widths, and the main memory Or the like (step S293). Further, the search unit 1233 searches for the position having the highest similarity with the generated window data in the details of the input process (step S295), and stores the data about the position with the highest similarity in the output data storage unit 125. Store. The process performed in step S295 is the same as the process performed in step S265. Then, the process returns to the original process.

第２の補充位置決定処理により生成されるウィンドウ・データの一例を図４１に示す。図４１の例では、ウィンドウＩＤと、行番号と、補充すべき文の前後の文（ここでは、上幅が２、下幅が４）のデータとが格納されるようになっている。なお、図４１の例では、ウィンドウ・データに含まれる下端の文を削除しているため、上端の文だけに章又は節の見出しを表すデータが含まれている。 An example of window data generated by the second replenishment position determination process is shown in FIG. In the example of FIG. 41, the window ID, the line number, and the data of the sentences before and after the sentence to be supplemented (here, the upper width is 2 and the lower width is 4) are stored. In the example of FIG. 41, since the bottom sentence included in the window data is deleted, only the top sentence includes data representing the chapter or section heading.

なお、ＵＩ工程において作成される処理詳細には、一般的な文書とは異なり、比較的短い文で内容が記述されており、且つ内容の移り変わりが激しいという特徴がある。従って、ウィンドウ幅を広くしすぎると、様々な処理内容についてのデータがウィンドウ・データに含まれてしまうため、特徴が薄れてしまい、候補となる位置を絞り込むことが難しい。一方で、ウィンドウ幅を狭くしすぎると、手がかりとなるデータが不足するため、誤った位置を特定してしまうことが多くなる。そこで、大まかにウィンドウ幅を決定するのではなく、章又は節のように、１つのまとまった内容のデータをウィンドウ・データとして利用することにより、データが欠落している位置として確からしい位置が高精度で特定されるようになる。 It should be noted that the processing details created in the UI process are characterized in that the contents are described in a relatively short sentence and the content changes drastically unlike a general document. Therefore, if the window width is too wide, data about various processing contents is included in the window data, so that the characteristics are diminished and it is difficult to narrow down candidate positions. On the other hand, if the window width is too narrow, there is a lack of data as a clue, so that an incorrect position is often specified. Therefore, instead of roughly determining the window width, using a single piece of data as window data, such as a chapter or a section, a position that is likely to be missing is high. It will be specified with accuracy.

図２２の説明に戻り、出力部１２７は、補充文格納部１１７及び出力データ格納部１２５に格納されているデータを表示装置に出力する（ステップＳ２１５）。そして、ユーザは、補充すべき文及び補充すべき位置を確認し、必要に応じて入力処理詳細のデータを修正する等の対応を行う。 Returning to the description of FIG. 22, the output unit 127 outputs the data stored in the supplementary sentence storage unit 117 and the output data storage unit 125 to the display device (step S215). Then, the user confirms the sentence to be replenished and the position to be replenished, and takes measures such as correcting the data of the input processing details as necessary.

以上のような処理を実施することにより、処理詳細において欠落しているデータを補充すべき位置を高精度で特定することができるので、処理詳細の修正作業に要するコストを削減することができるようになる。 By performing the processing as described above, it is possible to specify with high accuracy the position where the missing data in the processing details should be replenished, so that it is possible to reduce the cost required for correcting the processing details. become.

以上本技術の一実施の形態を説明したが、本技術はこれに限定されるものではない。例えば、上で説明した不整合検出装置及び修正支援装置の機能ブロック図は必ずしも実際のプログラムモジュール構成に対応するものではない。 Although one embodiment of the present technology has been described above, the present technology is not limited to this. For example, the functional block diagrams of the inconsistency detection apparatus and the correction support apparatus described above do not necessarily correspond to an actual program module configuration.

また、上で説明した各テーブルの構成は一例であって、必ずしも上記のような構成でなければならないわけではない。さらに、処理フローにおいても、処理結果が変わらなければ処理の順番を入れ替えることも可能である。さらに、並列に実行させるようにしても良い。 Further, the configuration of each table described above is an example, and the configuration as described above is not necessarily required. Further, in the processing flow, the processing order can be changed if the processing result does not change. Further, it may be executed in parallel.

例えば、上で述べた例では、ステップＳ５９における出現部分比較処理を実施した後にステップＳ６１における項目定義比較処理を実施しているが、この順番が逆であっても、同じ処理結果を得ることができる。 For example, in the example described above, the item definition comparison process in step S61 is performed after the appearance partial comparison process in step S59. However, even if this order is reversed, the same processing result can be obtained. it can.

また、本実施の形態では、ステップＳ１において診断対象の設計書の処理詳細及び項目定義の入力を受け付け、ステップＳ３において入力された処理詳細及び項目定義から自立語リスト及び項目名リストを生成している。しかし、設計書ＤＢ７から診断対象の設計書の自立語リスト及び項目名リストを読み出し、ステップＳ５以降の処理を実施するようにしても良い。 In the present embodiment, the process details and item definition input of the design document to be diagnosed are received in step S1, and an independent word list and item name list are generated from the process details and item definitions input in step S3. Yes. However, the self-supporting word list and the item name list of the design document to be diagnosed may be read from the design document DB 7 and the processes after step S5 may be performed.

また、上で述べた対応位置特定処理（ステップＳ２３３）においては、処理を行う度に候補リストを生成するようになっている。しかし、入力項目定義に含まれる項目名から共通項目名でないもの及び不整合項目名を除いたものを予め優先リストとして用意しておき、候補リストの代わりに用いるようにしてもよい。 In the corresponding position specifying process (step S233) described above, a candidate list is generated each time the process is performed. However, items that are not common item names and items that do not include inconsistent item names from item names included in the input item definition may be prepared in advance as a priority list and used instead of the candidate list.

また、上で述べた例では、類似設計書を使用するようにしているが、類似設計書を用いずに、例えばユーザから入力を受け付けた設計書を用いて処理を行うようにしてもよい。 In the example described above, the similar design document is used. However, the process may be performed using, for example, a design document that receives an input from the user without using the similar design document.

また、上で述べた例では、システム開発のＵＩ工程で作成される設計書に対して本技術を適用する例を示したが、このような設計書でなくても、文章と当該文章に含まれる項目名を定義する項目定義とを含む文書であれば本実施の形態を適用することが可能である。 In the example described above, an example in which the present technology is applied to a design document created in the UI process of system development is shown. However, even if it is not such a design document, it is included in the document and the document. The present embodiment can be applied to any document that includes an item definition that defines an item name.

なお、上で述べた不整合検出装置及び修正支援装置は、コンピュータ装置であって、図４２に示すように、メモリ２５０１とＣＰＵ２５０３とハードディスク・ドライブ（ＨＤＤ）２５０５と表示装置２５０９に接続される表示制御部２５０７とリムーバブル・ディスク２５１１用のドライブ装置２５１３と入力装置２５１５とネットワークに接続するための通信制御部２５１７とがバス２５１９で接続されている。オペレーティング・システム（ＯＳ：Operating System）及び本実施例における処理を実施するためのアプリケーション・プログラムは、ＨＤＤ２５０５に格納されており、ＣＰＵ２５０３により実行される際にはＨＤＤ２５０５からメモリ２５０１に読み出される。必要に応じてＣＰＵ２５０３は、表示制御部２５０７、通信制御部２５１７、ドライブ装置２５１３を制御して、必要な動作を行わせる。また、処理途中のデータについては、メモリ２５０１に格納され、必要があればＨＤＤ２５０５に格納される。本技術の実施例では、上で述べた処理を実施するためのアプリケーション・プログラムはコンピュータ読み取り可能なリムーバブル・ディスク２５１１に格納されて頒布され、ドライブ装置２５１３からＨＤＤ２５０５にインストールされる。インターネットなどのネットワーク及び通信制御部２５１７を経由して、ＨＤＤ２５０５にインストールされる場合もある。このようなコンピュータ装置は、上で述べたＣＰＵ２５０３、メモリ２５０１などのハードウエアとＯＳ及び必要なアプリケーション・プログラムとが有機的に協働することにより、上で述べたような各種機能を実現する。 The inconsistency detection device and the correction support device described above are computer devices, and as shown in FIG. 42, a display connected to a memory 2501, a CPU 2503, a hard disk drive (HDD) 2505, and a display device 2509. A control unit 2507, a drive device 2513 for a removable disk 2511, an input device 2515, and a communication control unit 2517 for connecting to a network are connected by a bus 2519. An operating system (OS) and an application program for executing the processing in this embodiment are stored in the HDD 2505, and are read from the HDD 2505 to the memory 2501 when executed by the CPU 2503. If necessary, the CPU 2503 controls the display control unit 2507, the communication control unit 2517, and the drive device 2513 to perform necessary operations. Further, data in the middle of processing is stored in the memory 2501 and stored in the HDD 2505 if necessary. In an embodiment of the present technology, an application program for performing the above-described processing is stored in a computer-readable removable disk 2511 and distributed, and installed from the drive device 2513 to the HDD 2505. In some cases, the HDD 2505 may be installed via a network such as the Internet and the communication control unit 2517. Such a computer apparatus realizes various functions as described above by organically cooperating hardware such as the CPU 2503 and the memory 2501 described above, the OS, and necessary application programs.

以上述べた実施の形態をまとめると以下のようになる。 The embodiment described above is summarized as follows.

本実施の形態に係る不整合検出装置は、（Ａ）文書に含まれる文章から抽出された自立語群と、当該文章に含まれる項目名及び当該項目名の定義を含む項目定義から抽出された項目名群とを文書毎に格納する文書データベースと、（Ｂ）診断対象である第１の文書の自立語群及び項目名群を格納するデータ格納部と、（Ｃ）文書データベースに格納されている各文書の自立語群と、データ格納部に格納されている自立語群との類似度を算出し、当該類似度が所定の閾値以上である文書を類似文書として特定し、特定された当該類似文書の自立語群及び項目名群を文書データベースから抽出する類似文書特定手段と、（Ｄ）データ格納部に格納されている自立語群に含まれる自立語である第１自立語に一致する項目名を、類似文書特定手段により抽出された項目名群から抽出する項目候補抽出手段と、（Ｅ）項目候補抽出手段により抽出された項目名のうち、データ格納部に格納されている項目名群に含まれていない項目名を不整合項目名として特定する不整合項目特定手段とを有する。 The inconsistency detection apparatus according to the present embodiment is extracted from an item definition including (A) an independent word group extracted from a sentence included in a document, an item name included in the sentence, and a definition of the item name. A document database that stores an item name group for each document; (B) a data storage unit that stores an independent word group and an item name group of the first document to be diagnosed; and (C) a document database that stores the item name group. The degree of similarity between the independent word group of each document and the independent word group stored in the data storage unit is calculated, the document having the similarity equal to or higher than a predetermined threshold is identified as the similar document, and the identified The similar document specifying means for extracting the independent word group and item name group of the similar document from the document database, and (D) the first independent word that is an independent word included in the independent word group stored in the data storage unit Use the similar document identification method to specify the item name. Item candidate extraction means for extracting from the extracted item name group, and (E) among the item names extracted by the item candidate extraction means, item names not included in the item name group stored in the data storage unit And inconsistent item specifying means for specifying as an inconsistent item name.

このように、類似文書の項目名群に含まれる項目名を用いているので、本来は項目名でない自立語を項目名として定義してしまうことを防止しつつ、未定義の項目名を高精度で特定できるようになる。また、項目名抽出のためのキーワードのパターンや学習事例等を診断対象の文書毎に用意する必要はないので、低コストである。さらに、文章と項目定義を含む文書であれば適用可能であるため、汎用性が高い。 In this way, since the item names included in the item name group of similar documents are used, it is possible to accurately define undefined item names while preventing independent words that are not originally item names from being defined as item names. It becomes possible to specify with. In addition, since it is not necessary to prepare a keyword pattern or a learning example for item name extraction for each document to be diagnosed, the cost is low. Furthermore, since it can be applied to any document including sentences and item definitions, it is highly versatile.

また、上で述べた項目候補抽出手段が、第１自立語毎に、当該第１自立語と類似文書特定手段により抽出された項目名群に含まれる各項目名との類似度を算出し、算出された当該類似度が第１の閾値以上である場合には、類似度の算出に係る項目名と類似度に基づき設定される一致度とを第１自立語に対応付けて一致項目データ格納部に格納する一致項目抽出手段と、一致項目データ格納部に格納されている各第１自立語について、当該第１自立語に対応付けて格納されている項目名から、少なくとも一致度に基づき、１つの項目名を項目候補として特定する絞り込み手段とを有するようにしてもよい。１つの第１自立語に対して項目名が複数抽出されることも考えられるが、上で述べた処理を行えば、最も相応しい項目名を特定し、必要のない項目名を排除することができる。 Further, the item candidate extraction means described above calculates, for each first independent word, the similarity between the first independent word and each item name included in the item name group extracted by the similar document specifying means, If the calculated similarity is equal to or greater than the first threshold, the item name related to the calculation of the similarity and the matching degree set based on the similarity are associated with the first independent word and stored as matching item data For each first independent word stored in the matching item data storage unit and the matching item extraction means stored in the part, from the item name stored in association with the first independent word, at least based on the degree of matching, You may make it have a narrowing-down means to specify one item name as an item candidate. Although multiple item names may be extracted for one first independent word, the above-described processing can identify the most appropriate item name and eliminate unnecessary item names. .

また、上で述べた一致項目抽出手段が、算出された類似度が第１の閾値以上第２の閾値未満である場合には部分一致であることを示す一致度を設定し、算出された類似度が第２の閾値以上である場合には完全一致であることを示す一致度を設定し、上で述べた絞り込み手段が、各第１自立語について、一致度が完全一致である項目名、第１自立語に対応付けられている唯一の項目名であって一致度が部分一致である項目名、又はデータ格納部に格納されている項目名群に含まれる項目名である第１項目名のいずれかと一致する項目名を項目候補として特定するようにしてもよい。一致度が完全一致である場合はもちろん、部分一致であっても第１自立語に対応付けられている項目名が１つであり選択の余地が無い場合にも項目候補として特定するものである。また、既に第１の文書の項目定義において定義されている場合にも、当然項目候補として特定する。 In addition, the matching item extraction unit described above sets a matching degree indicating that the calculated similarity is a partial match when the calculated similarity is greater than or equal to the first threshold and less than the second threshold, and the calculated similarity When the degree is greater than or equal to the second threshold, a degree of coincidence indicating complete match is set, and the above-described narrowing means, for each first independent word, item name whose degree of coincidence is perfect match, The first item name that is the only item name that is associated with the first independent word and that has a partial match, or the item name that is included in the item name group stored in the data storage unit An item name that matches any of the above may be specified as an item candidate. Of course, when the degree of coincidence is a perfect match, even if it is a partial match, the item name associated with the first independent word is one and there is no room for selection. . Further, even if the item is already defined in the item definition of the first document, it is naturally identified as an item candidate.

また、上で述べた絞り込み手段が、第１自立語に対応付けられており且つ類似文書のうち１の類似文書である第２の文書の項目名群に含まれる項目名である第２項目名が複数ある場合には、第２の文書の自立語群に含まれる自立語である第２自立語のうち第２項目名との類似度が所定の閾値以上である第２自立語と、第２の文書の自立語群における類似部分との距離に基づき、第２項目名から１つの項目名を特定し、特定された当該項目名以外の項目名についてのデータを一致項目データ格納部から削除する出現部分比較手段と、第１自立語に対応付けられている項目名を含む項目名群が複数存在する場合には、当該項目名群の各々とデータ格納部に格納されている項目名群との類似度を算出し、算出された当該類似度が最大である項目名群に含まれる項目名を項目候補として特定する項目定義比較手段とをさらに有するようにしてもよい。これにより、一致度や第１の文書の項目名群を用いた絞り込みだけでは項目候補を特定できない場合であっても、適切に項目候補を特定することができるようになる。 Further, the narrowing-down means described above is associated with the first independent word, and the second item name that is an item name included in the item name group of the second document that is one similar document among the similar documents. When there are a plurality of words, the second independent word whose similarity to the second item name is equal to or greater than a predetermined threshold among the second independent words that are independent words included in the independent word group of the second document, 1 item name is identified from the second item name based on the distance from the similar part in the independent word group of the document 2 and the data for the item name other than the identified item name is deleted from the matching item data storage unit And when there are a plurality of item name groups including the item name associated with the first independent word, the item name groups stored in each of the item name groups and the data storage unit The item name group for which the calculated similarity is the maximum The item names included may further have a field definition comparing means for identifying as an item candidate. Thereby, even if it is a case where an item candidate cannot be specified only by narrowing down using a matching degree and the item name group of the 1st document, an item candidate can be specified appropriately.

また、上で述べた出現部分比較手段が、第２の文書の項目名群に含まれる項目名のうち、第１項目名のいずれかと一致する項目名を特定し、共通項目データ格納部に格納し、第２自立語毎に当該第２自立語と共通項目データ格納部に格納されている各項目名との類似度を算出し、算出された類似度に基づき、第２の文書の自立語群における類似部分を特定し、当該類似部分を特定するためのデータを生成して類似部分データ格納部に格納し、第２自立語毎に、当該第２自立語と各第２項目名との類似度を算出し、算出された当該類似度のうち最大の類似度が所定の閾値以上である場合には、類似部分を特定するためのデータを用いて、第２自立語と類似部分との距離を算出し、算出された当該距離のうち最小の距離である第１の距離を特定し、類似度の算出に係る第２項目名と特定された第１の距離とを対応付けて距離データ格納部に格納し、距離データ格納部から、対応付けられている第１の距離が最小である第２項目名を特定し、特定された第２項目名以外の第２項目名についてのデータを一致項目データ格納部から削除するようにしてもよい。類似部分に距離が近い方が第１の文書との関連性が高く、項目候補として相応しいという考え方に基づき絞り込みを行うものである。 Further, the appearance part comparison means described above identifies an item name that matches any of the first item names among the item names included in the item name group of the second document, and stores it in the common item data storage unit. The similarity between the second independent word and each item name stored in the common item data storage unit is calculated for each second independent word, and the independent word of the second document is calculated based on the calculated similarity. A similar part in the group is specified, data for specifying the similar part is generated and stored in the similar part data storage unit, and for each second independent word, the second independent word and each second item name When the similarity is calculated and the maximum similarity among the calculated similarities is equal to or greater than a predetermined threshold, the second independent word and the similar part are determined using data for specifying the similar part. The distance is calculated, and the first distance that is the minimum distance among the calculated distances is specified. The second item name related to the calculation of similarity is associated with the identified first distance and stored in the distance data storage unit, and the first distance associated with the distance data storage unit is the smallest. The second item name may be specified, and data regarding the second item name other than the specified second item name may be deleted from the matching item data storage unit. Narrowing is performed based on the idea that the closer the distance to the similar part is, the higher the relevance to the first document is, and the suitable item candidate is.

本実施の形態に係る修正支援装置は、（Ａ）処理に関連する項目名を定義する項目定義データ及び当該処理の内容を定義する処理詳細データを処理毎に格納するデータベースと、（Ｂ）データベースから、第１の項目定義データ及び当該第１の項目定義データに対応する第１の処理詳細データを読み出し、当該第１の項目定義データに定義されており且つ当該第１の処理詳細データに含まれていない項目名を不整合項目名として抽出する項目抽出部と、（Ｃ）データベースに格納されている処理詳細データのうち不整合項目名を含む第２の処理詳細データについて、不整合項目名が第２の処理詳細データに出現する位置を特定し、当該位置を表す位置情報を記憶装置に格納する補充データ特定部と、（Ｄ）データベースから、第１の項目定義データ及び第２の処理詳細データに対応する第２の項目定義データに共に定義されている項目名を共通項目名として抽出し、当該共通項目名が第１の処理詳細データ及び第２の処理詳細データにおいて出現する位置をそれぞれ特定し、当該位置を表す位置情報を共通項目名に対応付けて記憶装置に格納する対応位置特定部と、（Ｅ）記憶装置に格納されているデータを用いて、共通項目名のうち、第２の処理詳細データにおいて不整合項目名の直前に出現する共通項目名及び直後に出現する共通項目名を直前項目名及び直後項目名として特定する範囲特定部とを有する。 The correction support apparatus according to the present embodiment includes (A) a database that stores item definition data that defines item names related to processing, and detailed processing data that defines the details of the processing, and (B) a database. The first item definition data and the first process detail data corresponding to the first item definition data are read out, and are defined in the first item definition data and included in the first process detail data An item extraction unit that extracts an inconsistent item name as an inconsistent item name, and (C) inconsistent item name for second processing detailed data including the inconsistent item name among the processing detailed data stored in the database Identifies a position that appears in the second process detailed data, and stores a position data representing the position in the storage device, and (D) a first item definition from the database. The item name defined in the second item definition data corresponding to the data and the second process detailed data is extracted as a common item name, and the common item name is the first process detailed data and the second process. Using the corresponding position specifying unit that specifies each position appearing in the detailed data, stores the position information representing the position in the storage device in association with the common item name, and (E) the data stored in the storage device A range specifying unit for specifying the common item name that appears immediately before the inconsistent item name and the common item name that appears immediately after the common item name as the immediately preceding item name and the immediately following item name in the second processing detailed data. Have.

このような構成であれば、不整合項目名に関するデータが欠落している位置を高精度で絞り込むことができるので、修正作業に要するコストを削減できるようになる。 With such a configuration, it is possible to narrow down the position where the data relating to the inconsistent item name is missing with high accuracy, so that the cost required for the correction work can be reduced.

また、第２の処理詳細データから、不整合項目名の前後それぞれ特定の範囲に含まれるデータをウィンドウ・データとして抽出する処理を、特定の範囲を変化させつつ実施するウィンドウ生成部と、第１の処理詳細データにおける、直前項目名及び直後項目名で挟まれた範囲において、ウィンドウ生成部により抽出されたウィンドウ・データの各々について当該ウィンドウ・データとの類似度が最も高い位置を探索し、ウィンドウ・データの各々について検出された位置のうち類似度が最も高い位置を、不整合項目名に関するデータが欠落している位置として特定する探索部とをさらに有するようにしてもよい。このようにすれば、特定された範囲の中から、不整合項目名に関するデータが欠落している位置として最も確からしい位置を特定できるようになる。 In addition, a window generation unit that performs processing for extracting data included in a specific range before and after the inconsistent item name as window data from the second processing detailed data while changing the specific range; In the processing detailed data of, in the range between the immediately preceding item name and the immediately following item name, the window data extracted by the window generation unit is searched for the position having the highest similarity with the window data, and the window -You may make it further have a search part which specifies a position with the highest similarity among the positions detected about each data as a position where the data about an inconsistent item name is missing. In this way, it is possible to identify the most probable position as the position where the data relating to the inconsistent item name is missing from the identified range.

また、第２の処理詳細データにおける不整合項目名の前後において、処理内容の区切りを表すデータをそれぞれ特定し、当該処理内容の区切りを表すデータで挟まれた範囲に含まれるデータをウィンドウ・データとして抽出するウィンドウ生成部と、第１の処理詳細データにおける、直前項目名及び直後項目名で挟まれた範囲において、ウィンドウ生成部により抽出されたウィンドウ・データとの類似度が最も高い位置を、不整合項目名に関するデータが欠落している位置として特定する探索部とをさらに有するようにしてもよい。このようにすれば、ウィンドウ・データには、不整合項目名に関連する処理内容のデータが含まれるようになるため、不整合項目名に関するデータが欠落している位置として確からしい位置が高精度で特定されるようになる。 In addition, before and after the inconsistent item name in the second processing detail data, data indicating a processing content delimiter is specified, and data included in a range sandwiched by the data indicating the processing content delimiter is displayed as window data. In the range sandwiched between the immediately preceding item name and the immediately following item name in the first processing detail data and the window generating unit to be extracted as, the position having the highest similarity between the window data extracted by the window generating unit, You may make it further have a search part which specifies as a position where the data regarding an inconsistent item name are missing. In this way, the window data will contain the data of the processing contents related to the inconsistent item name, so the position that is likely to be the position where the data related to the inconsistent item name is missing is highly accurate. To be specified.

また、上で述べた対応位置特定部は、第１の項目定義データにおいて共通項目名が出現する順序に従い、共通項目名が第１の処理詳細データにおいて出現する位置を表す位置情報と、共通項目名が第２の処理詳細データにおいて出現する位置を表す位置情報とを対応付けるようにしてもよい。このようにすれば、誤った範囲を特定してしまう可能性を低くすることができるようになる。 In addition, the corresponding position specifying unit described above, according to the order in which the common item name appears in the first item definition data, the position information indicating the position where the common item name appears in the first process detail data, and the common item You may make it match | combine with the positional information showing the position where a name appears in 2nd process detailed data. In this way, the possibility of specifying an incorrect range can be reduced.

また、上で述べた第２の処理詳細データは、不整合項目名を含み且つ第１の処理詳細データとの類似度が高くてもよい。類似度が高い処理詳細データを利用すれば、より確からしい結果を得ることができるからである。 In addition, the second process detail data described above may include an inconsistent item name and have a high degree of similarity with the first process detail data. This is because a more reliable result can be obtained by using processing detail data having a high degree of similarity.

本実施の形態に係る不整合検出方法は、文書に含まれる文章から抽出された自立語群と当該文章に含まれる項目名及び当該項目名の定義を含む項目定義から抽出された項目名群とを文書毎に格納する文書データベースと、診断対象である第１の文書の自立語群及び項目名群を格納するデータ格納部とにアクセス可能なコンピュータにより実行される。そして、本不整合検出方法は、（Ａ）文書データベースに格納されている各文書の自立語群と、データ格納部に格納されている自立語群との類似度を算出し、当該類似度が所定の閾値以上である文書を類似文書として特定し、特定された当該類似文書の自立語群及び項目名群を文書データベースから抽出する類似文書特定ステップと、（Ｂ）データ格納部に格納されている自立語群に含まれる自立語である第１自立語に一致する項目名を、類似文書特定ステップにおいて抽出された項目名群から抽出する項目候補抽出ステップと、（Ｃ）項目候補抽出ステップにおいて抽出された項目名のうち、データ格納部に格納されている項目名群に含まれていない項目名を不整合項目名として特定するステップとを含む。 The inconsistency detection method according to the present embodiment includes an independent word group extracted from a sentence included in a document, an item name group extracted from an item definition including an item name included in the sentence and the definition of the item name, This is executed by a computer that can access a document database for storing each document and a data storage unit for storing the independent word group and item name group of the first document to be diagnosed. The inconsistency detection method calculates (A) the degree of similarity between the independent word group of each document stored in the document database and the independent word group stored in the data storage unit. A similar document specifying step of specifying a document that is equal to or greater than a predetermined threshold as a similar document, and extracting the independent word group and item name group of the specified similar document from the document database; and (B) stored in the data storage unit An item candidate extraction step for extracting an item name that matches the first independent word that is an independent word included in the independent word group from the item name group extracted in the similar document specifying step; and (C) the item candidate extraction step. Identifying an item name that is not included in the item name group stored in the data storage unit among the extracted item names as an inconsistent item name.

本実施の形態に係る修正支援方法は、（Ａ）処理に関連する項目名を定義する項目定義データ及び当該処理の内容を定義する処理詳細データを処理毎に格納するデータベースから、第１の項目定義データ及び当該第１の項目定義データに対応する第１の処理詳細データを読み出し、当該第１の項目定義データに定義されており且つ当該第１の処理詳細データに含まれていない項目名を不整合項目名として抽出するステップと、（Ｂ）データベースに格納されている処理詳細データのうち不整合項目名を含む第２の処理詳細データについて、不整合項目名が第２の処理詳細データに出現する位置を特定し、当該位置を表す位置情報を記憶装置に格納するステップと、（Ｃ）データベースから、第１の項目定義データ及び第２の処理詳細データに対応する第２の項目定義データに共に定義されている項目名を共通項目名として抽出し、当該共通項目名が第１の処理詳細データ及び第２の処理詳細データにおいて出現する位置をそれぞれ特定し、当該位置を表す位置情報を共通項目名に対応付けて記憶装置に格納するステップと、（Ｄ）記憶装置に格納されているデータを用いて、共通項目名のうち、第２の処理詳細データにおいて不整合項目名の直前に出現する共通項目名及び直後に出現する共通項目名を特定するステップとを含む。 The correction support method according to the present embodiment includes (A) a first item from a database that stores item definition data that defines item names related to processing and detailed processing data that defines the details of the processing for each processing. First definition data and first process detail data corresponding to the first item definition data are read, and item names that are defined in the first item definition data and are not included in the first process detail data. The step of extracting as the inconsistent item name, and (B) the second process detailed data including the inconsistent item name among the process detailed data stored in the database, the inconsistent item name becomes the second process detailed data. Identifying the position where it appears, and storing the position information representing the position in the storage device; (C) from the database to the first item definition data and the second processing detail data The item names defined together in the corresponding second item definition data are extracted as common item names, and the positions where the common item names appear in the first process detail data and the second process detail data are specified. , Storing the position information representing the position in the storage device in association with the common item name, and (D) using the data stored in the storage device, the second processing detailed data in the common item name Identifying a common item name that appears immediately before the inconsistent item name and a common item name that appears immediately after.

なお、上記方法による処理をコンピュータに行わせるためのプログラムを作成することができ、当該プログラムは、例えばフレキシブルディスク、ＣＤ−ＲＯＭ、光磁気ディスク、半導体メモリ、ハードディスク等のコンピュータ読み取り可能な記憶媒体又は記憶装置に格納される。尚、中間的な処理結果はメインメモリ等の記憶装置に一時保管される。 A program for causing a computer to perform the processing according to the above method can be created. The program can be a computer-readable storage medium such as a flexible disk, a CD-ROM, a magneto-optical disk, a semiconductor memory, or a hard disk. It is stored in a storage device. The intermediate processing result is temporarily stored in a storage device such as a main memory.

以上の実施例を含む実施形態に関し、さらに以下の付記を開示する。 The following supplementary notes are further disclosed with respect to the embodiments including the above examples.

（付記１）
文書に含まれる文章から抽出された自立語群と、当該文章に含まれる項目名及び当該項目名の定義を含む項目定義から抽出された項目名群とを文書毎に格納する文書データベースと、
診断対象である第１の文書の前記自立語群及び前記項目名群を格納するデータ格納部と、
前記文書データベースに格納されている各文書の自立語群と、前記データ格納部に格納されている自立語群との類似度を算出し、当該類似度が所定の閾値以上である文書を類似文書として特定し、特定された当該類似文書の自立語群及び項目名群を前記文書データベースから抽出する類似文書特定手段と、
前記データ格納部に格納されている自立語群に含まれる自立語である第１自立語に一致する項目名を、前記類似文書特定手段により抽出された項目名群から抽出する項目候補抽出手段と、
前記項目候補抽出手段により抽出された項目名のうち、前記データ格納部に格納されている項目名群に含まれていない項目名を不整合項目名として特定する不整合項目特定手段と、
を有する不整合検出装置。 (Appendix 1)
A document database that stores for each document an independent word group extracted from a sentence included in the document, and an item name group extracted from an item definition including the item name and the definition of the item name included in the sentence;
A data storage unit for storing the independent word group and the item name group of the first document to be diagnosed;
The degree of similarity between the independent word group of each document stored in the document database and the independent word group stored in the data storage unit is calculated, and a document whose similarity is equal to or greater than a predetermined threshold is calculated as a similar document. Similar document specifying means for extracting from the document database the independent word group and item name group of the specified similar document
Item candidate extraction means for extracting an item name that matches the first independent word that is an independent word included in the independent word group stored in the data storage unit from the item name group extracted by the similar document specifying means; ,
Among the item names extracted by the item candidate extraction unit, an inconsistent item specifying unit that specifies an item name that is not included in the item name group stored in the data storage unit as an inconsistent item name;
A mismatch detection device having

（付記２）
前記項目候補抽出手段が、
前記第１自立語毎に、当該第１自立語と前記類似文書特定手段により抽出された項目名群に含まれる各項目名との類似度を算出し、算出された当該類似度が第１の閾値以上である場合には、前記類似度の算出に係る項目名と前記類似度に基づき設定される一致度とを前記第１自立語に対応付けて一致項目データ格納部に格納する一致項目抽出手段と、
前記一致項目データ格納部に格納されている各前記第１自立語について、当該第１自立語に対応付けて格納されている項目名から、少なくとも前記一致度に基づき、１つの項目名を項目候補として特定する絞り込み手段と、
を有する付記１記載の不整合検出装置。 (Appendix 2)
The item candidate extracting means is
For each first independent word, a similarity between the first independent word and each item name included in the item name group extracted by the similar document specifying unit is calculated, and the calculated similarity is the first If it is greater than or equal to a threshold value, the matching item extraction stores the item name relating to the calculation of the similarity and the matching degree set based on the similarity in the matching item data storage unit in association with the first independent word Means,
For each of the first independent words stored in the matching item data storage unit, one item name is selected as an item candidate based on at least the matching degree from item names stored in association with the first independent word. A narrowing-down means specified as:
The inconsistency detection device according to supplementary note 1, comprising:

（付記３）
前記一致項目抽出手段が、
算出された前記類似度が前記第１の閾値以上第２の閾値未満である場合には部分一致であることを示す一致度を設定し、算出された前記類似度が前記第２の閾値以上である場合には完全一致であることを示す一致度を設定し、
前記絞り込み手段が、
各前記第１自立語について、前記一致度が完全一致である項目名、前記第１自立語に対応付けられている唯一の項目名であって前記一致度が部分一致である項目名、又は前記データ格納部に格納されている項目名群に含まれる項目名である第１項目名のいずれかと一致する項目名を前記項目候補として特定する
ことを特徴とする付記２記載の不整合検出装置。 (Appendix 3)
The matching item extracting means includes
When the calculated similarity is greater than or equal to the first threshold and less than the second threshold, a degree of coincidence indicating partial match is set, and the calculated similarity is greater than or equal to the second threshold. If there is, set the degree of match to indicate an exact match,
The narrowing means is
For each first independent word, the item name whose degree of coincidence is a perfect match, the item name that is the only item name associated with the first independent word and the degree of coincidence is a partial match, or 3. The inconsistency detection apparatus according to appendix 2, wherein an item name that matches any of the first item names that are item names included in the item name group stored in the data storage unit is specified as the item candidate.

（付記４）
前記絞り込み手段が、
前記第１自立語に対応付けられており且つ前記類似文書のうち１の類似文書である第２の文書の項目名群に含まれる項目名である第２項目名が複数ある場合には、前記第２の文書の自立語群に含まれる自立語である第２自立語のうち前記第２項目名との類似度が所定の閾値以上である第２自立語と、前記第２の文書の自立語群における類似部分との距離に基づき、前記第２項目名から１つの項目名を特定し、特定された当該項目名以外の項目名についてのデータを前記一致項目データ格納部から削除する出現部分比較手段と、
前記第１自立語に対応付けられている項目名を含む項目名群が複数存在する場合には、当該項目名群の各々と前記データ格納部に格納されている項目名群との類似度を算出し、算出された当該類似度が最大である項目名群に含まれる項目名を項目候補として特定する項目定義比較手段と、
をさらに有する付記３記載の不整合検出装置。 (Appendix 4)
The narrowing means is
When there are a plurality of second item names that are associated with the first independent word and are included in item name groups of a second document that is one similar document among the similar documents, A second independent word whose similarity to the second item name is greater than or equal to a predetermined threshold among the second independent words that are independent words included in the independent word group of the second document, and the independent word of the second document Appearing portion that identifies one item name from the second item name based on the distance from the similar portion in the word group, and deletes data for item names other than the identified item name from the matching item data storage unit A comparison means;
When there are a plurality of item name groups including item names associated with the first independent words, the similarity between each of the item name groups and the item name group stored in the data storage unit is determined. An item definition comparing means for calculating and identifying an item name included in the item name group with the calculated maximum similarity as an item candidate;
The mismatch detection apparatus according to appendix 3, further comprising:

（付記５）
前記出現部分比較手段が、
前記第２の文書の項目名群に含まれる項目名のうち、前記第１項目名のいずれかと一致する項目名を特定し、共通項目データ格納部に格納し、
前記第２自立語毎に当該第２自立語と前記共通項目データ格納部に格納されている各項目名との類似度を算出し、算出された前記類似度に基づき、前記第２の文書の自立語群における類似部分を特定し、当該類似部分を特定するためのデータを生成して類似部分データ格納部に格納し、
前記第２自立語毎に、当該第２自立語と各前記第２項目名との類似度を算出し、算出された当該類似度のうち最大の類似度が所定の閾値以上である場合には、前記類似部分を特定するためのデータを用いて、前記第２自立語と前記類似部分との距離を算出し、算出された当該距離のうち最小の距離である第１の距離を特定し、前記類似度の算出に係る第２項目名と特定された前記第１の距離とを対応付けて距離データ格納部に格納し、
前記距離データ格納部から、対応付けられている前記第１の距離が最小である第２項目名を特定し、特定された前記第２項目名以外の第２項目名についてのデータを前記一致項目データ格納部から削除する
ことを特徴とする付記４記載の不整合検出装置。 (Appendix 5)
The appearance part comparing means is
Among the item names included in the item name group of the second document, an item name that matches any of the first item names is specified, and stored in the common item data storage unit,
For each second independent word, a similarity between the second independent word and each item name stored in the common item data storage unit is calculated, and based on the calculated similarity, the second document Identify similar parts in independent words, generate data for identifying the similar parts and store it in the similar part data storage unit,
For each second independent word, the similarity between the second independent word and each of the second item names is calculated, and when the maximum similarity among the calculated similarities is equal to or greater than a predetermined threshold , Using the data for specifying the similar part, calculating a distance between the second independent word and the similar part, and specifying a first distance that is a minimum distance among the calculated distances; The second item name related to the calculation of the similarity is associated with the identified first distance and stored in the distance data storage unit,
From the distance data storage unit, the second item name having the smallest associated first distance is specified, and data about the second item name other than the specified second item name is the matching item. The inconsistency detection apparatus according to appendix 4, wherein the inconsistency detection apparatus is deleted from the data storage unit.

（付記６）
文書に含まれる文章から抽出された自立語群と当該文章に含まれる項目名及び当該項目名の定義を含む項目定義から抽出された項目名群とを文書毎に格納する文書データベースに格納されている各文書の自立語群と、診断対象である第１の文書の前記自立語群及び前記項目名群を格納するデータ格納部に格納されている自立語群との類似度を算出し、当該類似度が所定の閾値以上である文書を類似文書として特定し、特定された当該類似文書の自立語群及び項目名群を前記文書データベースから抽出する類似文書特定ステップと、
前記データ格納部に格納されている自立語群に含まれる自立語である第１自立語に一致する項目名を、前記類似文書特定ステップにおいて抽出された項目名群から抽出する項目候補抽出ステップと、
前記項目候補抽出ステップにおいて抽出された項目名のうち、前記データ格納部に格納されている項目名群に含まれていない項目名を不整合項目名として特定するステップと、
をコンピュータに実行させるための不整合検出プログラム。 (Appendix 6)
An independent word group extracted from a sentence included in a document and an item name extracted from an item definition including an item name included in the sentence and the definition of the item name are stored in a document database for each document. Calculating the degree of similarity between the independent word group of each document and the independent word group stored in the data storage unit storing the independent word group and the item name group of the first document to be diagnosed, A similar document specifying step of specifying a document having a similarity equal to or higher than a predetermined threshold as a similar document, and extracting the specified independent word group and item name group of the similar document from the document database;
An item candidate extraction step for extracting item names that match the first independent word, which is an independent word included in the independent word group stored in the data storage unit, from the item name group extracted in the similar document specifying step; ,
Of the item names extracted in the item candidate extraction step, specifying an item name that is not included in the item name group stored in the data storage unit as an inconsistent item name;
Inconsistency detection program for causing a computer to execute.

（付記７）
文書に含まれる文章から抽出された自立語群と当該文章に含まれる項目名及び当該項目名の定義を含む項目定義から抽出された項目名群とを文書毎に格納する文書データベースと、診断対象である第１の文書の前記自立語群及び前記項目名群を格納するデータ格納部とにアクセス可能なコンピュータにより実行される不整合検出方法であって、
前記文書データベースに格納されている各文書の自立語群と、前記データ格納部に格納されている自立語群との類似度を算出し、当該類似度が所定の閾値以上である文書を類似文書として特定し、特定された当該類似文書の自立語群及び項目名群を前記文書データベースから抽出する類似文書特定ステップと、
前記データ格納部に格納されている自立語群に含まれる自立語である第１自立語に一致する項目名を、前記類似文書特定ステップにおいて抽出された項目名群から抽出する項目候補抽出ステップと、
前記項目候補抽出ステップにおいて抽出された項目名のうち、前記データ格納部に格納されている項目名群に含まれていない項目名を不整合項目名として特定するステップと、
を含む不整合検出方法。 (Appendix 7)
A document database for storing for each document an independent word group extracted from a sentence included in a document, an item name included in the sentence, and an item name group extracted from an item definition including the definition of the item name, and a diagnosis target An inconsistency detection method executed by a computer accessible to the data storage unit storing the independent word group and the item name group of the first document,
The degree of similarity between the independent word group of each document stored in the document database and the independent word group stored in the data storage unit is calculated, and a document whose similarity is equal to or greater than a predetermined threshold is calculated as a similar document. A similar document specifying step for extracting the independent word group and item name group of the specified similar document from the document database;
An item candidate extraction step for extracting item names that match the first independent word, which is an independent word included in the independent word group stored in the data storage unit, from the item name group extracted in the similar document specifying step; ,
Of the item names extracted in the item candidate extraction step, specifying an item name that is not included in the item name group stored in the data storage unit as an inconsistent item name;
Inconsistency detection method including:

（付記８）
処理に関連する項目名を定義する項目定義データ及び当該処理の内容を定義する処理詳細データを処理毎に格納するデータベースと、
前記データベースから、第１の項目定義データ及び当該第１の項目定義データに対応する第１の処理詳細データを読み出し、当該第１の項目定義データに定義されており且つ当該第１の処理詳細データに含まれていない項目名を不整合項目名として抽出する項目抽出部と、
前記データベースに格納されている処理詳細データのうち前記不整合項目名を含む第２の処理詳細データについて、前記不整合項目名が前記第２の処理詳細データに出現する位置を特定し、当該位置を表す位置情報を記憶装置に格納する補充データ特定部と、
前記データベースから、前記第１の項目定義データ及び前記第２の処理詳細データに対応する第２の項目定義データに共に定義されている項目名を共通項目名として抽出し、当該共通項目名が前記第１の処理詳細データ及び前記第２の処理詳細データにおいて出現する位置をそれぞれ特定し、当該位置を表す位置情報を前記共通項目名に対応付けて前記記憶装置に格納する対応位置特定部と、
前記記憶装置に格納されているデータを用いて、前記共通項目名のうち、前記第２の処理詳細データにおいて前記不整合項目名の直前に出現する共通項目名及び直後に出現する共通項目名を直前項目名及び直後項目名として特定する範囲特定部と、
を有する修正支援装置。 (Appendix 8)
A database that stores, for each process, item definition data that defines an item name related to the process and process detail data that defines the content of the process;
First item definition data and first process detail data corresponding to the first item definition data are read from the database, and are defined in the first item definition data and the first process detail data. An item extraction unit that extracts item names that are not included as inconsistent item names,
For the second process detail data including the inconsistent item name among the process detail data stored in the database, the position where the inconsistent item name appears in the second process detail data is specified, and the position A replenishment data specifying unit that stores position information representing
An item name defined together in the second item definition data corresponding to the first item definition data and the second process detail data is extracted as a common item name from the database, and the common item name is A corresponding position identifying unit that identifies each position appearing in the first process detail data and the second process detail data, and stores position information representing the position in the storage device in association with the common item name;
Using the data stored in the storage device, among the common item names, the common item name that appears immediately before the inconsistent item name and the common item name that appears immediately after the inconsistent item name in the second processing detailed data A range identifying part that identifies the immediately preceding item name and the immediately following item name;
A correction support apparatus.

（付記９）
前記第２の処理詳細データから、前記不整合項目名の前後それぞれ特定の範囲に含まれるデータをウィンドウ・データとして抽出する処理を、前記特定の範囲を変化させつつ実施するウィンドウ生成部と、
前記第１の処理詳細データにおける、前記直前項目名及び前記直後項目名で挟まれた範囲において、前記ウィンドウ生成部により抽出された前記ウィンドウ・データの各々について当該ウィンドウ・データとの類似度が最も高い位置を探索し、前記ウィンドウ・データの各々について検出された位置のうち類似度が最も高い位置を、前記不整合項目名に関するデータが欠落している位置として特定する探索部と、
をさらに有する付記８記載の修正支援装置。 (Appendix 9)
A window generation unit for performing processing for extracting data included in a specific range before and after the inconsistent item name as window data from the second processing detailed data while changing the specific range;
In the range between the immediately preceding item name and the immediately following item name in the first processing detailed data, each of the window data extracted by the window generation unit has the highest similarity to the window data. A search unit that searches for a high position and identifies a position having the highest similarity among the positions detected for each of the window data as a position where data relating to the inconsistent item name is missing,
The correction support device according to appendix 8, further comprising:

（付記１０）
前記第２の処理詳細データにおける前記不整合項目名の前後において、処理内容の区切りを表すデータをそれぞれ特定し、当該処理内容の区切りを表すデータで挟まれた範囲に含まれるデータをウィンドウ・データとして抽出するウィンドウ生成部と、
前記第１の処理詳細データにおける、前記直前項目名及び前記直後項目名で挟まれた範囲において、前記ウィンドウ生成部により抽出された前記ウィンドウ・データとの類似度が最も高い位置を、前記不整合項目名に関するデータが欠落している位置として特定する探索部と、
をさらに有する付記８記載の修正支援装置。 (Appendix 10)
Before and after the inconsistent item name in the second processing detail data, data indicating a processing content delimiter is specified, and data included in a range sandwiched by the data indicating the processing content delimiter is displayed as window data A window generator to extract as
In the first processing detailed data, in the range sandwiched between the immediately preceding item name and the immediately following item name, the position having the highest similarity with the window data extracted by the window generating unit is determined as the inconsistency. A search unit that identifies the position where data relating to the item name is missing;
The correction support device according to appendix 8, further comprising:

（付記１１）
前記対応位置特定部は、
前記第１の項目定義データにおいて前記共通項目名が出現する順序に従い、前記共通項目名が前記第１の処理詳細データにおいて出現する位置を表す位置情報と、前記共通項目名が前記第２の処理詳細データにおいて出現する位置を表す位置情報とを対応付ける
ことを特徴とする付記８乃至１０記載の修正支援装置。 (Appendix 11)
The corresponding position specifying unit is
In accordance with the order in which the common item names appear in the first item definition data, position information indicating the position where the common item names appear in the first process detail data, and the common item name is the second process. The correction support device according to any one of appendices 8 to 10, wherein the correction information is associated with position information representing a position that appears in the detailed data.

（付記１２）
前記第２の処理詳細データは、前記不整合項目名を含み且つ前記第１の処理詳細データとの類似度が高い
ことを特徴とする付記８乃至１１記載の修正支援装置。 (Appendix 12)
The correction support device according to any one of appendices 8 to 11, wherein the second processing detailed data includes the inconsistent item name and has a high degree of similarity with the first processing detailed data.

（付記１３）
処理に関連する項目名を定義する項目定義データ及び当該処理の内容を定義する処理詳細データを処理毎に格納するデータベースから、第１の項目定義データ及び当該第１の項目定義データに対応する第１の処理詳細データを読み出し、当該第１の項目定義データに定義されており且つ当該第１の処理詳細データに含まれていない項目名を不整合項目名として抽出するステップと、
前記データベースに格納されている処理詳細データのうち前記不整合項目名を含む第２の処理詳細データについて、前記不整合項目名が前記第２の処理詳細データに出現する位置を特定し、当該位置を表す位置情報を記憶装置に格納するステップと、
前記データベースから、前記第１の項目定義データ及び前記第２の処理詳細データに対応する第２の項目定義データに共に定義されている項目名を共通項目名として抽出し、当該共通項目名が前記第１の処理詳細データ及び前記第２の処理詳細データにおいて出現する位置をそれぞれ特定し、当該位置を表す位置情報を前記共通項目名に対応付けて前記記憶装置に格納する対応位置特定ステップと、
前記記憶装置に格納されているデータを用いて、前記共通項目名のうち、前記第２の処理詳細データにおいて前記不整合項目名の直前に出現する共通項目名及び直後に出現する共通項目名を直前項目名及び直後項目名として特定するステップと、
を含み、コンピュータにより実行される修正支援方法。 (Appendix 13)
The first item definition data and the first item definition data corresponding to the first item definition data are stored in the database storing the item definition data defining the item name related to the process and the process detail data defining the contents of the process for each process. Reading out the processing details data of 1 and extracting the item names defined in the first item definition data and not included in the first processing details data as inconsistent item names;
For the second process detail data including the inconsistent item name among the process detail data stored in the database, the position where the inconsistent item name appears in the second process detail data is specified, and the position Storing position information representing
An item name defined together in the second item definition data corresponding to the first item definition data and the second process detail data is extracted as a common item name from the database, and the common item name is A corresponding position identifying step of identifying each position appearing in the first process detail data and the second process detail data, and storing the position information representing the position in the storage device in association with the common item name;
Using the data stored in the storage device, among the common item names, the common item name that appears immediately before the inconsistent item name and the common item name that appears immediately after the inconsistent item name in the second processing detailed data Identifying the immediately preceding item name and the immediately following item name;
And a correction support method executed by a computer.

（付記１４）
前記第２の処理詳細データから、前記不整合項目名の前後それぞれ特定の範囲に含まれるデータをウィンドウ・データとして抽出する処理を、前記特定の範囲を変化させつつ実施するウィンドウ生成ステップと、
前記第１の処理詳細データにおける、前記直前項目名及び前記直後項目名で挟まれた範囲において、前記ウィンドウ生成ステップにおいて抽出された前記ウィンドウ・データの各々について当該ウィンドウ・データとの類似度が最も高い位置を探索し、前記ウィンドウ・データの各々について検出された位置のうち類似度が最も高い位置を、前記不整合項目名に関するデータが欠落している位置として特定するステップと、
をさらに含む付記１３記載の修正支援方法。 (Appendix 14)
A window generating step of performing processing for extracting data included in a specific range before and after the inconsistent item name as window data from the second processing detailed data while changing the specific range;
In the range between the immediately preceding item name and the immediately following item name in the first processing detailed data, each of the window data extracted in the window generating step has the highest similarity to the window data. Searching for a high position and identifying a position having the highest similarity among the positions detected for each of the window data as a position where data relating to the inconsistent item name is missing;
The correction support method according to supplementary note 13, further comprising:

（付記１５）
前記第２の処理詳細データにおける前記不整合項目名の前後において、処理内容の区切りを表すデータをそれぞれ特定し、当該処理内容の区切りを表すデータで挟まれた範囲に含まれるデータをウィンドウ・データとして抽出するウィンドウ生成ステップと、
前記第１の処理詳細データにおける、前記直前項目名及び前記直後項目名で挟まれた範囲において、前記ウィンドウ生成ステップにおいて抽出された前記ウィンドウ・データとの類似度が最も高い位置を、前記不整合項目名に関するデータが欠落している位置として特定するステップと、
をさらに含む付記１３記載の修正支援方法。 (Appendix 15)
Before and after the inconsistent item name in the second processing detail data, data indicating a processing content delimiter is specified, and data included in a range sandwiched by the data indicating the processing content delimiter is displayed as window data A window generation step to extract as
In the first processing detailed data, in the range sandwiched between the immediately preceding item name and the immediately following item name, the position having the highest similarity with the window data extracted in the window generating step is determined as the inconsistency. Identifying the location where the data relating to the item name is missing,
The correction support method according to supplementary note 13, further comprising:

（付記１６）
前記対応位置特定ステップが、
前記第１の項目定義データにおいて前記共通項目名が出現する順序に従い、前記共通項目名が前記第１の処理詳細データにおいて出現する位置を表す位置情報と、前記共通項目名が前記第２の処理詳細データにおいて出現する位置を表す位置情報とを対応付けるステップ
を含む付記１３乃至１５記載の修正支援方法。 (Appendix 16)
The corresponding position specifying step includes
In accordance with the order in which the common item names appear in the first item definition data, position information indicating the position where the common item names appear in the first process detail data, and the common item name is the second process. The correction support method according to any one of supplementary notes 13 to 15, further comprising a step of associating positional information representing a position appearing in the detailed data.

（付記１７）
前記第２の処理詳細データは、前記不整合項目名を含み且つ前記第１の処理詳細データとの類似度が高い
ことを特徴とする付記１３乃至１６記載の修正支援方法。 (Appendix 17)
The correction support method according to any one of appendices 13 to 16, wherein the second process detailed data includes the inconsistent item name and has a high degree of similarity with the first process detailed data.

（付記１８）
処理に関連する項目名を定義する項目定義データ及び当該処理の内容を定義する処理詳細データを処理毎に格納するデータベースから、第１の項目定義データ及び当該第１の項目定義データに対応する第１の処理詳細データを読み出し、当該第１の項目定義データに定義されており且つ当該第１の処理詳細データに含まれていない項目名を不整合項目名として抽出するステップと、
前記データベースに格納されている処理詳細データのうち前記不整合項目名を含む第２の処理詳細データについて、前記不整合項目名が前記第２の処理詳細データに出現する位置を特定し、当該位置を表す位置情報を記憶装置に格納するステップと、
前記データベースから、前記第１の項目定義データ及び前記第２の処理詳細データに対応する第２の項目定義データに共に定義されている項目名を共通項目名として抽出し、当該共通項目名が前記第１の処理詳細データ及び前記第２の処理詳細データにおいて出現する位置をそれぞれ特定し、当該位置を表す位置情報を前記共通項目名に対応付けて前記記憶装置に格納する対応位置特定ステップと、
前記記憶装置に格納されているデータを用いて、前記共通項目名のうち、前記第２の処理詳細データにおいて前記不整合項目名の直前に出現する共通項目名及び直後に出現する共通項目名を直前項目名及び直後項目名として特定するステップと、
を、コンピュータに実行させるための修正支援プログラム。 (Appendix 18)
The first item definition data and the first item definition data corresponding to the first item definition data are stored in the database storing the item definition data defining the item name related to the process and the process detail data defining the contents of the process for each process. Reading out the processing details data of 1 and extracting the item names defined in the first item definition data and not included in the first processing details data as inconsistent item names;
For the second process detail data including the inconsistent item name among the process detail data stored in the database, the position where the inconsistent item name appears in the second process detail data is specified, and the position Storing position information representing
An item name defined together in the second item definition data corresponding to the first item definition data and the second process detail data is extracted as a common item name from the database, and the common item name is A corresponding position identifying step of identifying each position appearing in the first process detail data and the second process detail data, and storing the position information representing the position in the storage device in association with the common item name;
Using the data stored in the storage device, among the common item names, the common item name that appears immediately before the inconsistent item name and the common item name that appears immediately after the inconsistent item name in the second processing detailed data Identifying the immediately preceding item name and the immediately following item name;
Is a correction support program for causing a computer to execute.

（付記１９）
前記第２の処理詳細データから、前記不整合項目名の前後それぞれ特定の範囲に含まれるデータをウィンドウ・データとして抽出する処理を、前記特定の範囲を変化させつつ実施するウィンドウ生成ステップと、
前記第１の処理詳細データにおける、前記直前項目名及び前記直後項目名で挟まれた範囲において、前記ウィンドウ生成ステップにおいて抽出された前記ウィンドウ・データの各々について当該ウィンドウ・データとの類似度が最も高い位置を探索し、前記ウィンドウ・データの各々について検出された位置のうち類似度が最も高い位置を、前記不整合項目名に関するデータが欠落している位置として特定するステップと、
を、さらにコンピュータに実行させるための付記１８記載の修正支援プログラム。 (Appendix 19)
A window generating step of performing processing for extracting data included in a specific range before and after the inconsistent item name as window data from the second processing detailed data while changing the specific range;
In the range between the immediately preceding item name and the immediately following item name in the first processing detailed data, each of the window data extracted in the window generating step has the highest similarity to the window data. Searching for a high position and identifying a position having the highest similarity among the positions detected for each of the window data as a position where data relating to the inconsistent item name is missing;
The correction support program according to appendix 18, for causing a computer to execute the above.

（付記２０）
前記第２の処理詳細データにおける前記不整合項目名の前後において、処理内容の区切りを表すデータをそれぞれ特定し、当該処理内容の区切りを表すデータで挟まれた範囲に含まれるデータをウィンドウ・データとして抽出するウィンドウ生成ステップと、
前記第１の処理詳細データにおける、前記直前項目名及び前記直後項目名で挟まれた範囲において、前記ウィンドウ生成ステップにおいて抽出された前記ウィンドウ・データとの類似度が最も高い位置を、前記不整合項目名に関するデータが欠落している位置として特定するステップと、
を、さらにコンピュータに実行させるための付記１８記載の修正支援プログラム。 (Appendix 20)
Before and after the inconsistent item name in the second processing detail data, data indicating a processing content delimiter is specified, and data included in a range sandwiched by the data indicating the processing content delimiter is displayed as window data A window generation step to extract as
In the first processing detailed data, in the range sandwiched between the immediately preceding item name and the immediately following item name, the position having the highest similarity with the window data extracted in the window generating step is determined as the inconsistency. Identifying the location where the data relating to the item name is missing,
The correction support program according to appendix 18, for causing a computer to execute the above.

（付記２１）
前記対応位置特定ステップが、
前記第１の項目定義データにおいて前記共通項目名が出現する順序に従い、前記共通項目名が前記第１の処理詳細データにおいて出現する位置を表す位置情報と、前記共通項目名が前記第２の処理詳細データにおいて出現する位置を表す位置情報とを対応付けるステップ
を含む付記１８乃至２０記載の修正支援プログラム。 (Appendix 21)
The corresponding position specifying step includes
In accordance with the order in which the common item names appear in the first item definition data, position information indicating the position where the common item names appear in the first process detail data, and the common item name is the second process. The correction support program according to any one of appendices 18 to 20, comprising a step of associating position information representing a position appearing in the detailed data.

（付記２２）
前記第２の処理詳細データは、前記不整合項目名を含み且つ前記第１の処理詳細データとの類似度が高い
ことを特徴とする付記１８乃至２１記載の修正支援プログラム。 (Appendix 22)
The correction support program according to any one of appendices 18 to 21, wherein the second processing detailed data includes the inconsistent item name and has a high degree of similarity with the first processing detailed data.

１入力データ処理部３入力データ格納部
５項目候補抽出部７設計書ＤＢ
９類似設計書特定部１１類似設計書格納部
１３項目候補格納部１５第一不整合項目特定部
１７第一不整合項目格納部１９出力部
５０１一致項目抽出部５０３一致項目データ格納部
５０５絞り込み部５０７共通項目データ格納部
５０９類似部分データ格納部５１１距離データ格納部
５０５１項目定義比較部５０５３出現部分比較部
１０１入力処理部１０３入力データ格納部
１０５設計書ＤＢ１０７類似設計書特定部
１０９第二不整合項目特定部１１１第二不整合項目格納部
１１３類似設計書格納部１１５補充文抽出部
１１７補充文格納部１１９絞り込み処理部
１１９１行番号リスト格納部１１９２対応位置データ格納部
１１９３行番号リスト生成部１１９４対応位置特定部
１１９５範囲特定部１２１絞り込み処理結果格納部
１２３補充位置決定部１２３１類似度格納部
１２３２ウィンドウ生成部１２３３探索部
１２５出力データ格納部１２７出力部 1 Input Data Processing Unit 3 Input Data Storage Unit 5 Item Candidate Extraction Unit 7 Design Document DB
DESCRIPTION OF SYMBOLS 9 Similar design document specific | specification part 11 Similar design document storage part 13 Item candidate storage part 15 1st inconsistency item specification part 17 1st inconsistency item storage part 19 Output part 501 Matching item extraction part 503 Matching item data storage part 505 Narrowing part 507 Common item data storage unit 509 Similar part data storage part 511 Distance data storage part 5051 Item definition comparison part 5053 Appearance part comparison part 101 Input processing part 103 Input data storage part 105 Design data DB 107 Similar design data specification part 109 Matching item specifying unit 111 Second inconsistent item storage unit 113 Similar design document storage unit 115 Supplementary sentence extraction unit 117 Supplementary sentence storage unit 119 Narrowing processing unit 1191 Line number list storage unit 1192 Corresponding position data storage unit 1193 Line number list generation unit 1194 Corresponding position specifying unit 1195 Range specifying unit 121 Result storage unit 123 supplemented position determination unit 1231 similarity storage unit 1232 the window generator 1233 searching unit 125 outputs the data storage unit 127 output unit

Claims

A database that stores, for each process, item definition data that defines an item name related to the process and process detail data that defines the content of the process;
First item definition data and first process detail data corresponding to the first item definition data are read from the database, and are defined in the first item definition data and the first process detail data. An item extraction unit that extracts item names that are not included as inconsistent item names,
For the second process detail data including the inconsistent item name among the process detail data stored in the database, the position where the inconsistent item name appears in the second process detail data is specified, and the position A replenishment data specifying unit that stores position information representing
An item name defined together in the second item definition data corresponding to the first item definition data and the second process detail data is extracted as a common item name from the database, and the common item name is A corresponding position identifying unit that identifies each position appearing in the first process detail data and the second process detail data, and stores position information representing the position in the storage device in association with the common item name;
Using the data stored in the storage device, among the common item names, the common item name that appears immediately before the inconsistent item name and the common item name that appears immediately after the inconsistent item name in the second processing detailed data A range identifying part that identifies the immediately preceding item name and the immediately following item name;
A correction support apparatus.

A window generation unit for performing processing for extracting data included in a specific range before and after the inconsistent item name as window data from the second processing detailed data while changing the specific range;
In the range between the immediately preceding item name and the immediately following item name in the first processing detailed data, each of the window data extracted by the window generation unit has the highest similarity to the window data. A search unit that searches for a high position and identifies a position having the highest similarity among the positions detected for each of the window data as a position where data relating to the inconsistent item name is missing,
The correction support device according to claim 1, further comprising:

Before and after the inconsistent item name in the second processing detail data, data indicating a processing content delimiter is specified, and data included in a range sandwiched by the data indicating the processing content delimiter is displayed as window data A window generator to extract as
In the first processing detailed data, in the range sandwiched between the immediately preceding item name and the immediately following item name, the position having the highest similarity with the window data extracted by the window generating unit is determined as the inconsistency. A search unit that identifies the position where data relating to the item name is missing;
The correction support device according to claim 1, further comprising:

The corresponding position specifying unit is
In accordance with the order in which the common item names appear in the first item definition data, position information indicating the position where the common item names appear in the first process detail data, and the common item name is the second process. The correction support device according to any one of claims 1 to 3, wherein the correction information is associated with position information representing a position that appears in the detailed data.

5. The correction support apparatus according to claim 1, wherein the second processing detailed data includes the inconsistent item name and has a high similarity to the first processing detailed data. .

The first item definition data and the first item definition data corresponding to the first item definition data are stored in the database storing the item definition data defining the item name related to the process and the process detail data defining the contents of the process for each process. Reading out the processing details data of 1 and extracting the item names defined in the first item definition data and not included in the first processing details data as inconsistent item names;
For the second process detail data including the inconsistent item name among the process detail data stored in the database, the position where the inconsistent item name appears in the second process detail data is specified, and the position Storing position information representing
An item name defined together in the second item definition data corresponding to the first item definition data and the second process detail data is extracted as a common item name from the database, and the common item name is Identifying each position appearing in the first process detail data and the second process detail data, and storing the position information representing the position in the storage device in association with the common item name;
Using the data stored in the storage device, among the common item names, the common item name that appears immediately before the inconsistent item name and the common item name that appears immediately after the inconsistent item name in the second processing detailed data Identifying steps;
And a correction support method executed by a computer.

The first item definition data and the first item definition data corresponding to the first item definition data are stored in the database storing the item definition data defining the item name related to the process and the process detail data defining the contents of the process for each process. Reading out the processing details data of 1 and extracting the item names defined in the first item definition data and not included in the first processing details data as inconsistent item names;
For the second process detail data including the inconsistent item name among the process detail data stored in the database, the position where the inconsistent item name appears in the second process detail data is specified, and the position Storing position information representing
An item name defined together in the second item definition data corresponding to the first item definition data and the second process detail data is extracted as a common item name from the database, and the common item name is Identifying each position appearing in the first process detail data and the second process detail data, and storing the position information representing the position in the storage device in association with the common item name;
Using the data stored in the storage device, among the common item names, the common item name that appears immediately before the inconsistent item name and the common item name that appears immediately after the inconsistent item name in the second processing detailed data Identifying steps;
Is a correction support program for causing a computer to execute.