JP6852002B2

JP6852002B2 - Data search method, data search device and program

Info

Publication number: JP6852002B2
Application number: JP2018023314A
Authority: JP
Inventors: 大石　巧; 巧大石; 洋花木
Original assignee: Hitachi GE Nuclear Energy Ltd
Current assignee: Hitachi GE Nuclear Energy Ltd
Priority date: 2018-02-13
Filing date: 2018-02-13
Publication date: 2021-03-31
Anticipated expiration: 2038-02-13
Also published as: JP2019139577A

Description

本発明は、電子データ検索システムに関する。 The present invention relates to an electronic data retrieval system.

発電所では建物や設備の設計情報、発電機器の仕様書、実体として存在する設備の構成情報、建物や設備の保全情報など多種多様で膨大な量のデータを管理する必要がある。特に原子力発電所はライフサイクルが４０年やそれ以上にわたるため、発電所のデータも４０年以上の長期間管理する必要がある。 At a power plant, it is necessary to manage a huge amount of diverse data such as design information of buildings and equipment, specifications of power generation equipment, configuration information of equipment that exists as an entity, and maintenance information of buildings and equipment. In particular, nuclear power plants have a life cycle of 40 years or more, so it is necessary to manage power plant data for a long period of 40 years or more.

ユーザが管理している膨大なデータの中から、ユーザが望むデータを迅速かつ正確に探し出せるよう、データ構造が互いに異なる情報システムを横断して検索するため、特許文献１、あるいは特許文献２に記載の技術が知られている。 Described in Patent Document 1 or Patent Document 2 in order to search across information systems having different data structures so that the data desired by the user can be quickly and accurately searched from the huge amount of data managed by the user. Technology is known.

特許文献１には「少なくとも日時情報が付加データとして付加された複数の検索対象データの検索を行うデータ検索システムであって、複数の検索対象データのうちから検索キーワードを含むキーワード一致データを検索し、該キーワード一致データに付加されている付加データに含まれる日時情報を連動検索キーワードとして抽出する検索手段と、複数の検索対象データのうちから連動検索キーワードと近似する付加データが付加された連動一致データが存在するかを検索する連動検索手段と、キーワード一致データ及び連動一致データを出力する出力手段と、を備えることを特徴とする。」と記載がある。 Patent Document 1 states, "It is a data search system that searches a plurality of search target data to which at least date and time information is added as additional data, and searches for keyword matching data including a search keyword from a plurality of search target data. , A search means for extracting the date and time information included in the additional data added to the keyword matching data as an interlocking search keyword, and an interlocking match in which additional data similar to the interlocking search keyword is added from a plurality of search target data. It is characterized by including an interlocking search means for searching for the existence of data and an output means for outputting keyword matching data and interlocking matching data. "

また、特許文献２には「画像に付与されている文字情報に基づいて類似画像を検索する文字検索部４と、この文字検索部４によって検索された類似画像を用いて画像の特徴情報を検出する特徴量数値化部５ａと、特徴情報を用いて、さらに類似する画像を検索する画像検索部５を具備」と記載がある。 Further, in Patent Document 2, "character search unit 4 that searches for similar images based on character information given to images, and similar images searched by the character search unit 4 are used to detect feature information of images. It is provided with a feature quantity quantification unit 5a to search for a similar image using feature information, and an image search unit 5 to search for similar images. "

特開２０１３−２０６３８７号公報Japanese Unexamined Patent Publication No. 2013-206387 特開２０１０−０４９３００号公報Japanese Unexamined Patent Publication No. 2010-049300

ライフサイクルが４０年以上の長期間にわたるため、既存の発電所には古いデータが膨大に存在し、それらのデータにはそもそも付加情報（タグやメタ情報等）がない場合もある。また、同一のデータであっても、長期間のうちに法律の改正等外部要因により付与すべき付加情報そのものが変化することも考えられる。 Since the life cycle is as long as 40 years or more, there is a huge amount of old data in existing power plants, and there are cases where these data do not have additional information (tags, meta information, etc.) in the first place. Moreover, even if the data is the same, it is possible that the additional information itself to be given may change due to external factors such as revision of the law over a long period of time.

さらに、長期間のうちに付加情報を管理する人員が交代せざるを得ないが、同一のデータであっても人により付与する付加情報が異なることが考えられる。これらの理由により、付加情報を利用するデータ検索では迅速かつ正確にデータを探し出せない場合がある。 Further, although the personnel who manage the additional information must be changed within a long period of time, it is conceivable that the additional information to be given differs depending on the person even if the data is the same. For these reasons, it may not be possible to find data quickly and accurately by data search using additional information.

特許文献１では、検索対象データに少なくとも日時情報が付加データとして付加されていることを前提としている。また、特許文献２では、画像データに付与されている文字情報に基づいて画像を検索している。すなわち検索対象のデータが付加情報のあるものに限定される。 Patent Document 1 is based on the premise that at least date and time information is added as additional data to the search target data. Further, in Patent Document 2, an image is searched based on the character information given to the image data. That is, the data to be searched is limited to those having additional information.

そこで本発明では、検索対象データが付加情報をもたない場合においても、データ構造が互いに異なる多種多様な情報システムを横断して検索するデータ検索システムの提供を目的とする。 Therefore, an object of the present invention is to provide a data search system that searches across a wide variety of information systems having different data structures even when the search target data does not have additional information.

本発明は、プロセッサとメモリを有する計算機が、第１のデータと第２のデータで検索を行うデータ検索方法であって、前記計算機が、前記第１のデータと前記第２のデータを読み込む第１のステップと、前記計算機が、前記第１のデータのデータ項目のデータ値と、第２のデータのデータ項目のデータ値から相関分析を行って、前記第１のデータのデータ項目と前記第２のデータのデータ項目の関連度を算出して関連付け情報を生成する第２のステップと、前記計算機が、検索キーワードを受け付けて、前記受け付けた検索キーワードと一致または部分一致する前記第１のデータのデータ項目を検索する第３のステップと、前記計算機が、前記検索キーワードと一致または部分一致する前記第１のデータのデータ項目で前記関連付け情報を検索し、前記検索キーワードと一致または部分一致する前記第１のデータのデータ項目と前記第２のデータのデータ項目の前記関連度が、所定の閾値以上の関連度のデータ項目を特定する第４のステップと、前記計算機が、前記検索キーワードと一致または部分一致する前記第１のデータのデータ項目と、前記特定された第２のデータのデータ項目と前記関連度を出力する第５のステップと、前記計算機が、前記出力された前記第１のデータのデータ項目と前記第２のデータのデータ項目の選択を受け付ける第６のステップと、前記計算機が、前記選択の有無に応じて前記閾値を変更する第８のステップと、を含む。 The present invention is a data retrieval method in which a computer having a processor and a memory searches for the first data and the second data, and the computer reads the first data and the second data. The first step and the computer perform a correlation analysis from the data value of the data item of the first data and the data value of the data item of the second data, and perform the correlation analysis with the data item of the first data and the first data. a second step of generating information correlated to calculate the relevance of the data item 2 of the data, the computer accepts the search keyword, the first data that matches or partial matches the received search keyword The third step of searching for the data item of the above, and the computer searches the association information in the data item of the first data that matches or partially matches the search keyword, and matches or partially matches the search keyword. A fourth step of specifying a data item having a degree of relevance between the data item of the first data and the data item of the second data having a degree of relevance equal to or higher than a predetermined threshold, and the computer using the search keyword. A fifth step of outputting the data item of the first data that matches or partially matches, the data item of the specified second data, and the degree of association, and the first output that the computer outputs. The sixth step of accepting the selection of the data item of the data and the data item of the second data, and the eighth step of the computer changing the threshold value depending on the presence or absence of the selection .

したがって、本発明は、付加情報を持たないデータに対して、ある情報システムにおいてある検索キーワードに該当するデータが存在した場合に、他の情報システムにおいてこのデータと関連するデータを特定することができるため、検索対象データが付加情報をもたない場合においても、データ構造が互いに異なる情報システムを横断して検索可能となる。 Therefore, according to the present invention, when data corresponding to a certain search keyword exists in a certain information system for data having no additional information, the data related to this data can be specified in another information system. Therefore, even when the search target data does not have additional information, it is possible to search across information systems having different data structures.

本発明の実施例１を示し、データ検索システムの機能の一例を示すブロック図である。It is a block diagram which shows Example 1 of this invention and shows an example of the function of the data search system. 本発明の実施例１を示し、データ検索装置のハードウェア構成の一例を示すブロック図である。It is a block diagram which shows Example 1 of this invention and shows an example of the hardware structure of the data search apparatus. 本発明の実施例１を示し、検索キーワード入力画面の一例を示す図である。It is a figure which shows Example 1 of this invention and shows an example of the search keyword input screen. 本発明の実施例１を示し、検索結果表示画面の一例を示す図である。It is a figure which shows Example 1 of this invention and shows an example of the search result display screen. 本発明の実施例１を示し、データ検索装置で行われる処理の一例を示すシーケンス図である。It is a sequence diagram which shows Example 1 of this invention and shows an example of the processing performed by the data search apparatus. 本発明の実施例１を示し、相関分析の一例を示すフローチャートである。It is a flowchart which shows Example 1 of this invention and shows an example of correlation analysis. 本発明の実施例１を示し、データ検索の一例を示すフローチャートである。It is a flowchart which shows Example 1 of this invention and shows an example of data search. 本発明の実施例１を示し、絞り込み検索の一例を示すフローチャートである。It is a flowchart which shows Example 1 of this invention and shows an example of narrowing down search. 本発明の実施例３を示し、検索結果のフィードバック処理の一例を示すフローチャートである。It is a flowchart which shows Example 3 of this invention and shows an example of feedback processing of a search result. 本発明の実施例１を示し、データ項目リストの一例を示す図である。It is a figure which shows Example 1 of this invention and shows an example of the data item list. 本発明の実施例１を示し、関連付け情報表の一例を示す図である。It is a figure which shows Example 1 of this invention and shows an example of the association information table. 本発明の実施例１を示し、相関分析データテーブルの一例を示す図である。It is a figure which shows Example 1 of this invention and shows an example of the correlation analysis data table. 本発明の実施例１を示し、検索対象データの一例を示す図である。It is a figure which shows Example 1 of this invention and shows an example of the search target data. 本発明の実施例１を示し、検索対象データの他の例を示す図である。It is a figure which shows Example 1 of this invention and shows other example of the search target data. 本発明の実施例４を示し、検索結果のフィードバック処理の一例を示すフローチャートである。FIG. 5 is a flowchart showing Example 4 of the present invention and showing an example of feedback processing of search results.

以下、本発明の実施形態を添付図面に基づいて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings.

図１は本発明によるデータ検索システムの機能の一例を示すブロック図である。本データ検索システムは、データ２−１〜２−ｎが蓄積された業務システム１−Ａ、１−Ｂと、ネットワーク１６を介して業務システム１−Ａ、１−Ｂに接続されたデータ検索装置１００とを含む。なお、以下では業務システム１−Ａ、１−Ｂを特定しない場合には、「−」以降を省略した符号「１」を使用する。他の構成要素の符号についても同様である。 FIG. 1 is a block diagram showing an example of the function of the data retrieval system according to the present invention. This data search system is a data search device connected to business systems 1-A and 1-B in which data 2-1 to 2-n are stored and business systems 1-A and 1-B via a network 16. Includes 100 and. In the following, when the business systems 1-A and 1-B are not specified, the code "1" with the "-" and subsequent "-" omitted is used. The same applies to the codes of other components.

データ検索装置１００は、データ整形部１０１と、データ収集部１０２と、データ項目リスト１０３と、相関分析部１０４と、関連付け情報表１０５と、データＡＰＩ１０７と、データ検索部１０８及び相関分析データテーブル１０９を含む計算機である。以下、本データ検索システムの概要を説明する。 The data search device 100 includes a data shaping unit 101, a data collecting unit 102, a data item list 103, a correlation analysis unit 104, an association information table 105, a data API 107, a data search unit 108, and a correlation analysis data table 109. It is a computer including. The outline of this data search system will be described below.

データ検索装置１００は、ユーザによる業務システム１のデータ２の検索を実行する前に、まず複数の業務システム１が保持するデータ２の項目間の関連付けを行う。データ検索装置１００では、外部の複数の業務システム１が保持するデータ２をデータ整形部１０１が読み込み、読み込んだデータに含まれる時間データで並べ替えたデータ２をデータ収集部１０２へ送信する。 The data search device 100 first associates the items of the data 2 held by the plurality of business systems 1 before executing the search of the data 2 of the business system 1 by the user. In the data search device 100, the data shaping unit 101 reads the data 2 held by the plurality of external business systems 1, and transmits the data 2 sorted by the time data included in the read data to the data collecting unit 102.

データ収集部１０２は、データ２の項目のみを抽出し、データ項目リスト１０３に登録する。相関分析部１０４は、データ２の値を用いてデータ項目間の相関係数を計算することでデータ項目間の関連度を算出し、関連付け情報表１０５に記憶する。 The data collection unit 102 extracts only the items of the data 2 and registers them in the data item list 103. The correlation analysis unit 104 calculates the degree of relevance between the data items by calculating the correlation coefficient between the data items using the value of the data 2, and stores it in the association information table 105.

データ検索装置１００のユーザはＧＵＩ１０６から検索キーワードを入力してデータ検索を行う。データＡＰＩ１０７で受信した検索キーワードはデータ検索部１０８に送られる。データ検索部１０８では、検索キーワードでデータ項目リスト１０３全体を検索し、一致（または部分一致）するデータ項目を探す（以下、一致項目と呼ぶ）。 The user of the data search device 100 inputs a search keyword from the GUI 106 to perform a data search. The search keyword received by the data API 107 is sent to the data search unit 108. The data search unit 108 searches the entire data item list 103 with a search keyword and searches for a matching (or partial matching) data item (hereinafter, referred to as a matching item).

データ検索部１０８は、一致項目があった場合、業務システム１−Ａ、１−Ｂ間を横断した検索を実行するため、さらに関連付け情報表１０５を一致項目で検索し、一致項目と関連度が大きいデータ項目（以下、関連項目と呼ぶ）を探す。データ検索部１０８は、こうして得られた、一致項目および、関連項目を検索結果としてＧＵＩ１０６に表示する。 When there is a match item, the data search unit 108 further searches the association information table 105 with the match item in order to execute a search across the business systems 1-A and 1-B, and the match item and the degree of relevance are checked. Search for large data items (hereinafter referred to as related items). The data search unit 108 displays the matching items and related items thus obtained on the GUI 106 as search results.

このとき、データ検索部１０８は、一致項目と関連項目でそれぞれ外部の業務システム１を検索し、データ値を収集してＧＵＩ１０６に表示する。これにより、ユーザのデータ検索回数を低減し、より迅速に所望のデータが得られるよう支援することができる。 At this time, the data search unit 108 searches the external business system 1 for each of the matching item and the related item, collects the data value, and displays it on the GUI 106. As a result, it is possible to reduce the number of data searches by the user and support the acquisition of desired data more quickly.

なお、業務システム１のデータ２は、各業務システム１で稼働する図示しないデータベース（ＤＢ）で管理される。 The data 2 of the business system 1 is managed by a database (DB) (not shown) that operates in each business system 1.

図２は、データ検索装置１００のハードウェア構成の一例を示すブロック図である。データ検索装置１００は、ＣＰＵ１０と、主記憶１１と、外部記憶装置１２と、ネットワークインタフェース１３と、入力装置１４と、出力装置１５を含む計算機である。 FIG. 2 is a block diagram showing an example of the hardware configuration of the data search device 100. The data search device 100 is a computer including a CPU 10, a main memory 11, an external storage device 12, a network interface 13, an input device 14, and an output device 15.

データ検索装置１００は、ネットワークインタフェース１３により、外部の業務システム１とデータを送受信する。また、図１のデータ整形部１０１と、データ収集部１０２と、相関分析部１０４と、データＡＰＩ１０７と、データ検索部１０８のプログラムデータは主記憶１１上にロードされ、ＣＰＵ１０で実行され、結果をＧＵＩ１０６へ出力する。データ項目リスト１０３と、関連付け情報表１０５は外部記憶装置１２に格納される。なお、相関分析データテーブル１０９は、主記憶１１に格納される例を示すが、外部記憶装置１２に格納されても良い。 The data search device 100 transmits / receives data to / from the external business system 1 by the network interface 13. Further, the program data of the data shaping unit 101, the data collecting unit 102, the correlation analysis unit 104, the data API 107, and the data retrieval unit 108 of FIG. 1 are loaded on the main memory 11 and executed by the CPU 10, and the result is displayed. Output to GUI 106. The data item list 103 and the association information table 105 are stored in the external storage device 12. Although the correlation analysis data table 109 shows an example of being stored in the main storage 11, it may be stored in the external storage device 12.

出力装置１５は、ディスプレイなどで構成されてＧＵＩ１０６を表示する。入力装置１４は、キーボードやマウスやタッチパネル等で構成されＧＵＩ１０６に対する操作を実施する。 The output device 15 is composed of a display or the like to display the GUI 106. The input device 14 is composed of a keyboard, a mouse, a touch panel, and the like, and performs operations on the GUI 106.

ＣＰＵ１０は、各部のプログラムに従って処理することによって、所定の機能を提供する部として稼働する。例えば、ＣＰＵ１０は、データ検索プログラムに従って処理することでデータ検索部１０８として機能する。他のプログラムについても同様である。さらに、ＣＰＵ１０は、各プログラムが実行する複数の処理のそれぞれの機能を提供する部としても稼働する。計算機及び計算機システムは、これらの部を含む装置及びシステムである。 The CPU 10 operates as a unit that provides a predetermined function by processing according to the program of each unit. For example, the CPU 10 functions as a data search unit 108 by processing according to a data search program. The same applies to other programs. Further, the CPU 10 also operates as a unit that provides each function of a plurality of processes executed by each program. A computer and a computer system are devices and systems including these parts.

図３はユーザが検索キーワードを入力するための検索キーワード入力画面２００の一例を示す図である。検索キーワード入力画面２００は、検索対象となる業務システム１のリストと、それらを実際に検索対象とするか選択するボタン２０７と、検索キーワードの履歴を表示する領域２０８と、検索キーワード入力領域２０１と、検索キーワードの候補をプルダウンメニューで表示する領域２０３−１〜２０３−４と、絞り込み用のキーを入力する領域２０２と、検索ボタン２０４と、絞り込み実行ボタン２０５と、追加検索ボタン２０６などを含む。 FIG. 3 is a diagram showing an example of a search keyword input screen 200 for a user to input a search keyword. The search keyword input screen 200 includes a list of business systems 1 to be searched, a button 207 for selecting whether to actually search for them, an area 208 for displaying the history of search keywords, and a search keyword input area 201. , Areas 203-1 to 203-4 for displaying search keyword candidates in a pull-down menu, area 202 for inputting a narrowing key, search button 204, narrowing execution button 205, additional search button 206, and the like. ..

図４はユーザに検索結果を表示するための検索結果表示画面２１０の一例を示す図である。検索結果表示画面２１０は、検索結果表示領域２１４と、検索キーワードと検索結果のデータ項目との関連度表示領域２１２と、ファイル出力するデータ項目の選択ボタン２１１と、ファイル出力実行ボタン２１３と、検索されたデータ項目を含んでいる元のデータベースを呼び出すためのリンクボタン２１５などを含む。 FIG. 4 is a diagram showing an example of a search result display screen 210 for displaying a search result to a user. The search result display screen 210 includes a search result display area 214, a relevance display area 212 between the search keyword and the search result data item, a file output data item selection button 211, a file output execution button 213, and a search. Includes a link button 215 and the like for calling the original database containing the created data items.

ファイル出力実行ボタン２１３は、検索結果表示領域２１４に表示された検索結果の一覧をＣＳＶ形式のファイルで出力する例を示すが、このファイル形式に限定されるものではない。 The file output execution button 213 shows an example of outputting a list of search results displayed in the search result display area 214 as a CSV format file, but the file output execution button 213 is not limited to this file format.

図５は本発明によるデータ検索システムで行われる処理の一例を示すシーケンス図である。処理は大きく２つに分けられる。１つは外部の業務システム１のデータ２を本データ検索装置１００に接続してから、業務システム１−Ａ、１−Ｂにまたがり、データ項目間の関連度を計算する処理である。 FIG. 5 is a sequence diagram showing an example of processing performed by the data retrieval system according to the present invention. The processing can be roughly divided into two. One is a process of connecting the data 2 of the external business system 1 to the data search device 100 and then calculating the degree of relevance between the data items across the business systems 1-A and 1-B.

もう１つの処理が、ユーザがデータ検索する部分である。前者は相関分析手続き３００まで、後者が検索キーワード入力３０１以降である。以下２つの処理を順番に説明する。 Another process is the part where the user searches for data. The former is up to the correlation analysis procedure 300, and the latter is after the search keyword input 301. The following two processes will be described in order.

外部の業務システム１を本データ検索装置１００に接続した後、本データ検索装置１００は外部の業務システム１が保持するデータ２をすべて読み込み、相関分析手続き３００によりデータ２間の関連度を計算する。 After connecting the external business system 1 to the data search device 100, the data search device 100 reads all the data 2 held by the external business system 1 and calculates the degree of relevance between the data 2 by the correlation analysis procedure 300. ..

相関分析手続き３００の詳細については図６で説明する。このデータ間の関連度の計算が終わるまでの間は、ユーザによるデータ検索を実行してもよいが、通常のキーワード検索と同等の機能を提供する。すなわち、外部の業務システム１のそれぞれに合った適切なキーワードを入力する必要がある。 The details of the correlation analysis procedure 300 will be described with reference to FIG. Until the calculation of the degree of relevance between the data is completed, the data search by the user may be executed, but the function equivalent to the normal keyword search is provided. That is, it is necessary to input an appropriate keyword suitable for each of the external business systems 1.

ユーザが入力装置１４を介して検索キーワード入力画面２００から検索キーワードを入力して検索ボタン２０４をクリックする（３０１）と、検索キーワードが本データ検索装置１００に入力される。データ検索手続き３０２により、入力された検索キーワードに一致するデータと、一致はしないが関連度が大きいデータを検索結果表示画面に表示する（３０３）。このデータ検索手続き３０２の詳細は図７で説明する。 When the user inputs a search keyword from the search keyword input screen 200 via the input device 14 and clicks the search button 204 (301), the search keyword is input to the data search device 100. According to the data search procedure 302, the data that matches the input search keyword and the data that does not match but has a high degree of relevance are displayed on the search result display screen (303). The details of the data search procedure 302 will be described with reference to FIG.

続いてユーザは、次のデータ検索を行う（３０８）こともできるし、さらに絞り込み検索３０４を行うこともできる。また、検索結果を他の業務システム１で活用するため、ファイル出力３０６することもできる。 Subsequently, the user can perform the next data search (308), or further perform the refined search 304. Further, in order to utilize the search result in the other business system 1, the file output 306 can also be performed.

絞り込み検索３０４とそれに伴うフィルタリング手続き３０５の詳細は図８で説明する。ファイル出力３０６の際のフィードバック手続き３０７の詳細については実施例３で説明する。 The details of the refined search 304 and the filtering procedure 305 associated therewith will be described with reference to FIG. The details of the feedback procedure 307 at the time of the file output 306 will be described in the third embodiment.

なお、図５においては、説明のためＧＵＩ１０６とデータ検索装置１００を分けて表示したが、図１、図２で示したようにＧＵＩ１０６は、データ検索装置１００が提供するユーザインタフェースである。なお、本実施例１では、データ検索装置１００のユーザがＧＵＩ１０６を利用する例を示したが、データ検索装置１００に接続された計算機でＧＵＩ１０６を利用することができる。 In FIG. 5, the GUI 106 and the data search device 100 are separately displayed for the sake of explanation, but as shown in FIGS. 1 and 2, the GUI 106 is a user interface provided by the data search device 100. In the first embodiment, the user of the data search device 100 uses the GUI 106, but the computer connected to the data search device 100 can use the GUI 106.

図６は複数の業務システム１−Ａ、１−Ｂの間で、データ項目間の関連度を計算する手続き３００のフローチャートである。ここでは、２つの業務システム１−Ａ、１−Ｂが新規に本データ検索装置１００に接続された場合を説明するが、３つ以上の業務システム１が新規に接続された場合も同様である。また、すでに接続済の業務システム１がある場合に、別の業務システム１を追加する場合、接続済の業務システム１の間で再計算を行わない以外は同様である。 FIG. 6 is a flowchart of the procedure 300 for calculating the degree of relevance between data items among a plurality of business systems 1-A and 1-B. Here, the case where two business systems 1-A and 1-B are newly connected to the data search device 100 will be described, but the same applies to the case where three or more business systems 1 are newly connected. .. Further, when another business system 1 is added when there is already a connected business system 1, the same applies except that recalculation is not performed between the connected business systems 1.

２つの業務システム１−Ａ、１−Ｂを説明の都合上業務システムＡ、Ｂ（図中Ａ、Ｂ）と呼ぶ。データ検索装置１００のデータ整形部１０１において業務システムＡ、Ｂからそれぞれデータを読み込む（４００、４０１）。 The two business systems 1-A and 1-B are referred to as business systems A and B (A and B in the figure) for convenience of explanation. The data shaping unit 101 of the data search device 100 reads data from the business systems A and B, respectively (400, 401).

業務システム１のデータ２はデータ項目（データ名称）と複数のデータ値から成り、複数のデータ項目を含む。データ検索装置１００では、業務システムＡ、Ｂのデータ２のそれぞれから１つずつデータ項目を選択し、２つのデータ項目からなる組を生成する。データ検索装置１００は、この組をすべてのデータ項目の組み合わせについて生成し、すべての組の相関係数を算出する（４０２）。相関係数の算出にあたり、データ値が文字列の場合はデータ検索装置１００で数値化する必要がある。 The data 2 of the business system 1 is composed of a data item (data name) and a plurality of data values, and includes a plurality of data items. The data search device 100 selects one data item from each of the data 2 of the business systems A and B, and generates a set consisting of the two data items. The data search device 100 generates this set for all combinations of data items and calculates the correlation coefficient for all sets (402). In calculating the correlation coefficient, if the data value is a character string, it is necessary to digitize it with the data search device 100.

相関係数の算出については、公知または周知の技術を適用すればよく、例えば、日本工業規格ＪＩＳＺ８１０１−１統計‐用語及び記号‐第１部：一般統計用語及び確率で用いられる用語の１．２３項が知られている（http://kikakurui.com/z8/Z8101-1-2015-01.html）。 For the calculation of the correlation coefficient, a known or well-known technique may be applied. For example, Japanese Industrial Standards JIS Z8101-1 Statistics-Terms and Symbols-Part 1: General statistical terms and terms used in probability 1 .23 is known (http://kikakurui.com/z8/Z8101-1-2015-01.html).

文字列の数値化についても、公知または周知の技術を適用すればよく、たとえば次のような方法がある。データ項目ごとに、データ値である文字列に１から順番に番号を付与する。また、文字列が完全に一致する場合は同じ番号を付与する。以下の説明において文字列に対して相関分析すると記載がある場合は上記数値化をしてから相関分析するものとする。 A known or well-known technique may be applied to the digitization of the character string, and there are, for example, the following methods. For each data item, a number is assigned to the character string that is the data value in order from 1. If the character strings match exactly, the same number is assigned. In the following explanation, if there is a description that the correlation analysis is performed on the character string, the correlation analysis shall be performed after the above digitization.

相関係数算出の詳細は次の通りである。説明のため、業務システムＡのデータ項目をｐ、業務システムＢのデータ項目をｑとする（４０３）。まず、データ検索装置１００のデータ収集部１０２は、読み込んだ項目ｐと項目ｑをすべて時間順に並べ替える（４０４）。 The details of the correlation coefficient calculation are as follows. For the sake of explanation, let p be the data item of the business system A and q be the data item of the business system B (403). First, the data collection unit 102 of the data search device 100 sorts all the read items p and q in chronological order (404).

次に、データ収集部１０２において、項目ｐ、項目ｑをその業務システム１のＤＢの名称とともにデータ項目リスト１０３に記憶する。また、データ収集部１０２は、項目ｐ、ｑのそれぞれにつき、データ値が共に変化する変化タイミング１１０１（図１２参照）と、変化の前後のデータ値の組み合わせ、とを抽出し、相関分析データテーブル１０９に格納する（４０５）。データ値の組み合わせは、業務システムＡの変化タイミングの前後の項目ｐのデータ値の組み合わせ（第１の組合せ）と、業務システムＢの変化タイミングの前後の項目ｑのデータ値の組み合わせ（第２の組合せ）である。 Next, in the data collecting unit 102, the item p and the item q are stored in the data item list 103 together with the name of the DB of the business system 1. Further, the data collection unit 102 extracts the change timing 1101 (see FIG. 12) in which the data values change together and the combination of the data values before and after the change for each of the items p and q, and performs a correlation analysis data table. Store in 109 (405). The combination of data values is a combination of data values of item p before and after the change timing of business system A (first combination) and a combination of data values of item q before and after the change timing of business system B (second combination). Combination).

本実施例１の相関分析部１０４は、データ値が変化する時間（または時間帯）を中心に相関分析を実施する（４０６）。データ値の変化点だけで相関分析する理由は、異なるデータ項目を時間情報で横並びにし、そのデータ値の変化を考えたとき、関連するデータであれば同じように変化すると考えられるためである。 The correlation analysis unit 104 of the first embodiment performs the correlation analysis centering on the time (or time zone) when the data value changes (406). The reason for the correlation analysis only at the change point of the data value is that when different data items are arranged side by side with time information and the change of the data value is considered, it is considered that the related data changes in the same way.

図１２は、相関分析部１０４が使用する相関分析データテーブル１０９の一例を示す図である。相関分析データテーブル１０９は、変化タイミング１１０１と、データ項目１（項目ｐ）のデータ値変化１１０２と、データ項目２（項目ｑ）のデータ値変化１１０３をひとつのエントリに含む。 FIG. 12 is a diagram showing an example of the correlation analysis data table 109 used by the correlation analysis unit 104. The correlation analysis data table 109 includes the change timing 1101, the data value change 1102 of the data item 1 (item p), and the data value change 1103 of the data item 2 (item q) in one entry.

図示の例では、変化タイミング１１０１＝「Ｔ３」で、項目ｐの値（データ値変化１１０２）が「１」から「３」に変化し、項目ｑの値（データ値変化１１０３）が「１００」から「１０１」へ変化した例を示す。 In the illustrated example, when the change timing 1101 = "T3", the value of item p (data value change 1102) changes from "1" to "3", and the value of item q (data value change 1103) is "100". An example of the change from "101" to "101" is shown.

相関分析部１０４において、項目ｐのデータ値が変化する変化タイミング１１０１と、項目ｐの変化の前後のデータ値の組み合わせ（データ値変化１１０２）と、項目ｑのデータ値が変化する変化タイミング１１０１と変化の前後のデータ値の組み合わせ（データ値変化１１０３）を用いて相関係数を計算し、この相関係数の値を項目ｐと項目ｑの関連度とする（４０６）。最後に相関分析部１０４は、項目ｐと項目ｑと関連度を関連付け情報表１０５に記憶する（４０７）。 In the correlation analysis unit 104, the change timing 1101 in which the data value of the item p changes, the combination of the data values before and after the change in the item p (data value change 1102), and the change timing 1101 in which the data value of the item q changes. A correlation coefficient is calculated using a combination of data values before and after the change (data value change 1103), and the value of this correlation coefficient is taken as the degree of association between item p and item q (406). Finally, the correlation analysis unit 104 stores the item p, the item q, and the degree of association in the association information table 105 (407).

データ検索装置１００は、業務システムＡのデータ項目と、業務システムＢのデータ項目の全ての組合せについて、上記ステップ４００〜４０７の処理を繰り返して実行する。 The data search device 100 repeatedly executes the processes of steps 400 to 407 for all combinations of the data items of the business system A and the data items of the business system B.

上記処理によって、業務システムＡのデータ項目の値と、業務システムＢのデータ項目の値を時系列でソートされ、業務システムＡ、Ｂの２つの値が共に変化する時刻または時間帯が抽出され、データ検索装置１００は、２つの値の相関係数を算出して関連度とする。そして、データ検索装置１００は、業務システムＡのデータ項目と、業務システムＢのデータ項目と、関連度を関連付け情報表１０５に格納していく。関連付け情報表１０５には、同じ時間帯や同じ時刻に値が変化したデータ項目の組合せが関連度と共に格納される。 By the above processing, the value of the data item of the business system A and the value of the data item of the business system B are sorted in chronological order, and the time or time zone in which both the two values of the business system A and B change is extracted. The data search device 100 calculates the correlation coefficient between the two values and uses it as the degree of relevance. Then, the data search device 100 stores the data items of the business system A, the data items of the business system B, and the degree of relevance in the association information table 105. In the association information table 105, combinations of data items whose values have changed in the same time zone or at the same time are stored together with the degree of association.

なお、２つのデータ項目の値が共に変化する変化タイミング１１０１は、同一の時刻に限定されるものではなく、所定の時間帯（５分間、１時間、１日）など検索対象となるデータ２の内容や性質に応じて適宜設定されるものである。 The change timing 1101 in which the values of the two data items change together is not limited to the same time, but is the data 2 to be searched such as a predetermined time zone (5 minutes, 1 hour, 1 day). It is set as appropriate according to the content and properties.

図７はユーザ入力によるデータ検索手続き３０２を実施するフローチャートである。データ検索装置１００のユーザは検索キーワード入力画面２００において、検索キーワード入力領域２０１に検索キーワードを入力するか、または、本データ検索装置１００が提示する候補リスト２０３から検索の候補を選択する（５００）。 FIG. 7 is a flowchart for executing the data search procedure 302 by user input. The user of the data search device 100 inputs a search keyword in the search keyword input area 201 on the search keyword input screen 200, or selects a search candidate from the candidate list 203 presented by the data search device 100 (500). ..

検索キーワード入力画面２００の候補リスト２０３は、たとえば、データ項目リスト１０３を参照して生成する。この時、ユーザが業務システム１−Ａ、１−Ｂを指定して、検索の候補となるデータ項目リスト１０３を指定することもできる。これら、ＧＵＩ１０６へ入力された検索キーワード、または検索の候補はデータＡＰＩ１０７で受信され、データ検索部１０８へ送られる。 The candidate list 203 of the search keyword input screen 200 is generated by referring to, for example, the data item list 103. At this time, the user can also specify the business systems 1-A and 1-B and specify the data item list 103 as a search candidate. These search keywords or search candidates input to the GUI 106 are received by the data API 107 and sent to the data search unit 108.

データ検索部１０８において、受信した検索キーワード、または検索の候補でデータ項目リスト１０３全体を検索する（５０１）。ステップ５０２では、データ検索部１０８が検索キーワードまたは検索の候補でヒットしたデータ項目の有無を判定する（５０２）。検索結果のデータ項目がない場合には、ステップ５００へ戻ってユーザに再度検索キーワードを入力してもらう。 The data search unit 108 searches the entire data item list 103 with the received search keyword or search candidate (501). In step 502, the data search unit 108 determines whether or not there is a data item hit by the search keyword or the search candidate (502). If there is no data item in the search result, the process returns to step 500 and asks the user to enter the search keyword again.

検索の結果、１つ、または複数のデータ項目名１０００（図１０参照）、業務システム１のＤＢ名称１００１が得られた場合、データ検索部１０８は、ステップ５０３へ進んで、検索結果のデータ項目名１０００と業務システム１のＤＢ名称１００１で関連付け情報表１０５を検索する。 When one or more data item names 1000 (see FIG. 10) and DB name 1001 of the business system 1 are obtained as a result of the search, the data search unit 108 proceeds to step 503 and proceeds to step 503 to obtain the data items of the search result. The association information table 105 is searched by the name 1000 and the DB name 1001 of the business system 1.

データ検索部１０８は、関連付け情報表１０５のデータ項目９００とデータ項目９００の業務システム１のＤＢ名称９０１を検索し、２つとも検索結果と一致する場合には、さらなるデータ項目９０２とその業務システム１のＤＢ名称９０３と、関連度９０４を取得する。 The data search unit 108 searches the DB name 901 of the business system 1 of the data item 900 and the data item 900 of the association information table 105, and if both match the search results, further data item 902 and its business system. The DB name 903 of 1 and the degree of association 904 are acquired.

ステップ５０４では、データ検索部１０８が、ステップ５０３で取得した関連度９０４があらかじめ決められた閾値以上か否かを判定する。関連度９０４が閾値以上の場合、ステップ５０５へ進み、そうでない場合には、ステップ５００へ戻って上記処理を繰り返す。なお、関連度９０４が未満の場合には、検索キーワードに関連するデータ項目が無かったことをＧＵＩ１０６へ出力しても良い。 In step 504, the data search unit 108 determines whether or not the relevance degree 904 acquired in step 503 is equal to or greater than a predetermined threshold value. If the degree of relevance 904 is equal to or greater than the threshold value, the process proceeds to step 505, and if not, the process returns to step 500 and the above process is repeated. If the degree of relevance 904 is less than, it may be output to GUI 106 that there is no data item related to the search keyword.

ステップ５０５では、データ検索部１０８が、関連付け情報表１０５の検索結果のデータ項目９００とデータ項目９００の業務システム１のＤＢ名称９０１と、関連付け情報表１０５から取得した該当するデータ項目９０２と業務システム１のＤＢ名称９０３を関連度９０４とともに検索結果表示画面２１０を生成してＧＵＩ１０６へ出力する。このとき、関連度９０４の大きな順に並び替えて出力することもできる。また、検索キーワード、または検索の候補に一致したデータ項目とその業務システム１のＤＢ名称も検索結果として出力してもよい。 In step 505, the data search unit 108 uses the data item 900 of the search result of the association information table 105, the DB name 901 of the business system 1 of the data item 900, and the corresponding data item 902 and the business system acquired from the association information table 105. The search result display screen 210 is generated with the DB name 903 of 1 together with the relevance degree 904, and is output to the GUI 106. At this time, it is also possible to sort and output in descending order of the degree of relevance 904. Further, the data item matching the search keyword or the search candidate and the DB name of the business system 1 may also be output as the search result.

データ検索部１０８は、これらデータ項目９００、９０２と業務システム１のＤＢ名称９０１、９０３を検索キーワードとして外部の業務システム１を検索し（図５のステップ３０２）、得られたデータ値を検索結果表示画面２１０に出力する（図５のステップ３０３）。これら検索結果の一例が図４の検索結果表示画面２１０である。 The data search unit 108 searches the external business system 1 using these data items 900 and 902 and the DB names 901 and 903 of the business system 1 as search keywords (step 302 in FIG. 5), and searches the obtained data values as search results. Output to the display screen 210 (step 303 in FIG. 5). An example of these search results is the search result display screen 210 of FIG.

上記処理によって、データ検索装置１００は、受け付けた検索キーワード（または検索候補）でデータ項目リスト１０３を検索し、データ項目名１０００と一致する場合には、当該データ項目名１０００とＤＢ名称１００１で関連付け情報表１０５を検索して、関連度９０４が閾値以上のデータ項目９０２とＤＢ名称９０３を取得する。 By the above processing, the data search device 100 searches the data item list 103 with the received search keyword (or search candidate), and if it matches the data item name 1000, associates the data item name 1000 with the DB name 1001. The information table 105 is searched to acquire the data item 902 and the DB name 903 whose relevance degree 904 is equal to or higher than the threshold value.

そして、データ検索部１０８は、業務システム１のＤＢ名称９０１、９０３からデータ項目９００、９０２のデータ値を取得して検索結果表示画面２１０に出力する。これにより、検索対象データが付加情報を持たない場合においても、データ構造が互いに異なる多種多様な業務システム１を横断して検索することが可能となる。 Then, the data search unit 108 acquires the data values of the data items 900 and 902 from the DB names 901 and 903 of the business system 1 and outputs them to the search result display screen 210. As a result, even when the search target data does not have additional information, it is possible to search across a wide variety of business systems 1 having different data structures.

図８は、検索結果に対して、ユーザ入力による絞り込み検索３０４（図５）を実施するフローチャートである。この処理は、図７の検索処理の後に、所定の操作で検索キーワード入力画面２００を出力装置１５（ＧＵＩ１０６）に表示させてから実行される。あるいは、検索キーワード入力画面２００の下方に検索結果表示画面２１０を表示するようにしてもよい。 FIG. 8 is a flowchart for performing a narrowed search 304 (FIG. 5) by user input on the search result. This process is executed after displaying the search keyword input screen 200 on the output device 15 (GUI106) by a predetermined operation after the search process of FIG. 7. Alternatively, the search result display screen 210 may be displayed below the search keyword input screen 200.

ユーザが検索キーワード入力画面２００において、絞り込み入力領域２０２に絞り込みキーを入力し、絞り込み実行ボタン２０５をクリックすると、データＡＰＩ１０７がＧＵＩ１０６から絞り込みキーを受信する（６００）。 When the user inputs the narrowing key in the narrowing input area 202 on the search keyword input screen 200 and clicks the narrowing execution button 205, the data API 107 receives the narrowing key from the GUI 106 (600).

続いて、データ検索部１０８は、検索結果表示領域２１４から、絞り込みキーに一致するデータ項目を選択する（６０１）。最後に、データ検索部１０８は、選択されたもののみを検索結果表示領域２１４に表示し、選択されなかったものを検索結果表示領域２１４から削除する（６０２、図５のステップ３０５）。 Subsequently, the data search unit 108 selects a data item matching the narrowing key from the search result display area 214 (601). Finally, the data search unit 108 displays only the selected items in the search result display area 214, and deletes the unselected items from the search result display area 214 (602, step 305 in FIG. 5).

図１０はデータ項目リスト１０３の一例である。データ収集部１０２が外部の業務システム１からデータ２を読み込んで抽出したデータ項目１０００と、当該業務システム１のＤＢ名称（ＤＢ名）１００１とを組にして保持するテーブルである。 FIG. 10 is an example of the data item list 103. This is a table in which the data item 1000 extracted by the data collecting unit 102 by reading the data 2 from the external business system 1 and the DB name (DB name) 1001 of the business system 1 are held as a set.

図１１は関連付け情報表１０５の一例である。異なる業務システム１−Ａ、１−Ｂから選択した２つのデータのそれぞれについて、業務システム１−Ａのデータ項目（データ名称）９００と業務システム１−ＡのＤＢ名称９０１と、業務システム１−Ｂのデータ項目（データ名称）９０２と業務システム１−ＢのＤＢ名称９０３と、データ項目９００、９０２の間の関連度９０４と、関連度の補正値９０５、をひとつのエントリに格納する。 FIG. 11 is an example of the association information table 105. For each of the two data selected from the different business systems 1-A and 1-B, the data item (data name) 900 of the business system 1-A, the DB name 901 of the business system 1-A, and the business system 1-B Data item (data name) 902, DB name 903 of business system 1-B, relevance degree 904 between data items 900 and 902, and relevance degree correction value 905 are stored in one entry.

関連度９０４は、値が１に近づくほどデータ項目９００、９０２間の関連性が高いことを示す。また、関連度の補正値９０５は、１．０〜−１．０の範囲の値となる。 The degree of relevance 904 indicates that the closer the value is to 1, the higher the relevance between the data items 900 and 902. Further, the correction value 905 of the degree of relevance is a value in the range of 1.0 to −1.0.

図１３と図１４は外部の業務システム１が保持するデータ２の一例である。本実施例１では、検索対象のデータとして、時刻や日付あるいはタイムスタンプ等の時系列情報と、データ項目名と、時系列情報に対応するデータ項目の値を含んでいれば良い。 13 and 14 are examples of data 2 held by the external business system 1. In the first embodiment, the data to be searched may include time-series information such as time, date, or time stamp, data item names, and data item values corresponding to the time-series information.

以上のように、本実施例１では、データ収集部１０２が検索対象の業務システム１のデータ２を読み込んで、データ項目名１０００と、データ２を格納するＤＢ名１００１を収集してデータ項目リスト１０３を生成する。 As described above, in the first embodiment, the data collection unit 102 reads the data 2 of the business system 1 to be searched, collects the data item name 1000 and the DB name 1001 for storing the data 2, and collects the data item list. Generate 103.

次に、データ収集部１０２は、業務システム１−Ａのデータ項目と、業務システム１−Ｂのデータ項目のそれぞれについて時系列でソートしてから、データ値が共に変化する変化タイミング１１０１を検出し、変化の前後のデータ値の組み合わせを抽出して相関分析データテーブル１０９を生成する。 Next, the data collection unit 102 sorts the data items of the business system 1-A and the data items of the business system 1-B in chronological order, and then detects the change timing 1101 in which the data values change together. , The combination of the data values before and after the change is extracted to generate the correlation analysis data table 109.

そして、相関分析部１０４は、データ値から業務システム１−Ａのデータ項目と、業務システム１−Ｂのデータ項目の関連度を算出して関連付け情報表１０５を生成する。 Then, the correlation analysis unit 104 calculates the degree of relevance between the data item of the business system 1-A and the data item of the business system 1-B from the data value, and generates the association information table 105.

このように、２つのデータ値が共に変化する変化タイミング１１０１を検出して、関連度９０４を算出することにより、検索対象データが付加情報をもたない場合においても、データ構造が互いに異なる多種多様な業務システムを横断して検索する情報を生成することができる。 In this way, by detecting the change timing 1101 in which the two data values change together and calculating the relevance degree 904, even when the search target data does not have additional information, the data structures are different from each other. It is possible to generate information to be searched across various business systems.

そして、データ検索装置１００は、受け付けた検索キーワードでデータ項目リスト１０３を検索し、データ項目名１０００と一致する場合には、当該データ項目名１０００とＤＢ名称１００１で関連付け情報表１０５を検索して、関連度９０４が閾値以上のデータ項目９０２とＤＢ名称９０３を取得する。 Then, the data search device 100 searches the data item list 103 with the received search keyword, and if it matches the data item name 1000, searches the association information table 105 with the data item name 1000 and the DB name 1001. , Acquire the data item 902 and the DB name 903 whose relevance degree 904 is equal to or higher than the threshold value.

そして、データ検索部１０８は、業務システム１−Ａ、１−ＢのＤＢ名称９０１、９０３からデータ項目９００、９０２のデータ値を取得して検索結果表示画面２１０に出力する。これにより、検索対象データが付加情報を持たない場合においても、データ構造が互いに異なる多種多様な業務システム１を横断して検索することが可能となる。 Then, the data search unit 108 acquires the data values of the data items 900 and 902 from the DB names 901 and 903 of the business systems 1-A and 1-B and outputs them to the search result display screen 210. As a result, even when the search target data does not have additional information, it is possible to search across a wide variety of business systems 1 having different data structures.

本実施例２では、データ検索装置１００のユーザが望むデータをより迅速に探し出せるように実施例１の検索、表示機能を拡張する例を示す。以下、特段の説明がない場合は実施例１と同様とする。 In the second embodiment, an example is shown in which the search and display functions of the first embodiment are expanded so that the user of the data search device 100 can find the desired data more quickly. Hereinafter, the same as in Example 1 will be applied unless otherwise specified.

実施例１の図５で説明したように、データ検索装置１００ではユーザが１度検索を実行した後に検索結果表示領域２１４（図４）に対して絞り込み検索３０４を行うこともできる。この絞り込み検索３０４の前や後に、さらに広く検索を行うため検索キーワードを追加して検索する、追加検索を実行する。 As described with reference to FIG. 5 of the first embodiment, in the data search device 100, after the user executes the search once, the narrowed search 304 can be performed on the search result display area 214 (FIG. 4). Before or after the refined search 304, an additional search is executed in which a search keyword is added to perform an additional search in order to perform a wider search.

検索結果表示領域２１４を参考にしつつこの追加検索と絞り込み検索を複数回繰り返し実行することでユーザは望むデータをより迅速に探し出すことが可能となる。以下追加検索について説明する。 By repeatedly executing the additional search and the refined search a plurality of times while referring to the search result display area 214, the user can find the desired data more quickly. The additional search will be described below.

追加検索は、図３に示した検索キーワード入力画面２００において、検索キーワード入力領域２０１に追加で検索したい追加検索キーワードを入力し、追加検索ボタン２０６を押下することで実行される。この追加検索キーワード２０２はまったく新たなキーでも可能だが、通常は検索結果のうち、さらに詳細に調べたい検索キーワードを選択して入力する。追加検索の実行手順は前記実施例１の図７と同様であり、異なる点は前回の検索結果表示領域２１４の結果を破棄せずに維持し、これに加えて追加検索結果を出力することである。 The additional search is executed by inputting the additional search keyword to be additionally searched in the search keyword input area 201 on the search keyword input screen 200 shown in FIG. 3 and pressing the additional search button 206. This additional search keyword 202 can be entered with a completely new key, but usually, the search keyword to be searched in more detail is selected and entered from the search results. The execution procedure of the additional search is the same as that of FIG. 7 of the first embodiment, and the difference is that the result of the previous search result display area 214 is maintained without being discarded, and the additional search result is output in addition to this. is there.

また、図４の検索結果表示画面２１０においてリンクボタン２１５を押下することで、図１３や図１４のような、検索されたデータ項目を含んでいる元のデータベース（業務システム１のデータ２）を呼び出すことも可能となる。 Further, by pressing the link button 215 on the search result display screen 210 of FIG. 4, the original database (data 2 of the business system 1) including the searched data items as shown in FIGS. 13 and 14 can be obtained. It is also possible to call.

元のデータベースに存在しているデータ項目を追加検索キーワードとして追加検索も可能である。すなわち、リンクボタン２１５と追加検索ボタン２０６を組合せて検索することで、関連度の高いキーワードをたどってデータ検索が可能となる。 It is also possible to perform an additional search using data items that exist in the original database as additional search keywords. That is, by searching by combining the link button 215 and the additional search button 206, it is possible to search for data by tracing keywords having a high degree of relevance.

本実施例３では、データ検索装置１００のユーザの入力履歴を活用し、実施例１または実施例２の検索機能の精度向上を図る例を示す。以下、特段の説明がない場合は実施例１または実施例２と同様とする。 In the third embodiment, an example is shown in which the input history of the user of the data search device 100 is utilized to improve the accuracy of the search function of the first or second embodiment. Hereinafter, unless otherwise specified, the same applies to Example 1 or Example 2.

実施例１の図５で説明したように、ユーザが１度検索を実行した後、次のデータ検索を行う（３０８）こともできるし、さらに絞り込み検索３０４を行うこともできる。また、検索結果１２４を他の業務システム１で活用するため、ファイル出力３０６を実施することもできる。ファイル出力３０６の際のフィードバック手続き３０７の詳細について説明する。 As described with reference to FIG. 5 of the first embodiment, after the user executes the search once, the next data search can be performed (308), or the refined search 304 can be further performed. Further, in order to utilize the search result 124 in the other business system 1, the file output 306 can also be implemented. The details of the feedback procedure 307 at the time of the file output 306 will be described.

図９はユーザによる検索結果をもとに、データ項目間の関連付け情報表１０５を更新するフィードバック手続き３０７のフローチャートである。ユーザが検索結果表示画面２１０において、ファイル出力実行ボタン２１３をクリックし（図５の３０６）、データＡＰＩ１０７がＣＳＶ出力要求を受信する（７００）。 FIG. 9 is a flowchart of the feedback procedure 307 that updates the association information table 105 between data items based on the search result by the user. The user clicks the file output execution button 213 on the search result display screen 210 (306 in FIG. 5), and the data API 107 receives the CSV output request (700).

データ検索装置１００のデータ検索部１０８は、ユーザが検索結果表示画面２１０上で、選択ボタン２１１によって選択したデータ項目に正の補正値、たとえば＋０．１点を、選択しなかったデータ項目に負の補正値、たとえば−０．１点を与える（７０１）。この時、ユーザごとに点数を可変にしてもよい。たとえば、ユーザが熟練者であれば、検索結果の選択が正解とみなして重みを増して＋０．３点、−０．３点とすることが可能である。最後に、図１１の関連付け情報表１０５において、前記データ項目に該当するエントリの補正値９０５に前記補正値を加算する（７０２）。 The data search unit 108 of the data search device 100 sets a positive correction value, for example, +0.1 points for the data item selected by the user with the selection button 211 on the search result display screen 210, and negatively for the data item not selected. A correction value of, for example, −0.1 points is given (701). At this time, the score may be variable for each user. For example, if the user is an expert, the selection of the search result can be regarded as the correct answer and the weight can be increased to +0.3 points and -0.3 points. Finally, in the association information table 105 of FIG. 11, the correction value is added to the correction value 905 of the entry corresponding to the data item (702).

この補正値９０５は、ユーザの入力によるデータ検索手続き３０２で適用する。具体的には図７のフローチャートの手続き５０４において、検索の結果得られた関連度９０４に補正値９０５を加算してから閾値と比較する。こうすることにより、検索結果に対するユーザの判断結果をデータ項目間の関連付けに反映して、検索精度の向上が可能となる。 This correction value 905 is applied in the data search procedure 302 input by the user. Specifically, in the procedure 504 of the flowchart of FIG. 7, the correction value 905 is added to the relevance degree 904 obtained as a result of the search, and then the threshold value is compared. By doing so, it is possible to improve the search accuracy by reflecting the user's judgment result on the search result in the association between the data items.

なお、ユーザが熟練者か否かの判定は、ユーザの識別子に応じてユーザの属性を予め設定したテーブルを用いれば良く、ユーザの属性に応じて閾値を補正する値を変更することができる。なお、ユーザの属性に応じて閾値を変更してもよい。 It should be noted that the determination of whether or not the user is an expert may be performed by using a table in which the user's attributes are preset according to the user's identifier, and the value for correcting the threshold value can be changed according to the user's attributes. The threshold value may be changed according to the user's attributes.

実施例１において、図７で説明した、検索結果として出力するデータ項目の範囲を決める閾値を、ユーザの入力履歴を活用し変更する手順を説明する。 In the first embodiment, the procedure for changing the threshold value for determining the range of the data items to be output as the search result, which is described with reference to FIG. 7, by utilizing the input history of the user will be described.

閾値を増やすと、検索結果表示画面２１０に出力するデータ項目の数が減り、閾値を減らすと、検索結果表示画面２１０に出力するデータ項目の数が増える。閾値を変化させ、出力するデータ項目数を適切に保つことにより、ユーザがＣＳＶ出力するデータを選択する負荷を軽減することが可能となる。 Increasing the threshold reduces the number of data items output to the search result display screen 210, and decreasing the threshold increases the number of data items output to the search result display screen 210. By changing the threshold value and keeping the number of data items to be output appropriately, it is possible to reduce the load on the user to select the data to be output in CSV format.

図１５はその具体的手順を示すフローチャートである。データ検索装置１００のユーザが検索結果表示画面２１０において、ファイル出力実行ボタン２１３をクリックした際に、ユーザが選択ボタン２１１により選択したデータ項目数が、あらかじめ定めた同時選択限度数未満の場合は閾値を０．０１減らし、同時選択限度数以上の場合は閾値を０．０１増やす。これにより、ユーザの入力履歴をステップ５０４の判定処理にフィードバックすることができる。 FIG. 15 is a flowchart showing the specific procedure. When the user of the data search device 100 clicks the file output execution button 213 on the search result display screen 210, if the number of data items selected by the user by the selection button 211 is less than the predetermined simultaneous selection limit, the threshold value is set. Is reduced by 0.01, and if the number is equal to or greater than the simultaneous selection limit, the threshold is increased by 0.01. As a result, the user's input history can be fed back to the determination process in step 504.

＜まとめ＞
なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明を分かりやすく説明するために詳細に記載したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施例の構成の一部を他の実施例の構成に置き換えることが可能であり、また、ある実施例の構成に他の実施例の構成を加えることも可能である。また、各実施例の構成の一部について、他の構成の追加、削除、又は置換のいずれもが、単独で、又は組み合わせても適用可能である。 <Summary>
The present invention is not limited to the above-described examples, and includes various modifications. For example, the above-described embodiment is described in detail in order to explain the present invention in an easy-to-understand manner, and is not necessarily limited to the one including all the configurations described. Further, it is possible to replace a part of the configuration of one embodiment with the configuration of another embodiment, and it is also possible to add the configuration of another embodiment to the configuration of one embodiment. Further, for a part of the configurations of each embodiment, any of addition, deletion, or replacement of other configurations can be applied alone or in combination.

また、上記の各構成、機能、処理部、及び処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。また、上記の各構成、及び機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリや、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記録装置、または、ＩＣカード、ＳＤカード、ＤＶＤ等の記録媒体に置くことができる。 Further, each of the above configurations, functions, processing units, processing means and the like may be realized by hardware by designing a part or all of them by, for example, an integrated circuit. Further, each of the above configurations, functions, and the like may be realized by software by the processor interpreting and executing a program that realizes each function. Information such as programs, tables, and files that realize each function can be stored in a memory, a hard disk, a recording device such as an SSD (Solid State Drive), or a recording medium such as an IC card, an SD card, or a DVD.

また、制御線や情報線は説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。実際には殆ど全ての構成が相互に接続されていると考えてもよい。 In addition, the control lines and information lines indicate those that are considered necessary for explanation, and do not necessarily indicate all the control lines and information lines in the product. In practice, it can be considered that almost all configurations are interconnected.

１−Ａ、１−Ｂ業務システム
２−１〜２−ｎデータ
１００データ検索装置
１０１データ整形部、
１０２データ収集部
１０３データ項目リスト
１０４相関分析部
１０５関連付け情報表
１０７データＡＰＩ
１０８データ検索部
１０９相関分析データテーブル 1-A, 1-B Business system 2-1-2-n Data 100 Data search device 101 Data shaping unit,
102 Data collection unit 103 Data item list 104 Correlation analysis unit 105 Association information table 107 Data API
108 Data search unit 109 Correlation analysis data table

Claims

A data search method in which a computer having a processor and a memory searches for the first data and the second data.
The first step in which the computer reads the first data and the second data,
The computer performs a correlation analysis from the data value of the data item of the first data and the data value of the data item of the second data, and performs a correlation analysis between the data item of the first data and the data of the second data. The second step of calculating the relevance of items and generating association information,
It said computer comprises a third step of searching a search keyword is accepted, the data item of the first data that matches or partial matches the received search keyword,
The computer searches for the association information in the data item of the first data that matches or partially matches the search keyword, and the data item of the first data that matches or partially matches the search keyword and the second data item. The fourth step of identifying a data item having a degree of relevance of the data item of the data of the above is equal to or higher than a predetermined threshold.
A fifth step in which the computer outputs the data item of the first data that matches or partially matches the search keyword, the data item of the specified second data, and the degree of relevance.
A sixth step in which the computer accepts the selection of the output data item of the first data and the data item of the second data.
An eighth step in which the computer changes the threshold value depending on the presence or absence of the selection.
A data retrieval method characterized by including.

The data search method according to claim 1.
The first data and the second data include time series information, respectively.
The second step is
The data values of the first data and the second data are sorted in the order of the time series information, and the timing at which both the data values of the first data and the data values of the second data change is extracted. Steps and
A step of generating a data value of the first data before the change timing of the data value and a data value after the change timing as a first set, and
A step of generating a data value before the change timing of the data value of the second data and a data value after the change timing as a second set, and a step of generating the data value after the change timing.
A step of calculating the correlation coefficient between the first set and the second set and using the correlation coefficient as the degree of relevance,
A data retrieval method characterized by including.

The data search method according to claim 1.
A ninth step in which the computer receives a user's identifier, acquires a preset attribute according to the identifier, and changes the threshold value according to the attribute.
A data retrieval method characterized by further including.

A data search device that has a processor and memory and performs a search using the first data and the second data.
The first data and the second data are read, correlation analysis is performed from the data value of the data item of the first data and the data value of the data item of the second data, and the first data And the correlation analysis unit that calculates the degree of relevance between the data item of the second data and the data item of the second data and generates the association information.
The search keyword is accepted, the data item of the first data that matches or partially matches the search keyword is searched, and the association information is searched for by the data item of the first data that matches or partially matches the search keyword. Then, a search unit that identifies the data item of the second data related to the data item of the first data, and the search unit.
Have,
The search unit
A data item having a degree of relevance in which the data item of the first data and the data item of the second data that match or partially match the search keyword has a degree of relevance equal to or higher than a predetermined threshold value is specified.
The data item of the first data that matches or partially matches the search keyword, the data item of the specified second data, and the degree of relevance are output, and the output data of the first data. A data search device that accepts selection of an item and a data item of the second data, and changes the threshold value according to the presence or absence of the selection.

The data retrieval device according to claim 4.
The first data and the second data include time series information, respectively.
The correlation analysis unit
The data values of the first data and the second data are sorted in the order of the time series information, and the timing at which both the data values of the first data and the data values of the second data change is extracted. ,
The data value before the change timing of the data value of the first data and the data value after the change timing are generated as the first set.
A data value before the change timing of the data value of the second data and a data value after the change timing are generated as a second set.
A data retrieval apparatus characterized in that a correlation coefficient between the first set and the second set is calculated and the correlation coefficient is used as the degree of relevance.

The data retrieval device according to claim 4.
The search unit
A data retrieval device that receives a user's identifier, acquires a preset attribute according to the identifier, and changes the threshold value according to the attribute.

A program that executes a search for the first data and the second data on a computer having a processor and memory.
The first step of reading the first data and the second data, and
Correlation analysis is performed from the data value of the data item of the first data and the data value of the data item of the second data, and the degree of relevance between the data item of the first data and the data item of the second data. The second step of calculating and generating the association information,
A third step of accepting a search keyword and searching for a data item of the first data that matches or partially matches the accepted search keyword.
The association information is searched for in the data item of the first data that matches or partially matches the search keyword, and the data item of the first data and the data of the second data that match or partially match the search keyword. A fourth step of identifying a data item having a degree of relevance of the item equal to or higher than a predetermined threshold, and
A fifth step of outputting the data item of the first data that matches or partially matches the search keyword, the data item of the specified second data, and the degree of association.
A sixth step in which the computer accepts the selection of the output data item of the first data and the data item of the second data.
An eighth step in which the computer changes the threshold value depending on the presence or absence of the selection.
A program for causing the computer to execute.