JP2019087199A

JP2019087199A - Data management system, data management method, and data management program

Info

Publication number: JP2019087199A
Application number: JP2017217286A
Authority: JP
Inventors: 恵介畑崎; Keisuke Hatasaki; 敬太郎上原; Keitaro Uehara
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2017-11-10
Filing date: 2017-11-10
Publication date: 2019-06-06
Anticipated expiration: 2037-11-10
Also published as: JP6725476B2

Abstract

To properly manage data acquired from a data source.SOLUTION: A server 110 and a data store apparatus 130 for managing data to be acquired from predetermined data sources 102 include: a data store 140 which stores data from the data sources 102; a utilization calculation unit which calculates utilization degree indicating effectiveness on using data analysis of data of the data sources 102, on the basis of statistical information on the contents of the data; and an action execution unit which executes predetermined processing operation corresponding to an action condition with respect to the data when the action condition including a condition on the calculated utilization degree is satisfied. The utilization calculation unit may calculate utilization degree for data in a predetermined period of the data source.SELECTED DRAWING: Figure 1

Description

本発明は、データソースから取得可能なデータを管理するデータ管理システム等に関する。 The present invention relates to a data management system or the like that manages data obtainable from a data source.

様々な機器やシステムの稼動情報のデータ、カメラから取得された映像情報や音声情報のデータ、および人物や機器等に搭載されたセンサから収集されたデータ等を蓄積し、それらデータを用いて可視化・解析・分析等することで重要な情報を取り出す試みが拡大している。 Data of operation information of various devices and systems, data of video information and audio information acquired from camera, and data collected from sensors mounted on persons and devices etc. are stored and visualized using these data -Attempts to extract important information through analysis and analysis are expanding.

しかしながら、高頻度に発生し、且つ膨大な量となる、センサや稼動ログなどのデータをすべて蓄積し保持するようにすると、データの利活用時に膨大な量のデータから適切なデータを検索することが困難であり、また。膨大な量のデータを維持するためのコストが増大する。 However, if all data such as sensors and operation logs, which occur frequently and become huge, are accumulated and held, appropriate data can be retrieved from the huge amount of data when utilizing the data. Is also difficult. The cost of maintaining huge amounts of data increases.

このようなデータにおいては、必ずしもすべてのデータを分析等に活用するわけではないため、利活用する可能性が低いデータを削減する方法を検討すべきである。 With such data, not all data are necessarily used for analysis etc., so methods should be considered to reduce data that is unlikely to be used.

このような課題に対して、例えば、データを蓄積する際に、データ保存前に、ストリームデータ処理等を実施し、データの選定や集約等を実施することで、蓄積されるデータ量を削減する方法が知られている。また、すべてのデータを一旦保持・蓄積しておき、蓄積してから一定期間が経過したデータや、一定期間利用しなかったデータを、定期的にバックアップ環境等へ移動したり、削除したりする方法も知られている。例えば、アクセス頻度の低いデータを下位のティアに移動させる技術としては、特許文献１に記載された技術が知られている。 To address such issues, for example, when storing data, stream data processing etc. is performed before data storage, and data selection and aggregation are performed to reduce the amount of data stored. The method is known. Also, all data is temporarily stored and accumulated, and data for which a certain period has elapsed since accumulation and data that has not been used for a certain period are periodically moved or deleted to a backup environment etc. Methods are also known. For example, as a technology for moving data with low access frequency to a lower tier, the technology described in Patent Document 1 is known.

特開２０１０−２５７０９４号公報JP, 2010-257094, A

例えば、データを蓄積する際に、データ保存前にストリームデータ処理等を実施し、データの選定や集約等を実施することで、蓄積されるデータ量を削減する方法を用いた場合には、データの選定や集約を実施した際に、必要なデータを削除してしまって、データ活用時に重要なデータが得られない虞がある。 For example, when data is stored, stream data processing etc. is performed prior to data storage, and data selection and aggregation are performed to reduce the amount of data stored. When the selection and consolidation are performed, necessary data may be deleted, and important data may not be obtained at the time of data utilization.

一方、データを蓄積してから一定期間が経過したデータや、一定期間利用しなかったデータについて、格納先を移動したり、データを削除したりする方法を用いた場合を考慮すると、後のデータ利活用において必要となるデータは、単に蓄積した期間や利用頻度だけで判断すること適切でないと考えられる場合がある。例えば、過去のデータを分析する場合においては、データを蓄積してから長期間が経過した際に、過去の状況を解析するために、それまで全く使用されていなかったデータを急遽利用する可能性が生じる場合がある。このような場合においては、必要なデータが削除されていたり、必要なデータの格納場所を探すのが困難であったりする虞がある。 On the other hand, in consideration of the case where the storage destination is moved or data is deleted for data for which a certain period has elapsed since data accumulation and data that has not been used for a certain period, the later data It may be considered that it is not appropriate to judge the data necessary for utilization simply by the accumulated period and the frequency of use. For example, in the case of analyzing past data, when a long period of time has passed since data were stored, the possibility of suddenly using data that has not been used at all before to analyze the past situation May occur. In such a case, the necessary data may be deleted, or it may be difficult to find the necessary storage location of the data.

本発明は、上記事情に鑑みなされたものであり、その目的は、データソースから取得したデータを適切に管理することのできる技術を提供することにある。 The present invention has been made in view of the above circumstances, and an object thereof is to provide a technology capable of appropriately managing data acquired from a data source.

上記目的を達成するため、一観点に係るデータ管理システムは、所定のデータソースから取得可能なデータを管理するデータ管理システムであって、データソースからのデータを記憶する記憶部と、データの内容に関する統計情報に基づいて、データソースのデータのデータ分析の利用に関する有効性の度合いを示す活用度を計算する活用度計算部と、計算された活用度に関する条件を含むアクション条件を満たす場合に、データに対してアクション条件に対応する所定の処理動作を実行するアクション実行部とを備える。 In order to achieve the above object, a data management system according to one aspect is a data management system for managing data obtainable from a predetermined data source, and a storage unit for storing data from the data source, and contents of the data The utilization degree calculation unit which calculates the utilization degree indicating the degree of effectiveness regarding the utilization of data analysis of data of the data source based on statistical information on the data, and an action condition including a condition regarding the calculated utilization degree, And an action execution unit that executes a predetermined processing operation corresponding to an action condition on data.

本発明によれば、データソースから取得したデータを適切に管理することができる。 According to the present invention, data acquired from a data source can be appropriately managed.

図１は、実施例１に係る計算機システムの全体構成図である。FIG. 1 is an entire configuration diagram of a computer system according to a first embodiment. 図２は、実施例１に係るデータ管理プログラムの機能ブロック図である。FIG. 2 is a functional block diagram of a data management program according to the first embodiment. 図３は、実施例１に係るデータソース管理テーブルの構成図である。FIG. 3 is a configuration diagram of a data source management table according to the first embodiment. 図４は、実施例１に係る活用度データの構成図である。FIG. 4 is a block diagram of utilization degree data according to the first embodiment. 図５は、実施例１に係るメタデータの構成図である。FIG. 5 is a block diagram of metadata according to the first embodiment. 図６は、実施例１に係るカタログデータの構成図である。FIG. 6 is a configuration diagram of catalog data according to the first embodiment. 図７は、実施例１に係るアクション定義テーブルの構成図である。FIG. 7 is a configuration diagram of an action definition table according to the first embodiment. 図８は、実施例１に係るメタデータ管理処理のフローチャートである。FIG. 8 is a flowchart of metadata management processing according to the first embodiment. 図９は、実施例１に係るデータ取得処理のフローチャートである。FIG. 9 is a flowchart of data acquisition processing according to the first embodiment. 図１０は、実施例１に係る活用度計算処理のフローチャートである。FIG. 10 is a flowchart of utilization degree calculation processing according to the first embodiment. 図１１は、実施例１に係る活用度計算を説明する図である。FIG. 11 is a diagram for explaining utilization calculation according to the first embodiment. 図１２は、実施例１に係るアクション管理処理のフローチャートである。FIG. 12 is a flowchart of action management processing according to the first embodiment. 図１３は、実施例１に係るアクション実行処理のフローチャートである。FIG. 13 is a flowchart of action execution processing according to the first embodiment. 図１４は、実施例１に係るカタログ管理処理のフローチャートである。FIG. 14 is a flowchart of catalog management processing according to the first embodiment. 図１５は、実施例１に係るデータソース検索画面の一例を示す図である。FIG. 15 is a diagram illustrating an example of a data source search screen according to the first embodiment. 図１６は、実施例１に係るカタログ評価画面の一例を示す図である。FIG. 16 is a diagram illustrating an example of a catalog evaluation screen according to the first embodiment. 図１７は、実施例２に係る計算機システムの全体構成図である。FIG. 17 is an entire configuration diagram of a computer system according to a second embodiment. 図１８は、実施例２に係る関係データに格納された内容を示す図である。FIG. 18 is a diagram showing the contents stored in the related data according to the second embodiment. 図１９は、実施例２に係る関連度計算を説明する図である。FIG. 19 is a diagram for explaining calculation of the degree of association according to the second embodiment. 図２０は、実施例３に係る計算機システムの全体構成図である。FIG. 20 is an entire configuration diagram of a computer system according to a third embodiment. 図２１は、実施例３に係るユーザ管理テーブルの構成図である。FIG. 21 is a configuration diagram of a user management table according to the third embodiment. 図２２は、実施例３に係るメタデータの構成図である。FIG. 22 is a configuration diagram of metadata according to the third embodiment.

いくつかの実施例について、図面を参照して説明する。なお、以下に説明する実施例は特許請求の範囲に係る発明を限定するものではなく、また実施例の中で説明されている諸要素及びその組み合わせの全てが発明の解決手段に必須であるとは限らない。 Several embodiments will be described with reference to the drawings. The embodiments described below do not limit the invention according to the claims, and all of the elements described in the embodiments and their combinations are essential for the solution means of the invention. There is no limit.

以下の説明では、「ＡＡＡテーブル」の表現にて情報を説明することがあるが、情報は、どのようなデータ構造で表現されていてもよい。すなわち、情報がデータ構造に依存しないことを示すために、「ＡＡＡテーブル」を「ＡＡＡ情報」と呼ぶことができる。 In the following description, information may be described by the expression “AAA table”, but the information may be expressed by any data structure. That is, the "AAA table" can be called "AAA information" to indicate that the information does not depend on the data structure.

図１は、実施例１に係る計算機システムの全体構成図である。 FIG. 1 is an entire configuration diagram of a computer system according to a first embodiment.

計算機システム１は、１以上の装置（Ａｓｓｅｔ：アセットともいう）１０１と、ゲートウェイ（Ｇａｔｅｗａｙ）１０３と、サーバ１１０と、データストア装置１３０と、バックアップ用データストア１６０とを備える。これらの構成間は、例えば、有線または無線のネットワークにより接続されている。アセット１０１は、サイズの小さいものから、サイズの大きいもの（建設機械）等を含んでもよく、アセット１０１は、例えば、機器、設備、デバイスと呼ばれるものも含んでよい。なお、Ｇａｔｅｗａｙ１０３と、バックアップ用データストア１６０とは、計算機システム１０に備えていなくてもよい。ここで、サーバ１１０と、データストア装置１３０とにより、データ管理システムが構成される。図１では、サーバ１１０と、データストア装置１３０とは、別体の構成となっているが、本発明はこれに限られず、サーバ１１０と、データストア装置１３０とを１つの計算機で構成してもよい。 The computer system 1 includes one or more devices (also referred to as assets) 101, a gateway 103, a server 110, a data store device 130, and a backup data store 160. These configurations are connected by, for example, a wired or wireless network. The asset 101 may include small to large-sized ones (construction machines) and the like, and the asset 101 may also include, for example, equipment, equipment, and devices. The gateway 103 and the backup data store 160 may not be provided in the computer system 10. Here, the server 110 and the data store device 130 constitute a data management system. Although the server 110 and the data store apparatus 130 are separately configured in FIG. 1, the present invention is not limited to this, and the server 110 and the data store apparatus 130 are configured by one computer. It is also good.

Ａｓｓｅｔ１０１は、例えば、１以上のデータソース１０２を備える。データソース１０２は、時系列データを逐次出力するセンサであってもよく、Ａｓｓｅｔ１０１に対する各種操作ログを記憶する記憶装置であってもよい。 The asset 101 includes, for example, one or more data sources 102. The data source 102 may be a sensor that sequentially outputs time-series data, or may be a storage device that stores various operation logs for the asset 101.

Ｇａｔｅｗａｙ１０３は、Ａｓｓｅｔ１０１のデータソース１０２と、サーバ１１０とを通信可能に接続する。例えば、Ｇａｔｅｗａｙ１０３は、データソース１０２に新たなデータが発生した場合には、その旨をサーバ１１０に通知する機能や、サーバ１１０からのデータソース１０２の新たなデータの発生の問い合わせに対して応答する機能を有していてもよい。 The gateway 103 communicably connects the data source 102 of the asset 101 and the server 110. For example, when new data is generated in the data source 102, the gateway 103 responds to the function of notifying the server 110 of that effect, or in response to an inquiry about the generation of new data in the data source 102 from the server 110. It may have a function.

サーバ１１０は、例えば、計算機（コンピュータ）で構成されており、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１１１と、メモリ１２０と、記憶デバイス１１２と、ネットワークアダプタ１１３とを備える。 The server 110 is configured by, for example, a computer, and includes a central processing unit (CPU) 111, a memory 120, a storage device 112, and a network adapter 113.

ＣＰＵ１１１は、メモリ１２０に格納されたプログラムを実行することにより各種処理を実行する。ネットワークアダプタ１１３は、サーバ１１０をネットワークに接続して、ネットワークを介して他の装置と通信可能にする。メモリ１２０は、例えば、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）で構成され、ＣＰＵ１１１で実行されるプログラムや、ＣＰＵ１１１で使用される各種データを記憶する。本実施形態では、メモリ１２０は、データ管理を行うためのデータ管理プログラム１２１と、データソース管理テーブル１２２と、アクション定義テーブル１２３とを記憶する。なお、データ管理プログラム１２１、データソース管理テーブル１２２、及びアクション定義テーブル１２３の詳細については後述する。記憶デバイス１１２は、例えば、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の非一時的記憶デバイス（不揮発性記憶デバイス）であり、ＣＰＵ１１１で実行されるプログラムや、各種情報を記憶する。 The CPU 111 executes various programs by executing a program stored in the memory 120. The network adapter 113 connects the server 110 to the network to enable communication with other devices via the network. The memory 120 is, for example, a RAM (Random Access Memory), and stores a program executed by the CPU 111 and various data used by the CPU 111. In the present embodiment, the memory 120 stores a data management program 121 for managing data, a data source management table 122, and an action definition table 123. The details of the data management program 121, the data source management table 122, and the action definition table 123 will be described later. The storage device 112 is, for example, a non-temporary storage device (nonvolatile storage device) such as a hard disk drive (HDD) or a solid state drive (SSD), and stores programs executed by the CPU 111 and various information.

データストア装置１３０は、例えば、計算機で構成されており、ＣＰＵ１３１と、メモリ１３２と、記憶デバイス１３４と、ネットワークアダプタ１３５とを備える。 The data store apparatus 130 is configured by, for example, a computer, and includes a CPU 131, a memory 132, a storage device 134, and a network adapter 135.

ＣＰＵ１３１は、メモリ１３２に格納されたプログラムを実行することにより各種処理を実行する。ネットワークアダプタ１３５は、データストア装置１３０をネットワークに接続して、ネットワークを介して他の装置と通信可能にする。メモリ１３２は、例えば、ＲＡＭで構成され、ＣＰＵ１３１で実行されるプログラムや、ＣＰＵ１３１で使用される各種データを記憶する。本実施形態では、メモリ１３２は、データストアを制御するデータストア制御プログラム１３３を記憶する。なお、データストア制御プログラム１３３については後述する。 The CPU 131 executes various processes by executing a program stored in the memory 132. The network adapter 135 connects the data store device 130 to the network to enable communication with other devices via the network. The memory 132 is, for example, a RAM, and stores programs executed by the CPU 131 and various data used by the CPU 131. In the present embodiment, the memory 132 stores a data store control program 133 that controls a data store. The data store control program 133 will be described later.

記憶デバイス１３４は、例えば、ＨＤＤ、ＳＳＤ等の非一時的記憶デバイス（不揮発性記憶デバイス）であり、ＣＰＵ１３１で実行されるプログラムや、各種情報を記憶する。本実施形態では、記憶デバイス１３４は、データストア１４０、及び管理データ１５０を保持する。データストア１４０には、１以上のデータソース１０２から取得した１以上のデータ１４１と、１以上のデータ１４１が纏められたカタログ１４２とが格納される。管理データ１５０には、活用度データ１５１、メタデータ１５２、及びカタログデータ１５３が格納される。 The storage device 134 is, for example, a non-temporary storage device (nonvolatile storage device) such as an HDD or an SSD, and stores programs executed by the CPU 131 and various information. In the present embodiment, the storage device 134 holds a data store 140 and management data 150. The data store 140 stores one or more data 141 acquired from one or more data sources 102 and a catalog 142 in which the one or more data 141 is collected. The management data 150 stores utilization degree data 151, metadata 152, and catalog data 153.

図２は、実施例１に係るデータ管理プログラムの機能ブロック図である。 FIG. 2 is a functional block diagram of a data management program according to the first embodiment.

データ管理プログラム１２１は、ＣＰＵ１１１によって実行されることにより、データ取得部２０１と、活用度計算部２０２と、メタデータ管理部２０３と、アクション管理部２０４と、アクション実行部２０５と、評価値受付部の一例としてのカタログ管理部２０６と、入力受付部及び表示制御部の一例としてのデータ検索部２０７と、の各機能部を構成するプログラムが含まれている。なお、各機能部による処理については後述する。 The data management program 121 is executed by the CPU 111 so that the data acquisition unit 201, the usage calculation unit 202, the metadata management unit 203, the action management unit 204, the action execution unit 205, and the evaluation value reception unit A program is included which includes functional units of a catalog management unit 206 as an example of the data processing unit and a data search unit 207 as an example of the input receiving unit and the display control unit. The processing by each functional unit will be described later.

次に、サーバ１１０のメモリ１２０に格納されているデータソース管理テーブル１２２について説明する。 Next, the data source management table 122 stored in the memory 120 of the server 110 will be described.

図３は、実施例１に係るデータソース管理テーブルの構成図である。 FIG. 3 is a configuration diagram of a data source management table according to the first embodiment.

データソース管理テーブル１２２は、１以上のＡｓｓｅｔ１０１におけるデータソース１０２を管理するテーブルであり、各データソース１０２に対応するエントリを格納する。データソース管理テーブル１２２のエントリは、データソースｉｄカラム３０１と、データ区分カラム３０２と、データ種別／単位カラム３０３と、対応オブジェクトカラム３０４と、対象区間カラム３０５と、主成分分析対象データｉｄカラム３０６と、計算実行契機カラム３０７とを含む。 The data source management table 122 is a table for managing the data sources 102 in one or more assets 101, and stores entries corresponding to each data source 102. The entries of the data source management table 122 include a data source id column 301, a data division column 302, a data type / unit column 303, a corresponding object column 304, a target section column 305, and a main component analysis target data id column 306. And a calculation execution trigger column 307.

データソースｉｄカラム３０１には、エントリに対応するデータソースのｉｄ（Ｉｄｅｎｔｉｆｉｅｒ）（データソースｉｄ）が格納される。データソースｉｄとしては、単に英数字の羅列であっても良いが、データ利活用の容易性を考慮し、データソースの種類や名称を示す情報であってもよい。本実施例では、例えば、データソースｉｄである[Asset1:Sensor1」は、ｉｄが「Asset1」のＡｓｓｅｔ１０１に搭載されている、ｉｄが「Sensor1」というセンサ１０２がデータソースであることを示している。 The data source id column 301 stores the id (Identifier) (data source id) of the data source corresponding to the entry. The data source id may be simply a list of alphanumeric characters, but may be information indicating the type or name of the data source in consideration of the ease of data utilization. In this embodiment, for example, the data source id [Asset1: Sensor1] indicates that the sensor 102 with an id of “Sensor1” mounted on the Asset101 with an id of “Asset1” is a data source. .

データ区分カラム３０２には、エントリに対応するデータソースの区分が格納される。データソースの区分としては、センサ等の測定値を示す「測定値」、Ａｓｓｅｔ１０１に対して人手等により設定され、時間と共に変化することが無い一定の値を示す「セット値」、文字列であることを示す「文字列」、ラベルの値であることを示す「ラベル値」、バイナリの値であることを示す「バイナリ」等が存在する。 The data division column 302 stores the data source division corresponding to the entry. The classification of data source is “measurement value” indicating the measurement value of sensor etc., “set value” indicating a fixed value which does not change with time and is set manually to asset 101, and character string There are "character string" indicating that, "label value" indicating that it is a label value, and "binary" indicating that it is a binary value.

データ種別／単位カラム３０３には、エントリに対応するデータソースのデータのデータ種別及び単位が格納される。例えば、データが温度情報であり、かつ単位が摂氏であれば、データ種別／単位カラム３０３には、「‘Temperature’: ‘°C’」が格納される。対応オブジェクトカラム３０４には、エントリに対応するデータソースのデータについてのデータストア装置１３０内でのオブジェクトの識別子が格納される。 The data type / unit column 303 stores the data type and unit of data of the data source corresponding to the entry. For example, if the data is temperature information and the unit is degree Celsius, “‘ Temperature ′: ‘° C ′ ′ is stored in the data type / unit column 303. In the corresponding object column 304, an identifier of an object in the data store 130 for data of a data source corresponding to the entry is stored.

対象区間カラム３０５には、エントリに対応するデータソースのデータについて活用度を算出する後述する活用度算出処理の対象とする区間（対象区間）が格納される。対象区間の単位は、例えば、ｍｓｅｃ（ミリ秒）の単位としてもよい。例えば、対象区間が、3,600,000であれば、１時間ごとのデータを対象として、活用度算出処理が実行される。この対象区間カラム３０５の内容により、データリソース毎に活用度を算出する対象区間を適切に設定することができる。 The target section column 305 stores a section (target section) to be a target of utilization degree calculation processing described later that calculates the utilization degree of data of the data source corresponding to the entry. The unit of the target section may be, for example, a unit of msec (milliseconds). For example, if the target section is 3,600,000, utilization degree calculation processing is executed on data for each hour. Based on the contents of the target section column 305, it is possible to appropriately set a target section for which the degree of utilization is calculated for each data resource.

主成分分析対象データｉｄカラム３０６には、主成分分析における因子負荷量を算出する基準となるデータソースのデータのｉｄ（基準データｉｄ）が格納される。なお、主成分分析対象データｉｄカラム３０６は、主成分分析による活用度計算を実施しない場合は必要ない。計算機実行契機カラム３０７には、活用度の計算を実行する契機（計算実行契機）が格納される。計算実行契機カラム３０７には、例えば、対象区間毎に活用度の計算を実行する場合には、「連続実行」が格納され、毎日０時に活用度算出処理を実行する場合には、「毎日00:00:00に実行」が格納される。なお、計算実行契機カラム３０７には、サーバ１１０及びデータストア装置１３０の負荷を考慮し、システムのリソースに余裕がある場合を契機とする指定を行ってもよく、システムの管理者の指示を契機とする内容を格納してもよい。なお、エントリに対応するデータソースが、活用度算出処理が不要であるものである場合には、計算実行契機カラム３０７に、算出対象外であることを示す「計算しない」などを格納してもよい。 In the main component analysis target data id column 306, id (reference data id) of data of a data source serving as a reference for calculating the factor loading amount in the main component analysis is stored. The principal component analysis target data id column 306 is not necessary when the utilization calculation by the principal component analysis is not performed. The computer execution opportunity column 307 stores an opportunity (calculation execution opportunity) for executing the calculation of the utilization degree. For example, “Continuous execution” is stored in the calculation execution opportunity column 307 when performing the calculation of utilization for each target section, and when the utilization calculation processing is performed at 0 o'clock every day, “00 “Execute at 00:00” is stored. Note that the calculation execution trigger column 307 may take into account the load on the server 110 and the data store apparatus 130, and may be specified based on the case where there are enough resources in the system as a trigger. You may store the content to be. If the data source corresponding to the entry does not require the utilization calculation process, the calculation execution trigger column 307 stores “do not calculate” indicating that the data is not to be calculated. Good.

次に、データストア装置１３０の記憶デバイス１３４に格納されている活用度データ１５１について説明する。 Next, utilization degree data 151 stored in the storage device 134 of the data storage device 130 will be described.

図４は、実施例１に係る活用度データの構成図である。 FIG. 4 is a block diagram of utilization degree data according to the first embodiment.

記憶デバイス１３４には、データソース１０２毎に対応する活用度データ１５１が保持されている。図４は、単一のデータソース「Asset1:Sensor1」の活用度データの例を示している。 The storage device 134 holds utilization degree data 151 corresponding to each data source 102. FIG. 4 shows an example of utilization data of a single data source “Asset1: Sensor1”.

活用度データ１５１は、対象区間ごとの活用度を示すエントリを含む。活用度データ１５１のエントリは、ＩＤカラム４０１と、対象区間カラム４０２と、取得頻度カラム４０３と、欠損率カラム４０４と、変動率カラム４０５と、標準偏差カラム４０６と、因子負荷量カラム４０７と、他指標カラム４０８と、活用度カラム４０９とを含む。 The utilization degree data 151 includes an entry indicating the utilization degree for each target section. The entries of the utilization data 151 include an ID column 401, a target section column 402, an acquisition frequency column 403, a defect rate column 404, a fluctuation rate column 405, a standard deviation column 406, and a factor loading amount column 407, Other index column 408 and utilization degree column 409 are included.

ＩＤカラム４０１には、エントリに対応するシリアルＩＤが格納される。対象区間カラム４０２には、対象区間に対応するデータをサンプリングした開始位置と終了位置との情報が格納される。本実施例では、開始位置と終了位置とは、時刻情報（例えば、年月日時分秒）となっている。 The ID column 401 stores a serial ID corresponding to the entry. The target section column 402 stores information on the start position and the end position at which data corresponding to the target section is sampled. In the present embodiment, the start position and the end position are time information (for example, date, time, hour, minute and second).

カラム４０３〜４０８は、対象区間カラム４０２に格納された対象区間においてサンプリングされたデータの内容（値）に基づいて計算された指標（統計情報等）が格納される。取得頻度カラム４０３には、エントリに対応するデータソースにおけるデータ取得頻度（例えば、回／ｓｅｃ）が格納される。欠損率カラム４０４には、エントリに対応するデータソースにおける不正な値の出現率（欠損率）が格納される。ここで、不正な値は、例えば、データがないもの、値なしを示す値（データが取得できなかったことを示す値）、センサで取得できない範囲の値を含んでもよい。変動率カラム４０５には、エントリに対応するデータソースにおけるデータの値の変動率が格納される。データの値の変動率の算出方法については後述する。標準偏差カラム４６０には、エントリに対応するデータソースにおけるデータの標準偏差が格納される。因子負荷量カラム４０７には、エントリに対応するデータソースに対応するデータソース管理テーブル１２２のエントリの基準データｉｄに対応するデータについての因子負荷量が格納される。他指標カラム４０８には、その他の指標が格納される。その他の指標としては、例えば、データの平均値、最大値、最小値、欠損値が出現する間隔（欠損間隔）などがある。活用度カラム４０９には、エントリに対応する対象区間におけるデータについての活用度が格納される。格納される活用度の計算方法については後述する。 In columns 403 to 408, indexes (statistical information and the like) calculated based on the contents (values) of data sampled in the target section stored in the target section column 402 are stored. The acquisition frequency column 403 stores the data acquisition frequency (for example, times / sec) in the data source corresponding to the entry. The missing rate column 404 stores the occurrence rate (missing rate) of incorrect values in the data source corresponding to the entry. Here, the incorrect value may include, for example, no data, a value indicating no value (value indicating that data could not be obtained), and a value in a range that can not be obtained by the sensor. The fluctuation rate column 405 stores the fluctuation rate of the value of data in the data source corresponding to the entry. The method of calculating the rate of change of the data values will be described later. The standard deviation column 460 stores the standard deviation of data in the data source corresponding to the entry. The factor loading amount column 407 stores a factor loading amount for data corresponding to the reference data id of the entry of the data source management table 122 corresponding to the data source corresponding to the entry. The other index column 408 stores other indexes. Other indexes include, for example, an average value, maximum value, minimum value, and an interval (missing interval) at which a missing value appears. The utilization degree column 409 stores the utilization degree of the data in the target section corresponding to the entry. The method of calculating the degree of utilization to be stored will be described later.

次に、データストア装置１３０の記憶デバイス１３４に格納されているメタデータ１５２について説明する。 Next, the metadata 152 stored in the storage device 134 of the data store apparatus 130 will be described.

図５は、実施例１に係るメタデータの構成図である。 FIG. 5 is a block diagram of metadata according to the first embodiment.

記憶デバイス１３４には、データソース１０２毎に対応するメタデータ１５２が保持されている。図５は、単一のデータソース「Asset1:Sensor1」のメタデータの例を示している。図５に示すメタデータは、ＪＳＯＮ（ＪａｖａＳｃｒｉｐｔＯｂｊｅｃｔＮｏｔａｔｉｏｎ（ＪａｖａＳｃｒｉｐｔは、登録商標））形式で記述された例となっている。 The storage device 134 holds metadata 152 corresponding to each data source 102. FIG. 5 shows an example of metadata of a single data source “Asset1: Sensor1”. The metadata shown in FIG. 5 is an example described in JSON (JavaScript Object Notation (JavaScript is a registered trademark)) format.

メタデータ１５２には、メタデータ１５２に対応するデータソースのデータについての「データソースｉｄ」と、「所有者」と、「作成日」と、「更新日」と、「最終アクセス日」と、「データ種類」と、「データ単位」と、「活用度」と、「補正済み」との項目が記述されている。なお、これら項目は必ずしもすべて必須ではない。また、これらの項目以外にも、データの利活用に有用な様々な項目をメタデータ１５２に追加してもよい。さらに、メタデータ１５２に対して新たな属性を随時追加可能としてもよい。 The metadata 152 includes “data source id”, “owner”, “creation date”, “update date”, and “last access date” for data of the data source corresponding to the metadata 152, The items “data type”, “data unit”, “utilization degree”, and “corrected” are described. Note that these items are not all essential. In addition to these items, various items useful for data utilization may be added to the metadata 152. Furthermore, new attributes may be added to the metadata 152 as needed.

「データソースｉｄ」は、メタデータ１５２に対応するデータソースのデータソースｉｄを示しており、具体的には、Ａｓｓｅｔの識別子を示す「アセットｉｄ」と、センサの識別子を示す「センサｉｄ」とで記述されており、図５の例では、「アセットｉｄ」が「Asset1」と記述され、「センサｉｄ」が、「Sensor1」と記述されている。 The “data source id” indicates the data source id of the data source corresponding to the metadata 152. Specifically, “asset id” indicating an identifier of Asset, and “sensor id” indicating an identifier of a sensor In the example of FIG. 5, “asset id” is described as “Asset 1”, and “Sensor id” is described as “Sensor 1”.

「所有者」は、メタデータ１５２に対応するデータソースのデータの所有者を示しており、図５の例では、「44123」という所有者ｉｄに対応するユーザが所有していることを示している。「作成日」は、メタデータ１５２に対応するデータソースのデータのオブジェクトの作成日を示している。「更新日」は、メタデータ１５２に対応するデータソースのデータのオブジェクトの更新日を示している。「最終アクセス日」は、メタデータ１５２に対応するデータソースのデータのオブジェクトにアクセスがあった最終日を示している。「データ種別」は、メタデータ１５２に対応するデータソースのデータの種別を示している。「データ単位」はメタデータ１５２に対応するデータソースのデータの単位を示している。「データ種別」と「データ単位」とは、データソース管理テーブル１２２から取得された値が設定される。「活用度」は、メタデータ１５２に対応するデータソースのデータの活用度を示している。「修正済み」は、データの補完処理等を実施したか否かを示している。データの補完処理等が実施された場合には、「修正済み」には、「Yes」が設定され、補完処理等が実施されていない場合は「No」が設定される。 “Owner” indicates the owner of data of the data source corresponding to the metadata 152, and in the example of FIG. 5, indicates that the user corresponding to the owner id “44123” owns There is. The “creation date” indicates the creation date of the data object of the data source corresponding to the metadata 152. The “update date” indicates an update date of an object of data of the data source corresponding to the metadata 152. The “last access date” indicates the last date when the object of the data of the data source corresponding to the metadata 152 is accessed. “Data type” indicates the type of data of the data source corresponding to the metadata 152. “Data unit” indicates a unit of data of the data source corresponding to the metadata 152. The values acquired from the data source management table 122 are set as the “data type” and the “data unit”. The “utilization degree” indicates the utilization degree of data of the data source corresponding to the metadata 152. "Modified" indicates whether or not data complementation processing has been performed. When data complementation processing or the like is performed, "Yes" is set to "Modified", and "No" is set if complementation processing or the like is not performed.

次に、データストア装置１３０の記憶デバイス１３４に格納されているカタログデータ１５３について説明する。 Next, catalog data 153 stored in the storage device 134 of the data storage device 130 will be described.

図６は、実施例１に係るカタログデータの構成図である。 FIG. 6 is a configuration diagram of catalog data according to the first embodiment.

記憶デバイス１３４には、カタログ１４２毎に対応するカタログデータ１５３が保持されている。図６は、単一のカタログ「カタログ1」のカタログデータの例を示している。図６に示すカタログデータは、ＪＳＯＮ形式で記述された例となっている。 The storage device 134 holds catalog data 153 corresponding to each catalog 142. FIG. 6 shows an example of catalog data of a single catalog "catalog 1". The catalog data shown in FIG. 6 is an example described in JSON format.

カタログデータ１５３には、カタログデータ１５３に対応するカタログについての「カタログｉｄ」と、「作成者」と、「作成日」と、「更新日」と、「最終アクセス日」と、「評価」と、「データリスト」と、「作成者ロール」と、「説明」との項目が記述されている。なお、これら項目は必ずしもすべて必須ではない。また、これらの項目以外にも、データの利活用に有用な様々な項目をカタログデータ１５３に追加してもよい。さらに、カタログデータ１５３に、新たな属性を随時追加可能としてもよい。 The catalog data 153 includes “catalog id”, “creator”, “creation date”, “update date”, “last access date”, and “evaluation” for the catalog corresponding to the catalog data 153. , “Data list”, “creator role”, and “description” are described. Note that these items are not all essential. In addition to these items, various items useful for data utilization may be added to the catalog data 153. Furthermore, new attributes may be added to the catalog data 153 as needed.

「カタログｉｄ」は、カタログ１４２のｉｄを示しており、図５の例では、「カタログｉｄ」が「Catalog1」と記述されている。 “Catalog id” indicates the id of the catalog 142, and in the example of FIG. 5, “catalog id” is described as “Catalog 1”.

「作成者」は、カタログデータ１５３に対応するカタログ１４２の作成者を示しており、図６の例では、「3323」というｉｄに対応するユーザが作成したことを示している。「作成日」は、カタログデータ１５３に対応するカタログ１４２の作成日を示している。「更新日」は、カタログデータ１５３に対応するカタログ１４２の更新日を示している。「最終アクセス日」は、カタログデータ１５３に対応するカタログ１４２にアクセスがあった最終日を示している。「評価」は、カタログデータ１５３に対応するカタログ１４２に対してデータ利活用に有効か否かを利用者が評価した結果を示している。「データリスト」は、カタログデータ１５３に対応するカタログ１４２に所属するデータソースのリストを示している。また、「データリスト」には、データソース以外のファイルやオブジェクトといったデータへの参照を含めることができる。図６では、「ファイル」として「filename1.aaa」等のファイル名が指定されている。「作成者ロール」は、カタログデータ１５３に対応するカタログ１４２の作成者のロール（役割）を示す。このロールの情報に基づいて、エキスパートが作成したカタログについては、カタログ管理部２０６が「評価」を自動的に加点するようにすることができる。「説明」は、カタログデータ１５３に対応するカタログ１４２の説明を示している。ユーザは、カタログ１４２を利用する際のこの説明を参考にすることができる。 “Creator” indicates the creator of the catalog 142 corresponding to the catalog data 153, and the example in FIG. 6 indicates that the user corresponding to the id “3323” has created. The “creation date” indicates a creation date of the catalog 142 corresponding to the catalog data 153. The "update date" indicates the update date of the catalog 142 corresponding to the catalog data 153. The “last access date” indicates the last date when the catalog 142 corresponding to the catalog data 153 was accessed. “Evaluation” indicates the result of the user evaluating whether the catalog 142 corresponding to the catalog data 153 is effective for data utilization. “Data list” indicates a list of data sources belonging to the catalog 142 corresponding to the catalog data 153. Also, "data list" can include references to data such as files and objects other than data sources. In FIG. 6, a file name such as “filename1.aaa” is specified as the “file”. “Creator role” indicates the role (role) of the creator of the catalog 142 corresponding to the catalog data 153. Based on this role information, the catalog management unit 206 can automatically add "evaluation" to a catalog created by an expert. The “description” indicates the description of the catalog 142 corresponding to the catalog data 153. The user can refer to this explanation when using the catalog 142.

次に、サーバ１１０のメモリ１２０に格納されているアクション定義テーブル１２３について説明する。 Next, the action definition table 123 stored in the memory 120 of the server 110 will be described.

図７は、実施例１に係るアクション定義テーブルの構成図である。 FIG. 7 is a configuration diagram of an action definition table according to the first embodiment.

アクション定義テーブル１２３は、活用度に関する条件を含む条件（アクション条件）に基づいて実行するアクション（処理動作）を管理するテーブルであり、条件及びアクションの組のそれぞれに対応するエントリを格納する。アクション定義テーブル１２３のエントリは、ＩＤカラム７０１と、名称カラム７０２と、条件カラム７０３と、アクション内容カラム７０４と、判定タイミングカラム７０５とを含む。 The action definition table 123 is a table for managing an action (processing operation) to be executed based on a condition (action condition) including a condition regarding utilization, and stores entries corresponding to each of the condition and the set of actions. An entry of the action definition table 123 includes an ID column 701, a name column 702, a condition column 703, an action content column 704, and a determination timing column 705.

ＩＤカラム７０１には、アクション定義に対応するＩＤが格納される。名称カラム７０２には、エントリに対応するアクション定義の名称が格納される。条件カラム７０３には、エントリに対応するアクション定義のアクションを実行するための条件（アクション条件）が格納される。アクション条件としては、活用度に関する条件以外にも、データの統計情報に関する条件を含んでもよい。 The ID column 701 stores an ID corresponding to the action definition. The name column 702 stores the name of the action definition corresponding to the entry. The condition column 703 stores conditions (action conditions) for executing the action of the action definition corresponding to the entry. The action condition may include a condition regarding statistical information of data in addition to the condition regarding utilization.

アクション内容カラム７０４には、エントリに対応する条件カラム７０３のアクション条件に合致した場合に実行されるアクションの内容が格納されている。 The action content column 704 stores the content of the action to be executed when the action condition of the condition column 703 corresponding to the entry is met.

判定タイミングカラム７０５には、エントリに対応する条件カラム７０３の条件の判定を実施するタイミングが格納されている。判定タイミングとしては、例えば、１日おき（毎日０時など）、１月おき、データ更新時等とすることができる。なお、サーバ１１０及びデータストア装置１３０の負荷を考慮し、システムのリソースに余裕がある場合を判定タイミングとして指定してもよく、システム（サーバ１１０及びデータストア装置１３０）の管理者の指示があった時点を判定タイミングとして指定してもよい。 The determination timing column 705 stores the timing at which the determination of the condition of the condition column 703 corresponding to the entry is performed. The determination timing may be, for example, every other day (eg, every day at 0 o'clock), every other month, at the time of data update or the like. Note that the load on the server 110 and the data store apparatus 130 may be taken into consideration, and a case where there is a margin in the system may be designated as the determination timing, and there is an instruction of the administrator of the system (server 110 and data store apparatus 130). The specified time may be designated as the determination timing.

アクション定義テーブル１２３の一つ目のエントリ（行）においては、アクション条件が、データ全体（すべての対象区間）に対して、最終更新日が１年以上前であり、かつ活用度が１０以下であることとなっており、アクション内容が、対象データソースの１年分のデータをアーカイブのデータストアへ移動する処理を実施するものとなっている。このアクション定義によると、単に最終更新日時だけでなく、活用度を考慮して、データをアーカイブに移動することができる。 In the first entry (row) of the action definition table 123, the action condition is that the last update date is one year or more ago and the degree of utilization is 10 or less for the entire data (all target sections) The action content is to execute processing of moving one year's worth of data of the target data source to the archive data store. According to this action definition, data can be moved to the archive in consideration of utilization as well as the last update date and time.

また、アクション定義テーブル１２３の二つ目のエントリ（行）においては、アクション条件が、最新対象区間のデータに対して、活用度が所定値（例えば、３０）以下であり、且つ更新頻度が所定時間（１ｓｅｃ）以下であることとなっており、アクション内容が、この対象区間のデータの１ｓｅｃ毎の平均値を残して、この対象区間のデータをアーカイブへ移動する処理を実行するものとなっている。このアクション定義によると、活用度の低いデータについて、データを間引いて記憶しておくことができ、データストア１４０に記憶させておくデータのデータ量を低減することができる。 In the second entry (row) of the action definition table 123, the action condition is that the utilization degree is less than or equal to a predetermined value (for example, 30) with respect to data of the latest target section, and the update frequency is predetermined Time (1 sec) or less is to be performed, and the action content is to execute processing of moving data of the target section to the archive, leaving an average value of 1 sec of data of the target section. There is. According to this action definition, data can be thinned out and stored for data of low utilization, and the amount of data stored in the data store 140 can be reduced.

また、アクション定義テーブル１２３の３つ目のエントリ（行）においては、アクション条件が、活用度が５０％以上且つ欠損率が所定値（５％）以下、且つ所定のデータソースの値が所定値以下出ることとなっており、アクション内容が、欠損値を前後値の平均値を算出して補完する処理を実行するものとなっている。このアクション定義によると、活用度が高いデータの欠損値を適切に補完することができる。なお、アクション条件に所定のデータソースの値が所定値以下との条件を含めなくてもよい。 Also, in the third entry (row) of the action definition table 123, the action condition is 50% or more in utilization level, the defect rate is a predetermined value (5%) or less, and the value of the predetermined data source is a predetermined value The action content is to execute a process of calculating a missing value and calculating an average value of previous and subsequent values to complement the action. According to this action definition, missing values of highly utilized data can be appropriately compensated. The action condition may not include the condition that the value of the predetermined data source is equal to or less than the predetermined value.

次に、実施例１に係る計算機システム１０における処理動作について説明する。 Next, the processing operation of the computer system 10 according to the first embodiment will be described.

図８は、実施例１に係るメタデータ管理処理のフローチャートである。 FIG. 8 is a flowchart of metadata management processing according to the first embodiment.

メタデータ管理処理は、メタデータ管理部２０３によって実行される処理である。メタデータ管理処理は、例えば、定期的に行うようにしてもよい。 Metadata management processing is processing executed by the metadata management unit 203. The metadata management process may be performed periodically, for example.

メタデータ管理部２０３は、データソース管理情報を取得する（ステップ２０３１）。データソース管理情報を入手する方法としては、例えば、図示しない外部システム等に存在する資産管理システム（EAM: Enterprise Asset Managementなど）や、定義ファイル等からインポートしたり、或いは、データソース管理情報をＧＵＩ経由でユーザから入手したりしてもよい。データソース管理情報としては、データソース管理テーブル１２２に登録する各種情報、例えば、データソースｉｄ、データ区分としての測定値またはセット値など、データ種別/単位、活用度の計算に必要となる対象区間、主成分分析を実施する場合に利用する基準データのｉｄ、活用度計算の計算実行契機等の情報である。 The metadata management unit 203 acquires data source management information (step 2031). As a method of obtaining data source management information, for example, import from an asset management system (EAM: Enterprise Asset Management etc.) existing in an external system (not shown) or the like, a definition file or the like, or data source management information It may be obtained from the user via the network. As data source management information, various kinds of information registered in the data source management table 122, for example, data source id, measured value or set value as data division, data type / unit, target section necessary for calculation of utilization degree This is information such as id of reference data used when performing principal component analysis, calculation execution timing of utilization degree calculation, and the like.

メタデータ管理部２０３は、取得したデータソース管理情報に基づいて、データソース管理テーブル１２２を更新する（ステップ２０３２）。 The metadata management unit 203 updates the data source management table 122 based on the acquired data source management information (step 2032).

次いで、メタデータ管理部２０３は、データストア装置１３０のデータストア１４０に対して、ステップ２０３１でデータソース管理情報を取得した対象のデータソース（この処理の説明において、該当データソースという）に対応するオブジェクトを作成し、作成したオブジェクトのデータストア１４０における識別情報を、データソース管理テーブル１２２の該当データソースに対応するエントリの対応オブジェクトカラム３０４に保存する（ステップ２０３３）。ただし、すでにデータストア１４０に該当データソースに対応するオブジェクトが存在する場合には、オブジェクトを新たに作成する必要はない。 Next, the metadata management unit 203 corresponds to the data source for which the data source management information has been acquired in step 2031 (referred to as the corresponding data source in the description of this process) for the data store 140 of the data store apparatus 130 An object is created, and identification information of the created object in the data store 140 is stored in the corresponding object column 304 of the entry corresponding to the corresponding data source in the data source management table 122 (step 2033). However, when an object corresponding to the corresponding data source already exists in the data store 140, it is not necessary to create a new object.

さらに、メタデータ管理部２０３は、該当データソースに対するメタデータ１５２を生成し、保存する（ステップ２０３４）。生成するメタデータ１５２は、図５に記載したようなメタデータである。図５に記載の内容に基づき説明すれば、「データソースｉｄ」、「データ種類」、および「データ単位」などは、データソース管理テーブル１２２から入手した情報を設定する。さらに、「作成日」について、オブジェクトを新規に作成した時刻を設定する。「更新日」や「最終アクセス日」については、データストア１４０が該当オブジェクトへの更新およびアクセスを検知して得られた情報に基づき更新する。「活用度」については、例えば、０．５などの所定のデフォルト値を設定する。「補正済み」について、作成直後は未補正を示す「No」を設定する。なお、他に有用な属性があれば、このステップの実行時に生成するようにすればよい。 Further, the metadata management unit 203 generates and stores the metadata 152 for the corresponding data source (step 2034). Metadata 152 to be generated is metadata as described in FIG. Referring to the contents described in FIG. 5, “data source id”, “data type”, “data unit” and the like set the information acquired from the data source management table 122. Furthermore, for "Created Date", set the time when the object was newly created. The "update date" and the "last access date" are updated based on information obtained by the data store 140 detecting an update and access to the corresponding object. For "utilization degree", for example, a predetermined default value such as 0.5 is set. For "corrected", "No" indicating uncorrected immediately after creation is set. If there are other useful attributes, they may be generated at the time of execution of this step.

図９は、実施例１に係るデータ取得処理のフローチャートである。 FIG. 9 is a flowchart of data acquisition processing according to the first embodiment.

データ取得処理は、データ取得部２０１によって実行される処理である。データ取得処理は、例えば、データソース１０２において新たなデータが発生したことを検出した場合に実行される。データソース１０２において新たなデータが発生したことは、例えば、Ｇａｔｅｗａｙ１０３から通知を受けるようにしてもよく、Ｇａｔｅｗａｙ１０３に対して確認するようにしてもよい。また、データ取得部２０１は、複数のデータソース１０２からのデータを取得するために、複数のデータ取得処理を並行して実行するようにしてもよい。 The data acquisition process is a process executed by the data acquisition unit 201. The data acquisition process is performed, for example, when it is detected that new data has occurred in the data source 102. The occurrence of new data in the data source 102 may be notified from, for example, the Gateway 103, or may be confirmed with the Gateway 103. Also, the data acquisition unit 201 may execute a plurality of data acquisition processes in parallel in order to acquire data from a plurality of data sources 102.

データ取得部２０１は、Ａｓｓｅｔ１０１のデータソース１０２からデータを取得する（ステップ２０１１）。次いで、データ取得部２０１は、取得したデータソース１０２（この処理の説明において該当データソースという）のデータに対して、該当データソースのｉｄと、データソース管理テーブル１２２の内容とに基づき、該当データソースに対応するデータストア１４０のオブジェクトへの更新を指示する（ステップ２０１２）。次いで、データ取得部２０１は、該当データソースのメタデータ１５２の更新（例えば、「更新日」の更新等）を行う（ステップ２０１３）。 The data acquisition unit 201 acquires data from the data source 102 of the Asset 101 (Step 2011). Next, the data acquiring unit 201 applies the corresponding data to the acquired data of the data source 102 (referred to as the corresponding data source in the description of this process) based on the id of the corresponding data source and the contents of the data source management table 122. The update to the object of the data store 140 corresponding to the source is instructed (step 2012). Next, the data acquisition unit 201 updates the metadata 152 of the corresponding data source (for example, update “update date”, etc.) (step 2013).

図１０は、実施例１に係る活用度計算処理のフローチャートである。 FIG. 10 is a flowchart of utilization degree calculation processing according to the first embodiment.

活用度計算処理は、活用度計算部２０２によって実行される処理である。活用度計算処理は、例えば、定期的に実行される。活用度計算処理は、データソース管理テーブル１２２にエントリが登録されている各データソースを対象に実行される。 The utilization degree calculation process is a process executed by the utilization degree calculation unit 202. The utilization calculation process is performed, for example, periodically. The utilization calculation process is performed on each data source whose entry is registered in the data source management table 122.

まず、活用度計算部２０２は、活用度計算の実行条件を確認する（ステップ２０２１）。具体的には、活用度計算部２０２は、データソース管理テーブル１２２の処理対象のデータソースに対応するエントリの計算実行契機カラム３０７の内容を確認する。 First, the utilization degree calculation unit 202 confirms the execution condition of the utilization degree calculation (step 2021). Specifically, the utilization degree calculation unit 202 confirms the content of the calculation execution trigger column 307 of the entry corresponding to the processing target data source of the data source management table 122.

計算実行契機カラム３０７の内容は、「連続実行」や「毎日00:00:00に実行」などの条件であるが、サーバ１１０及びデータストア装置１３０の負荷を考慮し、システムリソースに余裕がある場合を契機とする方法や、システムの管理者の指示を契機とする方法を指定することも可能である。例えば、システムリソースに余裕がある場合は、活用度計算部２０２はシステムのリソース利用率などの情報をモニタし、システムの１つまたは複数のリソースの組み合わせが、一定の閾値以下であれば、活用度の計算（ステップ２０２３以降の処理）を実行するようにしてもよい。例えば、データストア１４０を稼動するハードウェア（本例では、データストア装置１３０）のＣＰＵ１３１の利用率が所定値以下（例えば、３０％以下）であれば活用度の計算を実行するようにしてもよい。なお、該当データソースが活用度計算の対象外の場合、すなわち、データソース管理テーブル１２２の処理対象のデータソースに対応するエントリの計算実行契機カラム３０７の内容が計算しないである場合には、ステップ２０２３以降の処理は実行されない。 The contents of the calculation execution trigger column 307 are conditions such as “continuous execution” and “execute every day at 00:00:00”, but there is a margin in system resources in consideration of the load of the server 110 and the data storage device 130. It is also possible to specify a method triggered by a case or a method triggered by an instruction from a system administrator. For example, if there is enough system resources, the utilization calculation unit 202 monitors information such as the resource utilization of the system, and if the combination of one or more resources of the system is equal to or less than a certain threshold, the utilization is used. The calculation of the degree (the process after step 2023) may be performed. For example, if the usage rate of the CPU 131 of the hardware (in this example, the data store apparatus 130) operating the data store 140 is equal to or less than a predetermined value (for example, 30% or less) Good. If the corresponding data source is not the target of utilization calculation, that is, if the content of the calculation execution trigger column 307 of the entry corresponding to the processing target data source of the data source management table 122 is not calculated, the step The processing after 2023 is not executed.

次いで、活用度計算部２０２は、ステップ２０２１で確認した活用度計算の実行条件に該当しているか否かを判定し（ステップ２０２２）、この結果、実行条件に該当していないと判定した場合（ステップ２０２２：Ｎｏ）には、条件の確認を継続するために処理をステップ２０２１に進める。 Next, the utilization calculation unit 202 determines whether the execution condition of the utilization calculation checked in step 2021 is satisfied (step 2022). As a result, when it is determined that the execution condition is not satisfied ((2) Step 2022: No), the process proceeds to step 2021 to continue the confirmation of the condition.

一方、実行条件に該当する場合（ステップ２０２２：Ｙｅｓ）には、活用度計算部２０２は、該当データソースの対象区間およびデータストア１４０の該当データソースに対応するオブジェクトの識別情報をデータソース管理テーブル１２２から取得し、該当データソースの取得した対象区間のデータ（以下、対象区間データ）を取得する（ステップ２０２３）。 On the other hand, when the execution condition is satisfied (step 2022: Yes), the utilization calculation unit 202 uses the data source management table of the identification information of the object corresponding to the target section of the corresponding data source and the corresponding data source of the data store 140. The data of the target section acquired from the data source 122 (hereinafter, target section data) is acquired (step 2023).

次いで、活用度計算部２０２は、取得したデータに基づき活用度を計算し、計算で得られた活用度と関連情報とを活用度データ１５１として保持する（ステップ２０２４）。ここで、関連情報とは、例えば、活用度データ１５１のカラム４０１〜４０８に設定する情報である。 Then, the utilization calculation unit 202 calculates the utilization based on the acquired data, and holds the utilization and the related information obtained by the calculation as the utilization data 151 (step 2024). Here, the related information is, for example, information to be set in columns 401 to 408 of the utilization degree data 151.

以下に、活用度計算部２０２による活用度の計算について具体的に説明する。以下の説明では、対象区画データを時系列データ（時刻と値との組み合わせの列）とし、かつ活用度データ１５１は、図４に示す内容であるものとする。 The calculation of the utilization degree by the utilization degree calculation unit 202 will be specifically described below. In the following description, it is assumed that the target section data is time series data (sequence of combinations of time and value), and the utilization data 151 has the content shown in FIG.

活用度計算部２０２は、以下の式（１）、（２）、（３）により取得頻度、欠損率、及び変動率を算出する。 The utilization calculation unit 202 calculates the acquisition frequency, the loss rate, and the fluctuation rate according to the following formulas (1), (2), and (3).

取得頻度[回/sec]＝１／対象区間データの或る時刻と次の時刻との差分の平均時間[s]・・・（１）
欠損率＝対象区間データにおける不正な値の数／全データ列数・・・（２）
変動率＝対象区間データのデータ列の時刻毎の値の差分／全データ列数・・・（３） Acquisition frequency [times / sec] = 1 / average time of difference between certain time of target section data and next time [s] (1)
Loss rate = number of incorrect values in target section data / number of all data strings (2)
Rate of change = difference between values of data strings of target section data for each time / number of all data strings (3)

また、活用度計算部２０２は、対象区間データのデータ列の値に対して標準偏差、平均値、最大値、最小値、欠損間隔、及び因子負荷量を算出する。 Further, the utilization degree calculation unit 202 calculates standard deviation, average value, maximum value, minimum value, loss interval, and factor loading amount with respect to the value of the data string of the target section data.

欠損間隔は、不正な値を持つ時刻と、次に不正な値をもつ時刻との差分の平均値である。また、因子負荷量は、主成分分析を該当データソースのデータソース管理テーブル１２２の主成分分析対象データｉｄカラム３０６にｉｄが格納されているデータを対象として、同じＡｓｓｅｔ（データソースｉｄがAsset1の装置）のデータソースを変数とした主成分分析を実施した場合の、該当データソースの因子負荷量である。 The missing interval is the average value of the difference between the time with the incorrect value and the time with the next incorrect value. In addition, the factor load amount is the same as Asset (data source id is Asset 1 for data whose id is stored in the principal component analysis target data id column 306 of the data source management table 122 of the corresponding data source). The factor loading amount of the corresponding data source when performing principal component analysis with the data source of the device as a variable.

活用度計算部２０２は、上記に示した関連情報の少なくともいずれか１つを用いて活用度を計算する。活用度は、複数の活用度の計算方法の中からユーザにより選択されたものを使用するようにしてもよく、ユーザ自身が定義したものを使用するようにしてもよい。 The utilization calculation unit 202 calculates the utilization using at least one of the related information described above. The utilization level may be selected from among a plurality of utilization level calculation methods by the user, or may be defined by the user.

例えば、データソースのうちセット値でないデータソース（データソース管理テーブル１２２のデータ区分カラム３０２がセット値ではないもの）について、値の変化が小さいデータソースの活用度を低くし、値が変化に富むデータソースの活用度を高くするようにする場合においては、例えば、式（４）により活用度を求めるようにしてもよい。 For example, with regard to data sources that are not set values among data sources (in which the data division column 302 of the data source management table 122 is not set values), the degree of utilization of data sources with small change in value is lowered and the values are rich in change. When the degree of utilization of the data source is to be increased, for example, the degree of utilization may be determined by equation (4).

活用度＝（α×変動率／β×取得頻度＋γ）×標準偏差・・・（４）
ここで、α, β, γは、予め設定した定数である。 Utilization degree = (α × change rate / β × acquisition frequency + γ) × standard deviation ... (4)
Here, α, β and γ are preset constants.

また、欠損率が小さいデータソースを選択しやすくする場合（すなわち、活用度を大きくする場合）には、式（４）の右辺に、（１−欠損率)を掛けるようにして、活用度を算出するようにしてもよい。 In addition, when it is easy to select a data source with a low defect rate (that is, to increase the application rate), the right side of equation (4) is multiplied by (1-deletion rate) to increase the application rate. It may be calculated.

活用度計算部２０２は、関連情報と活用度を計算した後に、該当データソースの活用度データ１５１に、該当する対象区間、関連情報、及び活用度についての情報を追加・更新する。 After calculating the related information and the use degree, the use degree calculation unit 202 adds / updates information on the corresponding target section, the related information, and the use degree to the use degree data 151 of the corresponding data source.

次いで、活用度計算部２０２は、該当データソースの活用度データ１５１に保存された活用度に基づき、該当データソースのメタデータ１５２の活用度を更新する（ステップ２０２５）。ここで、本実施例では、メタデータ１５２の活用度は、例えば、該当データソースのすべての区間の活用度の平均値を算出したものとしている。なお、メタデータ１５２の活用度を最新の区間の活用度としてもよい。 Next, the utilization calculation unit 202 updates the utilization of the metadata 152 of the corresponding data source based on the utilization stored in the utilization data 151 of the corresponding data source (Step 2025). Here, in the present embodiment, it is assumed that the utilization of the metadata 152 is, for example, the average value of the utilizations of all sections of the corresponding data source. The utilization degree of the metadata 152 may be the utilization degree of the latest section.

次に、活用度計算部２０２による活用度の計算の具体例について説明する。 Next, a specific example of calculation of the utilization degree by the utilization degree calculation unit 202 will be described.

図１１は、実施例１に係る活用度計算を説明する図である。 FIG. 11 is a diagram for explaining utilization calculation according to the first embodiment.

図１１の例は、上記した式（３）により変動率を算出し、式（４）により活用度を算出した例である。なお、式（４）における定数α，β，γは、それぞれ１としている。 The example of FIG. 11 is an example in which the variation rate is calculated by the above-described equation (3), and the utilization degree is calculated by the equation (4). The constants α, β and γ in the equation (4) are each set to 1.

図１１は、データソース１１０２（データソースAsset1:Sensor2）と、データソース１１０４（データソースAsset3:Sensor5）との時系列データに対して活用度を算出した例となっている。本例では、時系列データとして、時刻と、その時刻における値（例えば、センサーの測定値）との列のデータとしている。 FIG. 11 is an example in which the degree of utilization is calculated with respect to time-series data of the data source 1102 (data source Asset1: Sensor2) and the data source 1104 (data source Asset3: Sensor5). In this example, as time series data, data of a row of time and a value at that time (for example, a measurement value of a sensor) is used.

データソース１１０２を対象に関連情報と活用度を計算すると、計算結果１１０３に示すように、標準偏差が０．３５となり、取得頻度が０．５／ｓｅｃとなり、変動率が０．２となり、活用度が０．７５となる。 Calculating related information and utilization for data source 1102, as shown in calculation result 1103, the standard deviation is 0.35, the acquisition frequency is 0.5 / sec, the fluctuation rate is 0.2, and utilization The degree is 0.75.

一方、データソース１１０４を対象に関連情報と活用度を計算すると、計算結果１１０５に示すように、標準偏差が３１となり、取得頻度が０．５／ｓｅｃとなり、変動率が２０．７となり、活用度が７２．３となる。 On the other hand, calculating related information and utilization for data source 1104, as shown in calculation result 1105, the standard deviation is 31, the acquisition frequency is 0.5 / sec, the fluctuation rate is 20.7, and utilization The degree is 72.3.

データソース１１０２と、データソース１１０４との活用度を比較すると、データソース１１０４の方が高い活用度となっている。すなわち、データソース１１０４の方がデータ分析の方がデータ分析に利用する際の有効性が高いことを示している。 When the degree of utilization of the data source 1102 and the data source 1104 is compared, the data source 1104 has a higher degree of utilization. That is, it is shown that the data source 1104 is more effective in data analysis when it is used for data analysis.

なお、図１１の例では、データソースのデータを時系列データとして説明したが、本発明はこれに限られず、例えば、キーとバリューとの組み合わせデータであれば、時系列データでなくてもよい。この場合には、取得頻度は算出できないが、キー値が一致しているデータ間であれば、キー値の差分を算出することで、同等の情報を取得することができる。また、対象区間については、キー値の範囲で指定すればよい。例えば、キー値がシーケンシャルな番号であれば、この番号に対する範囲で指定すればよい。 Although the data source data is described as time series data in the example of FIG. 11, the present invention is not limited to this, and for example, it may not be time series data as long as it is combination data of a key and a value. . In this case, although the acquisition frequency can not be calculated, equivalent data can be acquired by calculating the difference between the key values as long as the data has a matching key value. In addition, the target section may be designated in the range of key values. For example, if the key value is a sequential number, it may be designated by a range for this number.

図１２は、実施例１に係るアクション管理処理のフローチャートである。 FIG. 12 is a flowchart of action management processing according to the first embodiment.

アクション管理処理は、アクション管理部２０４によって実行される処理である。アクション管理処理は、例えば、定期的に行うようにしてもよい。 The action management process is a process executed by the action management unit 204. The action management process may be performed periodically, for example.

アクション管理部２０４は、アクション定義情報を取得する（ステップ２０４１）。アクション定義情報として取得する情報は、例えば、図７に示すアクション定義テーブル１２３に保持される情報（名称、条件、アクション内容、及び判定タイミング等）である。アクション定義情報は、例えば、アクション管理部２０４が所定の定義ファイルを読み込んで取得する場合や、ＵＩ（ＵｓｅｒＩｎｔｅｒｆａｃｅ）を提供してユーザからの入力により取得する場合等がある。なお、アクション定義テーブル１２３を予め登録している場合には、取得しなくてもよい。 The action management unit 204 acquires action definition information (step 2041). The information acquired as action definition information is, for example, information (name, condition, action content, determination timing, etc.) held in the action definition table 123 shown in FIG. The action definition information may be acquired, for example, when the action management unit 204 reads and acquires a predetermined definition file, or when acquiring an input from a user by providing a UI (User Interface). When the action definition table 123 is registered in advance, the action definition table 123 may not be acquired.

次いで、アクション管理部２０４は、ステップ２０４１で取得したアクション定義情報の内容に基づき、アクション定義テーブル１２３を更新する（ステップ２０４２）。 Next, the action management unit 204 updates the action definition table 123 based on the contents of the action definition information acquired in step 2041 (step 2042).

図１３は、実施例１に係るアクション実行処理のフローチャートである。 FIG. 13 is a flowchart of action execution processing according to the first embodiment.

アクション実行処理は、アクション実行部２０５によって実行される処理である。アクション実行処理は、例えば、定期的に行うようにしてもよい。 The action execution process is a process executed by the action execution unit 205. The action execution process may be performed periodically, for example.

アクション実行部２０５は、アクション定義の判定タイミングを確認する（ステップ２０５１）。ここでは、アクション実行部２０５は、アクション定義テーブル１２３の判定タイミングカラム７０５に保存された各アクションの判定タイミングの情報を取得する。 The action execution unit 205 confirms the determination timing of the action definition (step 2051). Here, the action execution unit 205 acquires information on the determination timing of each action stored in the determination timing column 705 of the action definition table 123.

次いで、アクション実行部２０５は、ステップ２０５１で取得した判定タイミングに該当するか否かを判定する（ステップ２０５２）。ここで、アクション定義テーブル１２３に複数のアクションに対応するエントリが登録されている場合には、各アクションのそれぞれを対象に、ステップＳ２０５２の判定が行われる。なお、該当アクションの実行が無効状態である場合（例えば、エントリの判定タイミングカラム７０５に無効が設定されている場合）には、アクション実行部２０５は、このアクションに対しては、ステップ２０５１及びステップ２０５２の処理を行わない。 Next, the action execution unit 205 determines whether it corresponds to the determination timing acquired in step 2051 (step 2052). Here, when entries corresponding to a plurality of actions are registered in the action definition table 123, the determination in step S2052 is performed on each of the actions. When the execution of the corresponding action is in an invalid state (for example, when the judgment timing column 705 of the entry is set to invalid), the action execution unit 205 executes the step 2051 and the step for this action. The process of 2052 is not performed.

ステップ２０５２の判定の結果、判定タイミングに該当しない場合（ステップ２０５２：Ｎｏ）には、アクション実行部２０５は、処理をステップ２０５１へ進める。 As a result of the determination in step 2052, when the determination timing does not apply (step 2052: No), the action execution unit 205 advances the process to step 2051.

一方、判定タイミングに該当する場合（ステップ２０５２：Ｙｅｓ）には、アクション実行部２０５は、判定タイミングに該当したアクション定義の実行条件を確認する（ステップ２０５３）。ここでは、アクション実行部２０５は、アクション定義テーブル１２３の条件カラム７０３の設定内容を取得する。 On the other hand, when it corresponds to the determination timing (step 2052: Yes), the action execution unit 205 confirms the execution condition of the action definition corresponding to the determination timing (step 2053). Here, the action execution unit 205 acquires the setting contents of the condition column 703 of the action definition table 123.

次いで、アクション実行部２０５は、ステップ２０５３で取得した条件に該当するか否かを判定する（ステップ２０５４）。ここで、この判定においては、アクション実行部２０５は、各データソースのメタデータ１５２、活用度データ１５１、およびカタログデータ１５３の内容を参照すると共に、サーバ１１０やデータストア装置１３０の内部状況（システムリソース利用状況など）を参照して利用する。また、取得したデータソースのデータそのものについて、例えば、或るアクションの条件として「Asset1:Sensor3の最新の値が30以上」などの条件が記録されている場合には、アクション実行部２０５は、該当データソースの最新の値を参照して判定する。また、取得した条件に、「最新対象区間」などの条件が含まれている場合には、アクション実行部２０５は、該当データソースの活用度データ１５１の所定の対象区間における情報を、条件の判定に利用する。 Next, the action execution unit 205 determines whether the condition acquired at step 2053 is met (step 2054). Here, in this determination, the action execution unit 205 refers to the contents of the metadata 152, utilization data 151, and catalog data 153 of each data source, and the internal status of the server 110 and the data store apparatus 130 (system Refer to resource usage etc.) and use it. Further, for example, when a condition such as “the latest value of Asset1: Sensor3 is 30 or more” is recorded as the condition of a certain action, the action execution unit 205 corresponds to the acquired data source data itself. Determine by referring to the latest value of the data source. When the acquired condition includes a condition such as “latest target section”, the action execution unit 205 determines the information in the predetermined target section of the utilization level data 151 of the corresponding data source as the condition. Use for

ステップ２０５４の判定の結果、実行条件に該当していない場合（ステップ２０５４：Ｎｏ）には、アクション実行部２０５は、アクションを実行することなく処理を終了する。一方、ステップ２０５４の判定の結果、実行条件に該当する場合（ステップ２０５４：Ｙｅｓ）には、アクション実行部２０５は、該当アクション定義のアクション内容カラム７０４に設定されている内容のアクションを実行する（ステップ２０５５）。なお、アクション実行部２０５は、アクションを実行する際に、外部システムのＡＰＩの呼び出し等を実行してもよい。 As a result of the determination in step 2054, when the execution condition is not satisfied (step 2054: No), the action execution unit 205 ends the process without executing the action. On the other hand, if it is determined in step 2054 that the execution condition is satisfied (step 2054: YES), the action execution unit 205 executes the action of the content set in the action content column 704 of the corresponding action definition ( Step 2055). Note that, when executing an action, the action execution unit 205 may execute an API call of an external system or the like.

上記したように、アクション実行処理によると、活用度の条件を含むアクション条件を満たした場合に、条件に対応するアクションが実行される。したがって、データソースの活用度に従ってデータソースを適切に管理することができる。 As described above, according to the action execution process, when the action condition including the condition of utilization is satisfied, the action corresponding to the condition is executed. Therefore, data sources can be properly managed according to the degree of utilization of the data sources.

図１４は、実施例１に係るカタログ管理処理のフローチャートである。 FIG. 14 is a flowchart of catalog management processing according to the first embodiment.

カタログ管理処理は、カタログ管理部２０６によって実行される処理である。カタログ管理処理は、例えば、定期的に行うようにしてもよい。 The catalog management process is a process executed by the catalog management unit 206. The catalog management process may be performed periodically, for example.

カタログ管理部２０６は、カタログ定義を取得する（ステップ２０６１）。カタログ定義の取得方法としては、例えば、カタログの定義ファイルなどから読み込む方法や、ＵＩなどを介してユーザによる入力から取得する方法がある。カタログ定義として取得する情報は、例えば、図６に示すカタログデータ１５３に含まれる「カタログｉｄ」の内容、「データリスト」におけるデータソースやファイル、「説明」の内容、「作成者」の内容、「作成者ロール」の内容等である。「作成者」の内容については、データ管理プログラム１２１の機能により、図示しないディレクトリサービスなどのユーザ管理機能と連携し、カタログを作成したユーザのｉｄを取得する。また、「作成者ロール」については、ディレクトリサービスに保持されたユーザの役割に関する情報を取得する。「作成日」、「更新日」、「最終アクセス日」は、それぞれカタログの作成日、更新時、および利用時の時刻を取得する。 The catalog management unit 206 acquires a catalog definition (step 2061). As a method of acquiring the catalog definition, for example, there is a method of reading from a definition file of a catalog or the like, or a method of acquiring from a user's input through a UI or the like. The information acquired as the catalog definition includes, for example, the contents of “catalog id” included in the catalog data 153 shown in FIG. 6, the data source or file in the “data list”, the contents of “description”, the contents of “creator”, Contents of the "creator role", etc. With regard to the contents of “creator”, the function of the data management program 121 cooperates with a user management function such as a directory service (not shown) to acquire the id of the user who has created the catalog. Also, with regard to the "creator role", information on the role of the user held in the directory service is acquired. For “creation date”, “update date”, and “last access date”, the creation date, update time, and use time of the catalog are acquired, respectively.

次いで、カタログ管理部２０６は、ステップ２０６１で取得したカタログ定義の情報に基づき、カタログデータ１５３に新しいカタログの追加、または既存カタログの更新を実施する（ステップ２０６２）。なお、カタログ管理部２０６は、カタログデータ１５３の「評価」については、カタログを新たに追加する際にはデフォルト値（例えば、最低１から最大５の範囲における中間値３）を設定している。 Next, the catalog management unit 206 adds a new catalog to the catalog data 153 or updates an existing catalog based on the information of the catalog definition acquired in step 2061 (step 2062). Note that the catalog management unit 206 sets a default value (for example, an intermediate value 3 in the range of at least 1 to at most 5) when newly adding a catalog to “evaluation” of the catalog data 153.

次いで、カタログ管理部２０６は、評価補正値を計算する（ステップ２０６３）。ここで、カタログ管理部２０６は。評価補正値を、例えば、該当カタログのカタログデータ１５３に基づき算出する。例えば、「管理者ロール」に対する補正値の対応表をカタログ管理部２０６が保持し、カタログ管理部２０６がその対応表に基づいて、「データサイエンティスト」であれば補正値を＋１などとする。 Next, the catalog management unit 206 calculates an evaluation correction value (step 2063). Here, the catalog management unit 206. The evaluation correction value is calculated, for example, based on the catalog data 153 of the corresponding catalog. For example, the catalog management unit 206 holds a correspondence table of correction values for “administrator role”, and if the catalog management unit 206 is “data scientist” based on the correspondence table, the correction value is set to +1 or the like.

次いで、カタログ管理部２０６は、該当カタログの「評価」と、ステップ２０６３で算出した評価補正値に基づき、該当カタログに属するデータソース群のメタデータ１５２に保持された活用度を更新する（ステップ２０６４）。本実施例では、カタログ管理部２０６は、カタログデータ１５３の「評価」の値に評価補正値を加算し、この結果を評価のデフォルト値（例えば、３）で割ったものを、該当カタログに属するデータソースのメタデータ１５２の「活用度」の値に掛け合わせたものを、メタデータ１５２おける新たな「活用度」として更新する。これにより、データソースのメタデータ１５２の活用度を、そのデータソースが属するカタログの評価値を反映された活用度に更新することができる。 Next, the catalog management unit 206 updates the utilization degree held in the metadata 152 of the data source group belonging to the corresponding catalog based on the “evaluation” of the corresponding catalog and the evaluation correction value calculated in step 2063 (step 2064) ). In the present embodiment, the catalog management unit 206 adds the evaluation correction value to the “evaluation” value of the catalog data 153, divides this result by the evaluation default value (for example, 3), and belongs to the corresponding catalog. The product multiplied by the value of “utilization degree” of the data source metadata 152 is updated as a new “utilization degree” in the metadata 152. Thus, the utilization degree of the metadata 152 of the data source can be updated to the utilization degree reflecting the evaluation value of the catalog to which the data source belongs.

図１５は、実施例１に係るデータソース検索画面の一例を示す図である。 FIG. 15 is a diagram illustrating an example of a data source search screen according to the first embodiment.

データソース検索画面１５０１は、データ検索部２０７によって提供されるユーザ向けのデータソース検索ＵＩである。データソース検索画面１５０１は、ブラウザやクライアントアプリケーションまたはモバイル・タブレット向けアプリケーションなどで表示される。データ検索部２０７は、ＷＥＢサーバやアプリケーションサーバとして稼動する。 The data source search screen 1501 is a data source search UI for users provided by the data search unit 207. The data source search screen 1501 is displayed by a browser, a client application, a mobile tablet application, or the like. The data search unit 207 operates as a web server or an application server.

データソース検索画面１５０１は、検索キー入力領域１５０２と、検索ボタン１５０３と、詳細検索オプションボタン１５０４と、候補表示ボックス１５０５と、閉じるボタン１５０６とを含む。 The data source search screen 1501 includes a search key input area 1502, a search button 1503, a detail search option button 1504, a candidate display box 1505, and a close button 1506.

検索キー入力領域１５０２は、データソースを検索するためのキーワードが入力可能な領域である。検索ボタン１５０３は、ユーザが検索を指示するためのボタンであり、検索ボタン１５０３が押下されると、検索キー入力領域１５０２に入力されたキーワードに基づいてデータソースの検索が行われ、検索結果（候補のデータソース）が候補表示ボックス１５０５に表示される。詳細検索オプションボタン１５０４は、押下されると、検索における詳細条件を選択するためのオプションが表示される。 The search key input area 1502 is an area where keywords for searching for data sources can be input. The search button 1503 is a button for the user to instruct a search, and when the search button 1503 is pressed, the data source is searched based on the keyword input in the search key input area 1502, and the search result ( The candidate data source is displayed in the candidate display box 1505. When the detail search option button 1504 is pressed, an option for selecting a detail condition in the search is displayed.

候補表示ボックス１５０５は、検索結果を表示する領域である。本実施形態では、候補表示ボックス１５０５には、例えば、候補となるデータソースのデータソースｉｄと、このデータソースに関連する情報（例えば、タグ）と、活用度と、詳細が表示される。関連する情報については、このデータソースのメタデータ１５２から取得することができる。本実施形態では、データ検索部２０７は、複数の候補のデータソースを表示する場合には、例えば、活用度により降順となるようにソートして表示させている。候補表示ボックス１５０５の詳細が選択されると、対応するデータソースのより詳細な情報が含まれている、このデータソースを取得するための画面が表示される。 The candidate display box 1505 is an area for displaying a search result. In the present embodiment, the candidate display box 1505 displays, for example, a data source id of a candidate data source, information (for example, a tag) related to the data source, the utilization degree, and details. Relevant information can be obtained from the metadata 152 of this data source. In the present embodiment, when displaying a plurality of candidate data sources, for example, the data search unit 207 sorts and displays the data sources in descending order according to the degree of utilization. When the details of the candidate display box 1505 are selected, a screen for acquiring this data source is displayed, which contains more detailed information of the corresponding data source.

閉じるボタン１５０６は、データソース検索画面１５０１を閉じるためのボタンであり、閉じるボタン１５０６が押下されると、データ検索部２０７は、データソース検索画面１５０１を閉じる。 The close button 1506 is a button for closing the data source search screen 1501. When the close button 1506 is pressed, the data search unit 207 closes the data source search screen 1501.

図１６は、実施例１に係るカタログ評価画面の一例を示す図である。 FIG. 16 is a diagram illustrating an example of a catalog evaluation screen according to the first embodiment.

カタログ評価画面１６０１は、カタログ管理部２０６により表示される、カタログを利用したユーザに対して評価の入力を要求する画面である。カタログ評価画面１６０１は、ブラウザやクライアントアプリケーションまたはモバイル・タブレット向けアプリケーションなどで表示される。なお、図１５に示したデータソースの検索と同様に、カタログをキーワード検索してカタログに関する情報を表示させるカタログ検索画面（図示せず）が用意されており、カタログ管理部２０６は、このカタログ検索画面でカタログが選択された場合に、カタログ評価画面１６０１が表示してもよく、或いは、使用したデータソースがカタログに属している場合に、そのカタログについてのカタログ評価画面１６０１を表示してもよい。 The catalog evaluation screen 1601 is a screen displayed by the catalog management unit 206 for requesting the user using the catalog to input an evaluation. The catalog evaluation screen 1601 is displayed by a browser, a client application, an application for mobile tablets, or the like. Similar to the data source search shown in FIG. 15, a catalog search screen (not shown) for searching the keywords in the catalog and displaying information about the catalog is prepared, and the catalog management unit 206 performs the catalog search. If a catalog is selected on the screen, the catalog evaluation screen 1601 may be displayed, or if the used data source belongs to a catalog, the catalog evaluation screen 1601 for the catalog may be displayed. .

カタログ評価画面１６０１は、カタログ情報表示領域１６０２と、データ内容ボックス１６０３と、評価設定領域１６０４と、終了ボタン１６０５とを含む。 Catalog evaluation screen 1601 includes a catalog information display area 1602, a data content box 1603, an evaluation setting area 1604, and an end button 1605.

カタログ情報表示領域１６０２には、ユーザが利用したカタログの情報が表示される。カタログの情報は、このカタログのカタログデータ１５３の内容に基づいて表示される。データ内容ボックス１６０３には、このカタログに属するデータソースの一覧が表示される。データソースの一覧には、例えば、各データソースのデータソースｉｄと、このデータソースに関連する情報（例えば、タグ）と、活用度と、詳細が表示される。データソースに関連する情報については、このデータソースのメタデータ１５２から取得することができる。 The catalog information display area 1602 displays information of the catalog used by the user. Catalog information is displayed based on the contents of catalog data 153 of this catalog. A data content box 1603 displays a list of data sources belonging to this catalog. The data source list displays, for example, the data source id of each data source, information (for example, a tag) related to the data source, the degree of utilization, and details. Information related to the data source can be obtained from the metadata 152 of this data source.

評価設定領域１６０４は、このカタログの評価を設定するための領域であり、例えば、ユーザが選択可能な５つの星形のボタンが表示されている。ユーザは、カタログの内容に応じて、選択する星形ボタンの数を変えることにより、カタログを５段階評価することができる。終了ボタン１６０５は、カタログの評価を終了するためのボタンであり、終了ボタン１６０５が押下されると、カタログ管理部２０６は、評価設定領域１６０４において評価した結果に基づいて、カタログデータ１５３の評価を更新する。 The evaluation setting area 1604 is an area for setting the evaluation of this catalog, and for example, five star-shaped buttons selectable by the user are displayed. The user can evaluate the catalog in five levels by changing the number of star buttons to be selected according to the contents of the catalog. The end button 1605 is a button for ending the evaluation of the catalog. When the end button 1605 is pressed, the catalog management unit 206 evaluates the catalog data 153 based on the result of evaluation in the evaluation setting area 1604. Update.

このカタログ評価画面１６０１によると、ユーザは、カタログを容易に評価することができる。また、カタログを評価することにより、データの活用度を適切に評価することができる。 According to the catalog evaluation screen 1601, the user can easily evaluate the catalog. Also, by evaluating the catalog, it is possible to properly evaluate the utilization of data.

次に、実施例２に係る計算機システムについて説明する。なお、実施例２の説明においては、実施例１に係る計算機システムと異なる点を中心に説明する。 Next, a computer system according to the second embodiment will be described. In the description of the second embodiment, differences from the computer system according to the first embodiment will be mainly described.

実施例２に係る計算機システム１０ａは、データストア装置１３０にデータソース間の関係を管理する関係データ１５４を新たに保持し、これを活用することで、関連性の高いデータソースの特定と、それに基づくデータ管理を実現するようにしたものである。これにより、例えば、データの内容が一致する複数のデータソースを特定することができる。このため、そのなかのいずれかのデータを選択的に蓄積するようにし、残りのデータはアーカイブストレージなどに移動するといった運用が可能となり、重複したデータによる無駄な記憶領域の使用を低減することができる。 The computer system 10a according to the second embodiment newly holds relationship data 154 for managing the relationship between data sources in the data storage device 130, and uses this to identify a highly relevant data source, and It is intended to realize data management based on This enables, for example, identification of a plurality of data sources whose data contents match. For this reason, it is possible to selectively accumulate one of the data and move the remaining data to an archive storage etc., thereby reducing the use of unnecessary storage area by duplicate data. it can.

図１７は、実施例２に係る計算機システムの全体構成図である。なお、実施例１に係る計算機システムと同様な構成については同一の符号を付している。 FIG. 17 is an entire configuration diagram of a computer system according to a second embodiment. The same components as those of the computer system according to the first embodiment are denoted by the same reference numerals.

実施例２に係る計算機システム１０ａにおいては、データストア装置１３０の管理データ１５０は、更に、関係データ１５４を記憶する。 In the computer system 10 a according to the second embodiment, the management data 150 of the data storage device 130 further stores relationship data 154.

図１８は、実施例２に係る関係データに格納された内容を示す図である。 FIG. 18 is a diagram showing the contents stored in the related data according to the second embodiment.

関係データ１５４は、一般的にグラフ構造データベースなどを利用して管理され、このデータベースがサポートするデータ形式の情報であるが、同図では、関連データ１５４の情報が示す内容を、理解を容易にするためにグラフ構造そのものとして図示している。すなわち、同図におけるグラフ構造に対応する内容が関連データ１５４に格納されていることとなる。 Relational data 154 is generally managed using a graph structure database or the like, and is information of a data format supported by this database, but in the figure, the contents indicated by the information of related data 154 can be easily understood. It is illustrated as a graph structure itself. That is, the content corresponding to the graph structure in the figure is stored in the related data 154.

グラフのノード１８０１，１８０２，１８０３は、それぞれデータソースｉｄを保持している。これらのノード間をつなぐエッジ１８１１，１８１２，１８１３に対して、それぞれのノード間の関連度を示す重み１８２１，１８２２，１８２３が対応付けられている。このノード間の関連度は、本実施形態では、例えば、１．０に近ければ、そのノード間の関連度が高いことを示している。ノード間の関連度の算出については後述する。 The nodes 1801, 1802 and 1803 of the graph respectively hold data source ids. With respect to edges 1811, 1812 and 1813 connecting these nodes, weights 1821, 1822 and 1823 indicating the degree of association between the nodes are associated. The degree of association between nodes indicates that the degree of association between the nodes is high, for example, if it is close to 1.0 in this embodiment. The calculation of the degree of association between nodes will be described later.

図１９は、実施例２に係る関連度計算を説明する図である。 FIG. 19 is a diagram for explaining calculation of the degree of association according to the second embodiment.

この関連度計算は、活用度計算部２０２が、図１０に示す活用度算出処理のステップ２０２４において、追加の処理として実行するものである。図１９は、データソース１９０１（データソースAsset1:Sensor1）と、データソース１９０３（データソースAsset1:Sensor2）との時系列データに対して関連度を算出した例となっている。本例では、時系列データとして、時刻と、その時刻における値（例えば、センサーの測定値）との列のデータとしている。 The degree-of-association calculation is executed by the degree-of-use calculation unit 202 as an additional process in step 2024 of the degree-of-use calculation process shown in FIG. FIG. 19 is an example in which the degree of association is calculated for time-series data of the data source 1901 (data source Asset1: Sensor1) and the data source 1903 (data source Asset1: Sensor2). In this example, as time series data, data of a row of time and a value at that time (for example, a measurement value of a sensor) is used.

活用度計算部２０２は、活用度の計算における関連情報として、標準偏差、変動率などを算出し、さらに活用度を算出して活用度データ１５１に追加する。これに加えて本実施例では、活用度計算部２０２は、活用度データ１５１に保持された情報に基づいて、２つのデータソースの関連度を算出する。具体的には、活用度計算部２０２は、活用度データ１５１から、標準偏差、変動率などを、比較のための「指標」として取得し、関連度を式（５）により計算する。 The utilization degree calculation unit 202 calculates a standard deviation, a fluctuation rate, and the like as related information in the calculation of the utilization degree, and further calculates the utilization degree and adds it to the utilization degree data 151. In addition to this, in the present embodiment, the utilization calculation unit 202 calculates the degree of association between the two data sources based on the information held in the utilization data 151. Specifically, the utilization degree calculation unit 202 acquires a standard deviation, a fluctuation rate, and the like from the utilization degree data 151 as an “index” for comparison, and calculates the degree of association using Expression (5).

関連度＝Σ各指標の重み×各指標の一致有無／比較する指標の総数・・・（５） Degree of association = weight of each index × consistency of each index / total number of indices to be compared (5)

ここで、式（５）の「各指標の重み」は、指標の重要度に基づいて予め設定してもよいし、ユーザが定義するようにしてもよい。なお、関連度を算出する方法はこれに限られない。 Here, the “weight of each index” of the equation (5) may be set in advance based on the importance of the index, or may be defined by the user. The method of calculating the degree of association is not limited to this.

図１９に示すデータソース１９０１と、データソース１９０３とに対する指標１９０２，１９０４は、標準偏差が３１となり、変動率が２０．７となり、それぞれの指標が一致している。 The indices 1902 and 1904 for the data source 1901 and the data source 1903 shown in FIG. 19 have a standard deviation of 31, a variation rate of 20.7, and the respective indices match.

このとき、各指標の重みを１とした場合には、式（５）により算出される関連度は、１．０となる。なお、関連度を算出するための指標としては、活用度も利用可能である。使用できる他の指標としては、図４に示す活用度データ１５１のカラム４０３〜４０９の情報があり、さらに関連度の算出に特化すれば、ハッシュ値を特定タイミングと対象区間とについて算出して使用することができる。 At this time, when the weight of each index is 1, the degree of association calculated by equation (5) is 1.0. The degree of utilization can also be used as an index for calculating the degree of association. As other indexes that can be used, there is information in columns 403 to 409 of utilization level data 151 shown in FIG. 4, and if specialized for calculation of the degree of association, the hash value is calculated for the specific timing and the target section. It can be used.

例えば、アクション定義テーブル１２３のアクション内容を、関連度が高い２つ以上のデータソースの組み合わせに対しては、１つのデータソースのオブジェクトをデータストア１４０に残し、他のデータソースのオブジェクトを、外部のアーカイブやバックアップ用ストレージ１６０に移動するように設定することで、このアクションが実行されるとデータ量削減など効率的なデータ管理を実現することができる。 For example, for a combination of two or more data sources having a high degree of association, the action content of the action definition table 123 leaves an object of one data source in the data store 140, and an object of another data source is external By setting to move to the archive or backup storage 160, efficient data management such as data volume reduction can be realized when this action is executed.

次に、実施例３に係る計算機システム１０ｂについて説明する。なお、実施例３の説明においては、実施例１に係る計算機システム１０と異なる点を中心に説明する。 Next, a computer system 10b according to a third embodiment will be described. In the description of the third embodiment, differences from the computer system 10 according to the first embodiment will be mainly described.

実施例３に係る計算機システム１０ｂは、ユーザおよびユーザが所属するグループ毎に、活用度の計算方法を変更可能とする実施例である。これにより、データソースの検索、およびデータ管理を実施する場合に、ユーザやグループのそれぞれの活用度を用いることができ、ユーザやグループに則したデータソースを検索することができる。 The computer system 10b according to the third embodiment is an embodiment in which the method of calculating the degree of utilization can be changed for each user and the group to which the user belongs. Thus, when performing search of data sources and data management, it is possible to use the degree of utilization of each user or group, and it is possible to search data sources according to the user or group.

図２０は、実施例３に係る計算機システムの全体構成図である。なお、実施例１に係る計算機システムと同様な構成については同一の符号を付している。 FIG. 20 is an entire configuration diagram of a computer system according to a third embodiment. The same components as those of the computer system according to the first embodiment are denoted by the same reference numerals.

実施例３に係る計算機システム１０ｂにおいては、サーバ１１０のメモリ１２０は、更に、ユーザ管理テーブル１２４を記憶する。メモリ１２０は、計算方法記憶部の一例である。また、計算機システム１０ｂでは、各グループごとに、各データソースごとの活用度データ１５１を記憶する。また、計算機システム１０ｂは、メタデータ１５２に代えてメタデータ１５２ａ（図２２参照）を記憶する。 In the computer system 10b according to the third embodiment, the memory 120 of the server 110 further stores a user management table 124. The memory 120 is an example of a calculation method storage unit. In addition, in the computer system 10b, utilization degree data 151 for each data source is stored for each group. Also, the computer system 10 b stores metadata 152 a (see FIG. 22) instead of the metadata 152.

図２１は、実施例３に係るユーザ管理テーブルの構成図である。 FIG. 21 is a configuration diagram of a user management table according to the third embodiment.

ユーザ管理テーブル１２４は、ユーザの情報を管理するテーブルであり、各ユーザに対応するエントリを格納する。ユーザ管理テーブル１２４のエントリは、ユーザｉｄカラム２１０１と、Ｎａｍｅカラム２１０２と、ロールカラム２１０３と、グループｉｄカラム２１１１と、活用度計算方法カラム２１１２と、説明カラム２１１３とを含む。 The user management table 124 is a table for managing user information, and stores entries corresponding to each user. The entry of the user management table 124 includes a user id column 2101, a Name column 2102, a roll column 2103, a group id column 2111, a utilization calculation method column 2112, and a description column 2113.

ユーザｉｄカラム２１０１には、エントリに対応するユーザを識別するユーザｉｄが格納される。Ｎａｍｅカラム２１０２には、エントリに対応するユーザの氏名等が格納さえる。ロールカラム２１０３には、エントリに対応するユーザのロール（役割）が格納される。ロールカラム２１０３の内容は、実施例１の図６に示すカタログデータ１５３の「作成者ロール」に追記するための記述として利用できる。 The user id column 2101 stores a user id for identifying the user corresponding to the entry. The Name column 2102 stores the name and the like of the user corresponding to the entry. The role column 2103 stores the user role (role) corresponding to the entry. The contents of the roll column 2103 can be used as a description for adding to the “creator roll” of the catalog data 153 shown in FIG. 6 of the first embodiment.

グループｉｄカラム２１１１には、エントリに対応するユーザが属するグループのｉｄ（グループｉｄ）が格納される。グループｉｄとしては、例えば、数値による識別子と、グループの記述とを組み合わせたものとしており、例えば、数値の識別子が「001」、グループの記述が「データサイエンスチームA」の場合には、グループｉｄは、「001:データサイエンスチームA」としている。なお、グループの識別ができれば、グループｉｄは、数値の識別子と、記述とのいずれかの情報のみでもよい。活用度計算方法カラム２１１２には、エントリに対応するユーザが属するグループにおける、活用度の計算方法の情報（計算方法情報）、例えば、活用度の計算式が格納されている。説明カラム２１１３には、エントリに対応するユーザが属するグループにおける活用度の計算方法に関する説明が格納される。 The group id column 2111 stores the id (group id) of the group to which the user corresponding to the entry belongs. The group id is, for example, a combination of a numeric identifier and a description of the group. For example, when the numeric identifier is "001" and the group description is "data science team A", the group id is "001: Data Science Team A". In addition, as long as the group can be identified, the group id may be only information of either a numerical identifier or a description. The utilization degree calculation method column 2112 stores information (calculation method information) of the utilization degree calculation method in the group to which the user corresponding to the entry belongs, for example, a calculation formula of the utilization degree. The description column 2113 stores a description of a method of calculating the degree of utilization of the group to which the user corresponding to the entry belongs.

なお、ユーザ管理テーブル１２４のすべてのカラムは必ずしも必須ではなく、例えば、Ｎａｍｅカラム２１０２、ロールカラム２１０３、及び説明カラム２１１３については、理解を即すために本実施例の説明として例示しているものであって必ずしも必要ではない。 Note that all the columns of the user management table 124 are not necessarily essential. For example, the Name column 2102, the roll column 2103, and the description column 2113 are illustrated as the explanation of the present embodiment for the purpose of making the understanding easy. It is not always necessary.

ユーザ管理テーブル１２４は、例えば、データ管理プログラム１２１が提供するユーザ向けＵＩにおける、定義ファイルなどからの取り込み操作や、ユーザからの入力操作等に従って、作成及び変更されてもよい。また、ユーザｉｄカラム２１０１、Ｎａｍｅカラム２１０２、ロールカラム２１０３、及びグループｉｄカラム２１１１に設定される情報は、外部のディレクトリサービス等から取得してもよい。 The user management table 124 may be created and changed, for example, in accordance with an operation for importing from a definition file or the like, an input operation from a user, or the like in a user-oriented UI provided by the data management program 121. The information set in the user id column 2101, the Name column 2102, the roll column 2103, and the group id column 2111 may be acquired from an external directory service or the like.

本実施例に係る活用度計算部２０２による活用度計算処理は、実施例１に係る活用度計算部２０２による図１０に示す活用度計算処理と以下の点が異なる。実施例３に係る活用度計算部２０２は、ユーザ管理テーブル１２４の活用度計算方法カラム２１１２に記載された活用度の計算方法に基づき、活用度の計算を実施する。活用度計算部２０２は、図１０に示す活用度計算処理のステップ２０２４において、ユーザ管理テーブル１２４に記載された活用度計算方法カラム２１１２の活用度計算方法の一部または全部の方法にて活用度を算出する。算出したすべての計算方法の活用度は、エントリに対応するグループのグループｉｄに対応する活用度データ１５１に格納する。 The utilization degree calculation process by the utilization degree calculation unit 202 according to the present embodiment is different from the utilization degree calculation process shown in FIG. 10 by the utilization degree calculation unit 202 according to the first embodiment as follows. The utilization calculation unit 202 according to the third embodiment performs calculation of the utilization based on the calculation method of the utilization described in the utilization calculation method column 2112 of the user management table 124. The utilization calculation unit 202 uses the utilization calculation method in part or all of the utilization calculation method of the utilization calculation method column 2112 described in the user management table 124 in step 2024 of the utilization calculation process shown in FIG. Calculate The utilization degrees of all the calculation methods calculated are stored in utilization degree data 151 corresponding to the group id of the group corresponding to the entry.

また、活用度計算部２０２は、図１０に示す活用度計算処理のステップ２０２５において、グループごとの活用度（例えば、そのグループにおける平均の活用度）をグループｉｄとともに、データソースのメタデータ１５２ａに追加・更新する。なお、活用度を一部の計算方法のみにより算出するか否かについては、例えば、ユーザ管理テーブル１２４などに活用度の計算要否を示すデータを追加しておき、そのデータに基づいて判断するようにしてもよく、また、一部の計算方法のみを行う対象とするデータソースのｉｄの指定を予め受け付けておき、対象のデータソースｉｄに基づいて判断するようにしてもよい。 In addition, in step 2025 of the utilization degree calculation process shown in FIG. 10, the utilization degree calculation unit 202 sets the utilization degree for each group (for example, the average utilization degree in the group) together with the group id to the metadata 152a of the data source. Add / update. As to whether or not the utilization degree is calculated by only a part of the calculation method, for example, data indicating the necessity of calculation of the utilization degree is added to the user management table 124 or the like, and determination is made based on the data. Alternatively, the specification of the id of the data source for which only a part of the calculation method is to be performed may be received in advance, and the determination may be made based on the data source id of the target.

なお、ユーザ管理テーブル１２４の活用度計算方法カラム２１１２において、活用度計算方法に、例えば、「変動率」、「取得頻度」、「標準偏差」などの関連情報が記述されている場合には、活用度計算部２０２は、図１０に示す活用度計算処理のステップ２０２４において、活用度計算方法を実行する前に、活用度計算方法に使用する各関連情報を計算する。 In the utilization degree calculation method column 2112 of the user management table 124, when related information such as “variation rate”, “acquisition frequency”, and “standard deviation” is described in the utilization degree calculation method, for example, The utilization calculation unit 202 calculates each piece of related information to be used for the utilization calculation method before executing the utilization calculation method in step 2024 of the utilization calculation process shown in FIG.

図２２は、実施例３に係るメタデータの構成図である。 FIG. 22 is a configuration diagram of metadata according to the third embodiment.

メタデータ１５２ａは、実施例１に係るメタデータ１５２とは、活用度として、グループｉｄと、そのグループｉｄのグループについての活用度との組が１つ以上含まれている点が異なっている。 The metadata 152a differs from the metadata 152 according to the first embodiment in that one or more pairs of a group id and a utilization degree of the group id are included as the utilization degree.

図２２に示すメタデータ１５２ａでは、グループｉｄ「００１」における活用度が「７５」であり、グループｉｄ「００２」における活用度が「９９」であり、グループｉｄ「０４１」における活用度が「１２．９」であることが記述されている。 In the metadata 152a shown in FIG. 22, the utilization in the group id "001" is "75", the utilization in the group id "002" is "99", and the utilization in the group id "041" is "12" .9 "is described.

本実施例３において、アクション定義テーブル１２３のエントリに対して、そのエントリを実行するグループのグループｉｄと対応付けて管理するようにし、アクション実行部２０５が、図１３に示すアクション実行処理のステップ２０５４において、対応するグループｉｄに対応する活用度を、メタデータ１５２から取得して、条件判定に利用するようにしてもよい。 In the third embodiment, the entry of the action definition table 123 is managed in association with the group id of the group executing the entry, and the action execution unit 205 executes step 2054 of the action execution process shown in FIG. In the above, the utilization degree corresponding to the corresponding group id may be acquired from the metadata 152 and used for condition determination.

また、データ検索部２０７は、データ管理プログラム１２１を利用しているユーザに応じて、データソース検索画面１５０１の候補ボックス１５０５に表示するデータソースの活用度を変えるようにしてもよい。具体的には、データ検索部２０７は、データ管理プログラム１２１を利用しているユーザの所属するグループｉｄをユーザ管理テーブル１２４から取得し、取得したグループｉｄの活用度をメタデータ１５２ａから取得して、候補ボックス１５０５のデータソースに対応させて表示させてもよい。これにより、利用しているユーザの属するグループに対応する活用度を適切に表示させることができる。 Further, the data search unit 207 may change the utilization degree of the data source displayed in the candidate box 1505 of the data source search screen 1501 according to the user who uses the data management program 121. Specifically, the data search unit 207 acquires, from the user management table 124, the group id to which the user using the data management program 121 belongs, and acquires the utilization degree of the acquired group id from the metadata 152a. , And may be displayed corresponding to the data source of the candidate box 1505. Thereby, the utilization degree corresponding to the group to which the user who is using can be displayed appropriately.

なお、本発明は、上述の実施例に限定されるものではなく、本発明の趣旨を逸脱しない範囲で、適宜変形して実施することが可能である。 The present invention is not limited to the above-described embodiment, and can be appropriately modified and implemented without departing from the spirit of the present invention.

例えば、上記実施例では、サーバ１１０と、データストア装置１３０とを別の計算機としていたが、本発明はこれに限られず、例えば、サーバ１１０と、データストア装置１３０とを１つの計算機で構成するようにしてもよい。 For example, although the server 110 and the data store apparatus 130 are separate computers in the above embodiment, the present invention is not limited to this. For example, the server 110 and the data store apparatus 130 are configured by one computer. You may do so.

また、上記実施例において、ＣＰＵが行っていた処理の一部又は全部を、専用のハードウェア回路で行うようにしてもよい。例えば、ＣＰＵがプログラムを実行することにより構成される機能部（２０１〜２０７等）の少なくともいずれか１つを専用のハードウェア回路で構成してもよい。また、上記実施形態におけるプログラムは、プログラムソースからインストールされてよい。プログラムソースは、プログラム配布サーバ又は記憶メディア（例えば不揮発性の可搬型の記憶メディア）であってもよい。 In the above embodiment, part or all of the processing performed by the CPU may be performed by a dedicated hardware circuit. For example, at least one of the functional units (201 to 201, etc.) configured by the CPU executing a program may be configured by a dedicated hardware circuit. Also, the program in the above embodiment may be installed from a program source. The program source may be a program distribution server or storage medium (eg, non-volatile portable storage medium).

１０…計算機システム、１０１…装置、１０２…データソース、１１０…サーバ、１３０…データストア装置、２０２…活用度計算部、２０５…アクション実行部、２０７…データ検索部 10: computer system, 101: device, 102: data source, 110: server, 130: data storage device, 202: utilization degree calculation unit, 205: action execution unit, 207: data search unit

Claims

A data management system for managing data obtainable from a predetermined data source, comprising:
A storage unit for storing data from the data source;
A utilization calculation unit that calculates utilization based on statistical information about the content of the data, indicating a degree of effectiveness regarding utilization of data analysis of the data of the data source;
A data management system, comprising: an action execution unit which executes a predetermined processing operation corresponding to the action condition on the data when an action condition including the calculated condition regarding the utilization degree is satisfied.

The data management system according to claim 1, wherein the utilization degree calculation unit calculates the utilization degree for data in a predetermined section of the data source.

The action condition includes a condition that the utilization degree is equal to or less than a predetermined value,
The data management system according to claim 1 or 2, wherein the predetermined processing operation corresponding to the condition is an operation of storing the data in an archive.

The statistical information is included in the data, such as a defect rate that is a rate at which incorrect values are included in values of multiple time points included in the data, a variation rate related to fluctuations in values of multiple time points included in the data, or the data The data management system according to any one of claims 1 to 3, which is at least one of standard deviations of values at a plurality of time points.

The data management system according to claim 4, wherein the utilization degree calculation unit calculates the utilization degree based on an acquisition frequency of values at a plurality of time points included in the data, the loss rate, and the standard deviation.

The data management system according to any one of claims 1 to 5, wherein the action condition includes a condition regarding the statistical information.

A catalog management unit is further provided which manages a plurality of related data sources as a catalog, and manages evaluation values related to the catalog,
The data management according to any one of claims 1 to 6, wherein the catalog management unit corrects the degree of utilization of data of the data sources belonging to the catalog based on the evaluation value regarding the catalog. system.

The data management system according to claim 7, further comprising an evaluation value receiving unit that receives specification of an evaluation value regarding the catalog.

The computer further includes a calculation method storage unit storing calculation method information including a calculation method for calculating the utilization for at least one of a user and a plurality of users.
The data management system according to any one of claims 1 to 8, wherein the utilization degree calculation unit calculates the utilization degree by the calculation method corresponding to a user or a group who has requested the calculation of the utilization degree.

An input receiving unit that receives an input of a search condition related to a data source for which the utilization degree is to be displayed;
A data search unit for searching a data source corresponding to the search condition;
10. The display control unit according to claim 1, further comprising: a display control unit configured to display the degree of utilization calculated for the data source with respect to the data source detected by the data search unit. Data management system.

A data management method by a data management system for managing data obtainable from a predetermined data source, comprising:
Store data from the data source,
Based on the statistical information on the content of the data, a utilization degree indicating the degree of effectiveness regarding utilization of data analysis of the data of the data source is calculated;
The data management method which performs predetermined processing operation corresponding to the said action condition with respect to the said data, when action conditions containing the conditions regarding the calculated said utilization degree are satisfy | filled.

A data management program for causing a computer constituting a data management system that manages data obtainable from a predetermined data source to be executed, the data management program comprising:
The computer,
A utilization calculation unit that calculates utilization based on statistical information about the content of the data, indicating a degree of effectiveness regarding utilization of data analysis of the data of the data source;
A data management program that functions as an action execution unit that executes a predetermined processing operation corresponding to the action condition on the data when an action condition including the calculated condition regarding the utilization degree is satisfied.