CN116205396A - Data panoramic monitoring method and system based on data center - Google Patents

Data panoramic monitoring method and system based on data center Download PDF

Info

Publication number
CN116205396A
CN116205396A CN202211580140.1A CN202211580140A CN116205396A CN 116205396 A CN116205396 A CN 116205396A CN 202211580140 A CN202211580140 A CN 202211580140A CN 116205396 A CN116205396 A CN 116205396A
Authority
CN
China
Prior art keywords
data
task
metadata
application
component
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211580140.1A
Other languages
Chinese (zh)
Inventor
彭晨辉
李贤慧
王威
王鸣一
张超
陈南明
富鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CHINA REALTIME DATABASE CO LTD
NARI Group Corp
Original Assignee
CHINA REALTIME DATABASE CO LTD
NARI Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CHINA REALTIME DATABASE CO LTD, NARI Group Corp filed Critical CHINA REALTIME DATABASE CO LTD
Priority to CN202211580140.1A priority Critical patent/CN116205396A/en
Publication of CN116205396A publication Critical patent/CN116205396A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/06Electricity, gas or water supply

Abstract

The invention discloses a data panoramic monitoring method and a system based on a data center station, wherein the method comprises the following steps: collecting basic data of monitoring requirements of a data center platform in a mode of regularly extracting metadata bin data and calling component interface data; processing the collected data into a unified data model table according to the actual monitoring requirement of the data center table and the specified data model specification design requirement, and carrying out standardized storage; carrying out logic processing and analysis processing on the standardized data, storing a detail table and a result summary table formed by processing results in a data center database, constructing a blood relationship link of a table, a task and an application, and alarming abnormal conditions; and visually displaying the results generated by data analysis according to the classification of the functional modules. The invention can quickly master the data storage, calculation and application support capability of the data center station, and is beneficial to improving the operation and maintenance level of the data center station.

Description

Data panoramic monitoring method and system based on data center
Technical Field
The invention relates to data monitoring and management, in particular to a data panoramic monitoring method and system based on a data center.
Background
With the increasingly deep construction of informatization and digitalization of the power grid industry, the generated business data volume is more and more huge, and business interaction logic relations are more and more complex, which provides more and more challenges for the work of data management practitioners of the power grid industry. In order to better cope with the increasingly complex data management work, power grid companies are fully developing data center construction work. Through gradual construction for several years, the power grid company preliminarily realizes 'searchability, availability and controllability' of the data through the data center, and effectively improves the data management capability and the working efficiency, but the data is synchronously migrated between all levels, associated mapping is circulated, and meanwhile, a data link is longer and more complex, so that the data operation and maintenance work difficulty of the data center is deepened.
The prior art has the following defects: based on the current construction situation and current stage achievements of the middle data platform, data operation and maintenance personnel cannot quickly and comprehensively master the data storage and operation conditions of the middle data platform, so that the operation and maintenance personnel have high working intensity and low efficiency, and the quick response capability of the middle data platform service is further restricted. Meanwhile, the credibility of the data of the middle station is reduced along with the time, so that the practical process of the data of the middle station is affected.
Disclosure of Invention
The invention aims to: the invention provides a data panoramic monitoring method and a system based on a data center, which improve the operation and maintenance level and the service supporting capability of the data center and improve the data management efficiency of a power grid.
The technical scheme is as follows: a data panoramic monitoring method based on a data center station comprises the following steps:
the method comprises the steps of collecting basic data of monitoring requirements of a data center platform by means of regularly extracting metadata bin data and calling component interface data, wherein the metadata bin data extraction comprises the following steps: configuring data source connection and destination connection in a data center data integration component, and extracting open metadata bin data of the data center to a source layer base table for storage through a table and field mapping relation; invoking component interface data includes: acquiring real-time data of a designated component by calling an API (application program interface) of an open data center, and storing the real-time data in a database corresponding table of the data center in a classified manner, wherein the designated component is a component capable of collecting the total data and enabling incremental data to synchronously enter the center;
processing the collected data into a unified data model table according to the actual monitoring requirement of the data center table and the specified data model specification design requirement, and carrying out standardized storage;
According to the actual monitoring requirement of the data center, carrying out association analysis processing on the acquired data and/or standardized data, forming a detail table and a result summary table from processing results, storing the detail table and the result summary table in a data center database, constructing a blood relationship link of a table, a task and an application, and alarming abnormal conditions;
and visually displaying the results generated by data analysis according to the classification of the functional modules.
Preferably, the open metadata class in the data includes asset metadata, task metadata, application metadata, API service metadata, wherein,
the extraction of asset meta-bin data includes: setting asset metadata bin data acquisition task scheduling time as a standard point to execute according to actual business requirements of the station asset monitoring in the data, and sorting and recording table names, table IDs, table capacities, table numbers, field numbers, table types, project space, creation time and modification time related to the asset data by acquiring asset metadata bin data;
the extraction of the task meta-bin data comprises the following steps: according to the actual running condition and monitoring requirement of a task in the data, setting task metadata bin data acquisition task scheduling time as a standard point for execution, and sorting and recording task names, task IDs, task states, job creation time, task execution time, last update time, reading item numbers and writing item numbers related to the data integration task by acquiring the task metadata bin data;
The extracting of the task meta-bin data further comprises: according to the actual running condition and monitoring requirement of a data center platform task, setting task metadata bin data acquisition task scheduling time as a standard point for execution, and sorting and recording task names, task IDs, task states, node names, workflow states, job types, start time, end time and scheduling types related to a data development task by acquiring the task metadata bin data;
the extraction of the application meta-bin data comprises: according to the actual service scene and monitoring requirement of the data center application, setting the application metadata bin data acquisition task scheduling time as a standard point for execution, and sorting and recording application names, application descriptions, application creators, working space names, application types, registration time and latest update time related to the application data by acquiring the application scene metadata bin data;
the extraction of the API service metadata comprises the following steps: according to actual release, calling conditions and monitoring requirements of the API service in the data center, setting the scheduling time of the acquisition task of the API service metadata bin as a standard point for execution, and sorting and recording the related API name, API description, API path, API type, API protocol, request type, creation time and update time of the API service by acquiring the API service metadata bin data.
Preferably, the specified component API is called by a JAVA program, the specified component comprising CDM, DLF, DRS, MRS, wherein
The invoking of CDM component interface data includes: setting JAVA timing tasks as minute frequency according to the utilization rate and monitoring requirements of CDM components, acquiring related data by calling an API of CDM, sorting the data and recording project space names, cluster names, task states and final running time of the CDM components;
the invocation of the DLF component interface data includes: setting JAVA timing tasks as minute frequency according to the utilization rate and monitoring requirements of the DLF component, acquiring related data by calling an API of the DLF, sorting the data and recording project space names, task names, node names, task states, task types, node types and final running time of the DLF component;
the call to DRS component interface data includes: setting JAVA timing tasks as minute frequency according to the utilization rate and monitoring requirements of the DRS components, acquiring related data by calling APIs of the DRS, sorting the data and recording project space names, cluster names, task states, task types and final running time of the DRS components;
the call for MRS component interface data includes: according to the utilization rate of the MRS component and the actual monitoring requirement, setting JAVA timing tasks as the frequency of minutes, acquiring related data by calling the API of the MRS component, sorting the data and recording the CPU utilization rate, the memory utilization rate, the disk utilization rate, the IO read rate and the IO write rate of the MRS component.
Preferably, the minimum frequency of metadata extraction and component interface data call is set to be once in 5 minutes, and the maximum frequency is set to be once in 1 day.
Preferably, processing the collected data into a unified data model table, and performing standardized storage includes: classifying the acquired data according to the following data model fields: a data acquisition domain, a data storage domain, a data processing domain, a data application domain, a data security domain, wherein
For the data acquisition domain, logically processing the data integration operation metadata bin data table according to the purpose classification of the metadata bin data to form a data acquisition domain model table;
for the data storage domain, according to the purpose classification of the metadata bin, logically processing the table information and the field information metadata bin data table to form a data storage domain model table;
for the data processing domain, according to the purpose classification of the metadata, logically processing the metadata table of the metadata development task and the metadata development operation to form a model table of the data processing domain;
for the data application domain, according to the purpose classification of the metadata, logically processing the data service and the data application metadata bin data table to form a data application domain model table;
and for the data guarantee domain, logically processing the metadata table of the designated component according to the purpose classification of the metadata bin to form a data guarantee domain model table respectively.
Preferably, performing the correlation analysis processing on the collected data and/or the normalized data includes:
carrying out association analysis on the collected and/or standardized asset data according to the monitoring requirement of the assets in the data to form analysis and summary results of the total number of access systems of the assets in the data, the total number of tables, the total capacity of the tables, the total number of table fields, the number of empty tables, the total number of tables of each level and the total capacity and detail of various asset data;
carrying out association analysis on the collected and/or standardized task data according to the task monitoring requirements of the data center to form analysis total results such as the total number of tasks, the number of successful tasks, the number of failed tasks, the number of tasks in execution, the number of unsuccessful tasks and the like, and detail of various task data;
carrying out association analysis on the collected and/or standardized application data according to the application monitoring requirements of the data center to form analysis total results such as the total number of applications, the number of successful applications, the number of failed applications, the number of primary applications, the number of secondary applications and the like and detail of various application data;
performing association analysis on the collected specified component data according to the monitoring requirement of the data center component to form key resource occupation condition statistics and trend analysis of CPU utilization rate, memory utilization rate, disk utilization rate, IO read rate and IO write rate;
And carrying out association analysis on the collected and/or standardized service data according to the service monitoring requirements of the data center to form analysis and summary results of total calling times, month calling times, day calling success rate, day calling time consumption, total release count, month release count, day release count and day release rate and detail of various application data.
Preferably, constructing the blood relationship links of the table, the task and the application comprises:
according to the collected and standardized task metadata bin data, analyzing SQL scripts contained in tasks by using Java programs, acquiring the table upstream and downstream relations and corresponding table mode names in task nodes, and respectively tracing the blood edge relations of the tables forward and backward by taking a source table and a target table as the target table to form the blood edge link relation of the tables and the tasks;
and according to the association relation between the task and the application and between the table and the application, acquiring complete links of the table, the task and the application blood-edge relation by matching key fields such as project space, mode name, table name, task name and the like on the basis of the blood-edge relation between the table and the task.
Preferably, alarming for abnormal conditions includes asset abnormality alarming, task operation early warning, task abnormality alarming, wherein:
The asset abnormality warning means that according to the asset monitoring requirement in the data, a threshold value is set on the basis of standardized asset model data, annotating maintenance conditions of the table and monitoring requirements of empty table conditions, and when the middle data exceeds the threshold value, abnormal conditions are timely notified to relevant responsible persons;
the task operation early warning means that according to the task monitoring requirement in the data, the task operation early warning means that the task operation early warning means sets the expected ending time for the heavy-point operation task based on the collected real-time task operation data and the standardized task model data, and notifies relevant responsible persons of abnormal conditions to check when the task is not completed within the expected time, so that the data analysis of the service is avoided;
the task abnormality warning means that tasks on a blood margin link and associated tables and application states are updated in real time based on collected real-time task state data according to blood margin monitoring requirements of data in the data, abnormal blood margins are distinguished and displayed, and abnormal task information and affected applications are notified to relevant responsible persons.
Preferably, the main functional modules of the application presentation part include: the system comprises an asset monitoring module, a task monitoring module, a component monitoring module, an application monitoring module and a service monitoring module; the asset monitoring module includes: asset overview, asset details, list details, empty list details, list blood margin sub-modules; the task monitoring module comprises: task overview, task detail, task blood-edge sub-module; the component monitoring module includes: CDM component, MRS component, DLF component, DRS component sub-module; the application monitoring module comprises: an application overview, an application detail, and an application blood margin sub-module; the service monitoring module comprises: service overview, service release details, and service invocation details sub-modules.
The invention also provides a data panoramic monitoring system based on the data center station, which comprises:
the data extraction module is used for collecting basic data of monitoring requirements of a data center platform in a mode of regularly extracting metadata of a metadata bin and calling interface data of a component, and comprises: the metadata extraction unit is configured to configure data source connection and destination connection in the data center data integration component, and extract open metadata of the data center to the source layer base table for storage through the table and field mapping relation; the component data calling unit is configured to acquire real-time data of a designated component by calling an API (application program interface) of an open data center, and store the real-time data in a database corresponding table of the data center in a classified manner, wherein the designated component is a component capable of collecting the total data and enabling incremental data to synchronously enter the center;
the standardized module processes the collected data into a unified data model table according to the actual monitoring requirement of the data center station and the specified data model specification design requirement, and performs standardized storage;
the data analysis module is used for carrying out association analysis processing on the collected data and/or the standardized data according to the actual monitoring requirement of the data center, forming a detail table and a result summary table from processing results, storing the detail table and the result summary table in a data center database, constructing a blood relationship link of the table, the task and the application, and alarming abnormal conditions;
And the application display module is used for visually displaying the results generated by data analysis according to the classification of the functional modules.
The beneficial effects are that: by the implementation of the invention, the conditions of data asset storage, timing task execution, component operation, service call and application support of the data center can be comprehensively mastered. In addition, the implementation of the invention can timely find out the abnormal condition of the data, timely inform related operation and maintenance operators through the alarm modes such as in-station information, short message notification and the like, and simultaneously help related operators to quickly find out and locate the source of the abnormality and solve the problem by means of the link relation of the data blood edges, and timely study and judge the influence range of the abnormality, thereby effectively avoiding the business risk caused by the abnormal condition of the data. By the implementation of the invention, the data storage, calculation and application support capability of the data center can be rapidly mastered, the operation and maintenance level of the data center can be improved, the accuracy, timeliness and integrity of the data center can be improved, the practical value of the data can be improved, the service support capability of the data center can be improved, and the digital transformation of a power company can be assisted.
Drawings
FIG. 1 is a flow chart of a data panorama monitoring method of the present invention;
fig. 2 is an example of the blood relationship of the present invention.
Detailed Description
The technical scheme of the invention is further described below with reference to the accompanying drawings.
As shown in fig. 1, the invention provides a data panoramic monitoring method based on a data center, which comprises the following steps:
step (1), data acquisition: and configuring data source connection and destination connection in the data center data integration component, extracting metadata bin data such as asset information, task information, application information, API service information and the like of the center opening to a source layer base table for storage through a table and field mapping relation, and setting data extraction frequency through a task configuration page.
The panoramic monitoring related to the invention mainly analyzes and monitors the acquisition of the platform operation log information. In an embodiment, a data middle station applied in a certain power grid system is monitored, and log information of the running state of the data middle station is integrated in a cloud platform to form a structured basic information table which is convenient to access, and the information such as a middle station access table (asset class), running tasks and historical examples (acquisition class and processing class), data service calling and publishing conditions (service class), component running (data operation class) and the like is covered. These structured underlying information tables are colloquially referred to as meta-bins. And (3) configuring data source connection and destination connection in the data center data integration component, and extracting metadata such as asset information, task information, application information, API service information and the like of the center through a table and field mapping relation to a source layer pasting basic table for storage.
The frequency of data extraction may be set by the task configuration page. The data acquisition frequency is set, for example, once a day. Additionally, different types of data acquisition times may be set individually as desired. Illustratively, the asset, task, application, API service data collection scenarios are as follows:
setting asset metadata bin data acquisition task scheduling time as a standard point to execute according to actual business and use conditions of assets in the data, and sorting and recording related table names, table IDs, table capacity, table numbers, field numbers, table types, project space, creation time and modification time of the asset data by acquiring the asset metadata bin data;
according to actual service and use conditions of a task in the data, setting task metadata bin data acquisition task scheduling time as a standard point for execution, and sorting and recording task names, task IDs, task states, job creation time, task execution time, last update time, reading item numbers and writing item numbers related to a data integration task by acquiring the task metadata bin data;
according to actual service and use conditions of a platform task in data, setting task metadata bin data acquisition task scheduling time as a standard point for execution, and sorting and recording task names, task IDs, task states, node names, workflow states, job types, start time, end time and scheduling types related to a data development task by acquiring the task metadata bin data;
Setting the application metadata bin data acquisition task scheduling time as a standard point to execute according to the actual service and use condition of the application scene in the data, and sorting and recording the application name, the application description, the application creator, the working space name, the application type, the registration time and the latest update time related to the application data by acquiring the application scene metadata bin data;
according to the actual service and use condition of the API service in the data center, setting the scheduling time of the acquisition task of the API service metadata bin as a standard point for execution, and sorting and recording the related API name, API description, API path, API type, API protocol, request type, creation time and update time of the API service by acquiring the API service metadata bin.
And acquiring real-time data through JAVA program call data center station cloud data migration (Cloud Data Migration, abbreviated as CDM), data lake factory service (Data Lake Factory, abbreviated as DLF), data replication service (Data Replication Service, abbreviated as DRS), mapReduce service (MapReduce Service, abbreviated as DLF) and other key component API interfaces, and storing the data in a data center station RDS database corresponding table in a classified manner. The total data acquisition of the middle platform and the synchronous entering of incremental data into the middle platform are mainly completed through the components, once a problem occurs, the quality of the whole middle platform data is affected, and the analysis accuracy and the accuracy of upper application data are affected, so that the key components at least comprise the four components. It should be understood that this is not a limitation of the present invention, and that in other vendors' data center software, there are corresponding components that can perform full data collection and synchronize incremental data into the center, which can be identified as critical components.
The frequency of API interface calls may be set by itself, for example, the minimum call frequency is set to be once for 5 minutes, and the maximum call frequency is set to be once for 1 day. Further, the frequency of real-time calling after clicking can be set according to the monitoring requirement. In the embodiment of the invention, the calling conditions of the API interface data of key components such as CDM, DLF, DRS, MRS are as follows:
setting JAVA timing tasks as minute frequency according to the utilization rate and actual service of CDM components, acquiring related data by calling an API of CDM, sorting the data and recording project space names, cluster names, task states and final running time of the CDM components;
setting JAVA timing tasks as minute frequency according to the utilization rate and actual service of the DLF component, acquiring related data by calling an API of the DLF, sorting the data and recording project space names, task names, node names, task states, task types, node types and final running time of the DLF component;
setting JAVA timing tasks as minute frequency according to the utilization rate and actual service of the DRS component, acquiring related data by calling an API of the DRS, sorting the data and recording the project space name, cluster name, task state, task type and final running time of the DRS component;
According to the utilization rate of the MRS component and the actual monitoring requirement, setting JAVA timing tasks as the frequency of minutes, acquiring related data by calling the API of the MRS component, sorting the data and recording the CPU utilization rate, the memory utilization rate, the disk utilization rate, the IO read rate and the IO write rate of the MRS component.
Step (2), data standardization: according to the actual monitoring requirement of the data center, the collected metadata bin data and real-time data of the components are processed into a unified data model table through logic conversion and design according to the specified standard design requirement, and the data is stored in a standardized mode.
The invention is applied to a power grid system, the standardization of data is required to follow a national power grid company enterprise public data model (SG-CIM), the business system data can be divided into a plurality of theme zones according to the SG-CIM model, and the subtopic zones can be further divided below the theme zones. According to the method, the obtained metadata are classified and sorted according to the types of the components in the data and the purposes of the metadata, analysis of the entity, the entity relationship, the attribute, the granularity and the like of the metadata are carried out, and 5 secondary data model subject domains such as a data acquisition domain, a data storage domain, a data processing domain, a data application domain and a data guarantee domain are formed. The specific classification cases are as follows:
According to the purpose classification of the metadata, logically processing the metadata table of the data integration operation metadata to form a data acquisition domain model table;
logically processing the table information and field information metadata bin data table according to the purpose classification of the metadata bin data to form a data storage domain model table;
according to the purpose classification of the metadata, logically processing the metadata table of the data development task and the data development operation to form a data processing domain model table;
according to the purpose classification of the metadata, logically processing the data service and the data application metadata table to form a data application domain model table;
and logically processing the metadata warehouse data tables of the DRS component, the MRS component and the like according to the purpose classification of the metadata warehouse data to form a data guarantee domain model table respectively.
The logic processing in the invention refers to establishing a secondary data topic domain model table meeting SG-CIM model specification design requirements according to metadata bin data usage classification, acquiring data which are the same as fields in the secondary data topic domain model table from the metadata bin data table, extracting the data and storing the data into a corresponding secondary data topic domain model table.
Step (3), data analysis: and carrying out association analysis processing on the collected and standardized data according to the actual monitoring requirement of the data center station, storing a processing result forming list and a result summary list in an RDS database of the data center station, and providing the data center station with a foreground application for display.
In the data processing analysis stage, performing association analysis on various data means that data items of expected categories are obtained from collected data and/or standardized data according to monitoring requirements, each expected category data item possibly comprises subdivided sub-data items, the obtained data items/sub-data items are subjected to statistics and de-duplication, and the data proportion conditions and change trends of different dimensions can be calculated.
In the embodiment of the invention, the data processing analysis mainly comprises data of assets, tasks, applications, components, services and the like, and the specific conditions are as follows:
and carrying out operations such as association analysis on the collected and standardized asset data according to the monitoring requirements of the assets in the data to form analysis total results such as total access systems, total tables, total table capacity, total table fields, empty tables, total table numbers of each level, total capacity and the like of the assets in the data and detail of various asset data. Meanwhile, the asset data is subjected to refined classification statistics according to a hierarchy, a system and the like, and the asset statistics data is subjected to impulse analysis to form a change trend analysis display of the refined classification data.
And carrying out operations such as association analysis on the collected and standardized task data according to the task monitoring requirements of the data center to form analysis total results such as the total number of tasks, the number of successful tasks, the number of failed tasks, the number of tasks in execution, the number of unsuccessful tasks and the like, and detail of various task data.
And carrying out operations such as association analysis on the collected and standardized application data according to the application monitoring requirements of the data center to form analysis total results such as the total number of applications, the number of successful applications, the number of failed applications, the number of primary applications, the number of secondary applications and the like and detail of various application data.
And performing association analysis and other operations on the collected CDM component data and MRS component data according to the monitoring requirements of the data center table component, and forming key resource occupation condition statistics and trend analysis such as CPU utilization rate, memory utilization rate, disk utilization rate, IO read rate, IO write rate and the like.
And performing operations such as association analysis on the collected and standardized service data according to the service monitoring requirements of the data center to form total call times, month call times, day call success rate, day call time consumption, total release total, month release total, day release rate and the like as well as various application data detail.
As a preferred embodiment, the data analysis further comprises data blood-edge link relation construction according to actual monitoring requirements of the data center. The construction method comprises the following steps:
the data center station completes data access (ODS layer), data cleaning, standardization (shared layer DWD), construction Data Warehouse (DWS) and data mart layer (DWM) to complete data processing links and processes, finally packages a data service interface of the DWM layer or provides a DWM data table for upper layer application to access, integrate or develop the interface and the like, the metadata bin task data stores SQL information of each layer of processing task and comprises source table target table information, according to collected and standardized task metadata bin data, a Java program is utilized to analyze SQL script contained in the task, a table upstream-downstream relation and a corresponding table mode name in a task node are obtained, the source table and the target table serve as a target table and a source table respectively trace the blood edge relation of the table and the task forward and backward, and simultaneously, the task state in the blood edge link relation is updated in real time according to task real-time state data returned by the component interface.
And according to the association relation between the task and the application and between the table and the application, acquiring complete links of the table, the task and the application blood-edge relation by matching key fields such as project space, mode name, table name, task name and the like on the basis of the blood-edge relation between the table and the task.
FIG. 2 shows a simple example of a blood-lineage relationship constructed as follows:
1) The source/patch table 1 and the source/patch table 2 are the source tables, the model table 1 is the target table, the association relation between the two is obtained by analyzing the execution script of the task 0;
2) The model table 1 is a source table, the analysis layer table 1 and the analysis layer table 2 are target tables, and the association relation between the two is obtained by analyzing the execution script of the task 1;
3) Analysis layer table 1 and analysis layer table 2 are associated with application 1 through task 1 (application has a job space in which task 1 is created, task 1 contains analysis layer tables 1 and 2, and two tables are target tables, which belong to the application);
4) The analysis layer table 2 is a source table, the analysis layer table 3 is a target table, and the association relationship between the analysis layer table 2 and the target table is obtained by analyzing the execution script of the task 3;
5) Analysis table 3 is associated with application 3 by a certain task, while analysis layer table 3 appears as a source table in the task execution script;
6) The model table 1 and the application 2 are associated by means of account authorization (in the database, the model table 1 is authorized to the account to which the application 2 belongs for use).
The source table and the target table are called upstream and downstream. Through the blood-edge relation graph, when the data of a certain table is abnormal, a downstream table, a task and an application can be timely notified to timely avoid business risks.
As a preferred embodiment, the data analysis also comprises an anomaly alarm according to the actual monitoring requirements of the data center.
The abnormal alarms mainly comprise asset abnormal alarms, task operation early warning and task abnormal alarms, and the specific conditions are as follows:
and setting a threshold value according to the monitoring requirements of the assets in the data, such as annotation maintenance conditions, empty list conditions and the like of the list based on the standardized asset model data. When the middle station data exceeds the threshold value, timely notifying the abnormal situation to the relevant responsible person through the short message and the intra-system message.
And setting expected ending time for the heavy operation task based on the collected real-time task operation data and the standardized task model data according to the task monitoring requirement in the data center. When the task is not completed in the expected time, abnormal conditions are timely notified to relevant responsible persons for checking through short messages and intra-system messages, so that data analysis of the service is avoided.
According to the data blood-edge monitoring requirement of the data center, based on the collected real-time task state data, the task on the blood-edge link, the associated table and the application state are updated in real time, abnormal blood edges are distinguished and displayed, and meanwhile abnormal task information and affected applications are notified to related responsible persons in a mode of in-station information and mobile phone short messages.
Step (4), application display: and according to the actual monitoring requirements of the data center, visually displaying the results generated by data analysis according to the classification of the functional modules.
According to an embodiment of the present invention, the main functional modules of the application presentation part include: asset monitoring module, task monitoring module, subassembly monitoring module, application monitoring module, service monitoring module.
According to the demand for monitoring assets in the data, the asset monitoring module includes: asset overview, asset details, list details, empty list details, list blood-margin, etc.
According to the task monitoring requirement of the data center platform, the task monitoring module comprises: sub-modules of task overview, task detail, task blood-margin, etc.
According to the data center station component monitoring requirement, the component monitoring module comprises: a CDM component, an MRS component, etc.
According to the application monitoring requirement of the data center, the application monitoring module comprises: application overview, application details, application blood-margin, etc.
According to the data center station service monitoring requirement, the service monitoring module comprises: sub-modules such as service overview, service release details, service call details, etc.
In a preferred embodiment, the application display further comprises sub-modules such as data query and abnormal alarm according to the actual monitoring requirement of the data center.
The invention also provides a data panoramic monitoring system based on the data center station, which comprises:
the data extraction module is used for collecting basic data of monitoring requirements of a data center platform in a mode of regularly extracting metadata of a metadata bin and calling interface data of a component, and comprises: the metadata extraction unit is configured to configure data source connection and destination connection in the data center data integration component, and extract open metadata of the data center to the source layer base table for storage through the table and field mapping relation; the component data calling unit is configured to acquire real-time data of a designated component by calling an API (application program interface) of an open data center, and store the real-time data in a database corresponding table of the data center in a classified manner, wherein the designated component is a component capable of collecting the total data and enabling incremental data to synchronously enter the center;
The standardized module processes the collected data into a unified data model table according to the actual monitoring requirement of the data center station and the specified data model specification design requirement, and performs standardized storage;
the data analysis module is used for carrying out association analysis processing on the collected data and/or the standardized data according to the actual monitoring requirement of the data center, forming a detail table and a result summary table from processing results, storing the detail table and the result summary table in a data center database, constructing a blood relationship link of the table, the task and the application, and alarming abnormal conditions;
and the application display module is used for visually displaying the results generated by data analysis according to the classification of the functional modules.
It should be understood that the data panoramic monitoring system based on the data center in the embodiment of the present invention may implement all the technical solutions in the above method embodiments, and the functions of each functional module may be specifically implemented according to the methods in the above method embodiments, and the specific implementation process may refer to the relevant descriptions in the above embodiments, which are not repeated herein.
Compared with the existing data monitoring method and monitoring system, the data panoramic monitoring method and system based on the data center can comprehensively master the data asset storage, timing task execution, component operation, service call and application support conditions of the data center, can find out tasks and components which are abnormally operated in time, can rapidly locate problems, solve the problems, improve the operation and maintenance level of the data center, and effectively improve the accuracy, timeliness and integrity of the data center, so that the practical value of the data is improved; meanwhile, the method can help operation and maintenance operators of the data center station to grasp the abnormal influence range in time, so that the business risk is effectively avoided, the business supporting capability of the data center station is improved, and the digitalized transformation of a power-assisted company is realized.

Claims (10)

1. The data panoramic monitoring method based on the data center station is characterized by comprising the following steps of:
the method comprises the steps of collecting basic data of monitoring requirements of a data center platform by means of regularly extracting metadata bin data and calling component interface data, wherein the metadata bin data extraction comprises the following steps: configuring data source connection and destination connection in a data center data integration component, and extracting open metadata bin data of the data center to a source layer base table for storage through a table and field mapping relation; invoking component interface data includes: acquiring real-time data of a designated component by calling an API (application program interface) of an open data center, and storing the real-time data in a database corresponding table of the data center in a classified manner, wherein the designated component is a component capable of collecting the total data and enabling incremental data to synchronously enter the center;
processing the collected data into a unified data model table according to the actual monitoring requirement of the data center table and the specified data model specification design requirement, and carrying out standardized storage;
according to the actual monitoring requirement of the data center, carrying out association analysis processing on the acquired data and/or standardized data, forming a detail table and a result summary table from processing results, storing the detail table and the result summary table in a data center database, constructing a blood relationship link of a table, a task and an application, and alarming abnormal conditions;
And visually displaying the results generated by data analysis according to the classification of the functional modules.
2. The method of claim 1, wherein the open metadata class in the data comprises asset metadata, task metadata, application metadata, API service metadata, wherein,
the extraction of asset meta-bin data includes: setting asset metadata bin data acquisition task scheduling time as a standard point to execute according to actual business requirements of the station asset monitoring in the data, and sorting and recording table names, table IDs, table capacities, table numbers, field numbers, table types, project space, creation time and modification time related to the asset data by acquiring asset metadata bin data;
the extraction of the task meta-bin data comprises the following steps: according to the actual running condition and monitoring requirement of a task in the data, setting task metadata bin data acquisition task scheduling time as a standard point for execution, and sorting and recording task names, task IDs, task states, job creation time, task execution time, last update time, reading item numbers and writing item numbers related to the data integration task by acquiring the task metadata bin data;
the extracting of the task meta-bin data further comprises: according to the actual running condition and monitoring requirement of a data center platform task, setting task metadata bin data acquisition task scheduling time as a standard point for execution, and sorting and recording task names, task IDs, task states, node names, workflow states, job types, start time, end time and scheduling types related to a data development task by acquiring the task metadata bin data;
The extraction of the application meta-bin data comprises: according to the actual service scene and monitoring requirement of the data center application, setting the application metadata bin data acquisition task scheduling time as a standard point for execution, and sorting and recording application names, application descriptions, application creators, working space names, application types, registration time and latest update time related to the application data by acquiring the application scene metadata bin data;
the extraction of the API service metadata comprises the following steps: according to actual release, calling conditions and monitoring requirements of the API service in the data center, setting the scheduling time of the acquisition task of the API service metadata bin as a standard point for execution, and sorting and recording the related API name, API description, API path, API type, API protocol, request type, creation time and update time of the API service by acquiring the API service metadata bin data.
3. The method of claim 1, wherein the specified component API is invoked by a JAVA program, the specified component comprising CDM, DLF, DRS, MRS, wherein
The invoking of CDM component interface data includes: setting JAVA timing tasks as minute frequency according to the utilization rate and monitoring requirements of CDM components, acquiring related data by calling an API of CDM, sorting the data and recording project space names, cluster names, task states and final running time of the CDM components;
The invocation of the DLF component interface data includes: setting JAVA timing tasks as minute frequency according to the utilization rate and monitoring requirements of the DLF component, acquiring related data by calling an API of the DLF, sorting the data and recording project space names, task names, node names, task states, task types, node types and final running time of the DLF component;
the call to DRS component interface data includes: setting JAVA timing tasks as minute frequency according to the utilization rate and monitoring requirements of the DRS components, acquiring related data by calling APIs of the DRS, sorting the data and recording project space names, cluster names, task states, task types and final running time of the DRS components;
the call for MRS component interface data includes: according to the utilization rate of the MRS component and the actual monitoring requirement, setting JAVA timing tasks as the frequency of minutes, acquiring related data by calling the API of the MRS component, sorting the data and recording the CPU utilization rate, the memory utilization rate, the disk utilization rate, the IO read rate and the IO write rate of the MRS component.
4. The method of claim 1, wherein the minimum frequency of metadata extraction and component interface data invocation is set to be 5 minutes once and the maximum frequency is set to be 1 day once.
5. The method of claim 1, wherein processing the collected data into a unified data model table for standardized storage comprises: classifying the acquired data according to the following data model fields: a data acquisition domain, a data storage domain, a data processing domain, a data application domain, a data security domain, wherein
For the data acquisition domain, logically processing the data integration operation metadata bin data table according to the purpose classification of the metadata bin data to form a data acquisition domain model table;
for the data storage domain, according to the purpose classification of the metadata bin, logically processing the table information and the field information metadata bin data table to form a data storage domain model table;
for the data processing domain, according to the purpose classification of the metadata, logically processing the metadata table of the metadata development task and the metadata development operation to form a model table of the data processing domain;
for the data application domain, according to the purpose classification of the metadata, logically processing the data service and the data application metadata bin data table to form a data application domain model table;
and for the data guarantee domain, logically processing the metadata table of the designated component according to the purpose classification of the metadata bin to form a data guarantee domain model table respectively.
6. The method of claim 1, wherein performing a correlation analysis process on the collected data and/or the normalized data comprises:
carrying out association analysis on the collected and/or standardized asset data according to the monitoring requirement of the assets in the data to form analysis and summary results of the total number of access systems of the assets in the data, the total number of tables, the total capacity of the tables, the total number of table fields, the number of empty tables, the total number of tables of each level and the total capacity and detail of various asset data;
carrying out association analysis on the collected and/or standardized task data according to the task monitoring requirements of the data center to form analysis total results such as the total number of tasks, the number of successful tasks, the number of failed tasks, the number of tasks in execution, the number of unsuccessful tasks and the like, and detail of various task data;
carrying out association analysis on the collected and/or standardized application data according to the application monitoring requirements of the data center to form analysis total results such as the total number of applications, the number of successful applications, the number of failed applications, the number of primary applications, the number of secondary applications and the like and detail of various application data;
performing association analysis on the collected specified component data according to the monitoring requirement of the data center component to form key resource occupation condition statistics and trend analysis of CPU utilization rate, memory utilization rate, disk utilization rate, IO read rate and IO write rate;
And carrying out association analysis on the collected and/or standardized service data according to the service monitoring requirements of the data center to form analysis and summary results of total calling times, month calling times, day calling success rate, day calling time consumption, total release count, month release count, day release count and day release rate and detail of various application data.
7. The method of claim 1, wherein constructing a list, task, application of the blood relationship link comprises:
according to the collected and standardized task metadata bin data, analyzing SQL scripts contained in tasks by using Java programs, acquiring the table upstream and downstream relations and corresponding table mode names in task nodes, and respectively tracing the blood edge relations of the tables forward and backward by taking a source table and a target table as the target table to form the blood edge link relation of the tables and the tasks;
and according to the association relation between the task and the application and between the table and the application, acquiring complete links of the table, the task and the application blood-edge relation by matching key fields such as project space, mode name, table name, task name and the like on the basis of the blood-edge relation between the table and the task.
8. The method of claim 7, wherein alerting for abnormal conditions comprises asset abnormality alerting, task operation pre-warning, task abnormality alerting, wherein:
The asset abnormality warning means that according to the asset monitoring requirement in the data, a threshold value is set on the basis of standardized asset model data, annotating maintenance conditions of the table and monitoring requirements of empty table conditions, and when the middle data exceeds the threshold value, abnormal conditions are timely notified to relevant responsible persons;
the task operation early warning means that according to the task monitoring requirement in the data, the task operation early warning means that the task operation early warning means sets the expected ending time for the heavy-point operation task based on the collected real-time task operation data and the standardized task model data, and notifies relevant responsible persons of abnormal conditions to check when the task is not completed within the expected time, so that the data analysis of the service is avoided;
the task abnormality warning means that tasks on a blood margin link and associated tables and application states are updated in real time based on collected real-time task state data according to blood margin monitoring requirements of data in the data, abnormal blood margins are distinguished and displayed, and abnormal task information and affected applications are notified to relevant responsible persons.
9. The method of claim 1, wherein the main functional modules of the application presentation section include: the system comprises an asset monitoring module, a task monitoring module, a component monitoring module, an application monitoring module and a service monitoring module; the asset monitoring module includes: asset overview, asset details, list details, empty list details, list blood margin sub-modules; the task monitoring module comprises: task overview, task detail, task blood-edge sub-module; the component monitoring module includes: CDM component, MRS component, DLF component, DRS component sub-module; the application monitoring module comprises: an application overview, an application detail, and an application blood margin sub-module; the service monitoring module comprises: service overview, service release details, and service invocation details sub-modules.
10. A data panoramic monitoring system based on a data center, comprising:
the data extraction module is used for collecting basic data of monitoring requirements of a data center platform in a mode of regularly extracting metadata of a metadata bin and calling interface data of a component, and comprises: the metadata extraction unit is configured to configure data source connection and destination connection in the data center data integration component, and extract open metadata of the data center to the source layer base table for storage through the table and field mapping relation; the component data calling unit is configured to acquire real-time data of a designated component by calling an API (application program interface) of an open data center, and store the real-time data in a database corresponding table of the data center in a classified manner, wherein the designated component is a component capable of collecting the total data and enabling incremental data to synchronously enter the center;
the standardized module processes the collected data into a unified data model table according to the actual monitoring requirement of the data center station and the specified data model specification design requirement, and performs standardized storage;
the data analysis module is used for carrying out association analysis processing on the collected data and/or the standardized data according to the actual monitoring requirement of the data center, forming a detail table and a result summary table from processing results, storing the detail table and the result summary table in a data center database, constructing a blood relationship link of the table, the task and the application, and alarming abnormal conditions;
And the application display module is used for visually displaying the results generated by data analysis according to the classification of the functional modules.
CN202211580140.1A 2022-12-09 2022-12-09 Data panoramic monitoring method and system based on data center Pending CN116205396A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211580140.1A CN116205396A (en) 2022-12-09 2022-12-09 Data panoramic monitoring method and system based on data center

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211580140.1A CN116205396A (en) 2022-12-09 2022-12-09 Data panoramic monitoring method and system based on data center

Publications (1)

Publication Number Publication Date
CN116205396A true CN116205396A (en) 2023-06-02

Family

ID=86518176

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211580140.1A Pending CN116205396A (en) 2022-12-09 2022-12-09 Data panoramic monitoring method and system based on data center

Country Status (1)

Country Link
CN (1) CN116205396A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116820892A (en) * 2023-07-14 2023-09-29 佛山众陶联供应链服务有限公司 Data processing monitoring system for a plurality of bins
CN116910130A (en) * 2023-09-08 2023-10-20 中国长江电力股份有限公司 Construction method of industrial data middle platform frame for hydropower equipment
CN117056172A (en) * 2023-10-12 2023-11-14 江苏鑫业智慧技术有限公司 Data integration method and system for system integration middle station

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116820892A (en) * 2023-07-14 2023-09-29 佛山众陶联供应链服务有限公司 Data processing monitoring system for a plurality of bins
CN116910130A (en) * 2023-09-08 2023-10-20 中国长江电力股份有限公司 Construction method of industrial data middle platform frame for hydropower equipment
CN116910130B (en) * 2023-09-08 2023-12-26 中国长江电力股份有限公司 Construction method of industrial data middle platform frame for hydropower equipment
CN117056172A (en) * 2023-10-12 2023-11-14 江苏鑫业智慧技术有限公司 Data integration method and system for system integration middle station
CN117056172B (en) * 2023-10-12 2023-12-19 江苏鑫业智慧技术有限公司 Data integration method and system for system integration middle station

Similar Documents

Publication Publication Date Title
CN110765337B (en) Service providing method based on internet big data
CN107886238B (en) Business process management system and method based on mass data analysis
CN116205396A (en) Data panoramic monitoring method and system based on data center
CN107315776B (en) Data management system based on cloud computing
US8671084B2 (en) Updating a data warehouse schema based on changes in an observation model
CN112396404A (en) Data center system
CN104036365A (en) Method for constructing enterprise-level data service platform
CN111984709A (en) Visual big data middle station-resource calling and algorithm
CN112527774A (en) Data center building method and system and storage medium
CN113468159A (en) Data application full-link management and control method and system
CN114880405A (en) Data lake-based data processing method and system
CN112883001A (en) Data processing method, device and medium based on marketing and distribution through data visualization platform
CN106777265B (en) Service data processing method and device
CN116662441A (en) Distributed data blood margin construction and display method
CN109522349B (en) Cross-type data calculation and sharing method, system and equipment
CN113986656B (en) Power grid data safety monitoring system based on data center platform
CN115936394A (en) Business event management system and equipment for industrial system
CN115658658A (en) Batch processing-based data sharing method and device for enterprise data middleboxes and storage medium
CN115664785A (en) Big data platform data desensitization system
CN114840519A (en) Data labeling method, equipment and storage medium
CN109033116B (en) Information data reflux system and method based on data ancestry
CN113407530A (en) Permission data recovery method, management device and storage medium
CN114925045B (en) PaaS platform for big data integration and management
CN112825165A (en) Project quality management method and device
CN111435466A (en) Integrated machine room operation and maintenance management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination