CN117435577A - Big data supervision method - Google Patents

Big data supervision method Download PDF

Info

Publication number
CN117435577A
CN117435577A CN202311398844.1A CN202311398844A CN117435577A CN 117435577 A CN117435577 A CN 117435577A CN 202311398844 A CN202311398844 A CN 202311398844A CN 117435577 A CN117435577 A CN 117435577A
Authority
CN
China
Prior art keywords
data
supervision
monitoring
metadata
change
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311398844.1A
Other languages
Chinese (zh)
Inventor
李洪海
林涛
张华荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Datacom Corp ltd
Original Assignee
China Datacom Corp ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Datacom Corp ltd filed Critical China Datacom Corp ltd
Priority to CN202311398844.1A priority Critical patent/CN117435577A/en
Publication of CN117435577A publication Critical patent/CN117435577A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/103Workflow collaboration or project management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computer Security & Cryptography (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a big data supervision method, which comprises the following steps: acquiring the monitored target data; performing governance on the target data, including: periodically auditing the metadata and monitoring the change of the metadata; periodically auditing and monitoring the change of the main data; policing data quality, including: data auditing detection is carried out on the data content and the metadata; processing and supervising BIM data; policing availability of data services, including: monitoring the availability of the data service in real time by using a monitoring tool, and notifying relevant personnel when the data service is not available; and adopting a three-screen linkage technology to visually display the management and supervision contents of the target data. The invention can realize data visualization and real-time supervision and response.

Description

Big data supervision method
Technical Field
The invention relates to the technical fields of data security and cloud computing, in particular to a big data supervision method.
Background
The technical background of big data supervision is derived from the comprehensive effects of factors such as explosive growth of data, distributed computing and storage, cloud computing, machine learning, data privacy compliance, data security technology, data management tools and the like. In this context, big data supervision aims at supervising, managing and protecting large-scale data to ensure its legal compliance, safety and reliability, and to address data quality, privacy and security challenges with advanced techniques and policies.
Despite the continued development and advancement of big data supervision methods and techniques, there are still some drawbacks that can negatively impact data governance and data security. Specifically, the prior art currently has the following disadvantages:
1. privacy problem: as large data grows, data privacy becomes more complex. In spite of privacy protection regulations, privacy is still vulnerable, especially in case the identity of the person is re-identified. Current privacy techniques and methods may not be sufficient to effectively address this challenge.
2. Complexity and fragmentation: big data policing involves multiple data sources, multiple systems, and multiple organizations, which results in complexity and fragmentation. Lack of uniform regulatory standards and methodologies can lead to difficulties and inconsistencies in data governance.
3. Data quality problem: data quality problems often exist in large data, such as incomplete, inaccurate, and duplicate data. This may result in erroneous decisions by regulatory authorities and correction of these problems may be very difficult.
4. Security vulnerabilities: large data storage and processing systems are easily targeted for attack, and data leakage and security vulnerabilities are a persistent threat. Despite various security measures, vulnerabilities remain that are difficult to discover.
5. Data governance challenges: data governance involves aspects of data classification, data documentation, access control, etc., but these tasks may be more complex for large data environments. Ensuring compliance and quality of data requires more automation and intelligent support.
6. Lack of regulatory skills: regulatory authorities and organizations may lack sufficient skill and expertise to effectively manage big data. This includes expertise in data science, data security, and compliance.
7. Supervision hysteresis: regulatory authorities often require time to accommodate new technologies and new threats. This results in regulatory delays to the development of technology and may not be able to deal with new challenges in a timely manner.
Disclosure of Invention
The object of the present invention is to solve at least one of the technical drawbacks.
Therefore, the invention aims to provide a big data supervision method for realizing data visualization, real-time supervision and response.
In order to achieve the above object, an embodiment of the present invention provides a big data supervision method, including the following steps:
step S1, acquiring monitored target data;
step S2, performing treatment supervision on the target data, wherein the step comprises the following steps:
periodically auditing the metadata and monitoring the change of the metadata;
periodically auditing and monitoring the change of the main data;
policing data quality, including: data auditing detection is carried out on the data content and the metadata; processing and supervising BIM data;
policing availability of data services, including: monitoring the availability of the data service in real time by using a monitoring tool, and notifying relevant personnel when the data service is not available;
and S3, adopting a three-screen linkage technology to visually display the management and supervision contents of the target data.
Further, in the step S1, the monitored target data is acquired in the following manner:
(1) Establishing a data access channel, acquiring data from a plurality of data sources through the data access channel, and extracting, converting and loading the data to obtain target data;
(2) Acquiring real-time data by adopting an application program interface API or a data stream;
(3) And data is imported in batches periodically by adopting a data importing and batch processing mode.
Further, in said step S2,
the periodic audit of the metadata comprises the following steps:
(1) Creating an audit plan, and setting audit frequency and target according to audit requirements;
(2) Making audit standards, and setting standards and quality requirements of metadata;
(3) Comparing the metadata with the audit standard via an automation script;
(4) Generating an audit report of metadata, and recording discovered problems and inconsistencies;
the monitoring of the change of the metadata includes:
(1) Storing the metadata in a version control database to record each change of the metadata;
(2) Forming a date of metadata change;
(3) Establishing an approval process of metadata change;
(4) Monitoring the change of the metadata by adopting an automatic script, and triggering an alarm or generating a change report when the metadata is changed;
(5) When the metadata is detected to be changed, the related personnel are timely notified through an alarm and notification function.
Further, in said step S2,
the periodic audit of the main data comprises the following steps:
(1) Performing quality assessment on the main data through an automated script, including checking accuracy, integrity, consistency and timeliness of the data;
(2) Comparing the main data with the data standard and the business rule to determine whether inconsistent, wrong or missing data exists or not;
(3) Comparing the main data with applicable regulations, policies and industry standards, and checking whether data which does not meet compliance requirements exists;
(4) Generating a main data audit report;
the monitoring of the change of the main data includes:
(1) Storing the main data in a version control database, and recording each change of the main data;
(2) Setting an approval implementation flow to ensure that each main data change is approved and approved by a corresponding responsible person;
(3) Recording a main data change log;
(4) Setting a monitoring mechanism, namely setting a monitoring range, a monitoring frequency, a monitoring time and an alarm problem through a custom monitoring setting panel;
(5) Checking, by an automation script, a record of the change of the master data, whether the change is compliant, whether there is an unauthorized change;
(6) When the change of the main data is detected, the related personnel are informed in time through an alarm and notification function.
Further, (1) data accuracy checking: checking whether the data meets an accuracy standard or not through the configured conditional statement and calculating an accuracy index of the data;
(2) Data integrity check: checking whether each item of data in the data set contains all necessary fields or not through the configured script, checking whether each field in the data set does not contain null value or invalid value or not, and checking whether the data type is correct or not;
(3) Data consistency check: checking whether the description of the data accords with the actual phenomenon or situation through the configured script; meanwhile, judging whether the two data values are in an acceptable error range or not through comparison between the series of data;
(4) Data timeliness check: for data with time sensitivity in the data set, checking whether the update time of the data meets the requirement or not and checking whether the date field is in a specific time period or not through the configured script.
Further, in the step S2, the performing data auditing detection on the data content and the metadata includes:
(1) Configuration management of check object attributes
(2) Configuring auditing rules, wherein the auditing rules are used for parameter configuration of custom configuration rules or automatically generating checking rules of each dimension of data quality by a system;
(3) Configuring a strategy, creating a strategy task, and performing configuration execution on the set strategy after configuration is completed;
(4) Configuring a data quality evaluation system according to service requirements;
(5) Evaluating the data quality to identify the existing data quality problem, and performing cause analysis on the identified data quality problem;
(6) And automatically generating a data quality analysis report according to the type and the number of the quality problems aiming at the operation result of each analysis task.
Further, in the step S2, the performing process supervision on the BIM data includes:
(1) Establishing BIM data standards and specifications;
(2) Using BIM data quality inspection tools to perform quality inspection on the BIM model and the data, and inspecting the integrity, accuracy and consistency of the data;
(3) Using a BIM data security checking tool to check the security of BIM data;
(4) Providing correction advice for the discovered problems associated with the BIM data;
(5) Automatically recording the BIM data processing supervision process;
(6) And generating a BIM data processing supervision report.
Further, in the step S2, the monitoring the availability of the data service includes:
(1) Defining a monitoring index of the availability of the data service;
(2) A regulatory dashboard is set and an availability threshold is defined. These thresholds are determined according to traffic demands;
(3) Configuring an alarm rule, and triggering an alarm when the index of the data service exceeds a preset availability threshold;
(4) Setting a notification channel to notify relevant personnel when an alarm is triggered;
(5) The data analysis function of the monitoring tool is used for identifying the trend and the problem generation reason and generating a monitoring report.
Further, the data analysis function of the monitoring tool identifies the trend and the cause of the problem, including:
the monitoring tool analyzes target data by adopting built-in analysis scripts, performs trend analysis, anomaly detection, statistical analysis and pattern recognition by tracking data change, inquiring logs or analyzing performance data so as to determine the root cause of existing problems and positioning problems, and outputs a monitoring report after completing data analysis.
Further, in the step S3, the visual display of the administration supervision content of the target data by adopting the three-screen linkage technology includes:
(1) An integrated digital platform is built, and the integrated digital platform three-display visual terminal and the data supervision platform are connected together to realize three-screen linkage visual display;
(2) Creating a large screen data supervision service topic visual presentation, wherein presenting content comprises: data supervision overview, data administration supervision visualization, data security supervision visualization, integrated digital platform supervision visualization;
(3) Creating a mid-screen data supervision index presentation, wherein the presentation content comprises: a data supervision index overview, a data administration supervision index and a data security supervision index;
(4) Creating a small screen data supervision service sign presentation, wherein the presentation content comprises: data supervision matters, data management supervision signboard, data safety supervision signboard and integrated digital platform supervision signboard
The big data supervision method provided by the embodiment of the invention has the following beneficial effects:
1. data visualization: powerful data visualization tools and dashboards are provided that enable users to intuitively understand data states and trends in graphical and graphical forms.
2. Real-time supervision and response: real-time supervision and response are supported, and problems can be detected and measures can be taken immediately instead of only regular batch supervision.
3. More powerful data integration capability: data pipes and data lakes are constructed to centrally store and manage the various data sources. The data lake can more easily integrate multiple data sources, including structured and unstructured data, enabling more comprehensive data administration and management.
4. Intelligent and automated: each flow realizes intelligent and automatic operation through an automatic script capable of setting configuration parameters, can automatically identify and solve common problems, and reduces the requirement of manual intervention.
5. Stronger performance and extensibility: and by adopting a distributed architecture, each component and module are scattered on a plurality of servers or nodes, so that the lateral expansion and the usability are greatly improved. Meanwhile, the workload on different servers is balanced through the load balancer, and meanwhile, the performance of the database is optimized, including query optimization, index optimization, partition table, data compression and the like, so that the data retrieval and storage efficiency is improved, and larger-scale data and more complex workload can be processed.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the invention will become apparent and may be better understood from the following description of embodiments taken in conjunction with the accompanying drawings in which:
FIG. 1 is a flow chart of a big data supervision method according to an embodiment of the invention;
FIG. 2 is a schematic diagram of a big data supervision method according to an embodiment of the invention;
fig. 3 is a schematic diagram of a big data supervision method according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.
The invention provides a big data monitoring method, which conforms to the technical specification and system architecture requirements of a hall integrated digital platform, and has a general architecture of five-horizontal and three-vertical, wherein the five-horizontal is a user interaction layer, an application layer, a support layer, a data layer and an infrastructure layer respectively from top to bottom, and the three-vertical is a standard specification system, a safety guarantee system and an operation management system respectively, as shown in figure 2.
The infrastructure layer includes: government cloud, government external network and virtual private network.
The data layer comprises: the integrated data platform database, the supervision tool and the security service tool database are filled with data.
The support layer includes: intelligent service gateway, unified identity authentication, visual support (large, medium and small screen visual adaptation), intelligent analysis support (model analysis, data mining).
The application layer comprises: and the data supervision auxiliary management and the data supervision work effect display. Wherein, data supervision auxiliary management includes: the system comprises data supervision auxiliary management, data management supervision, data safety supervision, integrated digital platform supervision, data analysis and result display, data supervision index system management, data supervision intelligent analysis, data supervision result display, data system management, role management, user management and authority management, and supervision information filling. The data supervision work effect display comprises: large screen data supervision topics (data supervision overview, data administration supervision visualization, data security supervision visualization, integrated platform supervision visualization), medium screen data supervision topics (data supervision overview, data administration supervision index, data security supervision index, integrated platform supervision index), small screen data supervision topics (data supervision items, data administration supervision signboards, data security supervision signboards, integrated platform supervision signboards).
The user interaction layer comprises: hall leaders (integrated platform three-screen linkage visualization), supervision teams (supervision and supply PC applications, integrated platform three-screen linkage visualization).
As shown in fig. 1 and fig. 3, the big data supervision method according to the embodiment of the present invention includes the following steps:
step S1, acquiring the supervised target data.
Specifically, the step may acquire the monitored target data in the following manner:
(1) And establishing a data access channel, acquiring data from a plurality of data sources through the data access channel, and extracting, converting and loading the data to obtain target data.
And establishing a data access channel, and acquiring data from different data sources, including a database, a data warehouse, a data lake, an application program log, an external data source and the like. The data access channel uses ETL tools to effect extraction, conversion and loading of data.
Extracting: the required data is first extracted from various data sources. The extracted data is formatted in a predetermined manner to meet the requirements of the subsequent conversion and loading process.
Conversion: then, based on the extracted data, necessary conversion operations are performed. Including data cleansing, data normalization, data mapping, data merging, etc., to ensure accuracy and consistency of the data for use in the data warehouse.
Loading: and finally, loading the converted data into a target data warehouse.
(2) The real-time data is acquired using an application program interface API or a data stream.
That is, for the case of real-time data, the platform uses an API (application program interface) or a data stream to acquire the data.
(3) And data is imported in batches periodically by adopting a data importing and batch processing mode.
In the invention, data import and batch processing are provided, and data can be periodically imported into a supervision platform in batch through file uploading and FTP transmission.
Step S2, treating and supervising the target data
1. The method for periodically auditing the metadata comprises the following steps:
(1) An audit plan is created, and audit frequency and targets are set according to audit requirements.
(2) Audit criteria are formulated, metadata criteria and quality requirements are set, including field naming rules, data types, data quality criteria, and the like.
(3) The metadata is compared to the criteria by an automation script. The metadata is compared with a pre-defined standard, and whether the metadata contains content which does not meet the standard or not is checked, wherein the content comprises field name misspelling, inconsistent data types and the like.
(4) A detailed metadata audit report is generated, recording the discovered problems and inconsistencies. The report includes a description of the problem, location, impact, and suggested solutions.
Monitoring for changes in metadata, comprising the steps of:
(1) Version control is performed: the metadata is stored in a version control database to record each change to the metadata. Recording content includes who, when, why, and how the metadata was changed.
(2) A log of metadata changes is formed, including information of the type of change, the timestamp, the user performing the change, and the like.
(3) And establishing an approval process of the metadata change, and ensuring that each change is approved and approved. The approval process includes multiple stages, adjusted according to the type and importance of the change.
(4) Changes to metadata are monitored using an automation script. An alarm is triggered or a change report is generated when metadata changes.
(5) By means of the alarm and notification function, the relevant person is notified immediately when the metadata is changed. For example, by email, text message, or other communication means.
2. And periodically auditing and monitoring the change of the main data. Auditing the primary data refers to periodic inspection and evaluation of the primary data to ensure its quality, accuracy, and compliance. Master data change monitoring refers to tracking and recording modifications, additions, and deletions to master data to ensure that the changes are legitimate, compliant, and able to trace back to their source.
The main data is subjected to periodic audit, which comprises the following steps:
(1) The quality evaluation of the main data is carried out through automated scripts, including the aspects of checking the accuracy, the integrity, the consistency, the timeliness and the like of the data.
1) Checking data accuracy: it is checked by means of the configured conditional statement whether the data meets a specific accuracy criterion. And calculating the accuracy index of the data through the configured function.
2) Data integrity check: by means of the configured script, whether each item of data in the data set contains all necessary fields or not is checked, whether each field in the data set does not contain null values or invalid values or not is checked, and whether the data type is correct or not is checked.
3) Data consistency check: checking whether the description of the data is consistent with the actual phenomenon or situation through the configured script. And meanwhile, whether the two data values are within an acceptable error range is judged through comparison between the series of data.
4) Data timeliness check: for data with time sensitivity in the data set, checking whether the update time of the data meets the requirement or not and checking whether the date field is in a specific time period or not through the configured script.
(2) The master data is compared with the data criteria and business rules to determine if there is inconsistent, erroneous, or missing data.
(3) The master data is compared with applicable regulations, policies and industry standards to check whether there is data that does not meet compliance requirements, particularly in the case of sensitive information.
(4) A master data audit report is generated including descriptions, solutions and suggestions of audit problems.
Monitoring for changes to the primary data, comprising the steps of:
(1) Version control is performed, the main data is stored in a version control database, and each change of the main data is recorded, including information such as time, type, and executor of the change.
(2) And setting an approval implementation flow to ensure that each main data change is approved and approved by a corresponding responsible person. And only authorized personnel can perform the changing operation.
(3) A detailed log of the changes to the primary data is recorded, including the purpose, cause, and effect of the changes, in order to track the change history.
(4) And setting a monitoring mechanism, namely setting a monitoring range, a monitoring frequency, a monitoring time, an alarm problem and the like through a custom monitoring setting panel.
(5) The record of the change of the master data is reviewed by an automation script, whether the change is compliant, and whether there is an unauthorized change.
(6) By means of the alarm and notification function, relevant personnel are notified in time when the main data is changed.
3. The method comprises the steps of monitoring data quality, specifically performing data auditing detection on data content and metadata, including verification object configuration, auditing rule configuration, strategy configuration and execution, evaluation configuration and execution, an evaluation engine, quality analysis, quality report and the like.
(1) Configuration management of check object attributes
By using the function of configuring the verification object, the information such as the attribute of the verification object can be configured and managed, and the setting and checking management of the quality information of the field data such as whether the primary key of the data of the verification object is non-null, the format specification, the length limitation and the like are supported.
(2) Auditing rule configuration
The auditing rule information is configured to be used for the parameter configuration of the custom configuration rule or the system automatically generates the checking rule of each dimension of the data quality, and the functions of rule checking, importing, exporting, analyzing and the like are provided.
(3) Policy configuration and enforcement
The user may configure the policy object, etc., to execute related information, such as detailed parameters, data verification sources, execution time, etc., based on the rules, thereby creating a policy task. After the policy configuration is completed, the set policy is configured and executed, and the set policy can be restarted or forcedly terminated, and meanwhile, the policy state details such as success, failure, to-be-executed, executed and the like are displayed.
(4) Evaluation configuration and execution
Through a flexible on-line configuration function of the evaluation index, the content of the evaluation index is customized individually according to the service requirement, the data quality evaluation system is enriched, optimized and perfected, and the multi-dimensional data quality condition evaluation analysis is effectively supported.
(5) Data quality evaluation analysis
And evaluating the data quality by a configured evaluation method, and identifying data quality problems including errors, deletions, inconsistencies and the like. And then, carrying out cause analysis on the identified data quality problem, and determining the cause of the problem by searching a data source, a data acquisition process, a data transmission process and a data input process.
(6) Data quality analysis report
The data quality report is a presentation of the data quality analysis results of spot checks. And automatically generating a data quality analysis report according to the type, the number and the like of the quality problems aiming at the operation result of each analysis task.
4. Processing and supervising BIM data;
(1) BIM data standards and specifications, including model standards, data formats, naming conventions, data quality standards, etc., are formulated to ensure consistency and comparability.
(2) And using a BIM data quality inspection tool to inspect the quality of the BIM model and the data, and inspecting the integrity, the accuracy and the consistency of the data.
(3) And (3) checking the security of BIM data by using a BIM data security checking tool, wherein the security comprises the security requirements of access control, data encryption, data leakage prevention and the like.
(4) Correction suggestions are provided for discovered problems associated with BIM data, such as data quality problems, security vulnerabilities, compliance problems, etc., for correction processing by the relevant personnel.
(5) The process of BIM data processing supervision is automatically recorded, including the results of data quality inspection, security screening, compliance screening, and problem resolution history.
(6) A detailed supervision report is generated summarizing the results of the supervision activities and the resolution of the problem.
5. Data service availability supervision
Policing availability of data services, including: monitoring tools are used to monitor the availability of data services in real time and to notify relevant personnel when data services are not available.
(1) A monitoring indicator of availability of the data service is defined. Key metrics to be monitored are determined, including response time, availability, throughput, error rate, etc. of the data service.
(2) A regulatory dashboard is set and an availability threshold is defined. The availability threshold may be determined based on traffic demand. For example, if an SLA requires that data services are available 99% of the time, you can set the availability threshold to 99%.
(3) An alarm rule is configured to trigger an alarm when an indicator of the data service exceeds a predetermined threshold.
(4) A notification channel is provided to notify the relevant personnel when an alarm is triggered. The notification mode comprises an email, a short message and the like.
(5) The trend and the root cause of the problem are identified through the data analysis function of the monitoring tool, and a monitoring report is generated to track the performance and availability of the data service.
In an embodiment of the present invention, identifying trend and problem causes by a data analysis function of a monitoring tool includes:
the monitoring tool uses built-in analysis scripts to analyze the target data, and performs trend analysis, anomaly detection, statistical analysis and pattern recognition by tracking data changes, query logs or analyzing performance data to help determine the problem that exists and locate the root cause of the problem.
After the monitoring tool completes the data analysis, the built-in reporting tool will output a monitoring report detailing various performance indicators such as response time, throughput, error rate, etc., and presented in a chart or table format for ease of understanding. The supervision report comprises a performance index trend report, and changes of various indexes along with time are displayed through a graph. When a performance problem or abnormal event is analyzed, the supervisory report provides detailed information about the problem, including the type of problem, time of occurrence, and possible cause.
And S3, adopting a three-screen linkage technology to visually display the management and supervision contents of the target data.
(1) Construction integrated digital platform
The integrated digital platform is constructed by connecting the three-display visual terminal and the data supervision platform together so as to realize three-screen linkage visual display.
Front-end development uses HTML, CSS, and JavaScript to build a visual front-end interface to display data and charts on a three-screen terminal. Various types of charts and visualization elements were created using a data visualization library, including d3.Js and Plotly.
Backend development selects Python, java to build backend applications to handle data acquisition, processing, and provide APIs. The front-end application program can communicate and exchange data with the data supervision platform through the GraphQL API. An asynchronous communication and event driven architecture is implemented using a message queuing system (Apache Kafka).
Three-screen linkage technology: the WebSocket technology is used for realizing real-time synchronization of three screens, ensuring consistency of display information of the three screens, and meanwhile, a distributed application architecture is adopted, so that different screens can communicate and share data.
(2) Creating large screen data supervision service topics
According to the three-screen linkage docking specification, docking the large screen of the integrated digital platform, and realizing the visual display of the large screen end data supervision service themes. Wherein, the display content includes: the data supervision overview, the data governance supervision visualization, the data security supervision visualization and the integrated digital platform supervision visualization.
(3) Creating mid-screen data supervision index displays
According to the three-screen linkage docking specification, docking the middle screen of the integrated digital platform, and displaying the data supervision indexes of the middle screen end. Wherein, the display content includes: data supervision index overview, data administration supervision index, data safety supervision index.
(4) Creating a small screen data supervision service sign
According to the three-screen linkage docking specification, docking the integrated digital platform small screen, and realizing the data supervision and the number viewing of the mobile terminal. Wherein, the display content includes: data supervision matters, a data management supervision board, a data safety supervision board and an integrated digital platform supervision board.
The big data supervision method provided by the embodiment of the invention has the following beneficial effects:
1. data visualization: powerful data visualization tools and dashboards are provided that enable users to intuitively understand data states and trends in graphical and graphical forms.
2. Real-time supervision and response: real-time supervision and response are supported, and problems can be detected and measures can be taken immediately instead of only regular batch supervision.
3. More powerful data integration capability: data pipes and data lakes are constructed to centrally store and manage the various data sources. The data lake can more easily integrate multiple data sources, including structured and unstructured data, enabling more comprehensive data administration and management.
4. Intelligent and automated: each flow realizes intelligent and automatic operation through an automatic script capable of setting configuration parameters, can automatically identify and solve common problems, and reduces the requirement of manual intervention.
5. Stronger performance and extensibility: and by adopting a distributed architecture, each component and module are scattered on a plurality of servers or nodes, so that the lateral expansion and the usability are greatly improved. Meanwhile, the workload on different servers is balanced through the load balancer, and meanwhile, the performance of the database is optimized, including query optimization, index optimization, partition table, data compression and the like, so that the data retrieval and storage efficiency is improved, and larger-scale data and more complex workload can be processed.
In the description of the present specification, a description referring to terms "one embodiment," "some embodiments," "examples," "specific examples," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the present invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
Although embodiments of the present invention have been shown and described above, it will be understood that the above embodiments are illustrative and not to be construed as limiting the invention, and that variations, modifications, alternatives, and variations may be made in the above embodiments by those skilled in the art without departing from the spirit and principles of the invention. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (10)

1. The big data supervision method is characterized by comprising the following steps of:
step S1, acquiring monitored target data;
step S2, performing treatment supervision on the target data, wherein the step comprises the following steps:
periodically auditing the metadata and monitoring the change of the metadata;
periodically auditing and monitoring the change of the main data;
policing data quality, including: data auditing detection is carried out on the data content and the metadata;
processing and supervising BIM data;
policing availability of data services, including: monitoring the availability of the data service in real time by using a monitoring tool, and notifying relevant personnel when the data service is not available;
and S3, adopting a three-screen linkage technology to visually display the management and supervision contents of the target data.
2. The big data supervision method according to claim 1, wherein in the step S1, the supervised target data is acquired in the following manner:
(1) Establishing a data access channel, acquiring data from a plurality of data sources through the data access channel, and extracting, converting and loading the data to obtain target data;
(2) Acquiring real-time data by adopting an application program interface API or a data stream;
(3) And data is imported in batches periodically by adopting a data importing and batch processing mode.
3. The big data supervision method according to claim 1, wherein in the step S2,
the periodic audit of the metadata comprises the following steps:
(1) Creating an audit plan, and setting audit frequency and target according to audit requirements;
(2) Making audit standards, and setting standards and quality requirements of metadata;
(3) Comparing the metadata with the audit standard via an automation script;
(4) Generating an audit report of metadata, and recording discovered problems and inconsistencies;
the monitoring of the change of the metadata includes:
(1) Storing the metadata in a version control database to record each change of the metadata;
(2) Forming a date of metadata change;
(3) Establishing an approval process of metadata change;
(4) Monitoring the change of the metadata by adopting an automatic script, and triggering an alarm or generating a change report when the metadata is changed;
(5) When the metadata is detected to be changed, the related personnel are timely notified through an alarm and notification function.
4. The big data supervision method according to claim 1, wherein in the step S2,
the periodic audit of the main data comprises the following steps:
(1) Performing quality assessment on the main data through an automated script, including checking accuracy, integrity, consistency and timeliness of the data;
(2) Comparing the main data with the data standard and the business rule to determine whether inconsistent, wrong or missing data exists or not;
(3) Comparing the main data with applicable regulations, policies and industry standards, and checking whether data which does not meet compliance requirements exists;
(4) Generating a main data audit report;
the monitoring of the change of the main data includes:
(1) Storing the main data in a version control database, and recording each change of the main data;
(2) Setting an approval implementation flow to ensure that each main data change is approved and approved by a corresponding responsible person;
(3) Recording a main data change log;
(4) Setting a monitoring mechanism, namely setting a monitoring range, a monitoring frequency, a monitoring time and an alarm problem through a custom monitoring setting panel;
(5) Checking, by an automation script, a record of the change of the master data, whether the change is compliant, whether there is an unauthorized change;
(6) When the change of the main data is detected, the related personnel are informed in time through an alarm and notification function.
5. The big data supervision method of claim 4,
(1) Checking data accuracy: checking whether the data meets an accuracy standard or not through the configured conditional statement and calculating an accuracy index of the data;
(2) Data integrity check: checking whether each item of data in the data set contains all necessary fields or not through the configured script, checking whether each field in the data set does not contain null value or invalid value or not, and checking whether the data type is correct or not;
(3) Data consistency check: checking whether the description of the data accords with the actual phenomenon or situation through the configured script; meanwhile, judging whether the two data values are in an acceptable error range or not through comparison between the series of data;
(4) Data timeliness check: for data with time sensitivity in the data set, checking whether the update time of the data meets the requirement or not and checking whether the date field is in a specific time period or not through the configured script.
6. The big data supervision method according to claim 1, wherein in the step S2, the performing data audit detection on the data content and the metadata includes:
(1) Configuration management of check object attributes
(2) Configuring auditing rules, wherein the auditing rules are used for parameter configuration of custom configuration rules or automatically generating checking rules of each dimension of data quality by a system;
(3) Configuring a strategy, creating a strategy task, and performing configuration execution on the set strategy after configuration is completed;
(4) Configuring a data quality evaluation system according to service requirements;
(5) Evaluating the data quality to identify the existing data quality problem, and performing cause analysis on the identified data quality problem;
(6) And automatically generating a data quality analysis report according to the type and the number of the quality problems aiming at the operation result of each analysis task.
7. The big data supervision method according to claim 1, wherein in the step S2, the performing processing supervision on BIM data includes:
(1) Establishing BIM data standards and specifications;
(2) Using BIM data quality inspection tools to perform quality inspection on the BIM model and the data, and inspecting the integrity, accuracy and consistency of the data;
(3) Using a BIM data security checking tool to check the security of BIM data;
(4) Providing correction advice for the discovered problems associated with the BIM data;
(5) Automatically recording the BIM data processing supervision process;
(6) And generating a BIM data processing supervision report.
8. The big data supervision method according to claim 1, wherein in the step S2, the supervising the availability of the data service includes:
(1) Defining a monitoring index of the availability of the data service;
(2) A regulatory dashboard is set and an availability threshold is defined. These thresholds are determined according to traffic demands;
(3) Configuring an alarm rule, and triggering an alarm when the index of the data service exceeds a preset availability threshold;
(4) Setting a notification channel to notify relevant personnel when an alarm is triggered;
(5) The data analysis function of the monitoring tool is used for identifying the trend and the problem generation reason and generating a monitoring report.
9. The big data supervision method according to claim 8, wherein the data analysis function through the monitoring tool identifies a trend and a cause of the problem, comprising:
the monitoring tool analyzes target data by adopting built-in analysis scripts, performs trend analysis, anomaly detection, statistical analysis and pattern recognition by tracking data change, inquiring logs or analyzing performance data so as to determine the root cause of existing problems and positioning problems, and outputs a monitoring report after completing data analysis.
10. The big data supervision method according to claim 1, wherein in the step S3, the three-screen linkage technology is adopted to visually display the administration supervision content of the target data, and the method comprises the following steps:
(1) An integrated digital platform is built, and the integrated digital platform three-display visual terminal and the data supervision platform are connected together to realize three-screen linkage visual display;
(2) Creating a large screen data supervision service topic visual presentation, wherein presenting content comprises: data supervision overview, data administration supervision visualization, data security supervision visualization, integrated digital platform supervision visualization;
(3) Creating a mid-screen data supervision index presentation, wherein the presentation content comprises: a data supervision index overview, a data administration supervision index and a data security supervision index;
(4) Creating a small screen data supervision service sign presentation, wherein the presentation content comprises: data supervision matters, a data management supervision board, a data safety supervision board and an integrated digital platform supervision board.
CN202311398844.1A 2023-10-26 2023-10-26 Big data supervision method Pending CN117435577A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311398844.1A CN117435577A (en) 2023-10-26 2023-10-26 Big data supervision method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311398844.1A CN117435577A (en) 2023-10-26 2023-10-26 Big data supervision method

Publications (1)

Publication Number Publication Date
CN117435577A true CN117435577A (en) 2024-01-23

Family

ID=89554740

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311398844.1A Pending CN117435577A (en) 2023-10-26 2023-10-26 Big data supervision method

Country Status (1)

Country Link
CN (1) CN117435577A (en)

Similar Documents

Publication Publication Date Title
US11722515B1 (en) Implementing hierarchical cybersecurity systems and methods
US20220004546A1 (en) System for automatically discovering, enriching and remediating entities interacting in a computer network
US20220210200A1 (en) Ai-driven defensive cybersecurity strategy analysis and recommendation system
US9392003B2 (en) Internet security cyber threat reporting system and method
US11442802B2 (en) Linking related events for various devices and services in computer log files on a centralized server
US20080148398A1 (en) System and Method for Definition and Automated Analysis of Computer Security Threat Models
US10540502B1 (en) Software assurance for heterogeneous distributed computing systems
US11003563B2 (en) Compliance testing through sandbox environments
US20200272741A1 (en) Advanced Rule Analyzer to Identify Similarities in Security Rules, Deduplicate Rules, and Generate New Rules
CN109714187A (en) Log analysis method, device, equipment and storage medium based on machine learning
CN105183625A (en) Log data processing method and apparatus
US20200012990A1 (en) Systems and methods of network-based intelligent cyber-security
CN112199276B (en) Method, device, server and storage medium for detecting change of micro-service architecture
CN113556254B (en) Abnormal alarm method and device, electronic equipment and readable storage medium
CN115396182A (en) Industrial control safety automatic arrangement and response method and system
US10262133B1 (en) System and method for contextually analyzing potential cyber security threats
CN114443437A (en) Alarm root cause output method, apparatus, device, medium, and program product
US20200067985A1 (en) Systems and methods of interactive and intelligent cyber-security
US10924362B2 (en) Management of software bugs in a data processing system
Settanni et al. A Collaborative Analysis System for Cross-organization Cyber Incident Handling.
Laue et al. A siem architecture for advanced anomaly detection
KR101973728B1 (en) Integration security anomaly symptom monitoring system
CN115499840A (en) Security assessment system and method for mobile internet
CN115408236A (en) Log data auditing system, method, equipment and medium
CN117435577A (en) Big data supervision method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination