CN117785980A - Online management and analysis system and method based on block chain public chain data - Google Patents

Online management and analysis system and method based on block chain public chain data Download PDF

Info

Publication number
CN117785980A
CN117785980A CN202410013630.6A CN202410013630A CN117785980A CN 117785980 A CN117785980 A CN 117785980A CN 202410013630 A CN202410013630 A CN 202410013630A CN 117785980 A CN117785980 A CN 117785980A
Authority
CN
China
Prior art keywords
data
analysis
client
blockchain
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410013630.6A
Other languages
Chinese (zh)
Inventor
赵玺
李雨航
张华东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202410013630.6A priority Critical patent/CN117785980A/en
Publication of CN117785980A publication Critical patent/CN117785980A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses an online management and analysis system and method based on block chain public chain data, wherein the method comprises the following steps: building and stably operating a public chain client, and monitoring the health state of the client; writing an etl tool, and analyzing the data; monitoring the real-time state of synchronous data of each client, constructing a big data cluster, analyzing the data in real time and storing the data into the big data cluster; constructing a data quality index aiming at the warehouse-in data, and calculating the quality problem of the monitoring data at regular time; the whole flow is controlled, and the missing failed task is re-executed; an abnormal alarm function is built, the alarm is carried out in a grading manner, and real-time alarm is carried out aiming at the operation of a client; constructing front-end and back-end separation service, and providing data for a user to perform a real-time analysis function; constructing a data analysis report template, periodically calculating related data labels, and generating a data analysis report; and providing a general label calculation tool for a user to call the related label to acquire corresponding data according to the self-requirement.

Description

Online management and analysis system and method based on block chain public chain data
Technical Field
The invention belongs to the field of big data analysis and data management, and particularly relates to an online management and analysis system and method based on block chain public chain data.
Background
Blockchain is an emerging technology with the characteristics of decentralization, tamper resistance, traceability and the like, and has been favored by many researchers since the advent of the technology. The blockchain is used as an emerging data storage technology, has professional terms, storage modes and expression modes, has the characteristics of strong specialization, low cognition, similar data object expression forms, difficult distinction, overlarge or undersize transaction values, high accuracy and the like, and has certain difficulty in understanding the blockchain data compared with the traditional data information. Moreover, the blockchain technology is not yet popularized and applied in the market, has high industry barriers and low mass awareness.
Currently, most blockchain management and analysis tools only provide blockchain partial data and formatted data analysis capability, but cannot provide large-scale online data analysis functions and good data quality.
Big data management and online analysis of blockchain public chain data are of great significance, and are mainly embodied in the following aspects: the blockchain male chain contains a large amount of transaction and contract data, and analysis of the data can provide support for decision making. For example, in the financial field, potential market trends can be identified by analyzing the transaction data to formulate a more accurate investment strategy. On-line analysis of blockchain transaction data helps to monitor market risk in real-time. By monitoring the transaction data, abnormal transaction behaviors can be found in time, potential risks are early warned, and corresponding risk management measures are adopted. The intelligent contracts on the blockchain are executable contracts in the form of codes, and the execution condition of the intelligent contracts can be better known through big data management and online analysis. This helps to find potential problems in time, improve the efficiency of contract execution, and optimize contracts. The user behavior data on the blockchain contains rich information, and the user demands and behavior patterns can be better known by analyzing the transaction behaviors of the user, the frequency of participating in intelligent contracts and the like, so that a reference basis is provided for optimizing products and services. The blockchain network may face performance bottlenecks when processing a large number of transactions, and the bottlenecks can be found by analyzing network data, so that network performance is optimized, and throughput and stability of the system are improved. Most industries are affected by regulations and regulations, including the blockchain domain. Big data management on the blockchain public chain data is helpful to ensure compliance of the platform, and through real-time monitoring and analysis, the service mode is timely adjusted to meet related regulations.
In general, big data management and online analysis of the blockchain public chain data are beneficial to fully playing the advantages of the blockchain technology, improving the utilization value of the data and promoting the wider application of the blockchain in various fields.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides an online management and analysis system and method based on the public chain data of the block chain, which realize the data management and online analysis of mass data of the public chain of the block chain.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: an online management and analysis system based on block chain public chain data comprises a data cluster layer, a data link layer, a data warehouse layer, a platform rear end and a platform front end;
the big data cluster layer comprises a big data component which is used for storing the obtained massive block chain public chains and providing online analysis resources to the outside;
the data link layer comprises a blockchain client, an ETL tool and a data crawler, wherein each node is subjected to dependence environment configuration, corresponding clients are installed and configured to become clients of all nodes or archive nodes, and the clients are used for synchronizing the public chain data of each blockchain in real time; the ETL tool is used for extracting, converting and warehousing the blockchain data; the data crawler is used for acquiring the block chain auxiliary information;
the data management layer comprises an offline data warehouse and a real-time data warehouse, the data warehouse is built in layers based on the data monitoring and management requirements, and the data is divided into an original data layer, a summarized data layer, a stable data layer and a data service layer according to the data quality index; the real-time data warehouse is used for providing real-time data analysis to the outside;
the platform back end integrates a data warehouse, a data tag, a data analysis interface and a data analysis model, and is used for data analysis and interaction; the data labels comprise labels which allow users to multiplex existing labels and write labels by themselves according to own needs; the data analysis interface is used for directly writing SQL by a user to perform online data analysis; the data analysis model is a built-in data analysis model and is used for analyzing data in real time; the data interaction function allows a user to perform iterative analysis on the queried data;
the front end of the platform is used for automatically generating a data analysis report, data analysis visualization, abnormal behavior analysis and display, user registration and login and platform monitoring.
Further, the big data component is Hadoop, hive, spark, HBase, kafka and/or Zookeeper; hadoop is used for storing basic data and scheduling resources; hive is used to satisfy the offline analysis of the user; spark is combined with Kafka on one hand for real-time data warehousing, and is provided for users for real-time analysis on the other hand; hbases are used to provide a quick response to a user with a part of specific data.
Further, the original data layer is a position where the acquired data is directly stored, and no operation is performed on the data; the summarized data layer carries out basic processing comprising abnormal value detection and null value detection on the data; the stable data layer calculates the position of the data stored without errors after index analysis by constructing the data quality index; the data service layer performs aggregation and statistics operation on the data, after the dimensionality of the data is enriched, the data is provided for the outside, the data in the latest set time is reserved in the real-time data warehouse, and the data before the set time is provided with error-free data by the offline data.
Further, the automatic generation of the data analysis report is specifically to automatically generate a block chain industry development trend report according to the data analysis template by periodically calculating data; data analysis visualization allows a user to visualize data analyzed online; the abnormal behavior analysis specifically comprises the steps of analyzing and displaying blockchain abnormal transaction data in real time through a built-in abnormal transaction model, and registering and logging in a system and a safe management method aiming at a user; the content monitored by the platform comprises a server resource state, a client running state, a client data synchronization state and a data analysis and warehousing condition, and warning notifiers are set according to requirements.
On the basis of providing an on-line management and analysis system based on the block chain public chain data, the invention also provides an on-line management and analysis method based on the block chain public chain data, which comprises the following steps:
s1: building and stably operating a public chain client, and monitoring health states of all clients according to the operation log of the client and the operated pid;
s2: aiming at the characteristics of synchronous data of all clients, writing an etl tool, and analyzing the data;
s3: monitoring the real-time state of synchronous data of each client, calling an etl tool to analyze the data in real time and storing the data in a data warehouse layer;
s4: constructing a data quality index aiming at the data stored in the data warehouse layer, and calculating the quality problem of the monitoring data at regular time; the data quality is evaluated and monitored by regular operation, and abnormal data are obtained again;
s5: controlling the whole flow by using a dolphinscheler, and re-executing the missing failed task setting;
s6: an abnormal alarm function is built aiming at the abnormal state, the abnormal alarm function is classified for alarm, the real-time alarm is carried out aiming at the operation of a client, and the alarm is carried out after information is summarized every day aiming at the problems of data quality and task execution flow;
s7: constructing a platform rear end and a platform front end separation service, and providing data for a user for real-time analysis;
s8: constructing a data analysis report template, periodically calculating related data labels, and generating a data analysis report;
s9: and providing a general label calculation tool for a user to call the related label to acquire corresponding data according to the self-requirement.
Further, in S1, for the main stream blockchain public chain, selecting a proper client, installing a corresponding environment on a machine meeting configuration requirements, operating the client, monitoring and analyzing a client operation log and the operated pid thereof, and simultaneously monitoring the use of server resources; in S2, modeling is conducted on data according to the characteristics of data APIs of all clients and service requirements when the ETL tool is written correspondingly, and the data are extracted and converted.
Further, in S5, for the content in S1-S4, the data management flow is controlled by using a dolphinscheler, including:
monitoring the health state of the client, restarting the client when the operation of the client is stopped, and recording a failure log;
monitoring the data synchronization state of the client in real time, calling ETL in real time to acquire data and storing the data into a data warehouse layer;
the ETL tool is operated regularly to acquire offline data according to the day, so that the stability and the data quality of the data are ensured;
periodically running a data quality index, analyzing the data quality, and processing abnormal data;
for the above flow, if the operation fails, the failure log is written and the operation is restarted.
Further, in step S6, for the contents in steps S1 to S4, by analyzing the scheduling process log in step S5, the abnormality in the running process of each step is monitored and alerted in a grading manner, for example, the abnormality is alerted in real time for the running of the client, and the abnormality is alerted after the information is summarized every day for the data quality and task execution flow problem.
Further, in step S7, a front-end and back-end separation system is constructed, the back-end system integrates Hive, spark, mysql and other components, provides real-time online data analysis service, and operates a user to obtain the latest information of the blockchain public chain through an integrated blockchain client interface API, and the front-end provides functions of registration, login, use and the like for the user, specifically includes operating the registered user to write SQL after logging in the system to perform online analysis on the blockchain public chain data.
Further, in S8, the blockchain is divided into Contract, token, NFT, DAPP, EOA, security and transaction networks by combining the blockchain business knowledge, macro analysis indexes are respectively designed, a data analysis report template is constructed, and a data analysis report is automatically generated by periodically calculating related data labels;
through the integrated Spark, the user is allowed to customize the label to calculate, and the universal label is provided, so that the user can call the related label to acquire and analyze the drink data according to the needs.
Compared with the prior art, the invention has at least the following beneficial effects: the system adopts an advanced big data processing technology and has good expandability. Can handle large-scale data and can be easily expanded to accommodate ever-increasing amounts of data when needed. Compared with the traditional batch processing system, the system supports real-time data processing and analysis. The user can acquire the latest data analysis result, monitor the service condition in real time and improve the real-time performance of the decision. The present system may integrate multiple data sources including, but not limited to, databases, file systems, APIs, and the like. The multi-source data can be uniformly managed and analyzed, and a more comprehensive view is provided. The system provides an intuitive and user-friendly visual interface, so that non-technical staff can use the system easily. And the data analysis result is displayed in the forms of a chart, an instrument panel and the like, so that the user experience is improved. The data quality index and the monitoring mechanism are introduced, so that the data quality problem can be found and processed in time. And the accuracy and the reliability of the analysis result are ensured. The user can customize the data analysis task according to the self requirements, and select the interested index and dimension for analysis. The system provides rich analysis tools and function libraries, and meets the requirements of different business scenes. The system adopts advanced security technologies including data encryption, identity authentication, access control and the like, and ensures the security of data in the storage and transmission processes. Meanwhile, privacy protection regulations are strictly complied with, and user privacy is ensured. Through efficient data processing and analysis, the system can reduce the cost of data management and maintenance. The user does not need to input a large amount of resources and manpower to maintain the system, and the cost efficiency is improved. The system has an open architecture, supporting integration with other systems. The system can be seamlessly connected with various data storage, processing and application systems, and the application range of the system is expanded.
In general, the big data management system and the online analysis method provide a more efficient, intelligent and safe data management and analysis solution for users by introducing advanced technologies and functions, and are beneficial to reducing the difficulty of analyzing the blockchain data for the users.
Drawings
FIG. 1 is a block diagram of an on-line management and analysis system and method based on blockchain public chain data in accordance with the present invention;
FIG. 2 is a data analysis report.
Detailed Description
The invention provides a data management method and an online analysis system based on block chain public chain data, wherein the method comprises the following steps: building and stably operating clients of public chains such as an Ethernet, a coin-operated device, polygon, gnosis, bitcoin and the like, and monitoring health states of the clients according to operation logs and operated pid of the clients; aiming at the characteristics of the synchronous data of each client, writing an etl tool, and analyzing the synchronous data of the clients; monitoring the real-time state of the synchronous data of each client, calling an etl tool to analyze the synchronous data of the client in real time and warehousing; constructing a data quality index aiming at the warehouse-in data, and calculating the quality problem of the monitoring data at regular time; the whole flow is controlled by using a dolphinscheler, and the task with failed missing can be re-executed; aiming at the abnormality in the process, an abnormality alarm function is built, the abnormality alarm function is classified for alarm, the real-time alarm is carried out aiming at the operation of a client, and the alarm is carried out after information is summarized every day aiming at the problems of data quality and task execution flow; constructing front-end and back-end separation service, and providing data for users to perform functions such as real-time analysis; constructing a data analysis report template, periodically calculating related data labels, and generating a data analysis report based on the data analysis report template and the related data labels; and providing a general label calculation tool, so that a user can call related labels to acquire corresponding data according to the self-needs.
Referring to fig. 1, the on-line management and analysis method based on blockchain public chain data provided by the invention comprises the following steps:
s1: build public chain customer end and health status monitoring
When constructing the clients of the public chains such as the Ethernet, the coin and the Polygon, gnosis, bitcoin, the client software suitable for each chain is selected by combining the factors such as the server performance, the environment and the like, such as Geth, party and Bitcoin Core. By performing the dependent environment configuration on each node, installing and configuring the corresponding client software, and operating the client, the public chain data is synchronized by becoming a full node or an archiving node. And monitoring the health state of each client by monitoring the client operation log and acquiring the operation PID. Health status monitoring may include information whether a node is online, synchronized status, block height, etc.
S2: writing ETL tools for data parsing
And aiming at the characteristics of synchronous data of all clients, an ETL (Extract, transform, load) tool is written and is used for analyzing the blockchain data. The tool needs to be capable of processing data formats of different public chains, extracting useful information, carrying out necessary conversion, and finally loading data into a target database, so that the consistency and usability of the data can be ensured; the ETL tool framework is constructed as follows: the entity layer, the analysis layer, the connection layer and the storage layer are convenient for quick response and modification of data processing according to technical upgrading and service variation;
s3: real-time monitoring of synchronous data status and real-time parsing:
and the monitoring system is used for monitoring the real-time state of the synchronous data of each client, and when the synchronous data is found to be new, the ETL tool is triggered in real time to analyze in real time, so that the timeliness of the data is ensured, and the data lag and abnormal conditions are prevented.
S4: constructing a data quality index and calculating timing:
establishing data quality indexes including data integrity, accuracy and consistency aiming at data in storage; by calculating data integrity, accuracy, and consistency at regular intervals, quality problems with the data can be monitored. For example, it may be checked whether there is missing data, whether the data format is correct, etc.
S5: flow control was performed using a dolphin scheduler:
dolphin Scheduler is an open source distributed workflow scheduling system that can be used to control the overall flow. The functions of timing execution, failed retry and the like of the tasks can be realized through Dolphin Scheduler, and the stability of the whole data synchronization and analysis flow is ensured.
For the content in S1-S4, the data management flow is controlled using Dolphin Scheduler, including:
monitoring the health state of the client, restarting the client when the operation of the client is stopped, and recording a failure log;
monitoring the data synchronization state of the client in real time, calling ETL in real time to acquire data and storing the data into a data warehouse layer;
the ETL tool is operated regularly to acquire offline data according to the day, so that the stability and the data quality of the data are ensured;
periodically running a data quality index, analyzing the data quality, and processing abnormal data;
for the above flow, if the operation fails, the failure log is written and the operation is restarted.
S6: construction of an abnormality alarming function:
aiming at the problems of abnormal operation and data quality of the client, an abnormal alarm function is built, graded alarm is realized, timely alarm can be set for real-time operation of the client, and alarm can be carried out after summarizing the problems of data quality every day, and alarm modes can comprise mail notification, short message reminding and the like.
S7: and (3) constructing front-end and back-end separation service:
the data is provided for the user to analyze in real time by constructing the services with separated front and back ends; the front end is provided with a visual instrument board for displaying the real-time state and the data quality condition of each public chain; the SQL interface is also provided for the user to analyze the data online; the backend is responsible for providing the data interface and processing user requests.
S8: constructing a data analysis report template:
designing and constructing a data analysis report template, periodically calculating related data labels, and generating a data analysis report, wherein the data analysis report can comprise performance indexes, user activity conditions and market trends of each public chain, so as to provide references for decision making.
S9: the general label computing tool provides:
the general label calculation tool is provided, so that a user can call related labels to acquire corresponding data according to own requirements, the flexibility of the system is improved, and the user can customize the required data labels according to specific conditions.
In general, aspects including from building public chain clients to data synchronization, parsing, quality monitoring, alarming, data analysis, ensure healthy operation of the entire blockchain system and data quality.
The system of the invention is 5 layers, comprising: a data cluster layer, a data link layer, a data warehouse layer (data management layer), a platform back end and a platform front end.
Wherein, big data cluster layer includes: hadoop, hive, spark, HBase, kafka, zookeeper and the like, which are used for storing the obtained massive blockchain public chains and providing online analysis resources to the outside, specifically: hadoop provides basic data storage and resource scheduling capability; hive is used for meeting the offline analysis requirements of users; spark is used for real-time data warehousing by combining Kafka on one hand and is used for providing users for real-time analysis on the other hand; HBase is then used to provide a quick response to the user with part of the special data.
The data link layer is one of the cores of the present invention. The data link layer mainly comprises a blockchain client, an ETL tool and a data crawler. The blockchain client refers to selecting client software suitable for each chain, such as Geth, party, bitcoin Core, and the like, by combining factors such as server performance, environment, and the like when building the client of the public chain such as Ethernet, coin and amp, polygon, gnosis, bitcoin, and the like. The client is installed and configured by carrying out environment-dependent configuration on each node, and the client is operated to become a client of a full node or an archive node, so that the aim of synchronizing public chain data of each blockchain in real time is achieved, and meanwhile, the health state of each client is monitored by acquiring PID (proportion integration differentiation) of each client during operation in order to ensure the stability of data synchronization. The ETL tool is designed according to the characteristics of each client and the business knowledge of each blockchain, and is used for extracting, converting and warehousing the blockchain data. The data crawler is a data web crawler written for some blockchain auxiliary information.
Different from the traditional data warehouse, the data warehouse is built in a layered mode based on the data monitoring and management requirements, and the data are layered according to the set data quality indexes and are divided into: the data processing system comprises an original data layer, a summarized data layer, a stable data layer and a data service layer, wherein the original data layer is a position where data obtained through various modes is directly stored, and no operation is performed on the data in the layer so as to ensure that the data of the upper layer can be traced when errors occur; the summarized data layer mainly performs some basic processing on data, including abnormal value detection, null value detection and the like; the stable data layer is the position where the error-free data is stored after the data quality index is constructed and the index analysis is calculated regularly; and then, performing operations such as aggregation, statistics and the like on the data, and enriching the dimension of the data to provide the data to the outside, namely, a data service layer. The offline data warehouse is used for providing real-time data analysis to the outside, and based on the specificity of the operation of the blockchain client, the real-time acquired data can be abnormal, so that the real-time data only keeps the data in the last 2 days, and the data before 2 days is provided with error-free data by the offline data.
The back end of the platform is mainly integrated with a data warehouse, a data tag, a data analysis interface, a data analysis model and data interaction capability. Wherein the data warehouse is the place where the actual data provided to the user for analysis is located; the data label comprises a label which allows a user to multiplex the existing label according to the self requirement and can write the label by oneself to perform data calculation; the data analysis interface allows a user to directly write SQL for online data analysis; the data analysis model is a built-in data analysis model and is used for analyzing data in real time; the data interaction function allows a user to iteratively analyze the queried data.
The front end of the platform comprises functions of automatically generating a data analysis report, data analysis visualization, abnormal behavior analysis and display, user registration and login and platform monitoring. The automatic generation of the data analysis report is specifically to automatically generate a block chain industry development trend report according to a data analysis template by periodically calculating data; data analysis visualization, allowing a user to visualize the data analyzed online; the abnormal behavior analysis specifically comprises the steps of analyzing and displaying blockchain abnormal transaction data in real time through a built-in abnormal transaction model, and registering and logging in a system and safety management actions aiming at a user; the monitoring interface for the fixed oil data management is arranged outside, the monitoring content comprises monitoring of server resource state, client running state, client data synchronous state, data analysis and storage condition and the like, warning notifiers can be arranged according to the needs, and corresponding responsible persons can be notified in time when abnormality occurs.
Referring to fig. 2, the data analysis report is specifically:
after the data index is calculated regularly, a Section selection frame is selected, the required data corresponding to the blockchain hierarchy is checked, and the corresponding period number can be selected, so that the automatic generation of the corresponding data analysis report can be displayed.
While the invention has been described with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. The on-line management and analysis system based on the block chain public chain data is characterized by comprising a data cluster layer, a data link layer, a data warehouse layer, a platform rear end and a platform front end;
the big data cluster layer comprises a big data component which is used for storing the obtained block chain public chain data and providing online analysis resources to the outside;
the data link layer comprises a blockchain client, an ETL tool and a data crawler, wherein each node is subjected to dependence environment configuration, corresponding clients are installed and configured to become clients of all nodes or archive nodes, and the clients are used for synchronizing the public chain data of each blockchain in real time; the ETL tool is used for extracting, converting and warehousing the blockchain data; the data crawler is used for acquiring the block chain auxiliary information;
the data management layer comprises an offline data warehouse and a real-time data warehouse, the data warehouse is built in layers based on the data monitoring and management requirements, and the data is divided into an original data layer, a summarized data layer, a stable data layer and a data service layer according to the data quality index; the real-time data warehouse is used for providing real-time data analysis to the outside;
the platform back end integrates a data warehouse, a data tag, a data analysis interface and a data analysis model, and is used for data analysis and interaction; the data labels comprise labels which allow users to multiplex existing labels and write labels by themselves according to own needs; the data analysis interface is used for directly writing SQL by a user to perform online data analysis; the data analysis model is a built-in data analysis model and is used for analyzing data in real time; the data interaction function allows a user to perform iterative analysis on the queried data;
the front end of the platform is used for automatically generating a data analysis report, data analysis visualization, abnormal behavior analysis and display, user registration and login and platform monitoring.
2. The blockchain-based public chain data online management and analysis system of claim 1, wherein the big data component is Hadoop, hive, spark, HBase, kafka and/or Zookeeper; hadoop is used for storing basic data and scheduling resources; hive is used to satisfy the offline analysis of the user; spark is combined with Kafka on one hand for real-time data warehousing, and is provided for users for real-time analysis on the other hand; hbases are used to provide a quick response to a user with a part of specific data.
3. The system of claim 1, wherein the raw data layer is a location where the acquired data is directly stored without performing any operations on the data; the summarized data layer carries out basic processing comprising abnormal value detection and null value detection on the data; the stable data layer calculates the position of the data stored without errors after index analysis by constructing the data quality index; the data service layer performs aggregation and statistics operation on the data, after the dimensionality of the data is enriched, the data is provided for the outside, the data in the latest set time is reserved in the real-time data warehouse, and the data before the set time is provided with error-free data by the offline data.
4. The online management and analysis system based on blockchain public chain data according to claim 1, wherein the automatic generation of the data analysis report is specifically realized by periodically calculating data, and the blockchain industry development trend report is automatically generated according to a data analysis template; data analysis visualization allows a user to visualize data analyzed online; the abnormal behavior analysis specifically comprises the steps of analyzing and displaying blockchain abnormal transaction data in real time through a built-in abnormal transaction model, and registering and logging in a system and a safe management method aiming at a user; the content monitored by the platform comprises a server resource state, a client running state, a client data synchronization state and a data analysis and warehousing condition, and warning notifiers are set according to requirements.
5. The online management and analysis method based on the blockchain public chain data is characterized in that the online management and analysis system based on the blockchain public chain data as claimed in any one of claims 1 to 4 comprises the following steps:
s1: building and stably operating a public chain client, and monitoring health states of all clients according to the operation log of the client and the operated pid;
s2: aiming at the characteristics of synchronous data of all clients, writing an etl tool, and analyzing the data;
s3: monitoring the real-time state of synchronous data of each client, calling an etl tool to analyze the data in real time and storing the data in a data warehouse layer;
s4: constructing a data quality index aiming at the data stored in the data warehouse layer, and calculating the quality problem of the monitoring data at regular time; the data quality is evaluated and monitored by regular operation, and abnormal data are obtained again;
s5: controlling the whole flow by using a dolphinscheler, and re-executing the missing failed task setting;
s6: an abnormal alarm function is built aiming at the abnormal state, the abnormal alarm function is classified for alarm, the real-time alarm is carried out aiming at the operation of a client, and the alarm is carried out after information is summarized every day aiming at the problems of data quality and task execution flow;
s7: constructing a platform rear end and a platform front end separation service, and providing data for a user for real-time analysis;
s8: constructing a data analysis report template, periodically calculating related data labels, and generating a data analysis report;
s9: and providing a general label calculation tool for a user to call the related label to acquire corresponding data according to the self-requirement.
6. The online management and analysis method based on blockchain public chain data according to claim 5, wherein in S1, for the blockchain public chain of the main stream, a proper client is selected, a corresponding environment is installed on a machine meeting configuration requirements, the client is operated, monitoring and analysis are performed on a client operation log and the pid operated by the client, and meanwhile, the use of server resources is monitored; in S2, modeling is conducted on data according to the characteristics of data APIs of all clients and service requirements when the ETL tool is written correspondingly, and the data are extracted and converted.
7. The method of claim 5, wherein in S5, for the content in S1-S4, using dolphin scheduler to control the data management flow comprises:
monitoring the health state of the client, restarting the client when the operation of the client is stopped, and recording a failure log;
monitoring the data synchronization state of the client in real time, calling ETL in real time to acquire data and storing the data into a data warehouse layer;
the ETL tool is operated regularly to acquire offline data according to the day, so that the stability and the data quality of the data are ensured;
periodically running a data quality index, analyzing the data quality, and processing abnormal data;
for the above flow, if the operation fails, the failure log is written and the operation is restarted.
8. The online management and analysis method based on blockchain public chain data according to claim 1, wherein in step S6, for the content in steps S1-S4, by analyzing the scheduling process log in step S5, the abnormality in the operation process of each step is monitored and graded, such as real-time alarm for the operation of the client, and alarm is performed after daily summary of information for the data quality and task execution flow problem.
9. The online management and analysis method based on the blockchain public chain data according to claim 1, wherein in step S7, a front-end and back-end separation system is constructed, a back-end system integrates Hive, spark, mysql and other components to provide real-time online data analysis service, and through an integrated blockchain client interface API, an operating user obtains the latest blockchain information, and the front end provides functions of registration, login, use and the like for the user, specifically includes operating the registered user to write SQL after logging into the system to perform online analysis on the blockchain public chain data.
10. The online management and analysis method based on the blockchain public chain data according to claim 1, wherein in S8, combining the blockchain business knowledge, dividing the blockchain into Contract, token, NFT, DAPP, EOA, security and transaction networks, respectively designing macro analysis indexes, constructing a data analysis report template, and automatically generating a data analysis report by periodically calculating related data labels;
through the integrated Spark, the user is allowed to customize the label to calculate, and the universal label is provided, so that the user can call the related label to acquire and analyze the drink data according to the needs.
CN202410013630.6A 2024-01-04 2024-01-04 Online management and analysis system and method based on block chain public chain data Pending CN117785980A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410013630.6A CN117785980A (en) 2024-01-04 2024-01-04 Online management and analysis system and method based on block chain public chain data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410013630.6A CN117785980A (en) 2024-01-04 2024-01-04 Online management and analysis system and method based on block chain public chain data

Publications (1)

Publication Number Publication Date
CN117785980A true CN117785980A (en) 2024-03-29

Family

ID=90389015

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410013630.6A Pending CN117785980A (en) 2024-01-04 2024-01-04 Online management and analysis system and method based on block chain public chain data

Country Status (1)

Country Link
CN (1) CN117785980A (en)

Similar Documents

Publication Publication Date Title
Begoli et al. Design principles for effective knowledge discovery from big data
CN112395325A (en) Data management method, system, terminal equipment and storage medium
CN112396404A (en) Data center system
CN114925045B (en) PaaS platform for big data integration and management
Fu et al. Real-time data infrastructure at uber
US20060184410A1 (en) System and method for capture of user actions and use of capture data in business processes
CN111866121B (en) Safety monitoring and management cloud platform for large crane equipment
CN110807067A (en) Data synchronization method, device and equipment for relational database and data warehouse
CN104036365A (en) Method for constructing enterprise-level data service platform
CN113094385B (en) Data sharing fusion platform and method based on software defined open tool set
CN112148578A (en) IT fault defect prediction method based on machine learning
CN110163458A (en) Data assets management and monitoring method based on artificial intelligence technology
CN114880405A (en) Data lake-based data processing method and system
CN115374102A (en) Data processing method and system
CN111913933B (en) Power grid historical data management method and system based on unified support platform
CN114218218A (en) Data processing method, device and equipment based on data warehouse and storage medium
CN112579703A (en) Big data comprehensive operation and maintenance platform processing system based on cloud computing
CN113535846B (en) Big data platform and construction method thereof
CN115640300A (en) Big data management method, system, electronic equipment and storage medium
CN115496337A (en) Data system for supporting brain of enterprise
CN111125450A (en) Management method of multilayer topology network resource object
CN112825165A (en) Project quality management method and device
CN116821106A (en) Enterprise data digital management system and method
CN115689788A (en) Financial data analysis method
CN117785980A (en) Online management and analysis system and method based on block chain public chain data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination