CN107704594B - Real-time processing method for log data of power system based on spark streaming - Google Patents

Real-time processing method for log data of power system based on spark streaming Download PDF

Info

Publication number
CN107704594B
CN107704594B CN201710951969.0A CN201710951969A CN107704594B CN 107704594 B CN107704594 B CN 107704594B CN 201710951969 A CN201710951969 A CN 201710951969A CN 107704594 B CN107704594 B CN 107704594B
Authority
CN
China
Prior art keywords
time
batch
interval
block interval
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710951969.0A
Other languages
Chinese (zh)
Other versions
CN107704594A (en
Inventor
宋爱波
涂金林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN201710951969.0A priority Critical patent/CN107704594B/en
Publication of CN107704594A publication Critical patent/CN107704594A/en
Application granted granted Critical
Publication of CN107704594B publication Critical patent/CN107704594B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/244Grouping and aggregation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a real-time processing method of log data of a power system based on Spark Streaming, which comprises the steps of firstly predefining a statistical model aiming at the problems that the log data stream of the whole network is increased sharply and the types and relevant attributes of the log data acquired by a processing system are varied, and reducing the preprocessing time of the processing system; then, through analysis of the relation between the block interval and the processing time, the dynamic adjustment based on the block interval is found, so that the processing time of the query task can be optimized; and finally, designing an efficient dynamic adjustment strategy based on the method, searching for an optimal block interval in time, and reducing the processing time of a query task, so that the running state and the track of the power dispatching automation system are analyzed, and the qualitative to quantitative analysis and conversion of the health condition of the power system are realized. The invention provides an efficient and easy-to-use real-time processing method for effective management of the log data of the power system.

Description

Real-time processing method for log data of power system based on spark streaming
Technical Field
The invention relates to a real-time processing method of log data of a power system, in particular to a real-time processing method of log data of a power system based on Spark Streaming.
Background
Electric power is a fundamental industry for operation and development of modern society, and the safety and stability of an electric power system are related to the aspects of human social life. The power dispatching automation system is a data processing system and comprises power system operation information, an analysis decision tool and a control means. During operation, the power dispatching automation system generates data such as state, debugging and error, and the data is collectively called log data. The log data is used as an expression form of the operation information of the power system, and the log data is analyzed quickly and accurately, so that the log data has an important guarantee effect on the safe and stable operation of the power system.
With the continuous expansion of the scale of the dispatching automation system, the log data volume required to be processed in real time by the power system is increased sharply. The system has the characteristics of large data volume and rapid growth of the whole network real-time log data, and the requirements on calculation, analysis, simulation, optimization and the like of the system far exceed the bearing capacity of a common computing system, so that the traditional log management means cannot meet the management and analysis requirements of mass log data. Previous streaming systems have selected distinct data for processing by dropping a portion of the incoming data stream (e.g., hierarchical offload), or by flexibly adding additional resources. However, generally speaking, discarding data is not a good choice, and it is very likely that the discarded data is very important, thereby affecting the correctness of the result; moreover, for a real-time data stream with high throughput, the cost is huge for acquiring related resources in advance.
In order to determine the running trend and mode of the system, find out faults and the like, the running state and track of the power dispatching automation system are analyzed, and online real-time analysis is needed. Due to the influence of the performance of the magnetic disk, log data cannot be processed in time, so that data loss is caused, and the fast processing capability of a memory is required. Meanwhile, in the face of the continuous change of system resources and states, the processing system needs to be capable of adjusting in time, and the processing time of the system is ensured to be optimal.
In view of the above problems, researchers have focused on how to break through the I/O bottleneck by using memory resources, improve the data throughput, and increase the data processing speed. Apache Spark is an open source computing framework that stands out therein. The Spark iterative computation framework based on the memory can operate a specific data set in the memory for multiple times, so that the rapid analysis and processing of big data are realized. Spark Streaming, as its upper tool, provides real-time processing functions based on intervals. The time when a data stream is divided into several data blocks is called a block interval, and the time when several data blocks are combined into one batch is called a batch interval. The method can well meet the real-time processing requirement of the power dispatching automation system on data in a certain time period.
Generally, if the parallelism of processing data by the Spark Streaming (the number of data blocks contained in a batch is equal to batch interval/block interval) is lower, the overhead and utilization rate of resources will be smaller, such as creation and interaction of tasks. Large-scale parallel computing results in a large amount of resource overhead accompanied by extremely high resource utilization. In order to timely know the running state and track of the power dispatching automation system and realize the qualitative to quantitative analysis and conversion of the health condition of the power system, it is necessary to ensure that the query task can reach lower resource overhead and higher resource utilization rate. In order to balance the overhead and utilization of resources, the parallelism of processing needs to be adjusted in time when facing different system states and resource changes.
In recent years, the processing requirements of real-time data streams have facilitated the development of a distributed real-time computing framework. For example: the document "High-through debug Architecture for Log Analysis and Data Stream Mining" employs an Apache Storm as a real-time computing framework, receives real-time Data and then analyzes. Spark Streaming, as an upper level tool of Spark, differs from Storm system in that: the Spark Streaming is not a processing data stream recorded one after another, but a batch job in which a data stream is divided into a plurality of time periods in advance at time intervals and processed. Storm is a real-time computing framework based on event level, and the power dispatching automation system is more of a computing analysis of data flow stateful batch processing in a certain time period. Moreover, Storm is processed at least once for each record, and when the node recovers from the error, the record is recalculated, so that the requirement of safety and reliability of the power dispatching automation system is not met.
By dynamically adjusting the batch interval or the size of the data block, the system can be ensured to run stably without knowing the state and running environment of the data stream in advance. However, these approaches are more concerned with the read-write throughput and resource utilization of data. And for complex calculation, dynamic adjustment cannot select a better batch interval or data block size, so that the processing time is longer and longer, and the requirement of rapid processing of the dispatching automation system is completely ignored.
Disclosure of Invention
The purpose of the invention is as follows: aiming at the problems, the invention provides a real-time processing method of log data of a power system based on Spark Streaming.
The technical scheme is as follows: in order to realize the purpose of the invention, the technical scheme adopted by the invention is as follows: a real-time processing method for log data of a power system based on Spark Streaming comprises the following steps:
(1) defining statistical models of different log categories;
(2) constructing a relation model of Spark Streaming block intervals and data stream processing time;
(3) and dynamically adjusting the block interval and searching for the optimal block interval.
Further, in step (1), the statistical model includes the elements: data sets, result sets, grouping conditions, grouping filters, and rule actions.
Further, in the step (2), the data stream is divided into a plurality of data blocks, namely, block intervals; several data blocks are combined into a batch time, i.e., a batch interval.
A relation model construction step:
(1) the batch module divides the received data stream into independent data blocks according to the block interval;
(2) wrapping the data blocks in a batch interval time into a batch, and entering a batch queue to queue for being processed;
(3) the data for all block intervals within one batch interval time are processed in parallel.
Further, in the step (3), a lot interval is given, a greedy algorithm is used for dynamically adjusting the block interval, and an optimal block interval is searched.
The greedy algorithm comprises the following steps:
(1) the initial block interval is expressed as beta, and the adjustment step length is i;
(2) if the batch processing time with the block interval beta is less than the batch processing time with the block interval beta + i, the optimal block interval is on the left side of the initial block interval; if the batch processing time with the block interval beta is less than the batch processing time with the block interval beta-i, the optimal block interval is on the right side of the initial block interval;
(3) when the direction of the optimal block interval is sought, the loop exploration is continued until the processing time cannot be reduced again.
Has the advantages that: the method comprehensively considers the characteristics of the log data of the power system, faces the constant change of system resources and states, the processing system does not need to redefine a statistical function and a statistical model according to the change of data streams, and can quickly and timely dynamically adjust, so that higher resource utilization rate and shorter processing time are achieved.
Drawings
FIG. 1 is a schematic block spacing diagram;
fig. 2 is a graph of the effect of block spacing on processing time.
Detailed Description
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
The invention provides a real-time processing method of log data of a power system based on Spark Streaming, aiming at the defects of the existing real-time computing framework in processing log data stream and comprehensively considering the relation between block intervals and data stream processing time, and aiming at ensuring that the Spark Streaming block intervals can be dynamically adjusted along with the continuous change of system resources and states, accelerating the processing speed of real-time data stream, analyzing the running state and track of a power dispatching automation system and realizing the qualitative to quantitative analysis and conversion of the health condition of the power system.
Firstly, aiming at the problems that the whole network log data stream is increased rapidly and the types and the related attributes of the log data acquired by the processing system are varied, the invention defines statistical models for different log types in advance so as to reduce the time for preprocessing the processing system; then, through the analysis of the relationship between the block interval and the processing time of the processing system, the dynamic adjustment based on the block interval is found to be capable of effectively reducing the processing time of the system; finally, based on the analysis, a dynamic adjustment strategy based on a greedy algorithm is designed, the optimal block interval is searched in time, the processing speed of the log data stream is increased, and the processing time of the query task is shortened.
The method for processing the log data of the power system in real time based on Spark Streaming comprises the following steps:
step 1: defining statistical models of different log categories, and performing rapid real-time analysis according to the statistical models;
when the types and the relevant attributes of the log data acquired by the processing system are changed continuously, a statistical model is defined in advance for each field in the process of processing and analyzing different log types, and the time for preprocessing the processing system is shortened.
The statistical model describes the set of elements that are required in a real-time analysis process. According to the statement format of SELECT in the structured query language, a statistical model needs to contain the following elements:
(1) data set: corresponding to the FROM and WHERE clauses. In the data set, the subscribed log categories, the statistical time window and the like need to be indicated, and if the log data belonging to a certain category needs to be further screened, the logic expression based on the layout elements is supported.
(2) And (4) result set: corresponding to the SELECT clause. In the result set, it is necessary to specify the result fields that will ultimately be generated in the current analysis process, mainly including layout elements and statistics fields. The statistics field supports a number of statistics functions: COUNT, SUM, MAX, MIN, TOP (N), ASSERT.
(3) Grouping conditions are as follows: corresponding to the GROUP BY clause. The grouping condition can only contain fields defined in the result set.
(4) A packet filter: packet filters can only contain statistical fields in the result set, for numeric elements the supported operators are: the name, >, <! The character-type elements support operators such as: EQUAL, continue, begin with, endlive.
(5) The rule acts as follows: and according to the content matching rule of the result set: and warehousing and alarming. Warehousing means storing the calculation result into an external system; the alarm means that a threshold value is set for the result of the statistical operation, and when the result exceeds the threshold value, alarm information is sent.
The analysis target and statistical model are shown in table 1:
TABLE 1
Figure BDA0001433060500000041
Step 2: constructing a relation model of Spark Streaming block intervals and data stream processing time;
the relation between the spare Streaming block interval and the data stream processing time is analyzed, and the condition of the block interval which enables the data stream processing time to reach the minimum is searched.
As shown in fig. 1, the batch module in the figure is a batch module of Spark Streaming, and is used for dividing the received data stream into a plurality of batches and then processing each batch separately. The batch module forms a batch and requires two important parameters: block interval and batch interval. The time when a data stream is divided into several data blocks is called a block interval, and the time when several data blocks are combined into one batch is called a batch interval.
Therefore, the batch module divides the received data stream into independent data blocks according to the block interval (block interval < batch interval), and then after a batch interval, all the data blocks in the batch interval are wrapped into a batch, and finally the batch enters the batch queue to be queued for processing.
It can be seen that the execution parallelism of a batch is determined by batch interval/block interval (batch interval/block interval), which indicates the number of data blocks in a batch. Under the equal resource allocation, if the processing parallelism is lower, the overhead and the utilization rate of the resources are smaller, such as the creation and interaction of tasks; large-scale parallel computation results in a large amount of resource overhead accompanied by extremely high resource utilization. In order to balance the overhead and utilization of resources, the parallelism of processing needs to be adjusted in time when facing different system states and resource changes. The running state and the track of the power dispatching automation system are known, the qualitative to quantitative analysis and conversion of the health condition of the power system are realized, and the batch interval needs to be kept relatively constant. Thus, the execution parallelism of a processing system is mainly affected by the block spacing.
According to the above analysis, the block interval determines the execution parallelism of the processing system, and also affects the processing performance of the system. As shown in FIG. 2, the batch interval for the Reduce workflow is constant at 3 seconds, while the batch interval for the Join workflow is constant at 1 second, the impact of the block interval on the processing time at data stream receive rates of 2MB/S and 4MB/S, respectively. It can be seen that the resulting curve approximates a parabola for different data stream reception rates, and the optimal block spacing to minimize processing time is the vertex of the parabola. In fact, the relationship between the block interval and the processing time is not a true parabola due to the change of the operating environment, the interference of noise, and the like. However, it is not suspected that the optimal block interval is always changed along with the change of the data receiving rate, because the faster the data receiving rate is, the more data in the block interval is; the slower the data reception rate, the less data in the block interval, and the more data will directly affect the processing time of the processing system.
Based on the above observations, for a given batch interval, the processing time of the query task can be optimized by adjusting the size of the block interval.
And step 3: and when the log data stream is analyzed in real time, according to the relation model in the step 2, the processing time of the query task is reduced by utilizing the dynamic adjustment of the Spark Streaming block interval.
According to the condition that the data stream processing time reaches the minimum block interval, searching the optimal block interval in time through a greedy method; and the dynamic adjustment is carried out according to the continuous change of the resources and the state of the processing system, so that the processing time of the query task is reduced.
The optimization objective of the present invention is to ensure that the block interval for the next batch of data reception has been determined for each batch processed by the processing system. As can be seen in fig. 2, if the selected initial block interval is too small or too large, the time to find the optimal block interval will be long. The trade-off is to select the block interval/2 as the initial block interval without frequent exploration, and then by gradually increasing or decreasing the block interval until the processing time cannot be decreased again.
Table 2 gives the algorithm for calculating the next block interval. The initial block interval is represented as beta, the adjustment step size is i, and in the calculation process, beta represents the next block interval. P1And P2Indicating the processing time of the first two batches.
The dynamic adjustment strategy based on the greedy algorithm is shown in table 2:
TABLE 2
Figure BDA0001433060500000051
Figure BDA0001433060500000061
The calculation process mainly comprises two parts: if the batch processing time with the block interval beta is less than the batch processing time with the block interval beta + i, the optimal block interval is on the left side of the initial block interval; if the batch time for a chunk interval β is less than the batch time for a chunk interval β -i, then the optimal chunk interval is to the right of the initial chunk interval. When the direction of the optimal block interval is sought, the loop exploration is continued until the processing time cannot be reduced again.
If the data reception rate and the system operating environment remain the same, the optimal block interval will remain stable. However, when the operating environment changes, the optimal block interval will change, and the correct algorithm needs to be adjusted in time to adapt to the latest environment. However, the convergence time is prolonged from the beginning, so the invention selects the block interval before the change of the running environment as the initial block interval and restarts the greedy adjustment.

Claims (2)

1. A method for processing log data of a power system in real time based on Spark Streaming is characterized by comprising the following steps: the method comprises the following steps:
(1) defining a statistical model of different log categories, the statistical model comprising the elements: data sets, result sets, grouping conditions, grouping filters, and rule actions;
(2) constructing a relation model of Spark Streaming block interval and data stream processing time, and dividing the data stream into a plurality of data block time, namely block interval; the time when a plurality of data blocks are combined into a batch, namely the batch interval;
(3) setting batch intervals, dynamically adjusting block intervals by using a greedy algorithm, and searching for optimal block intervals;
the greedy algorithm comprises the following steps:
(3.1) the initial block interval is expressed as beta, and the adjustment step size is i;
(3.2) if the batch processing time of the block interval β is less than the batch processing time of the block interval β + i, the optimal block interval is to the left of the initial block interval; if the batch processing time with the block interval beta is less than the batch processing time with the block interval beta-i, the optimal block interval is on the right side of the initial block interval;
and (3.3) when the direction of the optimal block interval is searched, continuing to circularly search until the processing time cannot be reduced again.
2. The Spark Streaming based power system log data real-time processing method according to claim 1, wherein the method comprises the following steps: the step (2) of constructing the relationship model comprises the following steps:
(2.1) the batching module dividing the received data stream into independent data blocks according to the block interval;
(2.2) wrapping the data blocks in one batch interval time into a batch, and entering a batch queue to queue for being processed;
and (2.3) processing the data of all block intervals in one batch interval time in parallel.
CN201710951969.0A 2017-10-13 2017-10-13 Real-time processing method for log data of power system based on spark streaming Active CN107704594B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710951969.0A CN107704594B (en) 2017-10-13 2017-10-13 Real-time processing method for log data of power system based on spark streaming

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710951969.0A CN107704594B (en) 2017-10-13 2017-10-13 Real-time processing method for log data of power system based on spark streaming

Publications (2)

Publication Number Publication Date
CN107704594A CN107704594A (en) 2018-02-16
CN107704594B true CN107704594B (en) 2021-02-09

Family

ID=61183445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710951969.0A Active CN107704594B (en) 2017-10-13 2017-10-13 Real-time processing method for log data of power system based on spark streaming

Country Status (1)

Country Link
CN (1) CN107704594B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109831316A (en) * 2018-12-17 2019-05-31 国网浙江省电力有限公司 Massive logs real-time analyzer, real-time analysis method and readable storage medium storing program for executing
CN112632020B (en) * 2020-12-25 2022-03-18 中国电子科技集团公司第三十研究所 Log information type extraction method and mining method based on spark big data platform

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104616205A (en) * 2014-11-24 2015-05-13 北京科东电力控制系统有限责任公司 Distributed log analysis based operation state monitoring method of power system
CN105005585A (en) * 2015-06-24 2015-10-28 上海卓悠网络科技有限公司 Log data processing method and device
CN105677489A (en) * 2016-03-04 2016-06-15 山东大学 System and method for dynamically setting batch intervals under disperse flow processing model
CN106168909A (en) * 2016-06-30 2016-11-30 北京奇虎科技有限公司 A kind for the treatment of method and apparatus of daily record
CN106227832A (en) * 2016-07-26 2016-12-14 浪潮软件股份有限公司 The Internet big data technique framework application process in operational analysis in enterprise
CN106778033A (en) * 2017-01-10 2017-05-31 南京邮电大学 A kind of Spark Streaming abnormal temperature data alarm methods based on Spark platforms

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2010065067A1 (en) * 2008-11-20 2010-06-10 Bodymedia, Inc. Method and apparatus for determining critical care parameters
US8826218B2 (en) * 2012-07-30 2014-09-02 Synopsys, Inc. Accurate approximation of the objective function for solving the gate-sizing problem using a numerical solver
CN106462578B (en) * 2014-04-01 2019-11-19 华为技术有限公司 The method they data base entries inquiry and updated
US9699205B2 (en) * 2015-08-31 2017-07-04 Splunk Inc. Network security system
CN105868019B (en) * 2016-02-01 2019-05-21 中国科学院大学 A kind of Spark platform property automatic optimization method
CN106547854B (en) * 2016-10-20 2020-12-15 天津大学 Distributed file system storage optimization energy-saving method based on glowworm firefly algorithm
CN106599182B (en) * 2016-12-13 2019-10-11 飞狐信息技术(天津)有限公司 Feature Engineering recommended method and device, video website based on spark streaming real-time streams
CN106936812B (en) * 2017-01-10 2019-12-20 南京邮电大学 File privacy disclosure detection method based on Petri network in cloud environment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104616205A (en) * 2014-11-24 2015-05-13 北京科东电力控制系统有限责任公司 Distributed log analysis based operation state monitoring method of power system
CN105005585A (en) * 2015-06-24 2015-10-28 上海卓悠网络科技有限公司 Log data processing method and device
CN105677489A (en) * 2016-03-04 2016-06-15 山东大学 System and method for dynamically setting batch intervals under disperse flow processing model
CN106168909A (en) * 2016-06-30 2016-11-30 北京奇虎科技有限公司 A kind for the treatment of method and apparatus of daily record
CN106227832A (en) * 2016-07-26 2016-12-14 浪潮软件股份有限公司 The Internet big data technique framework application process in operational analysis in enterprise
CN106778033A (en) * 2017-01-10 2017-05-31 南京邮电大学 A kind of Spark Streaming abnormal temperature data alarm methods based on Spark platforms

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Faster Stateful Stream Processing in Apache Spark Streaming;weiqing687;《https://blog.csdn.net/qq_26222859/article/details/54836445》;20170202;1-4页 *
Spark Streaming场景应用- Spark Streaming计算模型及监控;javastart;《https://blog.csdn.net/javastart/article/details/77510886》;20170823;1-4页 *
Spark Streaming性能调优详解Spark;w397090770;《https://www.iteblog.com/archives/1333.html》;20150428;1-3页 *
基于ELK Stack和Spark Streaming的日志处理平台设计与实现;村里的intern;《https://blog.csdn.net/bigstar863/article/details/49099531》;20151013;1-7页 *
基于Spark大数据平台日志审计系统的设计与实现;张彬;《中国优秀硕士学位论文全文数据库 信息科技辑》;20160415(第4期);I138-360页 *
基于Spark的电力系统日志数据的分析处理;涂金林;《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》;20180415(第4期);C042-741页 *

Also Published As

Publication number Publication date
CN107704594A (en) 2018-02-16

Similar Documents

Publication Publication Date Title
CN106648904B (en) Adaptive rate control method for streaming data processing
US20210152489A1 (en) Terminating data server nodes
US9251205B2 (en) Streaming delay patterns in a streaming environment
US9063973B2 (en) Method and apparatus for optimizing access path in database
US9436513B2 (en) Method of SOA performance tuning
US20070143246A1 (en) Method and apparatus for analyzing the effect of different execution parameters on the performance of a database query
CN103345514A (en) Streamed data processing method in big data environment
US20230146912A1 (en) Method, Apparatus, and Computing Device for Constructing Prediction Model, and Storage Medium
WO2023011236A1 (en) Compilation optimization method for program source code, and related product
CN107704594B (en) Real-time processing method for log data of power system based on spark streaming
WO2021169271A1 (en) Training method for thunderstorm weather prediction model, and thunderstorm weather prediction method
CN106780149A (en) A kind of equipment real-time monitoring system based on timed task scheduling
CN110413927B (en) Optimization method and system based on matching instantaneity in publish-subscribe system
CN114185885A (en) Streaming data processing method and system based on column storage database
CN112631754A (en) Data processing method, data processing device, storage medium and electronic device
CN116501805A (en) Stream data system, computer equipment and medium
CN111352820A (en) Method, equipment and device for predicting and monitoring running state of high-performance application
KR20170130178A (en) In-Memory DB Connection Support Type Scheduling Method and System for Real-Time Big Data Analysis in Distributed Computing Environment
Heintz et al. Towards optimizing wide-area streaming analytics
CN112783740B (en) Server performance prediction method and system based on time series characteristics
CN114257618A (en) Vehicle operation data real-time analysis system based on Internet of vehicles platform
CN114401496A (en) Video information rapid processing method based on 5G edge calculation
CN114595842A (en) Advertisement playing equipment group event studying and judging method and system based on real-time calculation
CN107566187B (en) SLA violation monitoring method, device and system
CN117350607B (en) International logistics transportation path planning system of improved KNN algorithm model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant