CN114490297A - Stream data processing method and device for daily cut scene - Google Patents

Stream data processing method and device for daily cut scene Download PDF

Info

Publication number
CN114490297A
CN114490297A CN202210132708.7A CN202210132708A CN114490297A CN 114490297 A CN114490297 A CN 114490297A CN 202210132708 A CN202210132708 A CN 202210132708A CN 114490297 A CN114490297 A CN 114490297A
Authority
CN
China
Prior art keywords
streaming data
data
time
processing
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210132708.7A
Other languages
Chinese (zh)
Inventor
李天浩
雷赛龄
杨小可
孟少川
赵正阳
黄子豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202210132708.7A priority Critical patent/CN114490297A/en
Publication of CN114490297A publication Critical patent/CN114490297A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • G06F11/3419Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment by assessing time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention provides a stream data processing method and device for a daily cutting scene, which can be used in the financial field or other fields. The method comprises the following steps: acquiring streaming data, performing data distribution processing on the streaming data according to the operation type of a driving system of the streaming data, and determining the processing mode of the streaming data; determining the latest arrival time of a window corresponding to the streaming data according to the processing mode of the streaming data; and carrying out window aggregation operation on the latest arrival time of the window corresponding to the streaming data to generate an operation result, and carrying out data transmission processing on the operation result. According to the invention, by shunting the streaming data in the daily cutting scene, the problem of long-time data disorder in the daily cutting scene is solved, the accuracy of real-time data processing is improved, the timeliness and the usability of the data are improved, and the method has great practical operation value.

Description

Stream data processing method and device for daily cut scene
Technical Field
The invention relates to the technical field of stream type calculation of a daily cut scene, in particular to a stream type data processing method and device of the daily cut scene.
Background
The operation in the streaming computing scene depends on event time driven computing, and different event times enter different time windows for computing. However, in a bank day-to-day scenario, the event time in the upstream data may become a future time within a certain time period, and such data may erroneously advance the time of the job to the future time upon entering the job. When normal time data is sent upstream after the end of the day, the data is mistakenly regarded as late data by the operation and enters an erroneous processing branch, and the data is discarded or not counted in the window.
Such erroneous future time data may cause a large amount of traffic data to be attributed as late data and discarded, and in addition, may cause data errors. At present, no good technical scheme is provided for solving the problem of long-time data disorder in the day-to-day banking scenes.
Disclosure of Invention
Aiming at the problems in the prior art, embodiments of the present invention mainly aim to provide a method and an apparatus for processing streaming data in a daily cut scene, so as to solve the problem of long-time data disorder in the daily cut scene and improve the accuracy of real-time data processing.
In order to achieve the above object, an embodiment of the present invention provides a streaming data processing method for a daily cut scene, where the method includes:
acquiring streaming data, and performing data distribution processing on the streaming data according to the operation type of a driving system of the streaming data to determine a processing mode of the streaming data;
determining the latest arrival time of a window corresponding to the streaming data according to the processing mode of the streaming data;
and carrying out window aggregation operation on the latest arrival time of the window corresponding to the streaming data to generate an operation result, and carrying out data transmission processing on the operation result.
Optionally, in an embodiment of the present invention, the method further includes:
determining late data in the streaming data according to a preset window time;
and according to the latest arrival time of the window corresponding to the late data, performing late data analysis processing and late data sending processing on the late data.
Optionally, in an embodiment of the present invention, the performing, according to the operation type of the driving system of the streaming data, data splitting processing on the streaming data, and determining a processing mode of the streaming data includes:
if the operation type of the driving system of the streaming data is a system time type, determining that the processing mode of the streaming data is a system time processing mode;
and if the operation type of the driving system of the streaming data is the timestamp type, determining that the processing mode of the streaming data is an event time processing mode.
Optionally, in an embodiment of the present invention, the determining, according to the processing mode of the streaming data, the latest arrival time of the window corresponding to the streaming data includes:
if the processing mode of the streaming data is a system time processing mode, determining the latest arrival time of a window corresponding to the streaming data according to the system time corresponding to the streaming data and the preset maximum allowable event timeout time;
and if the processing mode of the streaming data is an event time processing mode, determining the latest arrival time of a window corresponding to the streaming data according to the event time corresponding to the streaming data and the preset maximum allowable event timeout time.
Optionally, in an embodiment of the present invention, the determining, according to the system time corresponding to the streaming data and a preset maximum allowable event timeout time, the latest arrival time of the window corresponding to the streaming data includes:
determining a timestamp corresponding to the streaming data according to the system time corresponding to the streaming data and the system entering delay time;
and determining the latest arrival time of the window corresponding to the streaming data according to the timestamp corresponding to the streaming data and the preset maximum allowable event timeout time.
The embodiment of the invention also provides a streaming data processing device for the daily cut scene, which comprises:
the flow distribution processing module is used for acquiring flow data, carrying out data flow distribution processing on the flow data according to the operation type of a driving system of the flow data and determining the processing mode of the flow data;
the arrival time module is used for determining the latest arrival time of a window corresponding to the streaming data according to the processing mode of the streaming data;
and the data processing module is used for carrying out window aggregation operation on the latest arrival time of the window corresponding to the streaming data to generate an operation result and carrying out data transmission processing on the operation result.
Optionally, in an embodiment of the present invention, the apparatus further includes: the late data module is used for determining late data in the streaming data according to preset window time; and according to the latest arrival time of the window corresponding to the late data, performing late data analysis processing and late data sending processing on the late data.
Optionally, in an embodiment of the present invention, the split processing module includes:
a system time unit, configured to determine that a processing mode of the streaming data is a system time processing mode if a driving system operation type of the streaming data is a system time type;
and the event time unit is used for determining that the processing mode of the streaming data is an event time processing mode if the operation type of the driving system of the streaming data is a timestamp type.
Optionally, in an embodiment of the present invention, the arrival time module includes:
a first arrival time unit, configured to determine, if the processing mode of the streaming data is a system time processing mode, a latest arrival time of a window corresponding to the streaming data according to a system time corresponding to the streaming data and a preset maximum allowable event timeout time;
and the second arrival time unit is used for determining the latest arrival time of the window corresponding to the streaming data according to the event time corresponding to the streaming data and the preset maximum allowable event timeout time if the processing mode of the streaming data is the event time processing mode.
Optionally, in an embodiment of the present invention, the first arrival time unit includes:
the time stamp subunit is used for determining a time stamp corresponding to the streaming data according to the system time corresponding to the streaming data and the system entering delay time;
and the arrival time subunit is used for determining the latest arrival time of the window corresponding to the streaming data according to the timestamp corresponding to the streaming data and the preset maximum allowable event timeout time.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method when executing the program.
The present invention also provides a computer-readable storage medium storing a computer program for executing the above method.
According to the invention, by shunting the streaming data in the daily cutting scene, the problem of long-time data disorder in the daily cutting scene is solved, the accuracy of real-time data processing is improved, the timeliness and the usability of the data are improved, and the method has great practical operation value.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.
Fig. 1 is a flowchart of a streaming data processing method for a daily cut scene according to an embodiment of the present invention;
FIG. 2 is a diagram illustrating partial out-of-order processing of streaming data according to an embodiment of the present invention;
fig. 3 is a schematic view of streaming data processing of a system to which a streaming data processing method of a daily cut scenario is applied in an embodiment of the present invention;
FIG. 4 is a flow chart of processing late data in an embodiment of the present invention;
FIG. 5 is a flow chart of determining a processing mode in an embodiment of the present invention;
FIG. 6 is a flow chart of determining a latest arrival time of a window in an embodiment of the present invention;
FIG. 7 is a flow chart of determining a latest arrival time of a window in accordance with another embodiment of the present invention;
fig. 8 is a schematic structural diagram of a streaming data processing apparatus in a daily cut scene according to an embodiment of the present invention;
FIG. 9 is a schematic structural diagram of a streaming data processing apparatus for a daily cut scene according to another embodiment of the present invention;
fig. 10 is a schematic structural diagram of a shunting processing module in an embodiment of the present invention;
FIG. 11 is a block diagram of an arrival time module according to an embodiment of the present invention;
FIG. 12 is a diagram illustrating a first arrival time unit according to an embodiment of the present invention;
fig. 13 is a schematic structural diagram of an electronic device according to an embodiment of the invention.
Detailed Description
The embodiment of the invention provides a streaming data processing method and device for a day-cut scene, which can be used in the financial field and other fields.
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 2 is a schematic diagram of processing streaming data out of order locally for an event, which is further improved based on fig. 2 to obtain a schematic diagram of processing streaming data in a bank day-cut scenario based on event time shown in fig. 3, and specifically, fig. 3 is a processing flow diagram of a streaming data processing system applying the method shown in fig. 1. The system in fig. 3 uses Flink, which is a distributed stream data processing engine that can process stream data in a pipelined manner, supporting high throughput, low latency, high performance stateful computations. Flink supports representing the generation time of data with EventTime (timestamp) and performs operations based on this event time. The data processing flow of the system shown in fig. 3 is specifically described as follows.
Fig. 1 is a flowchart illustrating a streaming data processing method for a japanese-cut scene according to an embodiment of the present invention, and an execution subject of the streaming data processing method for a japanese-cut scene provided by the embodiment of the present invention includes, but is not limited to, a computer. The method shown in fig. 1 comprises:
and step S1, acquiring the streaming data, performing data distribution processing on the streaming data according to the operation type of the driving system of the streaming data, and determining the processing mode of the streaming data.
The daily cutting scene can be specifically a bank daily cutting scene, which means that a large amount of clearing-related and checking batch tasks are generally processed by a host task of a bank at the end of a day, and more computing resources are occupied by the checking batch tasks. In certain specific festivals, such as twenty-one, before 24 am before the day before various shopping festivals, in order to avoid the competition of the batch tasks and a large number of upcoming online transaction tasks for computing resources, the date field in the transaction information is changed into the same time of the next day in advance according to the batch transaction day switching table, the batch tasks are triggered and settled in advance, and preparation is made for the subsequent online transaction. The streaming data refers to data which can be continuously collected from various places, and has numerous sources, complex format and huge data volume.
Further, the streaming operation data source is host transaction data in kafka (high throughput distributed publish-subscribe message system), and after the operation receives the transaction data in real time, the operation performs 5min once aggregation operation on the transaction data by taking a TimeStamp field TimeStamp in the transaction data as the time for generating the event, and the data is subjected to statistics and then sent to a downstream distributed publish-subscribe message system.
Further, at the beginning of the E-commerce shopping festival, for example, at 11.10 nights 23:50 each year, the host transaction data is daily cut according to the host daily cut table, and all data dates within 23:50-24:00 are increased by one, namely, the time stamp is changed to 11.11 days 23: 5X. At the end of the day 00:00 cut on day 11.11, all host data is restored to normal time, i.e., day 00:00 on day 11.11.
Data sent by an upstream host cannot be guaranteed to be obtained in the sequence of the system according to the data generation time strictly because of problems of network delay, inconsistent machine system time, inconsistent transmission time and the like, and transaction information sent first may be sent later than transaction information sent later, and a streaming data processing system is generally designed under such a condition, for example, as shown in fig. 2. In fig. 2, the streaming job data source is a distributed publish-subscribe message system 1, and after streaming data is acquired, the streaming data is entered into a data receiving module to perform basic processing on the streaming data, and TimeStamp fields TimeStamp and WaterMark in transaction data are generated for each piece of data through a TimeStamp and WaterMark generator and a window module. The timestamp field in the transaction data directly extracts the timestamp field in each piece of data as the generation time of the event. And finally, the message is sent to the downstream destination distributed publish-subscribe message system 2 through a data sending module.
Further, WaterMark indicates that all data before this time is aligned, i.e. the latest arrival time of the window, and in order to obtain the latest arrival time of the window, two values need to be evaluated according to the service content: maximum allowed event timeout time (maxoutof orderinss) and window timeout release time (allowLateNess). For each event, the latest arrival time of the window is equal to the timestamp field in the transaction data, and when the latest arrival time of the window of a certain data exceeds the window ending time, the window immediately triggers the aggregation operation. This ensures that out-of-order data arriving earlier than the maximum allowed event timeout time will not trigger an operation immediately, and only after that time arrives will window calculations be triggered. Specifically, the maximum allowable event timeout time is the maximum allowable event disorder time, which represents the longest time for tolerating the event disorder; the window overtime release time is the longest time for waiting for late data in window calculation, the late data triggers window calculation again before the time is not exceeded, and the value obtained by the latest window aggregation operation is updated downstream.
Further, the window latest arrival time describes the latest time an event is allowed to arrive within a streaming window. When the timestamp of a newly arriving event is greater than the latest arrival time of the window, the system considers that all data before the latest arrival time of the window have arrived, immediately triggers window calculation, and returns a result value. In addition, the timestamp field in the transaction data represents a timestamp, represents a field in the upstream transmission data, and represents the generation time of the piece of data in the upstream system, namely the occurrence time of the event.
As shown in fig. 3, after acquiring streaming data, the present invention enters a data receiving module to perform basic processing on the streaming data, and after the processing is completed, an event shunting module performs shunting processing on an event.
As an embodiment of the present invention, as shown in fig. 5, performing data splitting processing on streaming data according to a driving system operation type of the streaming data, and determining a processing mode of the streaming data includes:
step S31, if the operation type of the drive system of the streaming data is the system time type, determining the processing mode of the streaming data as the system time processing mode;
in step S32, if the drive system operation type of the streaming data is the timestamp type, it is determined that the processing mode of the streaming data is the event time processing mode.
The specific diversion criteria are described as follows:
in a bank day-to-day scenario, data of a certain period of time is data delayed by one day, considering that there may be a system clock misalignment, there is a difference in data delay sent by different data links, there are problems that data of normal time is sent to the system in an abnormal time period, data of abnormal time is sent to the system in a normal time, and the like, a mode with system time as a target needs to be started to trigger streaming computation, and a mode switching judgment rule deduces as follows:
event time processing mode: a represents the time required by the system to operate by using a timestamp (EventTime) in the data to represent the time of the event occurrence;
system time processing mode: b represents the time required for driving the system to perform the calculation by representing the time of occurrence of the event by the system time (SystemTime) when the calculation is performed by using the data;
each data has a delay time from upstream to entering a streaming system for operation, and the delay time is represented by ForwardDelayTime;
calculating the maximum allowable event timeout time of the latest arrival time of the window in the preamble;
for the system time processing mode, the following equation is given:
B-WaterMark=SystemTime-A-ForwardDelayTime-A-maxOutofOrderness (1)
for the event time processing mode, the following formula is shown:
EventTime=SystemTimeA-ForwardDelayTime (2)
A-WaterMark=EventTime-A-maxOutofOrderness (3)
wherein EventTime is a timestamp, WaterMark is the latest arrival time of a window, systemlime is the system time, ForwardDelayTime is the delay time, and maxoutof orderinss is the maximum allowable event timeout time.
That is, for the event time processing mode, it can be approximately considered that the delay time of the upstream system, i.e., the difference between a and the delay time, is the standard delay time for each piece of data. The event time processing mode may be understood at this time as an ideal case of the system time processing mode.
Therefore, when the system time is used as the event time to drive the system operation, an average upstream system delay time needs to be estimated to approximate the representation, and the value can be obtained by performing average operation according to the difference between the event time reaching the streaming system and the system time in a statistical manner.
In summary, the mode switching condition between the system time processing mode and the event time processing mode should be:
SystemTime-EventTime>B-ForwardDelayTime (4)
if the system time processing mode is satisfied, the system time processing mode is switched, otherwise, the system time processing mode is switched to the event time processing mode. Therefore, the problem of long-time data disorder in a day-to-day scene is solved, the accuracy of real-time data processing is improved, and the timeliness and the availability of data are improved.
Step S2, determining the latest arrival time of the window corresponding to the streaming data according to the processing mode of the streaming data.
In this embodiment, as shown in fig. 6, determining the latest arrival time of the window corresponding to the streaming data according to the processing mode of the streaming data includes:
step S41, if the processing mode of the streaming data is the system time processing mode, determining the latest arrival time of the window corresponding to the streaming data according to the system time corresponding to the streaming data and the preset maximum allowable event timeout time;
step S42, if the processing mode of the streaming data is the event time processing mode, determining the latest arrival time of the window corresponding to the streaming data according to the event time corresponding to the streaming data and the preset maximum allowable event timeout time.
In this embodiment, as shown in fig. 7, determining the latest arrival time of the window corresponding to the streaming data according to the system time corresponding to the streaming data and the preset maximum allowable event timeout time includes:
step S51, determining a timestamp corresponding to the streaming data according to the system time corresponding to the streaming data and the system entering delay time;
and step S52, determining the latest arrival time of the window corresponding to the streaming data according to the timestamp corresponding to the streaming data and the preset maximum allowable event timeout time.
As shown in fig. 3, according to the split stream processing, the event in the system time processing mode flows into the timestamp and watermark generator 1, and at this time, the event time and watermark calculation formula of the data is as follows, so as to calculate the latest arrival time of the window corresponding to the stream data. Therefore, the accuracy of real-time data processing is improved, and the timeliness and the availability of data are improved.
In this embodiment, determining the latest arrival time of the window corresponding to the streaming data according to the system time corresponding to the streaming data and the preset maximum allowable event timeout time further includes: determining a difference value between the system time corresponding to the streaming data and the system entering delay time according to the system time corresponding to the streaming data and the system entering delay time, and taking the difference value as a timestamp corresponding to the streaming data; and determining a difference between the timestamp and the maximum allowable event timeout time according to the timestamp and the preset maximum allowable event timeout time, and taking the difference as the latest arrival time of the window corresponding to the streaming data, which is specifically shown in formulas (5) to (6).
EventTime=SystemTime-ForwardDelayTime (5)
WaterMark=EventTime-maxOutofOrderness (6)
Wherein EventTime is a timestamp, WaterMark is the latest arrival time of a window, systemlime is the system time, ForwardDelayTime is the delay time, and maxoutof orderinss is the maximum allowable event timeout time.
In this embodiment, determining the latest arrival time of the window corresponding to the streaming data according to the event time corresponding to the streaming data and the preset maximum allowable event timeout time includes: according to the event time corresponding to the streaming data and the preset maximum allowable event timeout time, determining a difference between the event time and the maximum allowable event timeout time, and using the difference as the latest arrival time of the window corresponding to the streaming data, as shown in formula (7).
As shown in fig. 3, the event stream time stamp of the event time processing mode is input to the watermark generator 2, and the watermark calculation formula is as follows, so as to obtain the latest arrival time of the corresponding window.
WaterMark=EventTime-maxOutofOrderness (7)
Therefore, the accuracy of real-time data processing is further improved, and the timeliness and the availability of data are improved.
And step S3, carrying out window aggregation operation on the latest arrival time of the window corresponding to the flow data to generate an operation result, and carrying out data transmission processing on the operation result.
As shown in the flow chart of fig. 2, after window calculation is triggered, the system does not immediately release the window, when data arrives within a window time period, the window calculation is triggered again according to the latest arrival time mechanism of the window, the window calculation value is updated to the downstream again, until the latest arrival time of a new event window exceeds the window timeout release time, the window is released, and the later arriving data is discarded as the late data. And finally, the calculated data is sent to a downstream data sending module, and is output to the downstream distributed publish-subscribe message system 2 after being correspondingly processed. Obviously, in the bank cutting scene shown in fig. 2, data after 11.10 days, 23:50 days, will be changed into data of 11.11 days in advance, and the time of the model is pulled to 11.11 in advance, so that the data of the following 11.11 days is all changed into late data. As the bank transaction amount is huge, the data amount of 1 day is also huge, and the preservation of the running out-of-order time window of 1 day is not practical.
However, as shown in the flow of fig. 3, two kinds of data simultaneously flow into the window module to perform window aggregation operation, and the processing result is sent to the process module for subsequent processing. Meanwhile, there may be data that is late and cannot be counted until the window is released, and such data flows into the side output processing module for processing.
The window aggregation operation includes a sorting operation, a summation operation (sum), an average operation (avg), an extremum operation (max, min), and a count operation (count). Further, according to the actual service requirement, the window aggregation operation may further include data splicing.
Specifically, for example, the two types of data at the upstream are respectively in a special data format before modification and a Map type data format after modification, the aggregation operation is performed by taking the associated event number as a main key, and partial functional fields in the data with the same main key are calculated, including logic operation, timeliness judgment and whether the data meet the specification. And splicing the data into modified Map type data after calculation, and sending the modified Map type data to a downstream subsequent node for processing.
Further, the normally processed data is sent to the downstream distributed publish-subscribe message system 2 through the data sending module 2, the side output data is sent to the data sending module 1 of the late data, and then the data is sent to the distributed publish-subscribe message system 1 for analyzing the late data to perform problem analysis.
As an embodiment of the present invention, as shown in fig. 4, the method further includes:
step S21, determining late data in the streaming data according to the preset window time;
and step S22, according to the latest arrival time of the window corresponding to the late data, performing late data analysis processing and late data sending processing on the late data.
As shown in fig. 3, there is late data that cannot be counted until the window is released, that is, data whose arrival time in the streaming data is later than the preset window time. Such data flows into the side output processing module for processing.
Furthermore, by using the latest arrival time of the window of the late data, the late data is analyzed and processed and the late data is sent. Therefore, the accuracy of real-time data processing is improved, and the timeliness and the availability of data are improved.
The invention provides a solution to the data disorder problem of the day-cut scene in the event time-based streaming processing model, which not only solves the problem of local disorder which may occur to upstream data, but also provides a solution to the problem of long-time data disorder of the day-cut type of the bank. Meanwhile, the situation that the two scenes are possibly mixed is also considered. The invention does not have extra storage pressure on hardware equipment, and has a bottom-catching measure on final error data. The accuracy of real-time data processing is improved, the timeliness and the usability of data are improved, and the method has a great practical operation value.
Fig. 8 is a schematic structural diagram of a streaming data processing apparatus in a japanese cutting scenario according to an embodiment of the present invention, where the apparatus includes:
the flow distribution processing module 10 is configured to acquire streaming data, perform data distribution processing on the streaming data according to an operation type of a driving system of the streaming data, and determine a processing mode of the streaming data;
an arrival time module 20, configured to determine, according to a processing mode of the streaming data, a latest arrival time of a window corresponding to the streaming data;
and the data processing module 30 is configured to perform window aggregation operation on the latest arrival time of the window corresponding to the streaming data to generate an operation result, and perform data transmission processing on the operation result.
As an embodiment of the present invention, as shown in fig. 9, the apparatus further includes: a late data module 40, configured to determine late data in the streaming data according to a preset window time; and according to the latest arrival time of the window corresponding to the late data, performing late data analysis processing and late data sending processing on the late data.
As an embodiment of the present invention, as shown in fig. 10, the shunting processing module 10 includes:
a system time unit 11, configured to determine that a processing mode of the streaming data is a system time processing mode if the operation type of the driving system of the streaming data is a system time type;
an event time unit 12, configured to determine that a processing mode of the streaming data is an event time processing mode if the driving system operation type of the streaming data is a timestamp type.
In this embodiment, as shown in fig. 11, the arrival time module 20 includes:
a first arrival time unit 21, configured to determine, if the processing mode of the streaming data is a system time processing mode, a latest arrival time of a window corresponding to the streaming data according to a system time corresponding to the streaming data and a preset maximum allowable event timeout time;
a second arrival time unit 22, configured to determine, if the processing mode of the streaming data is the event time processing mode, a latest arrival time of a window corresponding to the streaming data according to the event time corresponding to the streaming data and a preset maximum allowable event timeout time.
In this embodiment, as shown in fig. 12, the first arrival time unit 21 includes:
a timestamp subunit 211, configured to determine a timestamp corresponding to the streaming data according to the system time corresponding to the streaming data and the system entry delay time;
and an arrival time subunit 212, configured to determine, according to the timestamp corresponding to the streaming data and a preset maximum allowable event timeout time, a latest arrival time of a window corresponding to the streaming data.
Based on the same application concept as the streaming data processing method of the day-to-day scene, the invention also provides a streaming data processing device of the day-to-day scene. Because the principle of solving the problems of the streaming data processing device for the day-cut scene is similar to the streaming data processing method for the day-cut scene, the implementation of the streaming data processing device for the day-cut scene can refer to the implementation of the streaming data processing method for the day-cut scene, and repeated parts are not repeated.
According to the invention, by shunting the streaming data in the daily cutting scene, the problem of long-time data disorder in the daily cutting scene is solved, the accuracy of real-time data processing is improved, the timeliness and the usability of the data are improved, and the method has great practical operation value.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method when executing the program.
The present invention also provides a computer-readable storage medium storing a computer program for executing the above method.
As shown in fig. 13, the electronic device 600 may further include: communication module 110, input unit 120, audio processing unit 130, display 160, power supply 170. It is noted that the electronic device 600 does not necessarily include all of the components shown in FIG. 13; furthermore, the electronic device 600 may also comprise components not shown in fig. 13, which may be referred to in the prior art.
As shown in fig. 13, the central processor 100, sometimes referred to as a controller or operational control, may include a microprocessor or other processor device and/or logic device, the central processor 100 receiving input and controlling the operation of the various components of the electronic device 600.
The memory 140 may be, for example, one or more of a buffer, a flash memory, a hard drive, a removable media, a volatile memory, a non-volatile memory, or other suitable device. The information relating to the failure may be stored, and a program for executing the information may be stored. And the central processing unit 100 may execute the program stored in the memory 140 to realize information storage or processing, etc.
The input unit 120 provides input to the cpu 100. The input unit 120 is, for example, a key or a touch input device. The power supply 170 is used to provide power to the electronic device 600. The display 160 is used to display an object to be displayed, such as an image or a character. The display may be, for example, an LCD display, but is not limited thereto.
The memory 140 may be a solid state memory such as Read Only Memory (ROM), Random Access Memory (RAM), a SIM card, or the like. There may also be a memory that holds information even when power is off, can be selectively erased, and is provided with more data, an example of which is sometimes called an EPROM or the like. The memory 140 may also be some other type of device. Memory 140 includes buffer memory 141 (sometimes referred to as a buffer). The memory 140 may include an application/function storage section 142, and the application/function storage section 142 is used to store application programs and function programs or a flow for executing the operation of the electronic device 600 by the central processing unit 100.
The memory 140 may also include a data store 143, the data store 143 for storing data, such as contacts, digital data, pictures, sounds, and/or any other data used by the electronic device. The driver storage portion 144 of the memory 140 may include various drivers of the electronic device for communication functions and/or for performing other functions of the electronic device (e.g., messaging application, address book application, etc.).
The communication module 110 is a transmitter/receiver 110 that transmits and receives signals via an antenna 111. The communication module (transmitter/receiver) 110 is coupled to the central processor 100 to provide an input signal and receive an output signal, which may be the same as in the case of a conventional mobile communication terminal.
Based on different communication technologies, a plurality of communication modules 110, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, may be provided in the same electronic device. The communication module (transmitter/receiver) 110 is also coupled to a speaker 131 and a microphone 132 via an audio processor 130 to provide audio output via the speaker 131 and receive audio input from the microphone 132 to implement general telecommunications functions. Audio processor 130 may include any suitable buffers, decoders, amplifiers and so forth. In addition, an audio processor 130 is also coupled to the central processor 100, so that recording on the local can be enabled through a microphone 132, and so that sound stored on the local can be played through a speaker 131.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The principle and the implementation mode of the invention are explained by applying specific embodiments in the invention, and the description of the embodiments is only used for helping to understand the method and the core idea of the invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims (10)

1. A streaming data processing method for a daily cut scene is characterized by comprising the following steps:
acquiring streaming data, and performing data distribution processing on the streaming data according to the operation type of a driving system of the streaming data to determine a processing mode of the streaming data;
determining the latest arrival time of a window corresponding to the streaming data according to the processing mode of the streaming data;
and carrying out window aggregation operation on the latest arrival time of the window corresponding to the streaming data to generate an operation result, and carrying out data transmission processing on the operation result.
2. The method of claim 1, further comprising:
determining late data in the streaming data according to preset window time;
and according to the latest arrival time of the window corresponding to the late data, performing late data analysis processing and late data sending processing on the late data.
3. The method according to claim 1, wherein the streaming data is subjected to data splitting processing according to a driving system operation type of the streaming data, and determining a processing mode of the streaming data comprises:
if the operation type of the driving system of the streaming data is a system time type, determining that the processing mode of the streaming data is a system time processing mode;
and if the operation type of the driving system of the streaming data is the timestamp type, determining that the processing mode of the streaming data is an event time processing mode.
4. The method of claim 3, wherein the determining, according to the processing mode of the streaming data, the latest arrival time of the window corresponding to the streaming data comprises:
if the processing mode of the streaming data is a system time processing mode, determining the latest arrival time of a window corresponding to the streaming data according to the system time corresponding to the streaming data and the preset maximum allowable event timeout time;
and if the processing mode of the streaming data is an event time processing mode, determining the latest arrival time of a window corresponding to the streaming data according to the event time corresponding to the streaming data and the preset maximum allowable event timeout time.
5. The method according to claim 4, wherein the determining the latest arrival time of the window corresponding to the streaming data according to the system time corresponding to the streaming data and a preset maximum allowable event timeout time comprises:
determining a timestamp corresponding to the streaming data according to the system time corresponding to the streaming data and the system access delay time;
and determining the latest arrival time of the window corresponding to the streaming data according to the timestamp corresponding to the streaming data and the preset maximum allowable event timeout time.
6. The method according to claim 4, wherein the determining the latest arrival time of the window corresponding to the streaming data according to the system time corresponding to the streaming data and a preset maximum allowable event timeout time further comprises:
determining a difference value between the system time corresponding to the streaming data and the system entering delay time according to the system time corresponding to the streaming data and the system entering delay time, and taking the difference value as a timestamp corresponding to the streaming data;
and determining the difference between the timestamp and the maximum allowable event timeout time according to the timestamp and the preset maximum allowable event timeout time, and taking the difference as the latest arrival time of the window corresponding to the streaming data.
7. The method according to claim 4, wherein the determining the latest arrival time of the window corresponding to the streaming data according to the event time corresponding to the streaming data and a preset maximum allowable event timeout time comprises:
and determining the difference between the event time and the maximum allowable event timeout time according to the event time corresponding to the streaming data and the preset maximum allowable event timeout time, and taking the difference as the latest arrival time of the window corresponding to the streaming data.
8. A streaming data processing apparatus for a daily cut scene, the apparatus comprising:
the flow distribution processing module is used for acquiring flow data, carrying out data flow distribution processing on the flow data according to the operation type of a driving system of the flow data and determining the processing mode of the flow data;
the arrival time module is used for determining the latest arrival time of a window corresponding to the streaming data according to the processing mode of the streaming data;
and the data processing module is used for carrying out window aggregation operation on the latest arrival time of the window corresponding to the streaming data to generate an operation result and carrying out data transmission processing on the operation result.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that it stores a computer program for executing the method of any one of claims 1 to 7.
CN202210132708.7A 2022-02-14 2022-02-14 Stream data processing method and device for daily cut scene Pending CN114490297A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210132708.7A CN114490297A (en) 2022-02-14 2022-02-14 Stream data processing method and device for daily cut scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210132708.7A CN114490297A (en) 2022-02-14 2022-02-14 Stream data processing method and device for daily cut scene

Publications (1)

Publication Number Publication Date
CN114490297A true CN114490297A (en) 2022-05-13

Family

ID=81480461

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210132708.7A Pending CN114490297A (en) 2022-02-14 2022-02-14 Stream data processing method and device for daily cut scene

Country Status (1)

Country Link
CN (1) CN114490297A (en)

Similar Documents

Publication Publication Date Title
CN111031058A (en) Websocket-based distributed server cluster interaction method and device
CN111897738B (en) Automatic testing method and device based on atomic service
US8892762B2 (en) Multi-granular stream processing
CN113766146B (en) Audio and video processing method and device, electronic equipment and storage medium
CN111222869A (en) Transaction data processing method, device, computer equipment and medium
CN113485952A (en) Data batch transmission method and device
CN113760611A (en) System site switching method and device, electronic equipment and storage medium
CN114490297A (en) Stream data processing method and device for daily cut scene
WO2020224242A1 (en) Blockchain data processing method and apparatus, server and storage medium
CN109446200B (en) Data processing method and device
JP7375089B2 (en) Method, device, computer readable storage medium and computer program for determining voice response speed
CN114416407B (en) Real-time data out-of-order repair system and method and computer equipment
CN114038465B (en) Voice processing method and device and electronic equipment
CN112785201B (en) Heterogeneous system quasi-real-time high-reliability interaction system and method
CN115391158A (en) Time delay determination method, system and device and electronic equipment
CN113515447B (en) Automatic testing method and device for system
CN113742004B (en) Data processing method and device based on flink framework
CN111767435A (en) User behavior analysis method and device
CN110445578B (en) SPI data transmission method and device
CN113645151A (en) DUP equipment message management method and device
CN114938353B (en) Asynchronous notification current limiting method and system based on stream computing
CN112396511A (en) Distributed wind control variable data processing method, device and system
CN112667631A (en) Method, device and equipment for automatically editing service field and storage medium
CN112799863A (en) Method and apparatus for outputting information
CN111752950B (en) Bank peripheral system and information synchronization method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination