CN111367951A - Method and device for processing stream data - Google Patents

Method and device for processing stream data Download PDF

Info

Publication number
CN111367951A
CN111367951A CN202010131762.0A CN202010131762A CN111367951A CN 111367951 A CN111367951 A CN 111367951A CN 202010131762 A CN202010131762 A CN 202010131762A CN 111367951 A CN111367951 A CN 111367951A
Authority
CN
China
Prior art keywords
data
service data
latitude
preset
packet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010131762.0A
Other languages
Chinese (zh)
Inventor
康雪丹
姜黎明
王大飞
江旻
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
WeBank Co Ltd
Original Assignee
WeBank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WeBank Co Ltd filed Critical WeBank Co Ltd
Priority to CN202010131762.0A priority Critical patent/CN111367951A/en
Publication of CN111367951A publication Critical patent/CN111367951A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries

Abstract

The invention discloses a method and a device for processing stream data, wherein the method comprises the following steps: the method comprises the steps of obtaining various types of service data which accord with screening rules from monitored stream data, extracting the service data according to a preset structure of the service data aiming at the various types of service data to obtain service data with set latitudes, grouping the various types of service data with the set latitudes according to a preset grouping rule, and processing the service data with the set latitudes in the groups according to a preset operator of each group. The invention extracts the service data according to the preset structure of the service data to obtain the service data with the set latitude, processes the service data with the set latitude in the packet according to the preset operator of each packet after grouping, realizes the split of the real-time calculation stages, simultaneously, the calculation logics of each stage are not excessively coupled, and the preset operator of each packet is used for multiplexing other calculation models, so that the processing of the streaming data is more efficient.

Description

Method and device for processing stream data
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for processing stream data.
Background
In recent years, with the rapid development of information technology, the data volume shows a trend of rapid increase, and for massive data, the processing capacity of a single computer is far from enough, thereby promoting the research and development of a distributed system. How to rapidly analyze and acquire useful information in mass data is a research hotspot in the field of distributed computing at present, and stream computing is carried forward.
For the application scenario of stream data, unlike the traditional data stored in a disk or a memory, the stream data is characterized in that: real-time performance: generating data stream in real time, and obtaining an analysis result in real time; durability: the data stream is continuously generated and streamed indefinitely.
Stream computation is widely used because of the advantages of stream computation described above. The existing typical distributed stream computing framework comprises Storm, spark timing, Flink and the like, the real-time performance and fault tolerance of the framework in a distributed environment are good, but the coupling degree is too high for a specific service scene, the development and maintenance cost is increased, the stream computing logic is opaque to service personnel, along with the rapid change of the online operation condition of a product, the change of the computing logic every time needs to be redeveloped by the development personnel, the rapid expansion of the service is not facilitated, the service requirement cannot be met, the code utilization rate of the framework is low, and certain system resource waste is caused. In a streaming computing scenario, a general streaming computing framework has the disadvantages of being relatively heavy, high in coupling degree and low in heterogeneity.
Disclosure of Invention
The application provides a method and a device for processing stream data, which are used for solving the problem of how to conveniently and efficiently process stream data.
In a first aspect, an embodiment of the present application provides a method for stream data processing, including:
acquiring various service data which accord with the screening rule from the monitored stream data;
extracting the service data according to a preset structure of the service data aiming at each type of service data to obtain service data of a set latitude; the preset structure comprises at least one set latitude;
grouping various service data with set latitude according to a preset grouping rule;
and processing the service data of the set latitude in the packet according to a preset operator of each packet.
According to the scheme, the business data are extracted according to the preset structure of the business data to obtain the business data with the set latitude, the business data with the set latitude in the groups are processed according to the preset operator of each group after grouping, the splitting of the real-time calculation stage is realized, meanwhile, the calculation logics of all stages are not excessively coupled, the preset operator of each group is used for multiplexing and flexibly combining other calculation models, and the processing of the streaming data is more efficient.
Optionally, the screening rule includes at least one of: a set data source, set category service data, and a set time window.
According to the scheme, data screening is carried out by setting the data source and the category or the time window of the service data, the data format is unified, the useless data are filtered, and the calculation is more efficient.
Optionally, the extracting the service data according to the preset structure of the service data to obtain the service data with the set latitude includes:
according to the preset structure of the service data, constructing a data matrix for the service data in the same time window; each service data corresponds to one row in the data matrix, and the same set latitude of each service data corresponds to one column in the data matrix.
According to the scheme, the screened data is constructed into the matrix, so that the data in the same column corresponds to the same set latitude, and the processing of the streaming data is more convenient and efficient.
Optionally, a latitude primary key of each group is set in the grouping rule, and the set latitude includes the latitude primary key;
grouping various service data with set latitudes according to a preset grouping rule, comprising the following steps:
and obtaining the business data of the set latitude of each group aiming at the latitude main key of each group, wherein the business data in each group conforms to the mode of the data matrix.
According to the scheme, the data are grouped through the latitude main key, and the calculation efficiency and the accuracy are improved.
Optionally, the preset operator includes a latitude index and an operator for calculating the latitude index, and the set latitude includes the latitude index;
processing the service data of the set latitude in the packet according to the preset operator of each packet, comprising:
and calling the operator to process the business data with the set latitude in the packet according to the latitude index of the packet to obtain the calculation result of the packet in the latitude index.
According to the scheme, the operators are abstracted and calculated, and are flexibly combined and configured to be reused by other calculation models, so that the flow processing capacity of mass data is realized.
Optionally, the invoking the operator to process the service data of the set latitude in the packet includes:
and calling the operator to process the column data in the grouped data matrix.
Optionally, after the processing the service data of the set latitude in the packet, the method further includes:
and outputting the processed calculation result according to a preset output template.
According to the scheme, different stream data processing results are butted with the database through the preset output template, so that the processing process is more efficient.
In a second aspect, an embodiment of the present application provides an apparatus for stream data processing, where the apparatus includes:
the acquisition module is used for acquiring various service data which accord with the screening rule from the monitored stream data;
the processing module is used for extracting the service data according to a preset structure of the service data aiming at each type of service data to obtain the service data with the set latitude; the preset structure comprises at least one set latitude;
the processing module is further used for grouping various service data with set latitudes according to a preset grouping rule;
and the processing module is further used for processing the service data of the set latitude in the packet according to the preset operator of each packet.
Optionally, the processing module is specifically configured to:
the screening rules include at least one of: a set data source, set category service data, and a set time window.
Optionally, the processing module is specifically configured to:
according to the preset structure of the service data, constructing a data matrix for the service data in the same time window; each service data corresponds to one row in the data matrix, and the same set latitude of each service data corresponds to one column in the data matrix.
Optionally, the processing module is specifically configured to:
the group rule is set with latitude main keys of each group, and the set latitude comprises the latitude main keys;
the processing module is specifically configured to:
and obtaining the business data of the set latitude of each group aiming at the latitude main key of each group, wherein the business data in each group conforms to the mode of the data matrix.
Optionally, the processing module is specifically configured to:
the preset operator comprises a latitude index and an operator for calculating the latitude index, and the set latitude comprises the latitude index;
the processing module is specifically configured to:
and calling the operator to process the business data with the set latitude in the packet according to the latitude index of the packet to obtain the calculation result of the packet in the latitude index.
Optionally, the processing module is specifically configured to:
and calling the operator to process the column data in the grouped data matrix.
Optionally, the processing module is further configured to:
and after the service data of the set latitude in the group is processed, outputting the processed calculation result according to a preset output template.
Correspondingly, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the streaming data processing method according to the obtained program.
Accordingly, embodiments of the present invention also provide a computer-readable non-volatile storage medium, which includes computer-readable instructions, and when the computer-readable instructions are read and executed by a computer, the computer is caused to execute the above-mentioned method for processing streaming data.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a system framework of a method for processing streaming data according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of a method for processing stream data according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a method for processing stream data according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an apparatus for processing stream data according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to solve the problems in the prior art, an embodiment of the present invention provides a method for processing streaming data, and the method for processing streaming data provided in the embodiment of the present invention may be applied to a system architecture as shown in fig. 1, where the system architecture includes a streaming data collection device 100 and a service processing device 200.
The stream data acquisition device 100 sends the acquired stream data to the service processing device 200, and the service processing device 200 processes the stream data.
It should be noted that fig. 1 is only an example of a system architecture according to an embodiment of the present application, and the present application is not limited to this specifically.
Based on the system architecture illustrated in fig. 1, fig. 2 is a schematic flowchart corresponding to a method for processing streaming data according to an embodiment of the present invention, where the flow may be executed by a device for debugging an intelligent contract, which may be a service processing device of the above-mentioned content. As shown in fig. 2, the method includes:
step 201, acquiring various service data meeting the screening rule from the monitored stream data;
step 202, extracting the service data according to a preset structure of the service data for each type of service data to obtain the service data of the set latitude.
It should be noted that the preset structure includes at least one set latitude.
And 203, grouping various service data with set latitudes according to a preset grouping rule.
And 204, processing the service data of the set latitude in the packet according to the preset operator of each packet.
In a possible implementation, the method for stream data processing is performed based on a stream computation framework Spark Streaming.
Before specifically describing the scheme of the present application, first, a brief description is given of Spark Streaming:
spark Streaming is to divide Streaming into a series of short batch processing jobs, that is, to divide input data of Spark Streaming into a piece of data (partitioned Streaming) at a preset time interval (e.g. 1 second), that is, Spark Streaming accesses data from a real-time data stream and divides the data into small batches for processing by a subsequent Spark engine.
Based on this, in step 201, the following steps are first performed to acquire various types of service data that meet the screening rule.
S2011, a time window is set.
For example, the set time window may be every minute, every five minutes, every half hour, every day, every week, etc., which is not specifically limited in this application.
As another example, when the set time window may be every minute, every 60s of data forms a batch.
S2012, a data source is set.
It should be noted that, the scheme of the present application supports extracting data from multiple data sources, such as: rmb, Kafka, Flume, ZeroMQ, Kinesis, and the like.
And S2013, setting the service data of the category.
Such as, for example, WeChat loans, installment payments, and the like.
It should be noted that the above sequence is a general step, for example, S2013 may precede S2012, and this is not specifically limited in this application.
As can be seen from the above, the stream data is filtered, and the filtering rule includes: a set data source, set category service data, and a set time window. By screening the time range of the streaming data and the content category, the streaming data of each small batch can be uniformly and pertinently processed later. The processing flow is described in detail below.
In step 202, according to the preset structure of the service data, a data matrix is constructed from the service data in the same time window.
It should be noted that each piece of service data corresponds to one row in the data matrix, and the same set latitude of each piece of service data corresponds to one column in the data matrix.
For example, the following two structures are defined:
the account opening structure is as follows:
business scenario Customer ID State of opening an account Time of opening an account Channel for irrigation
Opening an account ID_NO Success of the method 2020-01-01 Mobile phone
Borrowing structure:
business scenario Customer ID Time of borrowing Amount of money to be borrowed State of borrowing
Borrowing money ID_NO 2020-01-01 100.0 Success of the method
For example, the set time window is 5 seconds, and 5 seconds accumulate all service data, and form a data matrix as follows:
time1 account opening data (account opening structure)
time2 borrowing data (borrowing structure)
time3 account opening data (account opening structure)
time4 account opening data (account opening structure)
In the embodiment of the present application, before the data matrix is constructed, the filtered stream data is analyzed in a preset manner, for example, the stream data is analyzed in a "manner by using a separator".
In a possible implementation manner, after the various types of service data meeting the screening rule are obtained, preliminary data filtering may be performed.
After the extraction of the business data is completed, the input field scope and type judgment is defined by the search engine, such as by the SQL component, and if the flow data does not satisfy the SQL condition or does not satisfy the predefined type, the filtering is performed directly.
For example, "BIZ _ TYPE ═ load' and ID _ NO is not null" indicates that data whose service scene is load and whose ID is not null is selected, and the remaining data that do not meet the condition is filtered.
Based on this, the detailed procedure of grouping various types of service data with set latitudes in step 203 is described in detail below.
In the embodiment of the application, latitude main keys of all groups are set in the grouping rule, and the set latitude comprises the latitude main keys;
based on this, in the embodiment of the present application, the service data of the set latitude of the packet is obtained for the latitude main key of each packet.
Note the manner in which the traffic data within each packet conforms to the data matrix.
And combining the content of the steps, and creating the service packet after selecting the data source and the category of the service data.
Further, in step 204, the preset operator includes a latitude index and an operator for calculating the latitude index, and the set latitude includes the latitude index;
in the embodiment of the present application, the latitude index means what latitude the group of indexes are calculated according to, such as a customer latitude, a merchant latitude, and a mobile phone number latitude. And selecting a field corresponding to the statistical latitude to form a latitude index.
In the embodiment of the application, the latitude indexes are predefined and correspond to the index IDs one by one.
For example, a latitude index is defined: CUST _ PAY _ SUCCESS, corresponding to the number of successful orders placed by the client, and the latitude index is defined as 'client ID' and 'successful orders placed'; and the operator that calculates the latitude index is a sum.
Specifically, according to the grouped latitude index, an operator is called to process business data of a set latitude in the group, and a calculation result of the group in the latitude index is obtained.
And further, calling the operator to process column data in the grouped data matrix.
The above description introduces specific grouping and calculation processes, and the following description describes a specific index calculation definition method.
In the embodiment of the application, the index calculation definition comprises an index name and a calculation model.
Specifically, the index name includes an index name and an index description.
In the embodiment of the application, when the calculation model selects a single index for customization, a defined calculation mode can be selected, and the method mainly comprises the following steps: hbase store, hbase query, hbase deduplication, spark SQL query, or
A general operator: statistical calculation (20 operators such as SUM/COUNT/DIS _ COUNT/DETAIL _ LIST/late), judgment calculation (i.e., >, <, >, etc.), and logical calculation (and, or not, etc.).
In order to better explain the invention, a specific example is described below in connection with fig. 3.
As shown in fig. 3:
first, a data source RMB is monitored, which contains a plurality of events, such as input1, input2, and input n in fig. 3, to form a dynamic event stream.
In the embodiment of the application, the configuration is loaded at a cache timing, the configuration information is loaded in 5 minutes, and the configuration information is loaded according to the event ID. For example, the event ID is RMB _ WCD _ load, where RMB is a set data source, WCD is a set service data category, and load is a specific service scenario.
Specifically, all events of the product a are reported to a data source, and operations such as login, account opening, borrowing and loan placing are included.
In the embodiment of the present application, service scenarios, that is, the aforementioned LOANs, are distinguished according to the BIZ _ TYPE keyword, and a data structure of a single service scenario is defined. Specifically, the following two structures correspond to stuck 1, stuck 2,.. and stuck n in fig. 3:
the account opening structure is as follows:
business scenario Customer ID State of opening an account Time of opening an account Channel for irrigation
Opening an account ID_NO Success of the method 2020-01-01 Mobile phone
Borrowing structure:
business scenario Customer ID Time of borrowing Amount of money to be borrowed State of borrowing
Borrowing money ID_NO 2020-01-01 100.0 Success of the method
Then, the time window of the data source is set to 5 seconds, and all service data are accumulated in 5 seconds to form a data matrix.
The method comprises the following specific steps:
opening an account ID_1 Success of the method 2020-01-01 Mobile phone
Borrowing money ID_2 2020-01-01 100.0 Success of the method
Borrowing money ID_3 2020-02-01 200.0 Success of the method
Borrowing money ID_2 2020-01-01 200.0 Success of the method
Opening an account ID_2 Success of the method 2020-02-01 Mobile phone
Further, a time window accumulates a batch of data, and the following process is performed.
S301, loading a corresponding configuration according to the primary latitude key (BIZ _ TYPE), grouping the batch data (5S), and obtaining different groups according to the primary latitude key, such as the data group shown in fig. 3, specifically as follows:
grouping one: the latitude primary key is a borrowing event, and the client ID is ID _ 2;
the latitude indexes are the latest borrowing time of the client, the borrowing amount of the client and the borrowing stroke number of the client.
Grouping II, wherein the latitude primary key is a borrowing event, and the client ID is ID _ 3;
the latitude indexes are the latest borrowing time of the client, the borrowing amount of the client and the borrowing stroke number of the client.
As can be seen from the above, the grouping condition is that the event is a loan, the borrowing status is a success, and the borrowing success indicator is counted according to the client ID.
S302, according to the configuration and the data flow, service grouping information is obtained in real time and comprises information such as service date, latitude main keys and latitude indexes.
Based on the above, the obtained service grouping information is as follows:
Figure BDA0002395966510000111
group1, Group1,. and Group pn in fig. 3 are formed after the grouping is completed, resulting in a Group one as follows:
borrowing money ID_2 2020-01-01 100.0 Success of the method
Borrowing money ID_2 2020-01-01 200.0 Success of the method
Meanwhile, grouping two is as follows:
borrowing money ID_3 2020-02-01 200.0 Success of the method
It should be noted that, in the embodiment of the present application, packet data parallel computation and the same group index serial computation are performed, and one latitude index corresponds to one or more operators, such as the operator 1, the operator 2, and the operator N in fig. 3.
In the above, the latitude index includes a plurality of single index calculations, and the configuration of the single index calculation is briefly described as follows:
Figure BDA0002395966510000112
it should be noted that the calculation range is a cycle range, which supports minutes/hours/days/weeks/months/years, and the cycle range is a numerical range for performing range check.
The operators in the computational model may be real-time framework intermediate operators: hbase store, hbase query, hbase deduplication, spark SQL query, or
A general operator: statistical calculation (20 operators such as SUM/COUNT/DIS _ COUNT/DETAIL _ LIST/late), judgment calculation (i.e., >, <, >, etc.), and logical calculation (and, or not, etc.).
In the embodiment of the application, after the service data of the set latitude in the packet is processed, the processed calculation result is output according to the output template according to the preset output template.
It should be noted that, the present solution may associate multiple static data sources and tables, and the output mode may be the following two types:
and (3) timing output: and (5) outputting by the timer, and updating the index table, such as inquiring the output index every five minutes.
And (3) immediate output: event driving, immediate synchronization after index updating, and direct updating to an index library after single grouping is finished.
Based on the same inventive concept, fig. 4 exemplarily illustrates a stream data processing apparatus provided by an embodiment of the present invention, which may be a flow of a stream data processing method.
The apparatus for stream data processing includes:
an obtaining module 401, configured to obtain various types of service data that meet a screening rule from monitored stream data;
a processing module 402, configured to extract, for each type of service data, the service data according to a preset structure of the service data, so as to obtain service data of a set latitude; the preset structure comprises at least one set latitude;
grouping various service data with set latitude according to a preset grouping rule;
and processing the service data of the set latitude in the packet according to a preset operator of each packet.
Optionally, the processing module 402 is specifically configured to:
the screening rules include at least one of: a set data source, set category service data, and a set time window.
Optionally, the processing module 402 is specifically configured to:
according to the preset structure of the service data, constructing a data matrix for the service data in the same time window; each service data corresponds to one row in the data matrix, and the same set latitude of each service data corresponds to one column in the data matrix.
Optionally, the processing module 402 is specifically configured to:
the group rule is set with latitude main keys of each group, and the set latitude comprises the latitude main keys;
the processing module 402 is specifically configured to:
and obtaining the business data of the set latitude of each group aiming at the latitude main key of each group, wherein the business data in each group conforms to the mode of the data matrix.
Optionally, the processing module 402 is specifically configured to:
the preset operator comprises a latitude index and an operator for calculating the latitude index, and the set latitude comprises the latitude index;
the processing module 402 is specifically configured to:
and calling the operator to process the business data with the set latitude in the packet according to the latitude index of the packet to obtain the calculation result of the packet in the latitude index.
Optionally, the processing module 402 is specifically configured to:
and calling the operator to process the column data in the grouped data matrix.
Optionally, the processing module 402 is further configured to:
and after the service data of the set latitude in the group is processed, outputting the processed calculation result according to a preset output template.
Based on the same inventive concept, an embodiment of the present invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the streaming data processing method according to the obtained program.
Based on the same inventive concept, the embodiment of the present invention also provides a computer-readable non-volatile storage medium, which includes computer-readable instructions, and when the computer reads and executes the computer-readable instructions, the computer is enabled to execute the method for processing the stream data.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A method of stream data processing, comprising:
acquiring various service data which accord with the screening rule from the monitored stream data;
extracting the service data according to a preset structure of the service data aiming at each type of service data to obtain service data of a set latitude; the preset structure comprises at least one set latitude;
grouping various service data with set latitude according to a preset grouping rule;
and processing the service data of the set latitude in the packet according to a preset operator of each packet.
2. The method of claim 1, wherein the filtering rule comprises at least one of: a set data source, set category service data, and a set time window.
3. The method of claim 1, wherein the extracting the service data according to the preset structure of the service data to obtain the service data of the set latitude comprises:
according to the preset structure of the service data, constructing a data matrix for the service data in the same time window; each service data corresponds to one row in the data matrix, and the same set latitude of each service data corresponds to one column in the data matrix.
4. The method of claim 1, wherein the grouping rule is set with a primary latitude key for each group, and the set latitude comprises the primary latitude key;
grouping various service data with set latitudes according to a preset grouping rule, comprising the following steps:
and obtaining the business data of the set latitude of each group aiming at the latitude main key of each group, wherein the business data in each group conforms to the mode of the data matrix.
5. The method of any one of claims 1 to 4, wherein the preset operator comprises a latitude index and an operator that calculates the latitude index, and the set latitude comprises the latitude index;
processing the service data of the set latitude in the packet according to the preset operator of each packet, comprising:
and calling the operator to process the business data with the set latitude in the packet according to the latitude index of the packet to obtain the calculation result of the packet in the latitude index.
6. The method of claim 5, wherein invoking the operator to process the set latitude of business data within the packet comprises:
and calling the operator to process the column data in the grouped data matrix.
7. The method of claim 5, wherein after processing the set latitude of traffic data within a packet, further comprising:
and outputting the processed calculation result according to a preset output template.
8. An apparatus for stream data processing, comprising:
the acquisition module is used for acquiring various service data which accord with the screening rule from the monitored stream data;
the processing module is used for extracting the service data according to a preset structure of the service data aiming at each type of service data to obtain the service data with the set latitude; the preset structure comprises at least one set latitude;
the processing module is further used for grouping various service data with set latitudes according to a preset grouping rule;
and the processing module is further used for processing the service data of the set latitude in the packet according to the preset operator of each packet.
9. A computing device, comprising:
a memory for storing program instructions;
a processor for calling program instructions stored in said memory to perform the method of any of claims 1 to 7 in accordance with the obtained program.
10. A computer-readable non-transitory storage medium including computer-readable instructions which, when read and executed by a computer, cause the computer to perform the method of any one of claims 1 to 7.
CN202010131762.0A 2020-02-29 2020-02-29 Method and device for processing stream data Pending CN111367951A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010131762.0A CN111367951A (en) 2020-02-29 2020-02-29 Method and device for processing stream data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010131762.0A CN111367951A (en) 2020-02-29 2020-02-29 Method and device for processing stream data

Publications (1)

Publication Number Publication Date
CN111367951A true CN111367951A (en) 2020-07-03

Family

ID=71208447

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010131762.0A Pending CN111367951A (en) 2020-02-29 2020-02-29 Method and device for processing stream data

Country Status (1)

Country Link
CN (1) CN111367951A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111935226A (en) * 2020-07-08 2020-11-13 上海微亿智造科技有限公司 Method and system for realizing streaming computing by supporting industrial data
CN112150273A (en) * 2020-09-24 2020-12-29 中国农业银行股份有限公司 System, method, apparatus and storage medium for processing online credit service
CN112328597A (en) * 2020-11-06 2021-02-05 北京航云物联信息技术有限公司 Flow calculation method and device based on table
CN113360564A (en) * 2021-07-12 2021-09-07 杭州安恒信息技术股份有限公司 ETL-based data stream processing method, system, device and readable storage medium
CN115080156A (en) * 2022-08-23 2022-09-20 卓望数码技术(深圳)有限公司 Flow-batch-integration-based optimized calculation method and device for big data batch calculation

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111935226A (en) * 2020-07-08 2020-11-13 上海微亿智造科技有限公司 Method and system for realizing streaming computing by supporting industrial data
CN111935226B (en) * 2020-07-08 2021-06-08 上海微亿智造科技有限公司 Method and system for realizing streaming computing by supporting industrial data
CN112150273A (en) * 2020-09-24 2020-12-29 中国农业银行股份有限公司 System, method, apparatus and storage medium for processing online credit service
CN112328597A (en) * 2020-11-06 2021-02-05 北京航云物联信息技术有限公司 Flow calculation method and device based on table
CN113360564A (en) * 2021-07-12 2021-09-07 杭州安恒信息技术股份有限公司 ETL-based data stream processing method, system, device and readable storage medium
CN115080156A (en) * 2022-08-23 2022-09-20 卓望数码技术(深圳)有限公司 Flow-batch-integration-based optimized calculation method and device for big data batch calculation
CN115080156B (en) * 2022-08-23 2022-11-11 卓望数码技术(深圳)有限公司 Flow-batch-integration-based optimized calculation method and device for big data batch calculation

Similar Documents

Publication Publication Date Title
CN111367951A (en) Method and device for processing stream data
CN110750650A (en) Construction method and device of enterprise knowledge graph
CN106339274A (en) Method and system for obtaining data snapshot
CN107247811B (en) SQL statement performance optimization method and device based on Oracle database
CN108664635B (en) Method, device, equipment and storage medium for acquiring database statistical information
CN113360554B (en) Method and equipment for extracting, converting and loading ETL (extract transform load) data
CN106055630A (en) Log storage method and device
CN108875077B (en) Column storage method and device of database, server and storage medium
CN104424018A (en) Distributed calculating transaction processing method and device
CN108389394B (en) Method and system for analyzing initial city entry of vehicle
CN110347724A (en) Abnormal behaviour recognition methods, device, electronic equipment and medium
CN106844320B (en) Financial statement integration method and equipment
CN108521588A (en) A kind of main broadcaster&#39;s arrangement method and system based on time slicing, server and storage medium
CN113242159A (en) Application access relation determining method and device
CN111324781A (en) Data analysis method, device and equipment
CN106557483B (en) Data processing method, data query method, data processing equipment and data query equipment
WO2020259155A1 (en) Method and apparatus for generating alarm data report
CN110737727B (en) Data processing method and system
CN116644136A (en) Data acquisition method, device, equipment and medium for increment and full data
CN115470279A (en) Data source conversion method, device, equipment and medium based on enterprise data
CN110704407A (en) Data deduplication method and system
CN114722045A (en) Time series data storage method and device
CN111198884B (en) Method and system for processing information of first entering city of vehicle
CN114385188A (en) Code workload statistical method and device and electronic equipment
CN115481160A (en) Stream data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination