CN116610736A - Data processing method and device, electronic equipment and storage medium - Google Patents

Data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116610736A
CN116610736A CN202310585124.XA CN202310585124A CN116610736A CN 116610736 A CN116610736 A CN 116610736A CN 202310585124 A CN202310585124 A CN 202310585124A CN 116610736 A CN116610736 A CN 116610736A
Authority
CN
China
Prior art keywords
rule
data
available
preset
stream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310585124.XA
Other languages
Chinese (zh)
Inventor
王兵
韩硕
路绪海
余珍聪
郭望纾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Telecom Corp Shanghai Research Institute
China Telecom Corp Ltd
Original Assignee
China Telecom Corp Shanghai Research Institute
China Telecom Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Telecom Corp Shanghai Research Institute, China Telecom Corp Ltd filed Critical China Telecom Corp Shanghai Research Institute
Priority to CN202310585124.XA priority Critical patent/CN116610736A/en
Publication of CN116610736A publication Critical patent/CN116610736A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/542Event management; Broadcasting; Multicasting; Notifications
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the application provides a data processing method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: obtaining data to be processed; the data to be processed comprises a log data stream and a broadcast state stream, wherein the log data stream comprises operation data corresponding to each object; if the broadcast state stream represents the newly added rule, converting the rule in the broadcast state stream into a rule in a preset format, and adding the rule in the preset format into a preset rule engine; wherein, the preset format is the format corresponding to the preset rule engine; and matching the operation data corresponding to each object with the available rules respectively aiming at any available rule in the preset rule engine, and determining whether the operation data corresponding to each object accords with each available rule. According to the scheme, the rules are dynamically added in the data processing, and the rule management process and the log data stream processing process are decoupled, so that the data processing and the later-stage rule maintenance are facilitated.

Description

Data processing method and device, electronic equipment and storage medium
Technical Field
The embodiment of the application relates to the technical field of big data, in particular to a data processing method, a device, electronic equipment and a storage medium.
Background
With the development of big data technology, a data processing engine is widely applied to data processing, for example, in a production environment, a Flink (open source stream processing framework) is adopted to perform basic extraction, transformation and loading (ETL) on a running log, and data meeting the conditions is screened out according to rules.
In the related art, rules are predefined in a data processing engine to perform data screening. Also taking the flank as an example, rules are predefined by complex event processing (Complex Event Processing, CEP), and after flank is started, data screening is performed using the rules.
However, the above manner cannot change the rule after the data processing engine is started, and it is difficult to meet the requirement of dynamically using the rule in the production environment.
Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device, electronic equipment and a storage medium, which are used for dynamically adding rules in data processing.
In a first aspect, an embodiment of the present application provides a first data processing method, applied to a task node in a data processing engine, where the method includes:
Obtaining data to be processed; the data to be processed comprises a log data stream and a broadcast state stream, wherein the log data stream comprises operation data corresponding to each object;
if the broadcast state stream represents the newly added rule, converting the rule in the broadcast state stream into a rule in a preset format, and adding the rule in the preset format into a preset rule engine; wherein, the preset format is the format corresponding to the preset rule engine;
and matching the operation data corresponding to each object with the available rules respectively aiming at any available rule in the preset rule engine, and determining whether the operation data corresponding to each object accords with each available rule.
According to the scheme, rules do not need to be predefined in the data processing engine, if new rules need to be added after the data processing engine is started, the new rules can be triggered through an external system in real time, and after the data processing engine processes the new rules into a broadcast state stream, the new rules are dynamically subjected to format conversion through the task nodes and added to the preset rule engine so as to be conveniently called; in addition, the rules are managed through the preset rule engine, the rule management process and the log data stream processing process are decoupled, the later rule maintenance is convenient, the data processing engine is enabled to concentrate on data processing, and the system operation efficiency is improved.
In some alternative embodiments, the method further comprises:
determining the times that the operation data corresponding to each object accords with the available rules in a preset time length aiming at any available rule in the preset rule engine;
and if the number of times that the object accords with the available rule is greater than the preset number of times corresponding to the available rule, determining that the object meets the aggregation condition of the available rule.
According to the scheme, through setting the preset times of each rule, whether each piece of operation data accords with the available rule or not is determined, and the times that the operation data corresponding to each object accords with the available rule in the preset time period can be analyzed; comparing the determined times with the preset times, and if the times of the corresponding operation data meeting the available rules in the preset time period is greater than the preset times, indicating that the object meets the aggregation conditions of the corresponding available rules; the matching requirement of complex rules is finally achieved through the cooperation of the preset rule engine and the data processing engine.
In some alternative embodiments, the log data stream is obtained by:
acquiring an operation log from the theme of the characterization log;
and based on the identification of each object in the running log, grouping the data in the running log to obtain the running data corresponding to each object.
According to the scheme, the rule can be newly added through the external system in real time, and the data source not only has a conventional operation log, but also has rule data; thus, the travel log and rule data are distinguished by subject matter; the task node can directly obtain the running log from the theme of the characterization log, further the identification of each object (each host or virtual machine), and perform grouping processing on the data in the running log (the data of each object is grouped and analyzed), so as to obtain the running data corresponding to each object.
In some alternative embodiments, the broadcast status stream is obtained by:
receiving the broadcast status stream broadcast by a broadcast node in the data processing engine;
the broadcast state flow is obtained by converting the rule data after the broadcast node receives the rule data; the rule data is obtained from the data processing engine by a reading node from the subject matter characterizing the rule.
According to the scheme, the rule can be newly added through the external system in real time, and the data source not only has a conventional operation log, but also has rule data; thus, the travel log and rule data are distinguished by subject matter; the task nodes are distributed nodes, and running logs cannot be directly obtained from the topics of the characterization rules, so that rule data are obtained from the topics of the characterization rules through the reading nodes; the regular data are converted into a broadcasting state stream through a broadcasting node and then broadcast; the task node may receive the broadcast status stream.
In some optional embodiments, before matching the operation data corresponding to each object with the available rule, the method further includes:
determining key fields in the operation data corresponding to each object;
matching the operation data corresponding to each object with the available rule respectively, wherein the matching comprises the following steps:
and matching information corresponding to key fields in the operation data corresponding to the objects with the available rules.
According to the scheme, the operation data possibly contains more useless information (information which is not needed when the rule is matched), so that the key fields in the operation data corresponding to all objects are determined before the rule is matched, and the rule matching can be accurately and efficiently realized based on the information corresponding to the key fields.
In some alternative embodiments, the method further comprises:
if the broadcast state stream characterizes the deletion rule, determining a rule identifier in the broadcast state stream;
and determining a corresponding rule from the preset rule engine based on the rule identification, and marking the corresponding rule as an unavailable rule.
In the above scheme, in practical application, there is a need for not only adding a rule but also deleting a rule, if the broadcast state stream characterizes the deleting rule (i.e. receives an instruction of deleting the rule triggered by an external system), determining a rule identifier (which rule needs to be deleted) in the broadcast state stream; and determining a corresponding rule from the preset rule engine based on the rule identification, and marking the corresponding rule as an unavailable rule, so that the rule cannot be matched subsequently. In addition, the rules are marked as unavailable rules, and are not deleted directly from the preset rule engine, so that related personnel can conveniently inquire or can be recovered for use, and the method is more suitable for the production environment.
In a second aspect, an embodiment of the present application provides a first data processing apparatus, applied to a task node in a data processing engine, the apparatus including:
the data acquisition module is used for acquiring data to be processed; the data to be processed comprises a log data stream and a broadcast state stream, wherein the log data stream comprises operation data corresponding to each object;
the broadcast stream processing module is used for converting the rule in the broadcast state stream into a rule in a preset format if the broadcast state stream represents the newly added rule, and adding the rule in the preset format into a preset rule engine; wherein, the preset format is the format corresponding to the preset rule engine;
the log stream processing module is used for matching the operation data corresponding to each object with the available rules respectively aiming at any available rule in the preset rule engine, and determining whether the operation data corresponding to each object accords with each available rule.
In some alternative embodiments, the method further comprises an aggregation determination module, configured to:
determining the times that the operation data corresponding to each object accords with the available rules in a preset time length aiming at any available rule in the preset rule engine;
And if the number of times that the object accords with the available rule is greater than the preset number of times corresponding to the available rule, determining that the object meets the aggregation condition of the available rule.
In some optional embodiments, the data obtaining module is configured to obtain the log data stream by:
acquiring an operation log from the theme of the characterization log;
and based on the identification of each object in the running log, grouping the data in the running log to obtain the running data corresponding to each object.
In some optional embodiments, the data acquisition module is configured to obtain the broadcast status stream by:
receiving the broadcast status stream broadcast by a broadcast node in the data processing engine;
the broadcast state flow is obtained by converting the rule data after the broadcast node receives the rule data; the rule data is obtained from the data processing engine by a reading node from the subject matter characterizing the rule.
In some optional embodiments, before the log stream processing module matches the running data corresponding to each object with the available rule, the log stream processing module is further configured to:
Determining key fields in the operation data corresponding to each object;
the log stream processing module is specifically configured to:
and matching information corresponding to key fields in the operation data corresponding to the objects with the available rules.
In some alternative embodiments, the broadcast stream processing module is further configured to:
if the broadcast state stream characterizes the deletion rule, determining a rule identifier in the broadcast state stream;
and determining a corresponding rule from the preset rule engine based on the rule identification, and marking the corresponding rule as an unavailable rule.
In a third aspect, an embodiment of the present application provides an electronic device, including at least one processor and at least one memory, where the memory stores a computer program, and when the program is executed by the processor, causes the processor to execute the data processing method according to any one of the first aspects.
In a fourth aspect, an embodiment of the present application provides a computer readable storage medium storing a computer program executable by a processor, the program when run on the processor causing the processor to perform the data processing method of any one of the first aspects.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a first application scenario provided in an embodiment of the present application;
fig. 2 is a schematic diagram of a second application scenario provided in an embodiment of the present application;
FIG. 3 is a flowchart illustrating a first data processing method according to an embodiment of the present application;
FIG. 4 is a schematic diagram of topic classification according to an embodiment of the present application;
FIG. 5 is a block diagram of a data processing engine according to an embodiment of the present application;
FIG. 6 is a flowchart of a second data processing method according to an embodiment of the present application;
FIG. 7 is a flowchart illustrating a third data processing method according to an embodiment of the present application;
FIG. 8 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "first," "second," and the like, are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature. In the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.
In the description of the present application, it should be noted that, unless explicitly stated and limited otherwise, the term "connected" should be interpreted broadly, and for example, it may be directly connected, or it may be indirectly connected through an intermediate medium, or it may be communication between two devices. The specific meaning of the above terms in the present application can be understood by those of ordinary skill in the art according to the specific circumstances.
With the development of big data technology, a data processing engine is widely applied to data processing, such as in a production environment, performing basic ETL on a running log by adopting a Flink, and screening out data meeting the conditions according to rules.
In the related art, rules are predefined in a data processing engine to perform data screening.
Referring to fig. 1, rules are predefined by the link CEP, and after the link is started, the rules cannot be added, and data screening is performed by using the predefined rules.
However, the above manner cannot change the rule after the data processing engine is started, and it is difficult to meet the requirement of dynamically using the rule in the production environment; moreover, the rule has strong coupling with data processing, and is inconvenient for rule management.
In view of this, the embodiments of the present application provide a data processing method, apparatus, electronic device, and storage medium, which are used to dynamically add rules in data processing.
Referring to fig. 2, an application scenario provided in an embodiment of the present application includes: the data processing engine and the preset rule engine;
task nodes in the data processing engine are used for obtaining data to be processed; the data to be processed comprises a log data stream and a broadcast state stream, wherein the log data stream comprises operation data corresponding to each object;
If the broadcast state stream represents the newly added rule, converting the rule in the broadcast state stream into a rule in a preset format, and adding the rule in the preset format into a preset rule engine; wherein, the preset format is the format corresponding to the preset rule engine;
and matching the operation data corresponding to each object with the available rules respectively aiming at any available rule in the preset rule engine, and determining whether the operation data corresponding to each object accords with each available rule.
The preset rule engine is used for managing rules.
Illustratively, the data processing engine may be a link, and the preset rule engine may be a Drools (an open source business rule engine).
According to the scheme, rules do not need to be predefined in the data processing engine, if new rules need to be added after the data processing engine is started, the new rules can be triggered through an external system in real time, and after the data processing engine processes the new rules into a broadcast state stream, the new rules are dynamically subjected to format conversion through the task nodes and added to the preset rule engine so as to be conveniently called; in addition, the rules are managed through the preset rule engine, the rule management process and the log data stream processing process are decoupled, the later rule maintenance is convenient, the data processing engine is enabled to concentrate on data processing, and the system operation efficiency is improved.
The following describes the technical scheme of the present application and how the technical scheme of the present application solves the above technical problems with reference to the drawings and specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.
Fig. 3 is a flow chart of a first data processing method according to an embodiment of the present application, which is applied to task nodes in a data processing engine, as shown in fig. 3, and includes the following steps:
step S301: obtaining data to be processed; the data to be processed comprises a log data stream and a broadcast state stream, wherein the log data stream comprises operation data corresponding to each object.
In implementation, the data processing engine is provided with distributed task nodes (tasks), the task nodes connect (connect) the log data Stream and the broadcast status Stream to obtain to-be-processed data (connected Stream), then the broadcast status Stream is processed by a broadcast status Stream processing method (public void process Broadcast Element), and the log data Stream is processed by a log data Stream processing method (public void process Element).
It can be understood that the above connection does not merge two data streams, but obtains two data streams when a new rule needs to be added after the data processing engine is started, and places the two data streams in a space (connect), calls a process method, and implements two core methods (a broadcast state stream processing method and a log data stream processing method).
Step S302: if the broadcast state stream represents the newly added rule, converting the rule in the broadcast state stream into a rule in a preset format, and adding the rule in the preset format into a preset rule engine.
The preset format is a format corresponding to the preset rule engine.
As described above, this embodiment needs to process the broadcast status flow by the broadcast status flow processing method, and if the broadcast status flow characterizes the new rule, it is explained that the new rule needs to be added. After the rule is processed into the broadcast state stream, the task node converts the format of the rule in the broadcast state stream, and the rule meeting the format required by the preset rule engine is dynamically added into the preset rule engine so as to be conveniently called.
That is, the preset rule engine has history rules and new rules added in real time, and the data processing engine only needs to process data without managing the rules. And the rule management process and the log data stream processing process are decoupled, so that later rule management is facilitated.
Step S303: and matching the operation data corresponding to each object with the available rules respectively aiming at any available rule in the preset rule engine, and determining whether the operation data corresponding to each object accords with each available rule.
As described above, the present embodiment requires processing of the log data stream by the log data stream processing method. The preset rule engine performs rule management, and only the available rules in the preset rule engine are called, the operation data corresponding to each object are respectively matched with the available rules (for example, whether the value of the corresponding field in the operation data is in the range required by the available rules or not is determined), so that the operation data conforming to each available rule (for example, the operation data of the value of the corresponding field in the range required by the available rules is determined to be the operation data conforming to the available rules) can be screened. In implementation, the operation data and the corresponding objects that meet each available rule may be further notified, which is not specifically limited in this embodiment.
According to the scheme, rules do not need to be predefined in the data processing engine, if new rules need to be added after the data processing engine is started, the new rules can be triggered through an external system in real time, and after the data processing engine processes the new rules into a broadcast state stream, the new rules are dynamically subjected to format conversion through the task nodes and added to the preset rule engine so as to be conveniently called; in addition, the rules are managed through the preset rule engine, the rule management process and the log data stream processing process are decoupled, the later rule maintenance is convenient, the data processing engine is enabled to concentrate on data processing, and the system operation efficiency is improved.
The rule can be newly added through the external system in real time, so that the data source not only has a conventional operation log, but also has rule data; in order to avoid data confusion, the present embodiment is provided with a plurality of topics (topics) by which the running log and the rule data are distinguished.
Referring to FIG. 4, kafka creates topic1 and topic2; flink reads the running log from topic1 and the rule data from topic 2.
Referring to fig. 5, there are multiple distributed task nodes in the data processing engine link (fig. 4 uses task1, task2 and task3 as an example, and more or fewer tasks may be set in implementation);
the task node can directly read the operation log, but cannot directly read the rule data;
therefore, the data processing engine is also provided with a reading node and a broadcasting node; acquiring rule data from the topics characterizing the rules by the reading nodes; the regular data are converted into a broadcasting state stream through a broadcasting node and then broadcast; task1, task2, and task3 can all receive the broadcast status stream.
In some alternative embodiments, the task node may obtain the log data stream by:
acquiring an operation log from the theme of the characterization log;
And based on the identification of each object in the running log, grouping the data in the running log to obtain the running data corresponding to each object.
As described above, the running log and rule data can be distinguished by the subject matter; the task node can directly obtain the running log from the theme of the characterization log, further identify the identification of each object (each host or virtual machine) from the running log, and perform grouping processing (data grouping of each object is analyzed) on the data in the running log to obtain the running data corresponding to each object.
Illustratively, task1 obtains a running log (DataStream 1) from topic1, where the running log has relevant data of host a, relevant data of host B, and relevant data of virtual machine C; the task1 analyzes related data packets of the host A, the host B and the virtual machine C to obtain operation data (KeyedStream) corresponding to the host A (key), operation data corresponding to the host B and operation data corresponding to the virtual machine C.
In some alternative embodiments, the task node may obtain the broadcast status flow by:
receiving the broadcast status stream broadcast by a broadcast node in the data processing engine;
The broadcast state flow is obtained by converting the rule data after the broadcast node receives the rule data; the rule data is obtained from the data processing engine by a reading node from the subject matter characterizing the rule.
As described above, the running log and rule data can be distinguished by the subject matter; the task nodes are distributed nodes, and can not directly acquire rule data from the topics characterizing the rules, so that the reading nodes acquire the rule data (DataStream 2) from the topics characterizing the rules; defining a proper state descriptor through a broadcasting node, converting rule data into a broadcasting state stream and broadcasting the broadcasting state stream; the task node may receive the broadcast status stream.
Correspondingly, fig. 6 is a flow chart of a second data processing method according to an embodiment of the present application, applied to task nodes in a data processing engine, as shown in fig. 6, including the following steps:
step S601: acquiring an operation log from the theme of the characterization log; and based on the identification of each object in the running log, grouping the data in the running log to obtain the running data corresponding to each object.
Step S602: and receiving the broadcast status stream broadcast by the broadcast node in the data processing engine.
The broadcast state flow is obtained by converting the rule data after the broadcast node receives the rule data; the rule data is obtained from the data processing engine by a reading node from the subject matter characterizing the rule.
The order of steps S601 and S602 is not particularly limited in this embodiment.
Step S603: and connecting the log data stream and the broadcast state stream to obtain the data to be processed.
Step S604: if the broadcast state stream represents the newly added rule, converting the rule in the broadcast state stream into a rule in a preset format, and adding the rule in the preset format into a preset rule engine.
The preset format is a format corresponding to the preset rule engine.
Step S605: and matching the operation data corresponding to each object with the available rules respectively aiming at any available rule in the preset rule engine, and determining whether the operation data corresponding to each object accords with each available rule.
The specific implementation of steps S601 to S605 may refer to the above embodiment, and will not be described herein.
Fig. 7 is a flow chart of a third data processing method according to an embodiment of the present application, applied to a task node in a data processing engine, as shown in fig. 7, including the following steps:
Step S701: obtaining data to be processed; the data to be processed comprises a log data stream and a broadcast state stream, wherein the log data stream comprises operation data corresponding to each object.
Step S702: if the broadcast state stream represents the newly added rule, converting the rule in the broadcast state stream into a rule in a preset format, and adding the rule in the preset format into a preset rule engine.
The preset format is a format corresponding to the preset rule engine.
The specific implementation manner of the steps S701 to S702 may refer to the above embodiment, and will not be described herein.
Step S703: and determining key fields in the operation data corresponding to the objects.
In practice, each rule is typically a range of values for a certain index, such as a host central processing unit (Central Processing Unit, CPU) usage of greater than 80%. And the operation data may contain more useless information (information not needed in the process of matching the rule).
Based on this, by identifying the key field in the operation data, the following may be directly matched with each available rule based on the information corresponding to the key field.
Step S704: and matching information corresponding to key fields in the operation data corresponding to each object with the available rules aiming at any available rule in the preset rule engine, and determining whether the operation data corresponding to each object accords with each available rule.
Exemplary, fields such as CPU utilization rate, memory occupancy rate and the like are used as key fields, and key word identification is carried out;
if the available rule is a rule of CPU utilization rate, information corresponding to the CPU utilization rate is found out from the operation data based on the keyword identification, and the information is matched with the available rule (whether the information is within a range required by the available rule) so as to determine whether the operation data corresponding to each object accords with the available rule, so that the operation data does not need to be traversed during each matching.
According to the scheme, the operation data possibly contains more useless information (information which is not needed when the rule is matched), so that the key fields in the operation data corresponding to all objects are determined before the rule is matched, and the rule matching can be accurately and efficiently realized based on the information corresponding to the key fields.
In some alternative embodiments, on the basis of any of the above embodiments, the following steps may be further performed:
determining the times that the operation data corresponding to each object accords with the available rules in a preset time length aiming at any available rule in the preset rule engine;
and if the number of times that the object accords with the available rule is greater than the preset number of times corresponding to the available rule, determining that the object meets the aggregation condition of the available rule.
In implementation, there may be special requirements on some rules, not only to determine whether the available rules are met, but also to determine how many times the running data of the same object meets the available rules in a period of time, so as to determine whether each object meets the aggregation condition of the rules.
The aggregation condition here means that the number of times of meeting the rule reaches the requirement, thereby realizing more complex matching requirements.
Illustratively, at time TI, the operational data of host A complies with applicable rule 1; at the time T2, the operation data of the host A accords with the available rule 1; at the time T3, the operation data of the host A accords with the available rule 1; at time T4, the operation data of the host A does not accord with the available rule 1; at time T5, the operation data of the host A does not accord with the available rule 1; at the time T6, the operation data of the host A accords with the available rule 1;
the preset duration is from TI to T6, and the times that the operation data corresponding to the host A accords with the available rule 1 are 4 times; if the preset number of times corresponding to the available rule 1 is 3, the host A meets the aggregation condition of the available rule 1.
The above-described process is an exemplary illustration of the polymerization condition determination process, and the present embodiment is not particularly limited thereto.
According to the scheme, through setting the preset times of each rule, whether each piece of operation data accords with the available rule or not is determined, and the times that the operation data corresponding to each object accords with the available rule in the preset time period can be analyzed; comparing the determined times with the preset times, and if the times of the corresponding operation data meeting the available rules in the preset time period is greater than the preset times, indicating that the object meets the aggregation conditions of the corresponding available rules; the matching requirement of complex rules is finally achieved through the cooperation of the preset rule engine and the data processing engine.
In some alternative embodiments, on the basis of any of the above embodiments, the following steps may be further performed:
if the broadcast state stream characterizes the deletion rule, determining a rule identifier in the broadcast state stream;
and determining a corresponding rule from the preset rule engine based on the rule identification, and marking the corresponding rule as an unavailable rule.
In the implementation, not only is the requirement for adding the rule newly, but also the requirement for deleting the rule possibly exists;
based on the above, the instruction of deleting the rule can be triggered by an external system, and the corresponding broadcast state flow represents the deleting rule, and the broadcast state flow in the scene carries a rule identifier (in implementation, the rule in the preset rule engine can be encoded, and the encoding is used as the rule identifier);
and determining a corresponding rule from the preset rule engine based on the rule identification, and marking the corresponding rule as an unavailable rule, so that the rule cannot be matched subsequently.
That is, in the preset rule engine, there may be an unavailable rule without matching the operation data with the unavailable rule.
In the above scheme, in practical application, there is a need for not only adding a rule but also deleting a rule, if the broadcast state stream characterizes the deleting rule (i.e. receives an instruction of deleting the rule triggered by an external system), determining a rule identifier (which rule needs to be deleted) in the broadcast state stream; and determining a corresponding rule from the preset rule engine based on the rule identification, and marking the corresponding rule as an unavailable rule, so that the rule cannot be matched subsequently. In addition, the rules are marked as unavailable rules, and are not deleted directly from the preset rule engine, so that related personnel can conveniently inquire or can be recovered for use, and the method is more suitable for the production environment.
As shown in fig. 8, an embodiment of the present application provides a data processing apparatus 800, including:
a data acquisition module 801, configured to acquire data to be processed; the data to be processed comprises a log data stream and a broadcast state stream, wherein the log data stream comprises operation data corresponding to each object;
a broadcast stream processing module 802, configured to, if the broadcast status stream represents the newly added rule, convert the rule in the broadcast status stream into a rule in a preset format, and add the rule in the preset format to a preset rule engine; wherein, the preset format is the format corresponding to the preset rule engine;
the log stream processing module 803 is configured to match, for any available rule in the preset rule engine, the operation data corresponding to each object with the available rule, and determine whether the operation data corresponding to each object meets each available rule.
In some alternative embodiments, the aggregation determination module 804 is further configured to:
determining the times that the operation data corresponding to each object accords with the available rules in a preset time length aiming at any available rule in the preset rule engine;
and if the number of times that the object accords with the available rule is greater than the preset number of times corresponding to the available rule, determining that the object meets the aggregation condition of the available rule.
In some alternative embodiments, the data obtaining module 801 is configured to obtain the log data stream by:
acquiring an operation log from the theme of the characterization log;
and based on the identification of each object in the running log, grouping the data in the running log to obtain the running data corresponding to each object.
In some alternative embodiments, the data obtaining module 801 is configured to obtain the broadcast status stream by:
receiving the broadcast status stream broadcast by a broadcast node in the data processing engine;
the broadcast state flow is obtained by converting the rule data after the broadcast node receives the rule data; the rule data is obtained from the data processing engine by a reading node from the subject matter characterizing the rule.
In some optional embodiments, before the log stream processing module 803 matches the running data corresponding to each object with the available rule, the log stream processing module is further configured to:
determining key fields in the operation data corresponding to each object;
the log stream processing module 803 is specifically configured to:
and matching information corresponding to key fields in the operation data corresponding to the objects with the available rules.
In some alternative embodiments, the broadcast stream processing module 802 is further configured to:
if the broadcast state stream characterizes the deletion rule, determining a rule identifier in the broadcast state stream;
and determining a corresponding rule from the preset rule engine based on the rule identification, and marking the corresponding rule as an unavailable rule.
Since the device is the device in the method according to the embodiment of the present application, and the principle of the device for solving the problem is similar to that of the method, the implementation of the device may refer to the implementation of the method, and the repetition is omitted.
Based on the same technical concept, the embodiment of the present application further provides an electronic device 900, as shown in fig. 9, including at least one processor 901, and a memory 902 connected to the at least one processor, where a specific connection medium between the processor 901 and the memory 902 is not limited in the embodiment of the present application, and in fig. 9, the connection between the processor 901 and the memory 902 is exemplified by a bus 903. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 9, but not only one bus or one type of bus.
The processor 901 is a control center of the electronic device, and may connect various parts of the electronic device using various interfaces and lines, and implement data processing by executing or executing instructions stored in the memory 902 and calling data stored in the memory 902. Alternatively, the processor 901 may include one or more processing units, and the processor 901 may integrate an application processor and a modem processor, wherein the application processor primarily processes an operating system, a user interface, an application program, and the like, and the modem processor primarily processes issuing instructions. It will be appreciated that the modem processor described above may not be integrated into the processor 901. In some embodiments, processor 901 and memory 902 may be implemented on the same chip, and in some embodiments they may be implemented separately on separate chips.
The processor 901 may be a general purpose processor such as a CPU, digital signal processor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with an embodiment of a data processing method may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in the processor for execution.
The memory 902 is a non-volatile computer-readable storage medium that can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory 902 may include at least one type of storage medium, which may include, for example, flash Memory, hard disk, multimedia card, card Memory, random access Memory (Random Access Memory, RAM), static random access Memory (Static Random Access Memory, SRAM), programmable Read-Only Memory (Programmable Read Only Memory, PROM), read-Only Memory (ROM), charged erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory), magnetic Memory, magnetic disk, optical disk, and the like. Memory 902 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 902 of embodiments of the present application may also be circuitry or any other device capable of performing memory functions for storing program instructions and/or data.
In an embodiment of the present application, the memory 902 stores a computer program that, when executed by the processor 901, causes the processor 901 to perform:
Obtaining data to be processed; the data to be processed comprises a log data stream and a broadcast state stream, wherein the log data stream comprises operation data corresponding to each object;
if the broadcast state stream represents the newly added rule, converting the rule in the broadcast state stream into a rule in a preset format, and adding the rule in the preset format into a preset rule engine; wherein, the preset format is the format corresponding to the preset rule engine;
and matching the operation data corresponding to each object with the available rules respectively aiming at any available rule in the preset rule engine, and determining whether the operation data corresponding to each object accords with each available rule.
In some alternative embodiments, processor 901 further performs:
determining the times that the operation data corresponding to each object accords with the available rules in a preset time length aiming at any available rule in the preset rule engine;
and if the number of times that the object accords with the available rule is greater than the preset number of times corresponding to the available rule, determining that the object meets the aggregation condition of the available rule.
In some alternative embodiments, processor 901 further performs:
Acquiring an operation log from the theme of the characterization log;
and based on the identification of each object in the running log, grouping the data in the running log to obtain the running data corresponding to each object.
In some alternative embodiments, processor 901 further performs:
receiving the broadcast status stream broadcast by a broadcast node in the data processing engine;
the broadcast state flow is obtained by converting the rule data after the broadcast node receives the rule data; the rule data is obtained from the data processing engine by a reading node from the subject matter characterizing the rule.
In some optional embodiments, before matching the running data corresponding to each object with the available rule, the processor 901 further performs:
determining key fields in the operation data corresponding to each object;
the processor 901 specifically performs:
and matching information corresponding to key fields in the operation data corresponding to the objects with the available rules.
In some alternative embodiments, processor 901 further performs:
if the broadcast state stream characterizes the deletion rule, determining a rule identifier in the broadcast state stream;
And determining a corresponding rule from the preset rule engine based on the rule identification, and marking the corresponding rule as an unavailable rule.
Because the electronic device is the electronic device in the method according to the embodiment of the present application, and the principle of solving the problem by the electronic device is similar to that of the method, the implementation of the electronic device may refer to the implementation of the method, and the repetition is omitted.
Based on the same technical idea, the embodiments of the present application also provide a computer-readable storage medium storing a computer program executable by a processor, which when run on the processor, causes the processor to perform the steps of the above-described data processing method.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

1. A data processing method for a task node in a data processing engine, the method comprising:
Obtaining data to be processed; the data to be processed comprises a log data stream and a broadcast state stream, wherein the log data stream comprises operation data corresponding to each object;
if the broadcast state stream represents the newly added rule, converting the rule in the broadcast state stream into a rule in a preset format, and adding the rule in the preset format into a preset rule engine; wherein, the preset format is the format corresponding to the preset rule engine;
and matching the operation data corresponding to each object with the available rules respectively aiming at any available rule in the preset rule engine, and determining whether the operation data corresponding to each object accords with each available rule.
2. The method of claim 1, wherein the method further comprises:
determining the times that the operation data corresponding to each object accords with the available rules in a preset time length aiming at any available rule in the preset rule engine;
and if the number of times that the object accords with the available rule is greater than the preset number of times corresponding to the available rule, determining that the object meets the aggregation condition of the available rule.
3. The method of claim 1, wherein the log data stream is obtained by:
Acquiring an operation log from the theme of the characterization log;
and based on the identification of each object in the running log, grouping the data in the running log to obtain the running data corresponding to each object.
4. The method of claim 1, wherein the broadcast status stream is obtained by:
receiving the broadcast status stream broadcast by a broadcast node in the data processing engine;
the broadcast state flow is obtained by converting the rule data after the broadcast node receives the rule data; the rule data is obtained from the data processing engine by a reading node from the subject matter characterizing the rule.
5. The method of claim 1, wherein before matching the operation data corresponding to each object with the available rule, respectively, further comprises:
determining key fields in the operation data corresponding to each object;
matching the operation data corresponding to each object with the available rule respectively, wherein the matching comprises the following steps:
and matching information corresponding to key fields in the operation data corresponding to the objects with the available rules.
6. The method of any one of claims 1-5, further comprising:
If the broadcast state stream characterizes the deletion rule, determining a rule identifier in the broadcast state stream;
and determining a corresponding rule from the preset rule engine based on the rule identification, and marking the corresponding rule as an unavailable rule.
7. A data processing apparatus for use in a task node in a data processing engine, the apparatus comprising:
the data acquisition module is used for acquiring data to be processed; the data to be processed comprises a log data stream and a broadcast state stream, wherein the log data stream comprises operation data corresponding to each object;
the broadcast stream processing module is used for converting the rule in the broadcast state stream into a rule in a preset format if the broadcast state stream represents the newly added rule, and adding the rule in the preset format into a preset rule engine; wherein, the preset format is the format corresponding to the preset rule engine;
the log stream processing module is used for matching the operation data corresponding to each object with the available rules respectively aiming at any available rule in the preset rule engine, and determining whether the operation data corresponding to each object accords with each available rule.
8. The apparatus of claim 7, further comprising an aggregation determination module to:
determining the times that the operation data corresponding to each object accords with the available rules in a preset time length aiming at any available rule in the preset rule engine;
and if the number of times that the object accords with the available rule is greater than the preset number of times corresponding to the available rule, determining that the object meets the aggregation condition of the available rule.
9. An electronic device comprising at least one processor and at least one memory, wherein the memory stores a computer program that, when executed by the processor, causes the processor to perform the method of any of claims 1-6.
10. A computer readable storage medium, characterized in that it stores a computer program executable by a computer, which when run on the computer causes the computer to perform the method according to any one of claims 1 to 6.
CN202310585124.XA 2023-05-23 2023-05-23 Data processing method and device, electronic equipment and storage medium Pending CN116610736A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310585124.XA CN116610736A (en) 2023-05-23 2023-05-23 Data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310585124.XA CN116610736A (en) 2023-05-23 2023-05-23 Data processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116610736A true CN116610736A (en) 2023-08-18

Family

ID=87683171

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310585124.XA Pending CN116610736A (en) 2023-05-23 2023-05-23 Data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116610736A (en)

Similar Documents

Publication Publication Date Title
CN107798038B (en) Data response method and data response equipment
CN111767143A (en) Transaction data processing method, device, equipment and system
CN107092686B (en) File management method and device based on cloud storage platform
CN111143446A (en) Data structure conversion processing method and device of data object and electronic equipment
CN106649344B (en) Weblog compression method and device
CN105183585A (en) Data backup method and device
CN108399175B (en) Data storage and query method and device
CN110633318A (en) Data extraction processing method, device, equipment and storage medium
CN112307318A (en) Content publishing method, system and device
CN102902574A (en) Cooperative processing method and device of multiple information flow nodes
CN112711683A (en) Data comparison method and device and computer equipment
CN111324645B (en) Block chain data processing method and device
WO2023104183A1 (en) Methods and systems for event management
CN116610736A (en) Data processing method and device, electronic equipment and storage medium
CN109101595B (en) Information query method, device, equipment and computer readable storage medium
CN116361153A (en) Method and device for testing firmware codes, electronic equipment and storage medium
CN108874798B (en) Big data sorting method and system
CN114417069A (en) Page data interaction method and device and electronic equipment
CN115269519A (en) Log detection method and device and electronic equipment
CN114897532A (en) Operation log processing method, system, device, equipment and storage medium
CN106156069B (en) Log system and log recording method
CN114401239A (en) Metadata transmission method and device, computer equipment and storage medium
CN114157662A (en) Cloud platform parameter adaptation method and device, terminal equipment and storage medium
CN112685557A (en) Visualized information resource management method and device
WO2019214685A1 (en) Message processing method, apparatus, and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination