CN113806191A

CN113806191A - Data processing method, device, equipment and storage medium

Info

Publication number: CN113806191A
Application number: CN202110911830.XA
Authority: CN
Inventors: 诸葛晓亚
Original assignee: Zhejiang Geely Holding Group Co Ltd; Hangzhou Youxing Technology Co Ltd
Current assignee: Zhejiang Geely Holding Group Co Ltd; Hangzhou Youxing Technology Co Ltd
Priority date: 2021-08-10
Filing date: 2021-08-10
Publication date: 2021-12-17

Abstract

The application relates to a data processing method, a device, equipment and a storage medium, wherein the method comprises the following steps: acquiring a data set to be analyzed in a preset format; acquiring a rule set; matching the data set to be analyzed with the rule set according to the mapping relation to obtain a matched data set; the mapping relation comprises an identifier of each piece of data to be analyzed in the data set to be analyzed and an identifier of a rule corresponding to the identifier of each piece of data to be analyzed; the matching data set comprises each data to be analyzed and the type of the rule corresponding to each data to be analyzed; performing aggregation processing on the matched data set based on the type of the rule corresponding to each data to be analyzed to obtain alarm data; the type of the rule corresponding to each data to be analyzed in the alarm data is an alarm type; and generating alarm prompt information based on the alarm data. Through the steps, the log data are reasonably analyzed and processed, and the effectiveness of the log data processing process can be improved.

Description

Data processing method, device, equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a data processing method, apparatus, device, and storage medium.

Background

In business operation and maintenance and operation scenarios at the enterprise level, logs are playing an increasingly important role. The simple local storage of the service log is difficult to mine the real data value behind the log, so that enterprises collect and gather the logs on all servers to perform centralized log management.

After the logs are stored in the centralized server, how to process the logs into indexes for guiding operation and maintenance and operation becomes an increasingly urgent need of enterprises.

At present, the traditional log statistics and analysis system cannot meet the core requirements of enterprise users, and how to efficiently process and use log data is a problem to be solved urgently.

Disclosure of Invention

The embodiment of the application provides a data processing method, a data processing device and a storage medium, and the effectiveness of a log data processing process can be improved by reasonably analyzing and processing log data.

In one aspect, an embodiment of the present application provides a data processing method, including:

acquiring a data set to be analyzed in a preset format;

acquiring a rule set; the rule set comprises a plurality of rules and the identification and type of each rule in the plurality of rules;

matching the data set to be analyzed with the rule set according to the mapping relation to obtain a matched data set; the mapping relation comprises an identifier of each piece of data to be analyzed in the data set to be analyzed and an identifier of a rule corresponding to the identifier of each piece of data to be analyzed; the matching data set comprises each data to be analyzed and the type of the rule corresponding to each data to be analyzed;

performing aggregation processing on the matched data set based on the type of the rule corresponding to each data to be analyzed to obtain alarm data; the type of the rule corresponding to each data to be analyzed in the alarm data is an alarm type;

and generating alarm prompt information based on the alarm data.

Optionally, the obtaining of the data to be analyzed in the preset format includes:

acquiring log data corresponding to the index type from the message middleware;

and analyzing the log data to obtain the data to be analyzed in a preset format.

Optionally, obtaining the rule set includes:

acquiring broadcast information from a preset storage area; the broadcast information includes a set of rules;

the preset storage area comprises a Mysql database or an Elasticissearch database.

Optionally, the identifier of each piece of data to be analyzed includes an application identifier and a data identifier;

matching the data set to be analyzed with the rule set according to the mapping relation to obtain a matched data set, wherein the matching data set comprises the following steps:

for each data to be analyzed in the data set to be analyzed: determining the identifier of the rule corresponding to each data to be analyzed from the mapping relation according to the application identifier and the data identifier of each data to be analyzed;

and acquiring a rule corresponding to each data to be analyzed from the rule set according to the identifier of the rule corresponding to each data to be analyzed, and assembling the rule corresponding to each data to be analyzed and each data to be analyzed to obtain a matched data set.

Optionally, performing aggregation processing on the matching data set based on the type of the rule corresponding to each piece of data to be analyzed to obtain alarm data, where the method includes:

for each data to be analyzed: if the type of the rule corresponding to the data to be analyzed is an alarm type, determining the data to be analyzed as the alarm data to be judged to obtain an alarm data set to be judged;

performing first aggregation processing according to the alarm type corresponding to each alarm data to be judged in the alarm data set to be judged to obtain a plurality of alarm data subsets to be judged; each alarm data subset to be judged in the plurality of alarm data subsets to be judged corresponds to different alarm types; each alarm data to be judged in each alarm data subset to be judged corresponds to the same alarm type;

aiming at each alarm data subset to be judged: and performing second aggregation processing to obtain alarm data corresponding to the alarm data subset to be judged.

Optionally, generating an alarm prompt message based on the alarm data includes:

acquiring historical reference data;

determining a change rate according to the alarm data and the historical reference data;

and if the change rate does not meet the preset range, generating corresponding alarm prompt information.

Optionally, the aggregating, based on the type of the rule corresponding to each piece of data to be analyzed, of the matching data set, further includes:

for each data to be analyzed: if the type of the rule corresponding to the data to be analyzed is a statistical type, determining the data to be analyzed as the data to be counted to obtain a data set to be counted;

carrying out aggregation processing on the data set to be counted;

and sending the aggregated data to be counted to a preset storage area for storage.

In another aspect, an embodiment of the present application provides a data processing apparatus, including:

the first acquisition module is used for acquiring a data set to be analyzed in a preset format;

the second acquisition module is used for acquiring a rule set; the rule set comprises a plurality of rules and the identification and type of each rule in the plurality of rules;

the first determining module is used for matching the data set to be analyzed with the rule set according to the mapping relation to obtain a matched data set; the mapping relation comprises an identifier of each piece of data to be analyzed in the data set to be analyzed and an identifier of a rule corresponding to the identifier of each piece of data to be analyzed; the matching data set comprises each data to be analyzed and the type of the rule corresponding to each data to be analyzed;

the second determining module is used for carrying out aggregation processing on the matched data set based on the type of the rule corresponding to each piece of data to be analyzed to obtain alarm data; the type of the rule corresponding to each data to be analyzed in the alarm data is an alarm type;

and the generating module is used for generating alarm prompt information based on the alarm data.

In another aspect, an embodiment of the present application provides an apparatus, where the apparatus includes a processor and a memory, where the memory stores at least one instruction or at least one program, and the at least one instruction or the at least one program is loaded by the processor and executes the data processing method.

In another aspect, an embodiment of the present application provides a computer storage medium, where at least one instruction or at least one program is stored in the storage medium, and the at least one instruction or the at least one program is loaded and executed by a processor to implement the data processing method.

The data processing method, the device, the equipment and the storage medium provided by the embodiment of the application have the following beneficial effects:

acquiring a data set to be analyzed in a preset format; acquiring a rule set; the rule set comprises a plurality of rules and the identification and type of each rule in the plurality of rules; matching the data set to be analyzed with the rule set according to the mapping relation to obtain a matched data set; the mapping relation comprises an identifier of each piece of data to be analyzed in the data set to be analyzed and an identifier of a rule corresponding to the identifier of each piece of data to be analyzed; the matching data set comprises each data to be analyzed and the type of the rule corresponding to each data to be analyzed; performing aggregation processing on the matched data set based on the type of the rule corresponding to each data to be analyzed to obtain alarm data; the type of the rule corresponding to each data to be analyzed in the alarm data is an alarm type; and generating alarm prompt information based on the alarm data. Through the steps, the log data are reasonably analyzed and processed, and the effectiveness of the log data processing process can be improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic flowchart of a data processing method provided in an embodiment of the present application;

fig. 2 is a schematic flowchart of a process for acquiring a data set to be analyzed in a preset format according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a data processing framework provided by an embodiment of the present application;

FIG. 4 is a schematic flow chart of how to obtain a matching data set according to an embodiment of the present application;

FIG. 5 is a schematic flow chart of how to obtain alarm data according to an embodiment of the present application;

fig. 6 is a schematic flowchart of a data processing method according to an embodiment of the present application;

FIG. 7 is a schematic flow chart illustrating a process of generating alarm notification information based on alarm data according to an embodiment of the present application;

fig. 8 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;

fig. 9 is a hardware block diagram of a server in a data processing method according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The data processing object in the present application mainly refers to log data, and the log data may include system log data, application log data, and security log data. System operation and development personnel can know the software and hardware information of the server through log data, check errors in the configuration process and cause of the errors, and therefore take measures to correct the errors in time; operators can know various service indexes through log data, so that the actual service problem is better solved.

In the embodiment of the application, a data set to be analyzed and a rule set in a preset format are obtained through a server; the rule set comprises a plurality of rules and the identification and type of each rule in the plurality of rules; then, the server matches the data set to be analyzed with the rule set according to the mapping relation to obtain a matched data set; the mapping relation comprises an identifier of each piece of data to be analyzed in the data set to be analyzed and an identifier of a rule corresponding to the identifier of each piece of data to be analyzed; the matching data set comprises each data to be analyzed and the type of the rule corresponding to each data to be analyzed; then, the server carries out aggregation processing on the matched data set based on the type of the rule corresponding to each data to be analyzed to obtain alarm data; the type of the rule corresponding to each data to be analyzed in the alarm data is an alarm type; and finally, the server generates alarm prompt information based on the alarm data. Optionally, the server may include an independent physical server, or a server cluster or a distributed system formed by a plurality of physical servers, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform, and the like.

While specific embodiments of a data processing method according to the present application are described below, fig. 1 is a schematic flow chart of a data processing method according to the embodiments of the present application, and the present specification provides the method operation steps according to the embodiments or the flow chart, but more or less operation steps may be included based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In practice, the system or server product may be implemented in a sequential or parallel manner (e.g., parallel processor or multi-threaded environment) according to the embodiments or methods shown in the figures. Specifically, as shown in fig. 1, the method may include:

s101: and acquiring a data set to be analyzed in a preset format.

In the embodiment of the application, the data set to be analyzed comprises a plurality of pieces of log data, and the structure of each piece of log data in the plurality of pieces of log data is in a preset format.

In an alternative embodiment, the step S101, as shown in fig. 2, may include:

s1011: acquiring log data corresponding to the index type from the message middleware;

s1012: and analyzing the log data to obtain the data to be analyzed in a preset format.

Specifically, as shown in fig. 3, fig. 3 is a schematic diagram of a data processing framework provided in an embodiment of the present application. The message middleware can be Kafka, which is a high-throughput distributed publish-subscribe message system, and can process large amount of data in real time to meet various demand scenarios; the method includes the steps that a centralized log management mode is utilized in advance, log data on all servers are collected through filebeat, and then the log data are sent to a corresponding type (topic) in Kafka; for example, index data appearing in an actual service scene is stored in an index type, and when the index data needs to be analyzed, corresponding log data can be obtained according to the index type; the preset format may be a key-value format; analyzing the acquired log data to obtain data in a key-value format, wherein the acquired original log data to be analyzed is 2021-05-1022: 29:35.299 Hangzhou | a official | 1', and after analysis, the acquired data to be analyzed in the key-value format is { "city": Hangzhou "," service line ": a official", "order quantity": 100 "}; in this way, the data set to be analyzed based on the key-value format is obtained by analyzing the acquired multiple pieces of original log data.

S103: acquiring a rule set; the rule set includes a plurality of rules and an identification, a type of each rule in the plurality of rules.

In the embodiment of the present application, the rule set refers to rule flow data. As shown in fig. 3, the present application processes streaming data based on a Flink framework, that is, obtains regular stream data from an external system, and then matches and assembles a plurality of pieces of log data with the regular stream data, so as to implement combination of the regular stream and the data stream, and a specific processing logic may be defined in a processing function.

In an optional implementation manner, the step S103 may include:

s1031: acquiring broadcast information from a preset storage area; the broadcast information includes a set of rules;

Specifically, as shown in fig. 3, a data source is predefined as a source of rule flow data, for example, a Mysql database or an Elasticsearch data may be used as the data source, the rule flow data is broadcast periodically, and a thread is set to acquire a rule set periodically, so that the rule can be updated in near real time.

S105: matching the data set to be analyzed with the rule set according to the mapping relation to obtain a matched data set; the mapping relation comprises an identifier of each piece of data to be analyzed in the data set to be analyzed and an identifier of a rule corresponding to the identifier of each piece of data to be analyzed; the matching data set comprises each data to be analyzed and the type of the rule corresponding to each data to be analyzed.

In the embodiment of the application, a mapping relation is configured in advance, and the mapping relation comprises an identifier of each piece of data to be analyzed in a data set to be analyzed and an identifier of a rule corresponding to the identifier of each piece of data to be analyzed, so that the identifier of the rule corresponding to the identifier of each piece of data to be analyzed can be determined according to the mapping relation according to the identifier of each piece of data to be analyzed in the data set to be analyzed, and then the rule corresponding to the identifier of the rule is assembled with the data to be analyzed to form new matching data, so that a matching data set is obtained; each rule carries a type field, so that the matching data set can include each data to be analyzed and the type of the rule corresponding to each data to be analyzed.

In an optional embodiment, the identifier of each piece of data to be analyzed includes an application identifier and a data identifier; the application identifier represents the name of a service corresponding to the data to be analyzed, and the data identifier represents the number of fields corresponding to the data to be analyzed; then, the step S105, as shown in fig. 4, may include:

s1051: for each data to be analyzed in the data set to be analyzed: determining the identifier of the rule corresponding to each data to be analyzed from the mapping relation according to the application identifier and the data identifier of each data to be analyzed;

s1052: and acquiring a rule corresponding to each data to be analyzed from the rule set according to the identifier of the rule corresponding to each data to be analyzed, and assembling the rule corresponding to each data to be analyzed and each data to be analyzed to obtain a matched data set.

Specifically, the mapping relationship can be shown in the following table:

wherein, the application name A and the application name B are application identifiers, and Rule1, Rule2 and Rule3 represent identifiers of different rules; determining a corresponding rule through an application identifier and a data identifier carried by data to be analyzed; for example, the application name a may be dispatcher, and correspondingly, the data identifier 1 may be dispatcher-ten-field, the data identifier 2 may be dispatcher-nine-field, and the data identifier 1 and the data identifier 2 represent log data having different field numbers; in the mapping relationship, the data identifier and the rule identifier are in a one-to-many relationship, that is, it indicates that each log data identifier may correspond to multiple rules, the multiple rules have different types, such as an alarm type and a statistic type, and the same log data is matched and assembled with the different types of rules to implement subsequent different business logic calculations. It should be noted that, according to the actual application scenario, different applications may freely configure a plurality of statistics and alarm rules.

S107: performing aggregation processing on the matched data set based on the type of the rule corresponding to each data to be analyzed to obtain alarm data; the type of the rule corresponding to each data to be analyzed in the alarm data is an alarm type.

S109: and generating alarm prompt information based on the alarm data.

In the embodiment of the application, as shown in fig. 3, by using the window characteristics of the Flink frame, a time window can be defined very conveniently to count events within a certain time, aggregation processing is performed on a matching data set according to the type of a rule corresponding to each data to be analyzed, alarm data is obtained, and when the time reaches a threshold value within the time window, alarm prompt information is generated.

In an alternative embodiment, step S107, as shown in fig. 5, may include:

s1071: for each data to be analyzed: if the type of the rule corresponding to the data to be analyzed is an alarm type, determining the data to be analyzed as the alarm data to be judged to obtain an alarm data set to be judged;

s1072: performing first aggregation processing according to the alarm type corresponding to each alarm data to be judged in the alarm data set to be judged to obtain a plurality of alarm data subsets to be judged; each alarm data subset to be judged in the plurality of alarm data subsets to be judged corresponds to different alarm types; each alarm data to be judged in each alarm data subset to be judged corresponds to the same alarm type;

s1073: aiming at each alarm data subset to be judged: and performing second aggregation processing to obtain alarm data corresponding to the alarm data subset to be judged.

The above alternative embodiment is described below by way of a specific example. Based on the mapping relationship shown in the above table, it is assumed that the data set to be analyzed includes the following data based on the key-value format:

if the identifier of the data 1 includes the application name a and the data identifier 1, after matching and assembling are performed according to the mapping relationship, two pieces of matching data can be obtained, that is, the matching data 1: data 1+ Rule1 and match data 2: data 1+ Rule 2;

if the identifier of the data 2 includes the application name a and the data identifier 2, matching and assembling are performed according to the mapping relationship to obtain matching data, that is, the matching data 3: data 2+ Rule 3;

if the identifier of the data 3 includes the application name B and the data identifier 1, matching and assembling are performed according to the mapping relationship to obtain matching data, that is, the matching data 4: data 3+ Rule 1;

during assembly, a new data number can be allocated to the matched data so as to be used for distribution in the following; the problem that performance bottleneck is caused by the fact that one application of subsequent aggregation operation is excessively concentrated on one thread for processing;

if the types of Rule1 and Rule2 are alarm types, the matching data 1, the matching data 2 and the matching data 4 are alarm data to be judged, and the matching data 1, the matching data 2 and the matching data 4 form an alarm data set to be judged; because the matching data 1 and the matching data 4 are the same Rule, the alarm types are necessarily the same, and assuming that Rule2 is an alarm type different from Rule1, two alarm data subsets to be judged are obtained after the first aggregation processing, namely subset 1{ matching data 1, matching data 4} and subset 2{ matching data 2 }; and during the second set processing, calculating all the matched data in the subset 1 according to the definition of Rule1 to obtain the alarm data corresponding to the subset 1, and calculating the matched data in the subset 2 according to the definition of Rule2 to obtain the alarm data corresponding to the subset 2.

In an alternative embodiment, step S107, as shown in fig. 6, may further include:

s1074: for each data to be analyzed: if the type of the rule corresponding to the data to be analyzed is a statistical type, determining the data to be analyzed as the data to be counted to obtain a data set to be counted;

s1075: carrying out aggregation processing on the data set to be counted;

s1076: and sending the aggregated data to be counted to a preset storage area for storage.

Specifically, the description is continued based on the above example. As shown in fig. 3, assuming that the type of Rule3 is a statistical type, determining matching data 3 as data to be counted, where the matching data 3 constitutes a data set to be counted; and calculating the matching data 3 according to the definition of Rule3 to obtain data to be counted, and then sending the data to be counted to a preset storage area for storage. The preset memory region may be infiluxdb. Influxdb is an open-source distributed time sequence, time and index database, and is commonly used for statistics of monitoring data and the like.

In an alternative embodiment, step S109, as shown in fig. 7, may include:

s1091: acquiring historical reference data;

s1092: determining a change rate according to the alarm data and the historical reference data;

s1093: and if the change rate does not meet the preset range, generating corresponding alarm prompt information.

Specifically, each piece of data to be analyzed carries timestamp information, alarm data is obtained by aggregating through a preset time window, in order to further determine whether the alarm data is abnormal, historical reference data is obtained, the historical reference data can be alarm data in the past in the same time period, then the current alarm data and the alarm data in the past in the same time period are calculated, the change rate is determined, then the change rate is compared with a preset range, and if the change rate does not meet the preset range, alarm prompt information is generated to inform a user. For example, alarm data { "city": hangzhou "," business line ": a official business", "order quantity": 100 "} statistics the order quantity according to the alarm rule according to the city, the business line, and the dimension of 1 minute (the time dimension is configurable), and then raises or lowers the order quantity ring by more than 50% than yesterday, and generates alarm prompt information based on the alarm data to inform the user of paying attention in time.

In an optional embodiment, after step S109, the method may further include:

s111: and sending the alarm prompt information to the message middleware to inform the user through the message middleware.

Specifically, as shown in fig. 3, the message middleware is Kafka, and when generating the alarm prompt message, the message middleware may generate a corresponding alarm level, and Kafka may send the alarm prompt message to different terminals or different terminal applications bound by the user according to the priority of the alarm level. For example, when the alarm level priority is higher, the terminal user is notified in a telephone form, and when the alarm level priority is lower, the terminal user is notified in a short message notification or application message push manner.

In an optional embodiment, after step S109, the method may further include:

s113: configuring statistical dimensions and statistical fields for alarm data and/or statistical data according to corresponding console service customization, and storing the statistical data in a database; thus, the data volume needing to be stored can be greatly reduced;

s115: configuring a continuous curve graph of data in a large-scale display database through grafana;

specifically, the grafana can realize visual monitoring and analysis capability. Index curves can be configured on the grafana as required, and index trends can be observed.

According to the data processing method provided by the embodiment of the application, Kafka can be used for effectively clipping peaks and filling valleys, so that the situation that a program is crashed due to the fact that data cannot be processed when the flow of the program is suddenly increased is prevented, the back pressure function of the Flink can control the inflow of data, and therefore the situation that the program cannot be processed to cause the crash when the data is large in volume can be avoided; and an alarm can be given when an abnormal index appears, and the amount of data which is aggregated and then stored in a warehouse is reduced, so that the utilization rate of machine resources can be reduced.

An embodiment of the present application further provides a data processing apparatus, and fig. 8 is a schematic structural diagram of the data processing apparatus provided in the embodiment of the present application, and as shown in fig. 8, the apparatus includes:

a first obtaining module 801, configured to obtain a data set to be analyzed in a preset format;

a second obtaining module 802, configured to obtain a rule set; the rule set comprises a plurality of rules and the identification and type of each rule in the plurality of rules;

a first determining module 803, configured to match the data set to be analyzed with the rule set according to the mapping relationship, so as to obtain a matching data set; the mapping relation comprises an identifier of each piece of data to be analyzed in the data set to be analyzed and an identifier of a rule corresponding to the identifier of each piece of data to be analyzed; the matching data set comprises each data to be analyzed and the type of the rule corresponding to each data to be analyzed;

a second determining module 804, configured to aggregate the matching data sets based on the type of the rule corresponding to each piece of data to be analyzed, so as to obtain alarm data; the type of the rule corresponding to each data to be analyzed in the alarm data is an alarm type;

and a generating module 805 configured to generate an alarm prompt message based on the alarm data.

In an optional implementation manner, the first obtaining module 801 is specifically configured to: acquiring log data corresponding to the index type from the message middleware; and analyzing the log data to obtain the data to be analyzed in a preset format.

In an optional implementation manner, the second obtaining module 802 is specifically configured to: acquiring broadcast information from a preset storage area; the broadcast information includes a set of rules; the preset storage area comprises a Mysql database or an Elasticissearch database.

In an optional embodiment, the identifier of each piece of data to be analyzed includes an application identifier and a data identifier; the first determining module 803 is specifically configured to: for each data to be analyzed in the data set to be analyzed: determining the identifier of the rule corresponding to each data to be analyzed from the mapping relation according to the application identifier and the data identifier of each data to be analyzed;

In an optional implementation manner, the second determining module 804 is specifically configured to:

In an optional implementation, the generating module 805 is specifically configured to: acquiring historical reference data; determining a change rate according to the alarm data and the historical reference data; and if the change rate does not meet the preset range, generating corresponding alarm prompt information.

In an optional implementation manner, the second determining module 804 is further specifically configured to: for each data to be analyzed: if the type of the rule corresponding to the data to be analyzed is a statistical type, determining the data to be analyzed as the data to be counted to obtain a data set to be counted; carrying out aggregation processing on the data set to be counted; and sending the aggregated data to be counted to a preset storage area for storage.

The device and method embodiments in the embodiments of the present application are based on the same application concept.

The method provided by the embodiment of the application can be executed in a computer terminal, a server or a similar operation device. Taking the example of the data processing method running on a server, fig. 9 is a hardware structure block diagram of the server according to the data processing method provided in the embodiment of the present application. As shown in fig. 9, the server 900 may have a relatively large difference due to different configurations or performances, and may include one or more Central Processing Units (CPUs) 910 (the processor 910 may include but is not limited to a Processing device such as a microprocessor NCU or a programmable logic device FPGA), a memory 930 for storing data, and one or more storage media 920 (e.g., one or more mass storage devices) for storing applications 923 or data 922. Memory 930 and storage media 920 may be, among other things, transient or persistent storage. The program stored in the storage medium 920 may include one or more modules, each of which may include a series of instruction operations in a server. Still further, the central processor 910 may be configured to communicate with the storage medium 920, and execute a series of instruction operations in the storage medium 920 on the server 900. The server 900 may also include one or more power supplies 960, one or more wired or wireless network interfaces 950, one or more input-output interfaces 940, and/or one or more operating systems 921, such as Windows, Mac OS, Unix, Linux, FreeBSD, etc.

The input/output interface 940 may be used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the server 900. In one example, the input/output Interface 940 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the input/output interface 940 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.

It will be understood by those skilled in the art that the structure shown in fig. 9 is only an illustration and is not intended to limit the structure of the electronic device. For example, server 900 may also include more or fewer components than shown in FIG. 9, or have a different configuration than shown in FIG. 9.

Embodiments of the present application further provide a storage medium, which may be disposed in a server to store at least one instruction, at least one program, a code set, or a set of instructions related to implementing a data processing method in the method embodiments, where the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by the processor to implement the data processing method.

Alternatively, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.

As can be seen from the embodiments of the data processing method, the data processing apparatus, the data processing device, and the storage medium provided in the present application, a data set to be analyzed in a preset format is obtained in the present application; acquiring a rule set; the rule set comprises a plurality of rules and the identification and type of each rule in the plurality of rules; matching the data set to be analyzed with the rule set according to the mapping relation to obtain a matched data set; the mapping relation comprises an identifier of each piece of data to be analyzed in the data set to be analyzed and an identifier of a rule corresponding to the identifier of each piece of data to be analyzed; the matching data set comprises each data to be analyzed and the type of the rule corresponding to each data to be analyzed; performing aggregation processing on the matched data set based on the type of the rule corresponding to each data to be analyzed to obtain alarm data; the type of the rule corresponding to each data to be analyzed in the alarm data is an alarm type; and generating alarm prompt information based on the alarm data. Through the steps, the log data are reasonably analyzed and processed, and the effectiveness of the log data processing process can be improved.

It should be noted that: the sequence of the embodiments of the present application is only for description, and does not represent the advantages and disadvantages of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A data processing method, comprising:

acquiring a data set to be analyzed in a preset format;

acquiring a rule set; the rule set comprises a plurality of rules and an identification and a type of each rule in the plurality of rules;

matching the data set to be analyzed with the rule set according to the mapping relation to obtain a matched data set; the mapping relation comprises an identifier of each piece of data to be analyzed in the data set to be analyzed and an identifier of a rule corresponding to the identifier of each piece of data to be analyzed; the matching data set comprises each piece of data to be analyzed and the type of a rule corresponding to each piece of data to be analyzed;

based on the type of the rule corresponding to each data to be analyzed, carrying out aggregation processing on the matched data set to obtain alarm data; the type of the rule corresponding to each data to be analyzed in the alarm data is an alarm type;

and generating alarm prompt information based on the alarm data.

2. The method according to claim 1, wherein the obtaining of the data to be analyzed in the preset format comprises:

acquiring log data corresponding to the index type from the message middleware;

and analyzing the log data to obtain the data to be analyzed in the preset format.

3. The method of claim 1, wherein obtaining the set of rules comprises:

acquiring broadcast information from a preset storage area; the broadcast information includes the set of rules;

wherein the preset storage area comprises a Mysql database or an Elasticissearch database.

4. The method of claim 1, wherein the identification of each piece of data to be analyzed comprises an application identification and a data identification;

the matching the data set to be analyzed with the rule set according to the mapping relationship to obtain a matched data set, including:

and acquiring a rule corresponding to each piece of data to be analyzed from the rule set according to the identifier of the rule corresponding to each piece of data to be analyzed, and assembling the rule corresponding to each piece of data to be analyzed and each piece of data to be analyzed to obtain the matched data set.

5. The method according to claim 1, wherein the aggregating the matching data set based on the type of the rule corresponding to each piece of data to be analyzed to obtain the alarm data comprises:

for each of the data to be analyzed: if the type of the rule corresponding to the data to be analyzed is an alarm type, determining the data to be analyzed as alarm data to be judged to obtain an alarm data set to be judged;

aiming at each alarm data subset to be judged: and performing second aggregation processing to obtain the alarm data corresponding to the alarm data subset to be judged.

6. The method of claim 5, wherein generating alert prompt information based on the alert data comprises:

acquiring historical reference data;

7. The method according to claim 1, wherein the aggregating the matching data sets based on the type of the rule corresponding to each piece of data to be analyzed further comprises:

for each of the data to be analyzed: if the type of the rule corresponding to the data to be analyzed is a statistical type, determining the data to be analyzed as the data to be statistical, and obtaining a data set to be statistical;

performing aggregation processing on the data set to be counted;

8. A data processing apparatus, comprising:

the second acquisition module is used for acquiring a rule set; the rule set comprises a plurality of rules and an identification and a type of each rule in the plurality of rules;

the first determining module is used for matching the data set to be analyzed with the rule set according to the mapping relation to obtain a matched data set; the mapping relation comprises an identifier of each piece of data to be analyzed in the data set to be analyzed and an identifier of a rule corresponding to the identifier of each piece of data to be analyzed; the matching data set comprises each piece of data to be analyzed and the type of a rule corresponding to each piece of data to be analyzed;

9. An apparatus comprising a processor and a memory, wherein at least one instruction or at least one program is stored in the memory, and wherein the at least one instruction or the at least one program is loaded by the processor and executes the data processing method according to any one of claims 1 to 7.

10. A computer storage medium, in which at least one instruction or at least one program is stored, which is loaded and executed by a processor to implement the data processing method according to any one of claims 1 to 7.