CN116089454B

CN116089454B - Dynamic log analysis method and system

Info

Publication number: CN116089454B
Application number: CN202211667170.6A
Authority: CN
Inventors: 林萍萍; 王西光; 张娇; 章云鹏
Original assignee: Shandong Future Network Research Institute Industrial Internet Innovation Application Base Of Zijinshan Laboratory; Boshang Shandong Network Technology Co ltd
Current assignee: Shandong Future Network Research Institute Industrial Internet Innovation Application Base Of Zijinshan Laboratory; Boshang Shandong Network Technology Co ltd
Priority date: 2022-12-23
Filing date: 2022-12-23
Publication date: 2023-09-19
Anticipated expiration: 2042-12-23
Also published as: CN116089454A

Abstract

The invention discloses a dynamic log analysis method, which comprises the following steps: based on the K8S cluster, a configuration center is built, and a script and a resource file of a Flink analysis task and a data processing script are defined in the configuration center; receiving a log data analysis request, analyzing the log data analysis request, and determining a log data analysis requirement and a log data resource; according to the log data analysis requirement, the data processing script is called, and the log data resource is processed to form standard log data; and calling a script and a resource file of the Flink analysis task according to the log data analysis requirement, analyzing the standard log data to form a log analysis result and outputting the log analysis result. The invention realizes the visual, flexible and dynamically updatable log analysis flow configuration.

Description

Dynamic log analysis method and system

Technical Field

The invention relates to the field of computers, in particular to a dynamic log analysis method and a dynamic log analysis system.

Background

Logs are ubiquitous in software and hardware systems and are also a key source of big data. The types, styles and sources of the logs are very numerous, including client logs and server logs. The server log further comprises a running/operation maintenance log of the service and a log generated by products used by the service. To manage many types of logs, a unified log system is required to collect, process, store, query, analyze, visualize, alert and deliver consumption, and to perform closed-loop on the life cycle of the log. The log analysis is an important ring, and can realize the functions of cleaning, filtering, shunting, desensitizing, calculating and the like of log data. The continuous development of cloud computing further improves the requirements on a log system, further improves the dynamic property and complexity of the collected log in a cloud primary scene, and improves the requirements on the real-time property, the stability and the reliability of an analysis system.

Currently, the existing log analysis tools generally adopt ELK and Loki, wherein ELK (ElasticSearch, logStash, kibana) is an approximately real-time distributed full-text analysis search engine, each field is indexed into an index document to guide, all objects are high-availability documents and easy to expand, clusters (clusters), shards and Replicas (Shards and Replicas) are supported, and the visualization of log data can be realized by matching with Kibana, including functions of log retrieval, chart display of each dimension and the like. Logstack is a powerful data processing tool that is capable of formatting data and outputting it in a standardized format. ELKs have a very large number of ecological and writing tools, installation, configuration, etc. and many more tools are available. However, ELK architecture is relatively complex, has many components, cannot flexibly utilize computing resources, is complex to deploy and maintain, and has high learning cost.

Loki is a lightweight log acquisition and analysis scheme which is developed in recent years, and adopts labels like promethaus as indexes, so that the content of a log can be queried through the labels, monitored data can be queried, the switching cost between two queries is reduced, and the storage of log indexes is greatly reduced. The costs consumed are lower and cost-effective compared to ELK. Grafana can be used in combination with log collection and visualization to realize the functions of screening and checking uplink and downlink on the log. However, the existing function is single, the screening has good performance only aiming at the checking of the log, the ELK is not strong for processing and cleaning the data, the ELK can be combined with various technologies to process the big data of the log, but Loki has no capability in the aspect of complex data processing. Furthermore, loki does not store full text indexes, and queries all rely on Labels; meanwhile, loki has no authentication function, and needs to rely on a third party system to realize functions such as user management and authority authentication.

That is, the existing log analysis has the problems of fixed architecture, difficult expansion and contraction, and incapability of dynamically changing analysis flow or analysis flow parameters, and each time the log analysis flow or the log analysis parameters are modified, the service is restarted, so that the analysis service is disconnected, the user experience is unfriendly, the architecture is complex, the learning cost is high, the utilization rate of computing resources is low, and the analysis task cannot be controlled in fine granularity.

For the problems in the related art, no effective solution has been proposed at present.

Disclosure of Invention

Aiming at the problems in the related art, the invention provides a dynamic log analysis method and a system, which are used for solving the technical problems existing in the prior related art.

The technical scheme of the invention is realized as follows:

according to an aspect of the present invention, a dynamic log analysis method is provided.

The dynamic log analysis method comprises the following steps:

based on the K8S cluster, a configuration center is built, and a script and a resource file of a Flink analysis task and a data processing script are defined in the configuration center;

receiving a log data analysis request, analyzing the log data analysis request, and determining a log data analysis requirement and a log data resource;

according to the log data analysis requirement, the data processing script is called, and the log data resource is processed to form standard log data;

and calling a script and a resource file of the Flink analysis task according to the log data analysis requirement, analyzing the standard log data to form a log analysis result and outputting the log analysis result.

Wherein the data processing script comprises: the method comprises a data filtering script, a field processing script, a data type conversion script, a date conversion script, a log segmentation script, a log condition judging script and a data distribution script.

The method for processing the log data resources according to the log data analysis requirement comprises the steps of:

according to the log data analysis requirements, the data filtering script is called, and data which do not meet the log data analysis requirements in the log data are filtered;

according to the log data analysis requirement, the field processing script is called to modify the data field in the log data, so that the log data is caused to meet the log data analysis requirement;

according to the log data analysis requirement, the data type conversion script is called to convert the data type in the log data, so that the log data is caused to meet the log data analysis requirement;

according to the log data analysis requirement, the date conversion script is called to convert the date in the log data, so that the log data is caused to meet the log data analysis requirement;

according to the log data analysis requirement, the log segmentation script is called, character segmentation is carried out on data in the log data, and log data corresponding to the log data analysis requirement is formed;

according to the log data analysis requirement, the log condition judgment script is called, and data condition judgment is carried out on the data in the log data to form log data with different data conditions;

and according to the log data analysis requirement, the data distribution script is called, and data stream distribution is carried out on the log data forming different data conditions.

In addition, the dynamic log analysis method further comprises the following steps: and counting the log data and the log analysis result to form a statistical result.

In addition, the dynamic log analysis method further comprises the following steps: and carrying out structured storage on the log data, the log analysis result and the statistical result, and configuring a link mode, an authentication mode, a user name and a password.

According to another aspect of the present invention, a dynamic log analysis system is provided.

The dynamic log analysis system comprises:

the configuration center building module is used for building a configuration center based on the K8S cluster, and defining scripts and resource files of the Flink analysis tasks and data processing scripts in the configuration center;

the log analysis module is used for receiving a log data analysis request, analyzing the log data analysis request and determining a log data analysis requirement and a log data resource;

the data processing module is used for calling the data processing script according to the log data analysis requirement and processing the log data resource to form standard log data;

and the log analysis module is used for calling scripts and resource files of the Flink analysis task according to the log data analysis requirement, analyzing the standard log data to form a log analysis result and outputting the log analysis result.

In addition, the dynamic log analysis system further includes: and the statistics module is used for carrying out statistics on the log data and the log analysis result to form a statistics result.

In addition, the dynamic log analysis system further includes: and the storage module is used for carrying out structural storage on the log data, the log analysis result and the statistical result, and configuring a link mode, an authentication mode, a user name and a password.

The beneficial effects are that:

according to the invention, the configuration center is deployed, and different data processing scripts are configured in the configuration center, so that a user can call different data processing scripts in a dragging mode to process data to form standard data, after the standard data is obtained, the scripts and resource files of a Flink analysis task can be directly called to analyze the data, a log analysis result is obtained, and further, the visualized, flexible and dynamically updatable log analysis flow configuration is realized; the invention is configured based on the K8S cluster, so that a user can apply the updated log analysis flow and analysis parameters to the log analysis task in real time or in a delayed manner, and the analysis task can be applied in real time without restarting; the analysis task based on K8S can flexibly apply for computing resources, and the resources are released after the task is canceled or ended, so that the utilization rate of the resources is improved; each analysis task has high isolation and does not affect each other.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method of dynamic log analysis according to an embodiment of the present invention;

FIG. 2 is a block diagram of a dynamic log analysis system according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which are derived by a person skilled in the art based on the embodiments of the invention, fall within the scope of protection of the invention.

According to the embodiment of the invention, a dynamic log analysis method and a system are provided.

As shown in fig. 1, a method for analyzing a dynamic log according to an embodiment of the present invention includes:

step S101, a configuration center is built based on a K8S cluster, and a script and a resource file of a Flink analysis task and a data processing script are defined in the configuration center;

step S103, receiving a log data analysis request, analyzing the log data analysis request, and determining a log data analysis requirement and a log data resource;

step S105, according to the log data analysis requirement, the data processing script is called, and the log data resource is processed to form standard log data;

and step S107, calling scripts and resource files of the Flink analysis task according to the log data analysis requirement, analyzing the standard log data to form a log analysis result and outputting the log analysis result.

In one embodiment, the data processing script comprises: the method comprises a data filtering script, a field processing script, a data type conversion script, a date conversion script, a log segmentation script, a log condition judging script and a data distribution script.

In one embodiment, according to the log data analysis requirement, calling the data processing script, and processing the log data resource to form standard log data includes: according to the log data analysis requirements, the data filtering script is called, and data which do not meet the log data analysis requirements in the log data are filtered; according to the log data analysis requirement, the field processing script is called to modify the data field in the log data, so that the log data is caused to meet the log data analysis requirement; according to the log data analysis requirement, the data type conversion script is called to convert the data type in the log data, so that the log data is caused to meet the log data analysis requirement; according to the log data analysis requirement, the date conversion script is called to convert the date in the log data, so that the log data is caused to meet the log data analysis requirement; according to the log data analysis requirement, the log segmentation script is called, character segmentation is carried out on data in the log data, and log data corresponding to the log data analysis requirement is formed; according to the log data analysis requirement, the log condition judgment script is called, and data condition judgment is carried out on the data in the log data to form log data with different data conditions; and according to the log data analysis requirement, the data distribution script is called, and data stream distribution is carried out on the log data forming different data conditions.

In one embodiment, the dynamic log analysis method further comprises: counting the log data and the log analysis result to form a statistical result; and carrying out structured storage on the log data, the log analysis result and the statistical result, and configuring a link mode, an authentication mode, a user name and a password.

As shown in fig. 2, a dynamic log analysis system according to an embodiment of the present invention includes:

the central configuration module 201 is configured to build a configuration center based on the K8S cluster, and define a script and a resource file of a flank parsing task and a data processing script in the configuration center;

the log analysis module 203 is configured to receive a log data analysis request, analyze the log data analysis request, and determine a log data analysis requirement and a log data resource;

the data processing module 205 is configured to invoke the data processing script according to the log data analysis requirement, and process the log data resource to form standard log data;

and the log analysis module 207 is configured to call a script and a resource file of a link analysis task according to the log data analysis requirement, analyze the standard log data, form a log analysis result, and output the log analysis result.

In one embodiment, the dynamic log analysis system further comprises: and the statistics module (not shown in the figure) is used for counting the log data and the log analysis result to form a statistics result. And the storage module (not shown in the figure) is used for carrying out structural storage on the log data, the log analysis result and the statistical result, and configuring a link mode, an authentication mode, a user name and a password.

In order to facilitate understanding of the above technical solutions of the present invention, the following describes the above technical solutions of the present invention in detail by specific application.

The core of the invention may be described in terms of several aspects for the invention, including in particular: visual analysis flow configuration and analysis parameter configuration, a high-availability configuration center, a Flink analysis flow forming processing tree, configuration dynamic update, configuration delay update and analysis task dynamic application of computing resources.

For visual analysis flow configuration and analysis parameter configuration, the invention provides visual log analysis flow configuration which is easy to use for a user, and the user forms a completed log processing flow by dragging a processing module at the front end, and can output the log processing flow to a designated target end according to the configuration. The system provides a plurality of general processing function modules, such as a log filtering module, a distribution module, a segmentation module, an aggregation statistics module and the like, and a user can also customize the processing module according to the needs. The general processing module mainly comprises:

the data source module refers to the source of log data processed by the log analysis system, at present, the data source module defaults to Kafka, a user needs to drag and drop the data source module to an operation area, and configures the address, the authentication mode, the partition number and the like of a target Kafka, and the user can also customize the expected data source module and realize related logic.

The log filtering module realizes a simple filtering function, supports inclusion, non-inclusion, beginning with a character string, end with a character string, regular matching and the like, and only log data meeting corresponding conditions can flow into the next processing flow, and unsatisfied log data can be discarded or flow into other processing modules.

The field processing module is mainly used for modifying the log field, comprising adding, modifying, removing, converting data types and the like, and a user adds any field according to the requirement by using the field, so that the log data can be flexibly processed.

The data type conversion module mainly completes the data type conversion of a certain field in log data, is generally used for formatting the data and has the main functions of: converting the character string into an int/float/long, and converting the int/float/long into the character string; int/float/long interconversions; float retains n-bit decimal, etc.

The date conversion module mainly completes conversion of date fields, and has the main function of converting time stamps into standard date character strings; converting the standard date character string into a time stamp; time zone conversion; time increases by an offset, etc.

The log segmentation module completes the function of log segmentation, segments the log according to the segmentation characters defined by the user, names of each segmented segment are customized by the user, and after the segmentation is completed, the user can configure whether to delete the original field. In addition to single character slicers, the present invention also provides sequential character slicers, multi-character slicers, and the like.

The condition judging module is a core module for completing the shunting function, mainly completing the function of condition judgment, and can judge the condition according to a certain field, enter the data stream A conforming to the condition, enter the data stream B not conforming to the condition, and the most typical scene is that log data is shunted according to the log level, so that the log stream data of different levels are respectively processed.

And the shunting module is used for shunting according to the execution result after the condition judgment module judges that the flow is completed, and then configuring a corresponding processing flow in each flow.

The window statistics module realizes a statistics function, the user-configurable window types are two types of a rolling window and a sliding window, the window size, the sliding time and the like are defined by a user, and statistical data form a new data stream and configure a responding processing flow.

And the output module is used for providing output target ends such as ElasticSearch, kafka, HBase, mySQL by default, a user needs to configure a persistence mode of response according to needs, and connection data such as an address, authentication information and the like need to be configured in the corresponding processing module.

For a high-availability configuration center, the dynamic update of the log analysis flow and the log analysis parameters is the core content of the invention. After the front end completes the configuration of the processing flow, the user can generate standard JSON data, the analysis service can construct a processing tree of the log flow according to the JSON data, and parameters of each processing module are also configured into corresponding JSONs. The JSON data is persisted to the configuration center of the current tenant, and the configuration center completes the analysis flow and the updating of analysis parameters in analysis service. The invention adopts Nacos as a configuration center and is high in availability deployment, and adopts a JSON data format as storage.

For the processing tree formed by the flank analysis flow, a plurality of processing modules are predefined in the flank, and the processing modules correspond to the processing modules at the front end, when an analysis task is started, the processing flow data in the JSON format is pulled from the configuration center, and analysis is performed recursively, so that a complete processing tree is formed.

For the dynamic update of the configuration, when the configuration is updated, the log analysis service and the configuration center keep long connection, so that the analysis service can perceive the change of the configuration, and the two types are mainly adopted. When the analysis flow changes, the analysis task will pull the new configuration and re-reconstruct the analysis tree; when the analysis parameters change, the change takes effect immediately.

For configuration delay updating, the updating configuration analysis task pushed to the configuration center can be perceived in real time, and the invention sets a delay task function at the rear end, so that the user can select the modification delay to take effect. After the configuration delay is effective, when the designated time is reached, the system pushes the update to the configuration center, thereby completing the update of the processing logic.

For analyzing task dynamic application computing resources, the invention is based On the Native K8S mode of the Flink On K8S, which is equivalent to that each task monopolizes the Flink cluster, when the task is started, the corresponding container resources are applied according to the parallelism configured by the user at the front end, the computing slots are obtained, and when the configuration such as the parallelism is modified, the resources are reapplied. When the task is ended or canceled, the container is automatically deleted, and the computing resource is released.

During specific operation, the corresponding system configuration and operation can be performed according to the following steps:

1. preparing at least 3 hosts, and building a high-availability Nacos configuration center; setting env and groupId;

2. preparing a k8s cluster, defining namespace, serviceaccount, rolebinding, and configuring serviceaccount to have the authority to create and delete pod; defining a script and a resource file for starting a Flink analysis task;

3. definition processing module:

the definition analysis rule data source module can be multiple, for example, the kafka module needs information such as kafka connection address, authentication mode, user name and password and the like, and finally a json file can be generated. The data source module must be the topmost module.

a. Defining a filtering module, wherein the filtering type options of the module are as follows: contains a contact; no non contact is included; start with string; ending with a string; reg is matched regularly; log data meeting the condition can enter the next processing module.

b. The definition field processing module is used for modifying the log field and comprises the steps of adding add and designating added field names and values; modifying change, and designating a target field name and a new value; reject delete, specify delete field name; a data type conversion transform, which designates a target field name and a target data type for data formatting processing; date conversion, namely converting the time stamp into a standard date character string; converting the standard date character string into a time stamp; time zone conversion; the time increases by the offset.

c. Defining a log segmentation module, wherein the main segmentation functions are as follows: single character segmentation, namely specifying segmentation character separation and simultaneously specifying the name of each segmented segment; multi-character segmentation, namely designating a plurality of segmented characters and designating names of each segment after segmentation; sequential character segmentation, namely sequentially designating segmentation characters and designating names of each segment after segmentation.

d. Defining a condition judgment module, designating a conditional expression, namely an entry data stream A conforming to a condition, and an entry data stream B not conforming to the condition, wherein the most typical scene is that log data is shunted according to log levels, so that the log stream data of different levels are respectively processed;

e. defining an aggregation module to realize a statistical function, in the aggregation module, calculating a statistical result according to a specified rule, for example, counting the number of error logs in 5min, counting once every 1min, namely, a sliding window with a window size of 5min and a sliding distance of 1min, and outputting the statistical result to a specified target end. Configurable are: window types, including a rolling window and a sliding window; window size and units, in seconds, minutes, hours; the sliding size of the sliding window;

f. defining an output persistence module, a final log storage mode ElasticSearch, kafka, HBase, mySQL and the like, and configuring a connection address, authentication information and the like similarly to a data source module;

4. the front-end console drag-and-drop assembly forms a complete analysis flow by using the various processing modules defined in the step 3, submits and generates json configuration, and each module can independently define the concurrency number of processing;

5. checking and testing the combined analysis flow in the step 4, if the checking is not passed, prompting modification, and if the checking is passed, storing the checking into a configuration center, wherein the dataId is fixed as a parameter_process;

6. after the analysis flow is configured, a Flink analysis task is started, the json analysis flow configuration stored in the configuration center in the step 5 is pulled, and a log stream analysis tree is constructed recursively;

7. applying container resources to the k8s cluster after the log analysis tree is constructed in the last step, and starting Pod of an analysis task after the application is successful;

8. the started Flink processing task can keep long connection with the configuration center, and if the processing flow is modified, the Flink task can sense and pull a new configuration real-time application; each time the processing flow is modified, whether delay is effective or not can be specified, and if not, the modification is effective in real time; if the time delay is selected to be effective, the effective time is required to be designated, and when the time arrives, the configuration is pushed to the configuration center. If the size or type of the window is modified in the processing flow, the Flink can regenerate the parse tree, and the whole task is dynamically applied without restarting.

According to the scheme, the configuration center is deployed, and different data processing scripts are configured in the configuration center, so that a user can call the different data processing scripts in a dragging mode to process data to form standard data, after the standard data is obtained, the scripts and resource files of the Flink analysis task can be directly called to analyze the data to obtain a log analysis result, and further visual, flexible and dynamically updatable log analysis flow configuration is realized; the invention is configured based on the K8S cluster, so that a user can apply the updated log analysis flow and analysis parameters to the log analysis task in real time or in a delayed manner, and the analysis task can be applied in real time without restarting; the analysis task based on K8S can flexibly apply for computing resources, and the resources are released after the task is canceled or ended, so that the utilization rate of the resources is improved; each analysis task has high isolation and does not affect each other.

The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the invention.

Claims

1. A method of dynamic log analysis, comprising:

according to the log data analysis requirement, calling a script and a resource file of a Flink analysis task, analyzing the standard log data to form a log analysis result and outputting the log analysis result;

the data processing script comprises: the method comprises the following steps of a data filtering script, a field processing script, a data type conversion script, a date conversion script, a log segmentation script, a log condition judging script and a data distribution script;

according to the log data analysis requirement, the data processing script is called, the log data resource is processed, and the forming of standard log data comprises the following steps:

according to the log data analysis requirement, the data distribution script is called, and data stream distribution is carried out on log data forming different data conditions;

counting the log data and the log analysis result to form a statistical result;

and carrying out structured storage on the log data, the log analysis result and the statistical result, and configuring a link mode, an authentication mode, a user name and a password.

2. A dynamic log analysis system, comprising:

the log analysis module is used for calling scripts and resource files of the Flink analysis task according to the log data analysis requirements, analyzing the standard log data to form a log analysis result and outputting the log analysis result;

the statistics module is used for carrying out statistics on the log data and the log analysis result to form a statistics result;

and the storage module is used for carrying out structural storage on the log data, the log analysis result and the statistical result, and configuring a link mode, an authentication mode, a user name and a password.