CN117971606A

CN117971606A - Log management system and method based on elastic search

Info

Publication number: CN117971606A
Application number: CN202410372592.3A
Authority: CN
Inventors: 王冠; 陈潜; 汪宋; 李星; 申奥
Original assignee: Yiqiyin Hangzhou Technology Co ltd; China Zheshang Bank Co Ltd
Current assignee: Yiqiyin Hangzhou Technology Co ltd; China Zheshang Bank Co Ltd
Priority date: 2024-03-29
Filing date: 2024-03-29
Publication date: 2024-05-03

Abstract

The invention discloses a log management system and method based on elastic search, comprising two components of a data source end and a platform terminal, wherein the log management system comprises 8 modules in total, namely acquisition, filtration, storage, management, treatment, search, clustering and topology, so that a full life cycle solution of log acquisition, storage, management, search, cluster analysis, anomaly detection and visual treatment is realized, functional domains such as fine grain authority control, anomaly content analysis, rule association alarm and service link tracking are covered, and the problems of difficult storage of massive scattered unstructured data, original inefficiency of a query mode, lack of management means, insufficient utilization rate of service association information and the like are promoted and solved, thereby being beneficial to improving service perception, investigation, treatment and backtracking capacity and daily operation and maintenance work efficiency.

Description

Log management system and method based on elastic search

Technical Field

The invention belongs to the technical field of computer information, and particularly relates to a log management system and method based on an elastic search.

Background

Under the background of the business fine operation guarantee in the financial and scientific field, the operation and maintenance system has higher requirements on the overall energization level of the log. Aiming at the problems of the traditional log management mode: 1) Massive log data are difficult to store in a centralized way, and log content is unstructured and is mostly dispersed; 2) The method for inquiring the log is low in efficiency, and a large number of manual inquiring operations exist; 3) The safety controllable log data management means is lacking, and the authority control and risk problems are prominent; 4) The actual utilization rate of the service association information is insufficient, and deep service operation condition analysis cannot be performed on the basis of log content; 5) The method can not provide service index visualization capability, can not push alarm information for abnormal events, and is a log management system based on an elastic search.

The elastiscearch is an open-source high-expansibility distributed full-text search engine, can realize near-real-time data storage and retrieval, has strong expansibility, establishes a data index based on an open-source Lucene word segmentation device at the bottom layer, and can hide complex logic through a restful API tool kit to simplify search requirements and support application-side expansion call for an external service interface.

Disclosure of Invention

The invention aims to provide an elastic search-based log management system which solves the objective problem existing in the traditional log management mode.

A log management system based on elastic search comprises a data source end and a platform terminal;

the data source terminal is used for accessing scattered log data, filtering the log data according to preset conditions, and unifying the nano-tube log data and state information through an elastic search cluster;

the platform terminal is constructed by adopting a micro service and a B/S system, manages log change operation in the form of an approval process and is used for reading analysis state information, message elements and monitoring data from a data source terminal and displaying the analysis state information, the message elements and the monitoring data; configuring a plurality of search models, and selecting different search models according to search requirements to perform log inquiry; after log data is acquired from a data source end by using a cluster analysis algorithm, performing format measurement, rule self-learning and result evaluation on logs with similar data structures or contents; custom topology rules and streaming calculations are used to detect abnormal log content.

Further, the data source end comprises the following modules:

and the acquisition module is used for: configuring a source end address, a log path, a log file name, maximum transmission throughput, a performance safety allowance and character set parameters, and extracting log data to a filtering module according to preset configuration parameter content after starting an acquisition module; the acquisition module monitors the state of the acquired log file in real time in a background operation timing task, and pushes updated part of log content if the state information of the log file is different;

and a filtering module: configuring various conditions to filter the log data pushed by the acquisition module, wherein the filtering conditions support grammar updating in a generalized grok mode; removing useless redundant fields by setting standard log content fields, arranging the redundant fields according to a time sequence, and merging, processing and systematically integrating the log content;

And a storage module: different log index generation templates and life cycle modes are configured according to differences of log orders, log indexes with daily average log quantity smaller than 10g can be matched with a lightweight index template and a long life cycle strategy, and log indexes with daily average log quantity larger than 10g can be matched with a normal index template and a short life cycle strategy;

Further, the platform terminal comprises a management module, a search module, a clustering module and a topology module, provides functions of flow management, state management, log search, cluster analysis and anomaly detection, and supports users to directly access the log management system from the production environment and the office environment.

Further, the management module uses the set approval nodes to approve the flow, and each node has approval, supplement and return authorities; the process flow comprises an index changing process flow, an index recovering process flow, a user authority changing process flow, a department authority changing process flow, a temporary inquiring authority process flow and a temporary removing log desensitizing process flow.

Further, the index change flow comprises newly adding an index to a log management platform, and modifying and deleting registered index information or corresponding departments;

the index recovery flow is to recover the log content of the specific index of the snapshot backup file with the specific date from the storage module;

The user authority changing flow is used for changing the index viewing authority range of a specific user;

the department authority changing flow is used for changing the index checking authority range of a specific department;

The temporary inquiry authority flow is a log content which is applied by a user to temporarily check part of self authorities;

The temporary desensitization process of the log is to temporarily apply for specific log content without desensitization, so as to meet the requirement of checking production problems according to certain specific information.

Further, the treatment module specifically comprises a service inquiry, interface dependent inquiry, link monitoring, index statistics, message summarization, status overview and topic detail sub-module;

The service inquiry sub-module is used for inquiring the micro-service registered on the registration center, supporting the prefix to match with the drop-down frame and selecting the service for inquiry, and the inquiry result comprises the application, the group name and the version of the registered micro-service and the information of the provider and the consumer of the service;

The interface dependence inquiring sub-module is used for inquiring the interface dependence relationship and displaying the result in the form of a directed graph, and the graph can check the upstream and downstream calling relationship of the inquired interface;

The link monitoring sub-module is used for forming a calling link to be identified through a globally unique link code under a micro-service system, wherein one complete transaction involves the calling of a plurality of micro-services;

The index statistics sub-module is used for collecting software and hardware indexes from the system and displaying the indexes, inquiring different times of transactions, and reporting index information to the topology module;

the message summarizing sub-module is respectively divided according to a message instance and a theme, and displays the consumption total number of the historical messages after the program is started and the production and consumption total number data of the historical messages of the theme, and is used for checking message bodies of different message IDs according to the theme and time nodes;

the status overview submodule is used for checking cluster production and consumption rate, total number, cluster status, configuration detail data, subscription group name, quantity, consumption rate, delay, brooker data, theme and production group;

The topic detail submodule is used for viewing topic lists, states, routes and consumer information.

Further, the search module configures 4 kinds of search models, namely a fuzzy search model, an accurate search model, a regular search model and an SQL-like search model;

The fuzzy search model is based on Standard Analyzer component expansion in the elastic search, adopts a search mode of dividing search content and lowercase processing according to words, does not perform special processing on common stop words, and is suitable for scenes with undefined searched content;

the accurate search model is based on the expansion of a Keyword Analyzer component in the elastic search, directly uses input content as search content to perform accurate range matching, and is suitable for scenes with clear search content;

The regular search model is based on Patter Analyzer component expansion in the elastic search, matches target content according to the principle of regular expressions, supports user customization of special regular type conditions, supports simultaneous effectiveness of a plurality of regular expressions, and is suitable for search scenes which can be matched by using the regular expressions;

The SQL-like search model is based on SQL component expansion in an elastic search, executes a query mode of SQL sentences, approximately correlates and matches bottom elements of a non-relational database in a relational condition, and is suitable for search scenes which can be correlated and matched in SQL logic.

Further, the clustering algorithm in the cluster analysis module comprises data cleaning, similarity calculation, clustering grouping and detection, and specifically comprises the following steps: and executing cleaning operation on the log data, replacing a time stamp, a log path field and a state information field in the log with null values in a regular matching mode, then respectively calculating text similarity, vector similarity and a maximum public subgraph of the log content in batches, taking an average value, screening an optimal result, merging logs with similar similarity into cluster groups, quantifying abnormal log information through an LOF detection algorithm, and performing independent grouping and marking as an abnormal state to realize general cluster analysis of the log content.

Further, the topology module detects abnormal log content by using a custom topology rule, a user firstly sets specific detection indexes, ranges, quantization thresholds and topology rules for the specific log content according to specific service requirements, a consumption end refers to an index newly-added link set by the user in a streaming calculation mode, continuously tracks log information conforming to a preselected range in real time from a storage module, abstracts the combined streaming calculation response value and the log content into log topology together, comprehensively matches preset quantization thresholds and rules to monitor service continuity and service availability, and pushes log abnormal topology content monitoring alarms to service contacts in a linkage manner once the topology rules are hit;

The flow type calculation mode is to count the extracted unstructured data indexes at any time through locating log information with strong time sequence association, the time sequence factors are calculated according to the current time sequence influence value and the autocorrelation coefficient, then the time sequence offset is obtained by combining the product time sequence influence weight value, and finally the flow type calculation response value is obtained by summarizing operation.

According to a second aspect of the present invention, there is provided a log management method based on the system, the method comprising:

The data source end collects multi-type logs, and unifies the nano-tube log data and state information through an elastic search cluster after filtering log data according to preset conditions;

The platform terminal manages log change operation in the form of approval flow, and is used for reading analysis state information, message elements and monitoring data from a data source end and displaying the analysis state information, the message elements and the monitoring data;

The platform terminal configures a plurality of search models, and selects different search models to perform log inquiry according to search requirements;

the platform terminal uses a cluster analysis algorithm to acquire log data from a data source terminal, and then carries out format measurement, rule self-learning and result evaluation on logs with similar data structures or contents;

the platform terminal uses custom topology rules and streaming calculations to detect abnormal log content.

The beneficial effects of the invention are as follows: according to the invention, an elastic search based log management system is formed by an acquisition module, a filtering module, a storage module and a management module, a treatment module, a search module, a clustering module and a topology module which are contained in a data source end, so that a full life cycle solution of log acquisition, storage, management, search, cluster analysis, anomaly detection and visual treatment is realized, and the problems that massive log data are difficult to store intensively, a log inquiring mode is low in efficiency, a safe and controllable log data management means is lacking, authority control and risk problems are outstanding, service related information actual utilization rate is insufficient and service index visualization cannot be provided in a traditional log management mode are promoted and solved.

Drawings

In order to more clearly illustrate the technical solutions of the implementation of the present invention, the following description of the embodiments or the drawings required for the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a general architecture diagram of a log management system based on elastic search according to an embodiment of the present application.

Fig. 2 is a schematic flow classification diagram of a log management system management module based on elastic search according to an embodiment of the present application.

Fig. 3 is a flowchart of a cluster analysis algorithm of a log management system cluster module based on elastic search provided by an embodiment of the present application.

Fig. 4 is a topology detection flow chart of a log management system topology module based on elastic search according to an embodiment of the present application.

Fig. 5 is a schematic diagram of a log management device based on elastic search according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, fig. 1 is an overall architecture diagram of an elastic search-based log management system, where the log management system includes a data source end and a platform terminal: the data source end comprises an acquisition module, a filtering module and a storage module, and the platform terminal comprises a management module, a treatment module, a search module, a clustering module and a topology module.

The data source end comprises an acquisition module, a filtering module and a storage module. The scattered log data are accessed to the filtering module through the collecting module, the filtering module filters the log data according to preset conditions and then leads the log data into the storage module, and the storage module unifies the nano-tube log data and the module state information. The data source end supports business application log, system information log, network session log, database log and middleware log as input to provide log data for the platform terminal.

The acquisition module is configured with a source end address, a log path, a log file name, maximum transmission throughput, a performance safety limit and character set parameters, log data can be extracted to the filtering module according to preset configuration parameter content after the acquisition module is started, and different log types can be shunted to different filtering modules by default. The acquisition module can report the current acquisition state such as the number of lines, the file code and the update condition of the log file to the storage module in real time, the acquisition module always runs a timing task in the background to monitor the state of the acquired log file in real time, and once the state information of the log file is different, the log content of the update part is pushed, so that the long-term synchronous automatic log data acquisition is realized.

The filtering module is configured with various conditions to filter the log data pushed by the acquisition module, the filtering conditions support the grammar update in a generalized grok mode, the useless redundant fields are removed by setting the standard log content fields and are arranged according to the time sequence, the log content is integrated in a merging mode and a systematic mode, and the negative influence of the disordered log content on the cluster analysis function of the cluster module and the abnormality detection function of the topology module in the platform terminal is avoided. The filtering module transmits different types of log data and filtering module state information to the storage module, and the storage module performs unified storage, quantification, storage and backup.

The storage module stores the state information of log data and other modules through an elastic search cluster, different log index generation templates and life cycle modes are configured according to differences of log orders, log indexes with daily average log quantity smaller than 10g can be matched with a lightweight index template and a long life cycle strategy, log indexes with daily average log quantity larger than 10g can be matched with a normal index template and a short life cycle strategy, backup operation is operated every day at regular time, and the log indexes are stored in a storage array in a snapshot mode to realize lasting log backup. The storage module can continuously manage the real-time state information of other modules, and the real-time state information is summarized and then pushed to the treatment module and the topology module of the platform terminal for further analysis and integration.

The platform terminal is constructed by adopting a micro-service and B/S system and comprises a management module, a treatment module, a search module, a clustering module and a topology module, and provides functions of flow management, state treatment, log searching, cluster analysis and anomaly detection, so that users are supported to directly access the log management system from a production environment and an office environment.

The management module manages log change operations in the form of an approval process, where the approval process includes an index change process, an index recovery process, a user authority change process, a department authority change process, a temporary inquiry authority process, and a temporary release log desensitization process, please refer to fig. 2, and fig. 2 is a schematic flow classification diagram of the log management system management module based on an elastic search. After the process is initiated, 4 approval nodes are passed through an initiator, an initiator department responsible person, a log management platform manager and a business line responsible person, and each node has approval, supplement and return authorities. After the process approval is passed, the process approval can be immediately validated to realize a change effect, and a user can correspondingly initiate the approval process according to the specific requirement of log change. All the historical processes can be archived to a storage module so as to be convenient for supervision audit and regression traceability.

The index change flow comprises newly adding an index to a log management platform, and modifying and deleting registered index information or corresponding departments;

The management module reads and analyzes the state information, the message elements and the monitoring data from the storage module, displays various index data conditions of the service and message layers in real time at the platform terminal, and specifically comprises service inquiry, interface dependent inquiry, link monitoring, index statistics, message summarization, state overview and topic detail sub-modules, provides perceptibility of the running state and positioning capability of the running problems, and improves the management level of the log management system on the service and message layers.

The service inquiry sub-module can inquire the micro-service registered on the registration center, support a prefix matching drop-down box and select the service for inquiry, and the inquiry result comprises the application, group name and version of the registered micro-service and the information of the service provider and consumer;

The interface dependence inquiring sub-module can inquire the interface dependence relationship to display a result in the form of a directed graph, and a user can check the upstream and downstream calling relationship of the searched interface in the graph;

The index statistics sub-module is used for collecting software and hardware indexes (such as garbage collection conditions of JVM, dead lock thread numbers, database connection pool indexes and the like) from the system and displaying, inquiring different times of transactions, wherein the transactions are basic units of client embedded points, and for example, one dubbo call or one sql statement execution can be called as the transactions. The index statistics submodule reports index information to the topology module, and the topology module performs anomaly detection by combining user-defined rules.

The message summarizing sub-module is divided according to the message instance and the theme, displays the consumption total number of the historical messages after the program is started and the production and consumption total number data of the historical messages of the theme, and can check message bodies of different message IDs according to the theme and the time node;

The state overview submodule is used for checking cluster production and consumption rate, total number, cluster state, configuration detail data, subscription group name, quantity, consumption rate, delay, brooker data, theme and production groups;

the topic detail submodule is used for checking topic lists, states, routes and consumer information;

the search module configures 4 search models, namely a fuzzy search model, an accurate search model, a regular search model and an SQL-like search model, wherein the bottoms of different search models correspond to different search logics, and a user can select different search models on a platform terminal log query page according to search requirements. Before the search result is displayed, card numbers, addresses, mailboxes and password sensitive information are identified and extracted through regular expression rules, star signs are used for default to replace when front-end pages are displayed, a general desensitization function on log information is achieved, and an administrator is supported to add custom desensitization rules to achieve a customized desensitization effect on special fields and contents.

The fuzzy search model is based on Standard Analyzer component expansion in the elastic search, adopts a search mode of dividing search content and lowercase processing according to words, does not perform special processing on common stop words (such as a, an, is, the and the like), can be associated and matched with more information but can have partial redundant information, is suitable for scenes with undefined searched content, and can acquire more associated information;

The accurate search model is based on the expansion of a Keyword Analyzer component in the elastic search, does not perform any word segmentation processing on search content, namely directly uses input content as search content to perform accurate range matching, only hits strongly related content information, is suitable for scenes with clear search content, and can avoid interference of information with lower association degree;

the regular search model is based on Patter Analyzer component expansion in the elastic search, matches target content according to the principle of regular expressions, supports user customization of special regular type conditions, supports simultaneous validation of a plurality of regular expressions, is suitable for search scenes matched by the regular expressions, and can summarize inherent logic relations to realize efficient association search;

the SQL-like search model is based on SQL component expansion in an elastic search, executes a query mode of SQL sentences, approximately correlates and matches bottom elements of a non-relational database in a relational condition, is suitable for search scenes which can be correlated and matched in SQL logic, and can directly and stably migrate daily SQL search logic.

And the cluster analysis module uses a cluster analysis algorithm to acquire log data from the storage module, and then performs format measurement, rule self-learning and result evaluation on logs with similar data structures or contents. Referring to fig. 3, fig. 3 is a flowchart of a cluster analysis algorithm of the log management system clustering module based on the elastic search, including data cleaning, similarity calculation, cluster grouping and detection.

The method comprises the steps of firstly executing cleaning operation on log data, replacing time stamps, log path fields and state information fields in the log with null values in a regular matching mode, reducing the influence of useless data in the subsequent clustering process on results, then respectively calculating text similarity, vector similarity and maximum public subgraphs of log content in batches, taking average values, screening optimal results, merging logs similar in similarity into cluster groups, and finally carrying out independent grouping identification on abnormal log information into abnormal states through an LOF detection algorithm, thereby realizing the effect of general cluster analysis of log content.

The edit distance can be used to measure the similarity of text, and between two strings, the minimum edit operation number required by converting one string into the other string through the edit operation is the edit distance, and the calculation expression is:

Dividing the character string a and the character string b into individual characters and creating a matrix, i and j representing the subscripts of the character string a and the character string b, respectively, the subscripts starting from 1. The distances from each row to each column are calculated in a loop until the last character, and the last value d [ i ] [ j ] of the last returned matrix is their distance.

The text similarity calculation expression is:

the |a| and the |b| represent the lengths of the character string a and the character string b, respectively.

The cosine value of the included angle of the two vectors in the vector space is used as a measure for measuring the difference between the two individuals, and the closer the value is to 1, the closer the included angle is to 0, the more similar the two vectors are, namely the strong similarity of the vectors is.

The vector similarity calculation expression is:

the IK word segmentation device of the elastic search is used for segmenting words of the A document and the B document, word frequencies of all words are calculated, and then

Then X and Y are used for representing word frequency vectors in the document respectively, and p represents vector component subscripts.

If two graph structures are more similar, the more they share, i.e., there are common subgraphs, so the degree of similarity of the two graph structures can be measured by their largest common subgraphs. The maximum common sub-graph similarity calculation expression is:

and respectively representing the node numbers of the two graphs, finding out the node number of the maximum common subgraph of the two graphs and the maximum node number in the two graphs, and calculating the similarity.

The LOF detection algorithm is a classical algorithm based on density, and by assigning an outlier factor LOF depending on the neighborhood density to each data point, and further judging whether the data point is an outlier, the anomaly degree of each data point can be quantified, and the calculation expression is:

The kth reachable distance from the data point q to the point o is calculated first, and the result is the maximum value of the k nearest neighbor distance from the point o and the distance from the point q to the point o.

And calculating the local reachable density of the point q, wherein the result is the reciprocal of the average number of the neighbor distances of the point q.

And finally, calculating the local relative density of the point q as the ratio LOF of the average local reachable density of the points in the neighborhood of the point q and the local reachable density of the point q, if the LOF value is larger, the abnormal condition is indicated, otherwise, the abnormal condition is indicated, and if the LOF value is smaller, the normal condition is indicated.

The clustering module can assist in problem positioning and abnormality detection, for example, a certain server has a certain number of error report logs in a short time, but the number of key error report logs is possibly less submerged by other large number of error report logs. By using the cluster analysis algorithm, the error report logs can be summarized, the overall log appearance can be rapidly mastered by abstraction, the key error report information can be extracted, the basic log content can be converted into log events, and the user can be assisted to analyze and find logs which are abnormal, rare and unknown in positioning. The clustering analysis operation is repeatedly executed at regular time, so that newly-appeared log categories can be collected, and redundant abnormal jitter can be identified.

The topology module detects abnormal log content by using a custom topology rule, a user firstly sets specific detection indexes, a range, a quantization threshold and a topology rule for the specific log content according to specific service requirements, a consumption end refers to an index newly-added link set by the user in a streaming calculation mode, continuously tracks log information conforming to a preselected range in real time from a storage module, integrates and abstracts the log content into log topology together by combining a streaming calculation response value and the log content, comprehensively matches with the preset quantization threshold and rule to monitor service continuity and service availability, and can link and push log abnormal topology content monitoring alarms to service contacts in various modes such as short messages, mails and the like once the topology rule is hit, and referring to fig. 4, the topology detection flow chart of the log management system topology module based on elastic search is shown.

The flow calculation mode is to count the extracted unstructured data indexes at any time through locating log information with strong time sequence association, the time sequence factors are calculated according to the current time sequence influence values and autocorrelation coefficients, then the time sequence offset is obtained by combining the time sequence influence weights, finally the flow calculation response values are obtained by summarizing operation, the flow conditions of accessing among a plurality of services at one time are grasped at the granularity of a minute level, the link log health degree can be analyzed by combining the historical data, various parameter indexes and data conditions of the log are quantized uniformly, and parallel processing can be realized.

The streaming computing expression is:

Wherein, Is a timing factor,/>For the current timing impact value,/>Is an autocorrelation coefficient,/>Is the product time sequence influencing weight value,/>Is a timing offset,/>A response value is calculated for the stream.

The topology rule templates are:

log_data_source = log_index = xxx_ INDICE AND TYPE (service type) = SERVICE AND content (content key) content 'non work' or 'ook' or 'disconnect' or 'full gc' for fileds (preset range) in [099601] |frequency > 10|time_step = 60 s| HIGH DEGREE (severity) AND solt by time.

The user can select log_data_source, log index and service type by referring to the topology rule template, configure rule attributes including frequency, time interval time_step, content keyword content, preset range fileds and severity degree rule according to service requirements, link with or, embed, sequence, timestamp, regular match, precise match and other logic relations, and form a generalizable custom topology rule.

Multiple alarm response levels can be set within a fixed time range, flexible modification and optimization are supported, abnormal time points, abnormal content and abnormal rules are found in time from log content, effective information prejudgment is conducted in advance, and emergency treatment measures are configured.

According to another aspect of the present application, there is also provided an elastic search-based log management system running a working process, including the steps of:

S1, a user collects various types of logs including an application log, a system information log, a network session log, a database log and a middleware log at a data source end through an acquisition module, a source end address, a log path, a log file name, a maximum transmission throughput, a performance safety allowance and character set parameters are configured, and log data are extracted to a filtering module according to preset configuration parameter content after the acquisition module is started;

S2, a user uses a filtering module to filter the log data pushed by the acquisition module at a data source end, supports updating of filtering conditions in a generalized grok mode grammar, and transmits the content of the log data filtered according to the preset conditions and the state information of the filtering module to a storage module;

S3, a user realizes unified storage, quantification, storage and backup of log data at a data source end through a storage module, the state information of the log data and other modules is stored by using an elastic search cluster in the storage module, and different log index generation templates and life cycle modes are configured according to differences of log orders;

S4, a user manages log change operation in a platform terminal in the form of an approval process through a management module, wherein the approval process comprises an index change process, an index recovery process, a user authority change process, a department authority change process, a temporary inquiry authority process and a temporary release log desensitization process, the user can correspondingly initiate a process according to specific log change requirements, and 4 nodes are approved and then take effect through an initiator, an initiator department responsible person, a log management platform manager and a business line responsible person;

S5, the user reads and analyzes the state information, the message elements and the monitoring data through the management module at the platform terminal, displays various index data conditions of the service and the message layer in real time at the platform terminal, can obtain the perceptibility of the service running state and the positioning capability of the system running problem, and improves the overall management level of the log management system on the service and the message layer;

S6, a user queries logs through a fuzzy search model, an accurate search model, a regular search model and an SQL-like search model of the search module at the platform terminal, different search logics are corresponding to different search bottoms of different search models, and the user can select different search models according to search requirements. Before the search result is displayed, identifying and extracting sensitive information through a regular expression, and replacing the star with the star to realize desensitization when the front page is displayed;

S7, the user carries out format measurement, rule self-learning and result evaluation on logs with similar data structures or contents at the platform terminal through a clustering module, wherein the method comprises 4 steps of data cleaning, similarity calculation, clustering grouping and detection, so that problem positioning and abnormality detection are realized, the user can summarize error-reporting logs, abstract and quickly master the overall view of the logs to extract key error-reporting information through a clustering analysis algorithm, basic log contents are converted into log events, and the logs with abnormal positioning, rare and unknown positioning are found through analysis. The clustering analysis operation is repeatedly executed at regular time, so that newly-appeared log categories can be collected, and redundant abnormal jitter can be identified;

S8, a user detects abnormal log content through a topology module by using a custom topology rule at a platform terminal, the user can set specific detection indexes, ranges, quantization thresholds and topology rules for the specific log content according to specific service requirements, a consumption end continuously tracks log information conforming to a preselected range in real time by referring to an index newly-added link set by the user in a stream calculation mode, a stream calculation response value and the log content are combined to be abstracted into a log topology, preset quantization thresholds and rules are comprehensively matched to monitor service continuity and service availability, and once the topology rules are hit, log abnormal topology content monitoring alarm can be pushed to service contacts in a linkage mode through a plurality of modes such as short messages, mails and the like;

The user can realize various functional details and business effects of the log management system based on the elastic search through the steps.

Referring to fig. 5, an log management device based on elastic search provided by an embodiment of the present invention includes a memory and one or more processors, where executable codes are stored in the memory, and when the processor executes the executable codes, the processor is configured to implement an log management method based on elastic search in the above embodiment.

The embodiment of the log management device based on the elastic search can be applied to any device with data processing capability, and the device with data processing capability can be a device or a device such as a computer. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory by a processor of any device with data processing capability. In terms of hardware, as shown in fig. 5, a hardware structure diagram of an apparatus with any data processing capability where an elastic search-based log management device provided by the present invention is located is shown in fig. 5, and in addition to a processor, a memory, a network interface, and a nonvolatile memory shown in fig. 5, any apparatus with any data processing capability where an embodiment is located generally includes other hardware according to an actual function of the apparatus with any data processing capability, which is not described herein.

The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.

For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present invention. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

The embodiment of the present invention also provides a computer-readable storage medium having a program stored thereon, which when executed by a processor, implements an elastic search-based log management method in the above embodiment.

The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any of the data processing enabled devices described in any of the previous embodiments. The computer readable storage medium may also be an external storage device of any device having data processing capabilities, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), an SD card, a flash memory card (FLASH CARD), etc. provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any data processing device. The computer readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing apparatus, and may also be used for temporarily storing data that has been output or is to be output.

The invention also provides a computer program product comprising computer programs/instructions which, when executed by a processor, implement the method of log management based on elastic search.

The above-described embodiments are intended to illustrate the present invention, not to limit it, and any modifications and variations made thereto are within the spirit of the invention and the scope of the appended claims.

Claims

1. An elastic search based log management system is characterized by comprising a data source end and a platform terminal;

2. The log management system based on elastic search according to claim 1, wherein the data source end comprises the following modules:

and a storage module: different log index generation templates and life cycle modes are configured according to differences of log orders, log indexes with daily average log quantity smaller than 10g can be matched with a lightweight index template and a long life cycle strategy, and log indexes with daily average log quantity larger than 10g can be matched with a normal index template and a short life cycle strategy.

3. The log management system based on elastic search according to claim 1, wherein the platform terminal comprises a management module, a search module, a clustering module and a topology module, provides functions of flow management, state management, log search, cluster analysis and anomaly detection, and supports direct access of users from production environments and office environments to the log management system.

4. A log management system based on elastic search according to claim 3, wherein the management module uses set approval nodes to approve the process, and each node has approval, supplement and return rights; the process flow comprises an index changing process flow, an index recovering process flow, a user authority changing process flow, a department authority changing process flow, a temporary inquiring authority process flow and a temporary removing log desensitizing process flow.

5. The log management system based on elastic search according to claim 4, wherein the index change procedure comprises newly adding an index to a log management platform, and modifying and deleting registered index information or corresponding departments;

6. The log management system based on elastic search according to claim 3, wherein the governance module comprises service query, interface dependent query, link monitoring, index statistics, message summary, status overview and topic detail sub-module;

7. The log management system based on elastic search according to claim 3, wherein the search module is configured with 4 search models, namely a fuzzy search model, an accurate search model, a regular search model and an SQL-like search model;

8. The log management system based on elastic search according to claim 3, wherein the clustering algorithm in the cluster analysis module comprises data cleaning, similarity calculation, clustering grouping and detection, specifically: and executing cleaning operation on the log data, replacing a time stamp, a log path field and a state information field in the log with null values in a regular matching mode, then respectively calculating text similarity, vector similarity and a maximum public subgraph of the log content in batches, taking an average value, screening an optimal result, merging logs with similar similarity into cluster groups, quantifying abnormal log information through an LOF detection algorithm, and performing independent grouping and marking as an abnormal state to realize general cluster analysis of the log content.

9. The log management system based on elastic search according to claim 3, wherein the topology module uses custom topology rules to detect abnormal log content, a user firstly sets specific detection indexes, ranges, quantization thresholds and topology rules for specific log content according to specific service requirements, a consumer continuously tracks log information conforming to a preselected range in real time by referring to an index newly-added link set by the user in a streaming calculation mode, a combined streaming calculation response value and the log content are abstracted together into log topology, preset quantization thresholds and rules are comprehensively matched to monitor service continuity and service availability, and once the topology rules are hit, log abnormal topology content is pushed in a linked manner to monitor alarms to service contacts;

10. A method of log management based on the system of any one of claims 1-9, the method comprising: