CN112486789A - Log analysis system, method and device - Google Patents

Log analysis system, method and device Download PDF

Info

Publication number
CN112486789A
CN112486789A CN202011369258.0A CN202011369258A CN112486789A CN 112486789 A CN112486789 A CN 112486789A CN 202011369258 A CN202011369258 A CN 202011369258A CN 112486789 A CN112486789 A CN 112486789A
Authority
CN
China
Prior art keywords
log
log data
analysis
data
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011369258.0A
Other languages
Chinese (zh)
Inventor
何嘉杰
邓玉
胡仲强
谢潇宇
林浪桥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CCB Finetech Co Ltd
Original Assignee
CCB Finetech Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CCB Finetech Co Ltd filed Critical CCB Finetech Co Ltd
Priority to CN202011369258.0A priority Critical patent/CN112486789A/en
Publication of CN112486789A publication Critical patent/CN112486789A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • G06F11/3476Data logging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing

Abstract

The embodiment of the specification provides a log analysis system, a log analysis method and a log analysis device. The method comprises the following steps: acquiring a log variable representing the content of the log; formatting the log variable to obtain formatted log data; establishing a time tag for the log data, and storing the log data with the time tag into a time sequence database; based on the time sequence database, log data in a preset time period are obtained through query; and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database, thereby improving the efficiency of log analysis.

Description

Log analysis system, method and device
Technical Field
The embodiment of the specification relates to the technical field of computers, in particular to a log analysis system, method and device.
Background
Nginx is a high-performance and high-concurrency HTTP and reverse proxy Web server, and supports HTTP (HyperText Transfer Protocol) reverse proxy, TCP (Transmission Control Protocol) proxy, load balancing, HTTP caching, Web development, and the like.
In the running process of the Ngnix program, a running log is recorded according to the configuration, the running log contains HTTP request and HTTP response information, and the HTTP request and the HTTP response information are very important for debugging the Web program. For example, the log may obtain information such as the IP address of the user, information of the browser, and processing time of the request. The current solution of the known Nginx log analysis management system mainly includes using a stand-alone analysis tool, such as ngxtop or GoAccess, which is suitable for small stand-alone websites or temporary problem analysis.
The scheme of adopting the single-machine analysis tool is generally simpler to install and use, can adopt the analysis data provided by different analysis tools according to the needs, has no dependence on the Nginx version, and only needs the log format meeting the requirements.
However, as the scale of the website is enlarged, the defects of the scheme of adopting a single-machine analysis tool are increasingly shown, for example, customized monitoring data cannot be used, Nginx cluster acquisition is not supported, the log format is not flexible, and the efficiency of log analysis is not high.
Disclosure of Invention
An object of the embodiments of the present disclosure is to provide a log analysis system, method and device, so as to improve efficiency of log analysis.
In order to solve the above problem, an embodiment of the present specification provides a log analysis system, where the system includes an acquisition module, a storage module, and an analysis module; the acquisition module is used for acquiring a log variable representing the content of the log, formatting the log variable to obtain formatted log data, and transmitting the log data to the storage module; the storage module is used for establishing a time tag for the log data and storing the log data with the time tag into a time sequence database; the analysis module is used for inquiring log data in a preset time period based on the time sequence database, and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database.
In order to solve the above problem, an embodiment of the present specification further provides a log analysis method, where the method includes: acquiring a log variable representing the content of the log; formatting the log variable to obtain formatted log data; establishing a time tag for the log data, and storing the log data with the time tag into a time sequence database; based on the time sequence database, log data in a preset time period are obtained through query; and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database.
To solve the above problem, an embodiment of the present specification further provides a log analysis apparatus, including: the acquisition module is used for acquiring the log variables representing the log contents; the formatting module is used for formatting the log variable to obtain formatted log data; the storage module is used for establishing a time tag for the log data and storing the log data with the time tag into a time sequence database; the query module is used for querying and obtaining log data in a preset time period based on the time sequence database; and the analysis module is used for storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database.
In order to solve the above problem, an embodiment of the present specification further provides an electronic device, including: a memory for storing a computer program; a processor for executing the computer program to implement: acquiring a log variable representing the content of the log; formatting the log variable to obtain formatted log data; establishing a time tag for the log data, and storing the log data with the time tag into a time sequence database; based on the time sequence database, log data in a preset time period are obtained through query; and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database.
To solve the above problem, embodiments of the present specification further provide a computer-readable storage medium having stored thereon computer instructions, which when executed, implement: acquiring a log variable representing the content of the log; formatting the log variable to obtain formatted log data; establishing a time tag for the log data, and storing the log data with the time tag into a time sequence database; based on the time sequence database, log data in a preset time period are obtained through query; and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database.
As can be seen from the technical solutions provided in the embodiments of the present specification, a log variable representing log content can be obtained; formatting the log variable to obtain formatted log data; establishing a time tag for the log data, and storing the log data with the time tag into a time sequence database; based on the time sequence database, log data in a preset time period are obtained through query; and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database. According to the method provided by the embodiment of the specification, the Ngx _ Lua module is introduced to realize the collection of the logs, the collected logs are stored in the time sequence database in a centralized manner, the analysis of the logs is realized through the continuous query function provided by the time sequence database, the collection of clustered Nginx access logs is supported, the unified management and calculation of data are realized, the collection requirement of log data is met with high performance and low consumption, the development and test work of data analysis service codes is simplified, and the efficiency of log analysis is improved.
Drawings
In order to more clearly illustrate the embodiments of the present specification or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the specification, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a functional structure diagram of a log analysis system according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of a log analysis method according to an embodiment of the present disclosure;
fig. 3 is a functional structure diagram of an electronic device according to an embodiment of the present disclosure;
fig. 4 is a functional structure diagram of a log analysis device according to an embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present specification without any creative effort shall fall within the protection scope of the present specification.
The Nginx log is useful for statistical, system service troubleshooting. When accessing the nginx service, nginx will log. The nginx log is divided into two types, one is an access log, and the other is an error log, the access log is recorded in an "access. The error log records the access error information, such as the status of an error of nginx processing Http request, the running error status of nginx service, and the like, and records according to different levels, which can help to locate the cause of the error. The access log primarily records client requests. Each request that a client initiates to a Nginx server is recorded here. Information such as client IP, browser information, referrer, request processing time, request URL, etc. may be available in the access log. Of course, which information is to be recorded is specifically defined by the log _ format instruction. The log _ format may organize the variables in nginx together, recorded into access log.
The use of a Nginx enterprise often requires the collection, management, and analysis of a loginx access log. Most enterprises can use a plurality of Nginx server clusters for deployment, log collection of the Nginx server clusters is supported, and centralized management and centralized analysis of access log data are supported. The development time for reducing most of the monitoring traffic requirements, such as average of URI responses, tp90, tp99, etc., can be collected, managed and analyzed through the access log of the Nginx (tp99 is generally used to embody the response capability of the service, i.e., 99% of requests can be responded to within this time point).
The common solution of the Nginx log analysis management system mainly comprises two types.
The first solution is to use a stand-alone analysis tool, such as ngxtop or GoAccess, suitable for small stand-alone websites or temporary problem analysis. Taking ngxtop as an example, the access log saved in the default format may be analyzed, and the installation and analysis processes are as follows.
1. And (6) mounting. Ngxtop is installed using the pip command of Python.
#yum install python-pip
#pip install ngxtop
2. An analysis is performed. Access log analysis is performed using the ngxtop command.
#ngxtop--config/usr/local/nginx/confnginx.conf-n 10
Ngxtop finds the location of access _ log in the configuration file, -n serves to specify the number of URI lines output by the display.
The second scheme is that a log collection agent is deployed in a server, and the storage and analysis of log data are completed by adopting a big data scheme. For example:
1. collecting an access log by using flash, and writing the formatted data into Kafka;
data in Kafka were analyzed by Storm in real time and written into Elasticsearch after calculation;
the data in the Elasticsearch is used for various types of data statistics in real time and offline.
For the first scheme adopting a stand-alone analysis tool, the installation and the use are generally simple, analysis data provided by different analysis tools can be adopted according to needs, no dependence is caused on the Nginx version, and the log format is only required to meet the requirements. However, as the scale of the website is enlarged, the disadvantages of the website are increasingly shown, for example, customized monitoring data cannot be used, the cluster collection of the Nginx is not supported, and the log format is not flexible.
For the second scheme of deploying the log collection agent, the log format can be customized, the cluster collection of the Nginx is supported, and the access log is analyzed and managed in a centralized manner, but the second scheme also has the defects that the collection and formatting operations of the Flume easily occupy excessive server resources under the high concurrency condition, certain enterprise development, operation and maintenance resources are occupied when large data components are maintained, and the second scheme is not cost-effective if only used for supporting log monitoring.
The analysis efficiency of the Nginx log in the prior art is not high. Considering that if an access log collection scheme different from the traditional agent scheme is adopted, the Ngx _ Lua module is introduced to collect logs, the collected logs are stored in the time sequence database in a centralized manner, and the logs are analyzed through the continuous query function provided by the time sequence database, the problem that the analysis efficiency of the Nginx logs in the prior art is not high is hopefully solved, so that the log data collection requirement is met with high performance and low consumption, the data analysis service code development and test work is simplified, and the log analysis efficiency is improved. Based on this, the embodiment of the specification provides a log analysis system, a log analysis method and a log analysis device.
Referring to fig. 1, a log analysis system is provided in the present specification. The log analysis system may include an acquisition module 110, a storage module 120, and an analysis module 130.
In some embodiments, the acquisition module 110 may be configured to acquire a log variable representing log content, format the log variable to obtain formatted log data, and transmit the log data to the storage module.
In some embodiments, the acquisition module 110 may be deployed on a Web server, and in order to meet the requirements of high performance and low consumption, the acquisition module may be implemented by directly formatting data on Nginx after acquiring the log variables, and then transmitting the data to the log data storage module through a network, so as to omit other intermediate links (such as reading log files, performing string operations, and the like).
In some embodiments, the collection of log variables may be implemented using a Nginx three-party module, lua-Nginx-module. Specifically, the nginn access log content is composed of variables of the nginn, and the variables corresponding to the access log content are the log variables in the embodiment of the present specification. The lua-nginx-module can directly acquire the log variables in the memory through the LuaAPI and is used for acquiring the service processing flow. The lua-nginx-module is a nginx http module, and embeds a lua parser into the nginx for parsing and executing the webpage background script written by the lua language. Its principle is ngx _ Lua embedding Lua into Nginx, which can let Nginx execute Lua script and process various requests with high concurrency and non-blocking. The Lua is built with a protocol so that the asynchronous callback can be well converted into a sequential calling form. ngx _ Lua all I/O operations in Lua are delegated to the Nginx's event model, thus enabling non-blocking calls. The developer can write programs in a serial mode, ngx _ lua can be automatically interrupted when blocked I/O operation is carried out, and context is saved; the I/O operation is then delegated to the Nginx event handling mechanism, and after the I/O operation is completed, ngx _ lua restores the context and the program continues to execute, all transparently to the user program. Each NginxWorker process holds a Lua interpreter or LuaJIT instance, which is shared by all requests processed by this Worker. The Context of each request is split by the Lua lightweight protocol to ensure that the requests are independent. ngx _ lua adopts a processing model of "one-core-per-request", for each user request, ngx _ lua will wake up a protocol for executing the user code processing request, and when the request processing is completed, the protocol will be destroyed. Each coroutine has an independent global environment (variable space) that inherits to globally shared, read-only "comman data". Therefore, any variables injected into the global space by the user code will not affect the processing of other requests, and these variables will be released after the request processing is completed, thus ensuring that all the user code runs in a "sandbox" which has the same life cycle as the request. Thanks to the support of the Lua protocol, ngx _ Lua requires only little memory when handling 10000 concurrent requests. According to the test, ngx _ lua requires only 2KB of memory to process each request, and less if LuaJIT is used. Ngx _ lua is well suited for enabling scalable, highly concurrent services.
In some embodiments, the acquisition module 110 may format the acquired log variables, for example, format the log variables into transmittable character strings to obtain formatted log data, and then transmit the log data to the storage module 120.
In some embodiments, the log data may be transmitted to the storage module using a lua-reserve-loader-socket module. In the execution stage of log data transmission, after the acquisition request response is completed, the influence on the service is small, and even if the data push fails, the response on the client is not influenced. Wherein, the lua-reserve-loader-socket module is one module in OpenResty. OpenResty is a high-performance Web platform based on Nginx and Lua, and integrates a large number of fine Lua libraries, third-party modules and most of the dependent items inside. The method is used for conveniently building dynamic Web application, Web service and dynamic gateway which can process ultra-high concurrency and extremely high expansibility. OpenResty effectively turns Nginx into a powerful general Web application platform by aggregating various well-designed Nginx modules. Thus, Web developers and system engineers can use the Lua script language to invoke various C and Lua modules supported by Nginx, and quickly construct a high-performance Web application system which is enough to be competent for single machine concurrent connection of 10K or even more than 1000K. OpenResty enables Web services to run directly inside the Nginx services, making full use of the non-blocking I/O model of Nginx, and performing consistent high-performance response not only to HTTP client requests, but also to remote backend such as MySQL, PostgreSQL, Memcached, and Redis.
In some embodiments, before transmitting the log data, the acquisition module 110 may further obtain, according to a service requirement, a target log variable meeting the service requirement from the log variables; correspondingly, the target log variable is formatted, and formatted log data are obtained. In the analysis of the access log, the service requirement may include: counting pv (the number of times nginx is accessed on the day, namely the number of rows of the access log); the first n most active IPs; visit the top n urls most frequently; the number of state codes of each webpage; the most frequent top n URLs visited by each IP of the top n most active IPs; accessing the top n most active IPs per url of the top n most frequent urls; the statistical results of the above parameters account for the ratio and the like.
In some embodiments, the storage module 120 may be configured to establish a time tag for the log data, and store the time-tagged log data in a time-series database.
The analysis of the access log is based on the time dimension, and the request quantity, the fluctuation amplitude of the request, the comparison of average response time and the like are all carried out based on time. Accordingly, the storage module 120 may store log data using a time series database to facilitate supporting data analysis. The Time Series Database (TSDB) is mainly used for processing Time-tagged (Time-sequentially changing, i.e., Time-sequenced) data, which is also referred to as Time-Series data.
In some embodiments, the time series database is implemented by infiluxdb. The infiluxdb is an open source distributed time sequence, event and index database developed based on Go language, and is suitable for processing monitoring data, and the rich query function provided by the infiluxdb can well simplify data analysis work, for example, as follows:
1. the function count is used for returning the number of field fields, and can be used for calculating the request quantity of a certain URI in a period of time in access log analysis;
2. the function top is used for returning a maximum value in a field, supporting the returning of topN, and calculating the maximum value of response time of a certain URI returned in the access log analysis;
3. the function percentile is used for returning a percentage value of a field subjected to sequencing, and the values of tp90 and tp99 of a certain URI response time can be calculated in access log analysis;
4. the function mean is used for returning an arithmetic mean value of the field, and the mean response time of a certain URI can be calculated in the access log analysis;
5. the function derivative is used for returning the change rate of a field, and the stability of the service can be calculated in the access log analysis.
In some embodiments, the analysis module 130 may be configured to query log data within a preset time period based on the time sequence database, and store the log data within the preset time period into an analysis database, so as to analyze the log data in the analysis database. Specifically, for most log analysis, the log analysis is calculated by using data within a period of time, for example, every minute, every 15 minutes, and the like, the log data within a preset period of time can be obtained by using a continuous query function provided by infiluxdb, the function can perform a timing query, and store the executed result in an analysis database; for a small part of complex service requirements, timing query can be executed through service codes, and query results are stored in an analysis database.
In some embodiments, the analysis module 130 may also visually display the log data in the analysis database. Specifically, a data visualization tool, grafana, may be used to visually display the log data. Grafana is a cross-platform open-source measurement analysis and visualization tool, and can be used for inquiring and then visually displaying acquired data and timely notifying the acquired data. The log data are visually displayed by adopting a data visualization tool, the indexes and the logs can be visualized in different modes, such as multiple display modes of heat maps, line graphs, charts and the like, the alarm rule of the most important index can be defined in a visual mode, Grafana continuously calculates and sends a notification, and the notification is obtained through Slack, PagerDuty and the like when the data reach a threshold value, so that a data basis is provided for subsequent log analysis and tracing reasons of abnormal problems of a service system.
The log analysis system provided by the embodiment of the specification can acquire the log variables representing the log contents; formatting the log variable to obtain formatted log data; establishing a time tag for the log data, and storing the log data with the time tag into a time sequence database; based on the time sequence database, log data in a preset time period are obtained through query; and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database. The system provided by the embodiment of the specification realizes the collection of the logs by introducing the Ngx _ Lua module, intensively stores the collected logs into the time sequence database, realizes the analysis of the logs by the continuous query function provided by the time sequence database, supports the collection of the clustered Nginx access logs, realizes the unified management and the unified calculation of the data, meets the collection requirement of the log data with high performance and low consumption, simplifies the development and test work of data analysis service codes, and improves the efficiency of log analysis.
Please refer to fig. 2. The embodiment of the description also provides a log analysis method. In the embodiment of the present specification, a subject performing the log analysis method may be an electronic device having a logical operation function, and the electronic device may be a server. The server may be an electronic device having a certain arithmetic processing capability. Which may have a network communication unit, a processor, a memory, etc. Of course, the server is not limited to the electronic device having a certain entity, and may be software running in the electronic device. The server may also be a distributed server, which may be a system with multiple processors, memory, network communication modules, etc. operating in coordination. Alternatively, the server may also be a server cluster formed by several servers. The method may include the following steps.
S210: and acquiring a log variable representing the log content.
In some embodiments, the server may use the Nginx three-party module lua-Nginx-module to realize the collection of the log variables, thereby obtaining the log variables representing the log contents.
Specifically, the nginn access log content is composed of variables of the nginn, and the variables corresponding to the access log content are the log variables in the embodiment of the present specification. The lua-nginx-module can directly acquire the log variables in the memory through the LuaAPI and is used for acquiring the service processing flow. The lua-nginx-module is a nginx http module, and embeds a lua parser into the nginx for parsing and executing the webpage background script written by the lua language. Its principle is ngx _ Lua embedding Lua into Nginx, which can let Nginx execute Lua script and process various requests with high concurrency and non-blocking. The Lua is built with a protocol so that the asynchronous callback can be well converted into a sequential calling form. ngx _ Lua all I/O operations in Lua are delegated to the Nginx's event model, thus enabling non-blocking calls. The developer can write programs in a serial mode, ngx _ lua can be automatically interrupted when blocked I/O operation is carried out, and context is saved; the I/O operation is then delegated to the Nginx event handling mechanism, and after the I/O operation is completed, ngx _ lua restores the context and the program continues to execute, all transparently to the user program. Each NginxWorker process holds a Lua interpreter or LuaJIT instance, which is shared by all requests processed by this Worker. The Context of each request is split by the Lua lightweight protocol to ensure that the requests are independent. ngx _ lua adopts a processing model of "one-core-per-request", for each user request, ngx _ lua will wake up a protocol for executing the user code processing request, and when the request processing is completed, the protocol will be destroyed. Each coroutine has an independent global environment (variable space) that inherits to globally shared, read-only "comman data". Therefore, any variables injected into the global space by the user code will not affect the processing of other requests, and these variables will be released after the request processing is completed, thus ensuring that all the user code runs in a "sandbox" which has the same life cycle as the request. Thanks to the support of the Lua protocol, ngx _ Lua requires only little memory when handling 10000 concurrent requests. According to the test, ngx _ lua requires only 2KB of memory to process each request, and less if LuaJIT is used. Ngx _ lua is well suited for enabling scalable, highly concurrent services.
S220: and formatting the log variable to obtain formatted log data.
In some embodiments, the server may format the obtained log variable, for example, format the log variable into a transmittable character string, so as to obtain formatted log data.
In some embodiments, before formatting the log variables, a target log variable meeting a service requirement may be obtained from the log variables according to the service requirement; correspondingly, the target log variable is formatted, and formatted log data are obtained. In the analysis of the access log, the service requirement may include: counting pv (the number of times nginx is accessed on the day, namely the number of rows of the access log); the first n most active IPs; visit the top n urls most frequently; the number of state codes of each webpage; the most frequent top n URLs visited by each IP of the top n most active IPs; accessing the top n most active IPs per url of the top n most frequent urls; the statistical results of the above parameters account for the ratio and the like.
S230: and establishing a time tag for the log data, and storing the log data with the time tag into a time sequence database.
In some embodiments, the server may establish a time tag for the log data, and transmit the log data to a time-series database for storage using a lua-reserve-loader-socket module. Wherein, the lua-reserve-loader-socket module is one module in OpenResty. OpenResty is a high-performance Web platform based on Nginx and Lua, and integrates a large number of fine Lua libraries, third-party modules and most of the dependent items inside. The method is used for conveniently building dynamic Web application, Web service and dynamic gateway which can process ultra-high concurrency and extremely high expansibility. OpenResty effectively turns Nginx into a powerful general Web application platform by aggregating various well-designed Nginx modules. Thus, Web developers and system engineers can use the Lua script language to invoke various C and Lua modules supported by Nginx, and quickly construct a high-performance Web application system which is enough to be competent for single machine concurrent connection of 10K or even more than 1000K. OpenResty enables Web services to run directly inside the Nginx services, making full use of the non-blocking I/O model of Nginx, and performing consistent high-performance response not only to HTTP client requests, but also to remote backend such as MySQL, PostgreSQL, Memcached, and Redis.
The analysis of the access log is based on the time dimension, and the request quantity, the fluctuation amplitude of the request, the comparison of average response time and the like are all carried out based on time. Thus, a time series database may be employed to store log data to facilitate supporting data analysis. The Time Series Database (TSDB) is mainly used for processing Time-tagged (Time-sequentially changing, i.e., Time-sequenced) data, which is also referred to as Time-Series data.
In some embodiments, the time series database may be implemented by infiluxdb. The infiluxdb is an open source distributed time sequence, event and index database developed based on Go language, and is suitable for processing monitoring data, and the rich query function provided by the infiluxdb can well simplify data analysis work, for example, as follows:
1. the function count is used for returning the number of field fields, and can be used for calculating the request quantity of a certain URI in a period of time in access log analysis;
2. the function top is used for returning a maximum value in a field, supporting the returning of topN, and calculating the maximum value of response time of a certain URI returned in the access log analysis;
3. the function percentile is used for returning a percentage value of a field subjected to sequencing, and the values of tp90 and tp99 of a certain URI response time can be calculated in access log analysis;
4. the function mean is used for returning an arithmetic mean value of the field, and the mean response time of a certain URI can be calculated in the access log analysis;
5. the function derivative is used for returning the change rate of a field, and the stability of the service can be calculated in the access log analysis.
S240: and inquiring to obtain log data in a preset time period based on the time sequence database.
S250: and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database.
In some embodiments, the server may be configured to query log data within a preset time period based on the time sequence database, and store the log data within the preset time period into the analysis database, so as to analyze the log data in the analysis database. Specifically, for most log analysis, the log analysis is calculated by using data within a period of time, for example, every minute, every 15 minutes, and the like, the log data within a preset period of time can be obtained by using a continuous query function provided by infiluxdb, the function can perform a timing query, and store the executed result in an analysis database; for a small part of complex service requirements, timing query can be executed through service codes, and query results are stored in an analysis database.
In some embodiments, the server may further visually present the log data in the analysis database. Specifically, a data visualization tool, grafana, may be used to visually display the log data. Grafana is a cross-platform open-source measurement analysis and visualization tool, and can be used for inquiring and then visually displaying acquired data and timely notifying the acquired data. The log data are visually displayed by adopting a data visualization tool, the indexes and the logs can be visualized in different modes, such as multiple display modes of heat maps, line graphs, charts and the like, the alarm rule of the most important index can be defined in a visual mode, Grafana continuously calculates and sends a notification, and the notification is obtained through Slack, PagerDuty and the like when the data reach a threshold value, so that a data basis is provided for subsequent log analysis and tracing reasons of abnormal problems of a service system.
The log analysis method provided by the embodiment of the specification can acquire the log variable representing the content of the log; formatting the log variable to obtain formatted log data; establishing a time tag for the log data, and storing the log data with the time tag into a time sequence database; based on the time sequence database, log data in a preset time period are obtained through query; and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database. According to the method provided by the embodiment of the specification, the Ngx _ Lua module is introduced to realize the collection of the logs, the collected logs are stored in the time sequence database in a centralized manner, the analysis of the logs is realized through the continuous query function provided by the time sequence database, the collection of clustered Nginx access logs is supported, the unified management and calculation of data are realized, the collection requirement of log data is met with high performance and low consumption, the development and test work of data analysis service codes is simplified, and the efficiency of log analysis is improved.
Fig. 3 is a functional structure diagram of an electronic device according to an embodiment of the present disclosure, where the electronic device may include a memory and a processor.
In some embodiments, the memory may be used to store the computer programs and/or modules, and the processor may implement various functions of the log analysis method by running or executing the computer programs and/or modules stored in the memory and calling data stored in the memory. The memory can mainly comprise a program storage area and a data storage area, wherein the program storage area can store an operating system and an application program required by at least one function; the storage data area may store data created according to the use of the user terminal. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an APPlication Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The processor may execute the computer instructions to perform the steps of: acquiring a log variable representing the content of the log; formatting the log variable to obtain formatted log data; establishing a time tag for the log data, and storing the log data with the time tag into a time sequence database; based on the time sequence database, log data in a preset time period are obtained through query; and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database.
In the embodiments of the present description, the functions and effects specifically realized by the electronic device may be explained in comparison with other embodiments, and are not described herein again.
Fig. 4 is a functional structure diagram of a log analysis apparatus according to an embodiment of the present disclosure, where the apparatus may specifically include the following structural modules.
An obtaining module 410, configured to obtain a log variable representing log content;
the formatting module 420 is configured to format the log variable to obtain formatted log data;
the storage module 430 is configured to establish a time tag for the log data, and store the log data with the time tag into a time sequence database;
the query module 440 is configured to query log data within a preset time period based on the time sequence database;
the analysis module 450 is configured to store the log data in the preset time period into an analysis database, so as to analyze the log data in the analysis database.
Embodiments of the present specification further provide a computer-readable storage medium of a log analysis method, where the computer-readable storage medium stores computer program instructions, and when the computer program instructions are executed, the computer program instructions implement: acquiring a log variable representing the content of the log; formatting the log variable to obtain formatted log data; establishing a time tag for the log data, and storing the log data with the time tag into a time sequence database; based on the time sequence database, log data in a preset time period are obtained through query; and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database.
In the embodiments of the present specification, the storage medium includes, but is not limited to, a Random Access Memory (RAM), a Read-Only Memory (ROM), a Cache (Cache), a Hard Disk Drive (HDD), or a Memory Card (Memory Card). The memory may be used for storing the computer programs and/or modules, and the memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the user terminal, and the like. In addition, the memory may include high speed random access memory, and may also include non-volatile memory. In the embodiments of the present description, the functions and effects specifically realized by the program instructions stored in the computer-readable storage medium may be explained in contrast to other embodiments, and are not described herein again.
It should be noted that the data cleaning method, the data cleaning device, and the electronic device provided in the embodiments of the present specification relate to the field of computer technologies, and may be applied to the field of finance for processing data in a computer use process, and may also be applied to any field other than the field of finance.
It should be noted that, in the present specification, each embodiment is described in a progressive manner, and the same or similar parts in each embodiment may be referred to each other, and each embodiment focuses on differences from other embodiments. In particular, as for the apparatus embodiment and the apparatus embodiment, since they are substantially similar to the method embodiment, the description is relatively simple, and reference may be made to some descriptions of the method embodiment for relevant points.
After reading this specification, persons skilled in the art will appreciate that any combination of some or all of the embodiments set forth herein, without inventive faculty, is within the scope of the disclosure and protection of this specification.
In the 90 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain the corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD), such as a Field Programmable Gate Array (FPGA), is an integrated circuit whose Logic functions are determined by programming the Device by a user. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually making an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as abel (advanced Boolean Expression Language), ahdl (alternate Language Description Language), traffic, pl (core unified Programming Language), HDCal, JHDL (Java Hardware Description Language), langue, Lola, HDL, laspam, hardbyscript Description Language (vhr Description Language), and vhjhd (Hardware Description Language), which is currently used by most popular version-software. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
From the above description of the embodiments, it is clear to those skilled in the art that the present specification can be implemented by software plus a necessary general hardware platform. Based on such understanding, the technical solutions of the present specification may be essentially or partially implemented in the form of software products, which may be stored in a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and include instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments of the present specification.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The description is operational with numerous general purpose or special purpose computing system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet-type devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like.
This description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
While the specification has been described with examples, those skilled in the art will appreciate that there are numerous variations and permutations of the specification that do not depart from the spirit of the specification, and it is intended that the appended claims include such variations and modifications that do not depart from the spirit of the specification.

Claims (15)

1. The log analysis system is characterized by comprising an acquisition module, a storage module and an analysis module;
the acquisition module is used for acquiring a log variable representing the content of the log, formatting the log variable to obtain formatted log data, and transmitting the log data to the storage module;
the storage module is used for establishing a time tag for the log data and storing the log data with the time tag into a time sequence database;
the analysis module is used for inquiring log data in a preset time period based on the time sequence database, and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database.
2. The system of claim 1, wherein the log data is transmitted to the storage module using a lua-reserve-loader-socket module.
3. The system of claim 1, wherein the analysis module is further configured to visually present log data in the analysis database.
4. The system according to claim 3, wherein the log data is visually presented using a data visualization tool, grafana.
5. A method of log analysis, the method comprising:
acquiring a log variable representing the content of the log;
formatting the log variable to obtain formatted log data;
establishing a time tag for the log data, and storing the log data with the time tag into a time sequence database;
based on the time sequence database, log data in a preset time period are obtained through query;
and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database.
6. The method of claim 5, wherein log variables characterizing log content are obtained through a LuaAPI using a Nginx three-party module, lua-Nginx-module.
7. The method of claim 5, further comprising:
according to the service requirement, acquiring a target log variable meeting the service requirement from the log variables;
correspondingly, the target log variable is formatted, and formatted log data are obtained.
8. The method of claim 7, wherein the traffic demand includes at least one of statistics of the number of times nginx is accessed, the first n most active IPs, the first n urls most frequently accessed, and status code data of each web page.
9. The method of claim 5, wherein the time series database is implemented by InfluxDB.
10. The method of claim 8, wherein querying log data for a predetermined period of time based on the time-series database comprises: and querying to obtain log data in a preset time period by using a continuous query function provided by InfluxDB.
11. The method of claim 5, further comprising:
and visually displaying the log data in the analysis database.
12. The method according to claim 11, wherein the log data is visualized using a data visualization tool, grafana.
13. An apparatus for log analysis, the apparatus comprising:
the acquisition module is used for acquiring the log variables representing the log contents;
the formatting module is used for formatting the log variable to obtain formatted log data;
the storage module is used for establishing a time tag for the log data and storing the log data with the time tag into a time sequence database;
the query module is used for querying and obtaining log data in a preset time period based on the time sequence database;
and the analysis module is used for storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database.
14. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement: acquiring a log variable representing the content of the log; formatting the log variable to obtain formatted log data; establishing a time tag for the log data, and storing the log data with the time tag into a time sequence database; based on the time sequence database, log data in a preset time period are obtained through query; and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database.
15. A computer readable storage medium having computer instructions stored thereon that when executed perform: acquiring a log variable representing the content of the log; formatting the log variable to obtain formatted log data; establishing a time tag for the log data, and storing the log data with the time tag into a time sequence database; based on the time sequence database, log data in a preset time period are obtained through query; and storing the log data in the preset time period into an analysis database so as to analyze the log data in the analysis database.
CN202011369258.0A 2020-11-30 2020-11-30 Log analysis system, method and device Pending CN112486789A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011369258.0A CN112486789A (en) 2020-11-30 2020-11-30 Log analysis system, method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011369258.0A CN112486789A (en) 2020-11-30 2020-11-30 Log analysis system, method and device

Publications (1)

Publication Number Publication Date
CN112486789A true CN112486789A (en) 2021-03-12

Family

ID=74937001

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011369258.0A Pending CN112486789A (en) 2020-11-30 2020-11-30 Log analysis system, method and device

Country Status (1)

Country Link
CN (1) CN112486789A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312376A (en) * 2021-05-21 2021-08-27 福建天泉教育科技有限公司 Method and terminal for real-time processing and analysis of Nginx logs
CN113342623A (en) * 2021-05-28 2021-09-03 福建福诺移动通信技术有限公司 Visual early warning system and method based on dynamic threshold method
CN113612816A (en) * 2021-07-06 2021-11-05 深圳市酷开网络科技股份有限公司 Data acquisition method, system, terminal and computer readable storage medium
CN113608981A (en) * 2021-07-27 2021-11-05 远景智能国际私人投资有限公司 Time sequence database testing method and device, computer equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109885537A (en) * 2019-02-22 2019-06-14 成都信息工程大学 A kind of journal displaying method, system and computer readable storage medium
CN110110178A (en) * 2019-05-10 2019-08-09 南京硅基智能科技有限公司 Virtual mobile phone log generation method based on cloud
CN110351280A (en) * 2019-07-15 2019-10-18 杭州安恒信息技术股份有限公司 A kind of method, system, equipment and readable storage medium storing program for executing for threatening information to extract
CN110515810A (en) * 2019-09-03 2019-11-29 浪潮软件集团有限公司 A kind of data monitoring log collection and retroactive method based on time series database
CN111400156A (en) * 2020-03-16 2020-07-10 中国联合网络通信集团有限公司 Log analysis method and device
CN111488261A (en) * 2020-03-11 2020-08-04 北京健康之家科技有限公司 User behavior analysis system, method, storage medium and computing device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109885537A (en) * 2019-02-22 2019-06-14 成都信息工程大学 A kind of journal displaying method, system and computer readable storage medium
CN110110178A (en) * 2019-05-10 2019-08-09 南京硅基智能科技有限公司 Virtual mobile phone log generation method based on cloud
CN110351280A (en) * 2019-07-15 2019-10-18 杭州安恒信息技术股份有限公司 A kind of method, system, equipment and readable storage medium storing program for executing for threatening information to extract
CN110515810A (en) * 2019-09-03 2019-11-29 浪潮软件集团有限公司 A kind of data monitoring log collection and retroactive method based on time series database
CN111488261A (en) * 2020-03-11 2020-08-04 北京健康之家科技有限公司 User behavior analysis system, method, storage medium and computing device
CN111400156A (en) * 2020-03-16 2020-07-10 中国联合网络通信集团有限公司 Log analysis method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
宋金玉等: "一种查看Linux系统审计日志的图形工具", 《计算机工程》 *
陈飞等: "基于Flume的分布式日志采集分析系统设计与实现", 《软件》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113312376A (en) * 2021-05-21 2021-08-27 福建天泉教育科技有限公司 Method and terminal for real-time processing and analysis of Nginx logs
CN113342623A (en) * 2021-05-28 2021-09-03 福建福诺移动通信技术有限公司 Visual early warning system and method based on dynamic threshold method
CN113342623B (en) * 2021-05-28 2022-05-17 福建福诺移动通信技术有限公司 Visual early warning system and method based on dynamic threshold method
CN113612816A (en) * 2021-07-06 2021-11-05 深圳市酷开网络科技股份有限公司 Data acquisition method, system, terminal and computer readable storage medium
CN113608981A (en) * 2021-07-27 2021-11-05 远景智能国际私人投资有限公司 Time sequence database testing method and device, computer equipment and storage medium
CN113608981B (en) * 2021-07-27 2024-01-05 远景智能国际私人投资有限公司 Time sequence database test method and device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
US11733982B1 (en) Dynamically changing input data streams processed by data stream language programs
CN112486789A (en) Log analysis system, method and device
CN110928718B (en) Abnormality processing method, system, terminal and medium based on association analysis
US10404822B2 (en) Predictive rollup and caching for application performance data
CN107506451B (en) Abnormal information monitoring method and device for data interaction
US20130081001A1 (en) Immediate delay tracker tool
US20200396232A1 (en) Generating data structures representing relationships among entities of a high-scale network infrastructure
CN107632920A (en) A kind of power transmission and transforming equipment monitoring device deep monitored method
US9131002B2 (en) Web application monitoring
CN111858284A (en) Resource monitoring method and device for artificial intelligence server
CN110598135A (en) Network request processing method and device, computer readable medium and electronic equipment
WO2016178661A1 (en) Determining idle testing periods
CN114490268A (en) Full link monitoring method, device, equipment, storage medium and program product
CN107391528B (en) Front-end component dependent information searching method and equipment
CN103544271A (en) Picture processing window loading method and device for browsers
CN106383869B (en) Method and device for acquiring user behavior information
CN107729208A (en) A kind of power transmission and transforming equipment monitoring device deep monitored system
KR100503776B1 (en) Method for analyzing and tuning web application performance
CN107967170B (en) Migration calculation method, device, equipment and storage medium
CN115202973A (en) Application running state determining method and device, electronic equipment and medium
CN108572900B (en) Blank pit position monitoring method, system, electronic equipment and storage medium
CN114357001A (en) Multi-cluster data query method and device, monitoring platform and storage medium
CN112860844A (en) Case clue processing system, method and device and computer equipment
CN113448985A (en) API (application program interface) interface generation method, calling method and device and electronic equipment
Ryckbosch et al. Analyzing performance traces using temporal formulas

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210312