CN117667636A - Log analysis method, system, equipment and medium based on generalized linear model - Google Patents
Log analysis method, system, equipment and medium based on generalized linear model Download PDFInfo
- Publication number
- CN117667636A CN117667636A CN202311798000.6A CN202311798000A CN117667636A CN 117667636 A CN117667636 A CN 117667636A CN 202311798000 A CN202311798000 A CN 202311798000A CN 117667636 A CN117667636 A CN 117667636A
- Authority
- CN
- China
- Prior art keywords
- data
- module
- log
- analysis
- generalized linear
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 98
- 238000012549 training Methods 0.000 claims abstract description 42
- 238000013500 data storage Methods 0.000 claims abstract description 37
- 238000007781 pre-processing Methods 0.000 claims abstract description 26
- 238000013480 data collection Methods 0.000 claims abstract description 24
- 238000010223 real-time analysis Methods 0.000 claims abstract description 11
- 238000012544 monitoring process Methods 0.000 claims description 27
- 238000000034 method Methods 0.000 claims description 20
- 238000006243 chemical reaction Methods 0.000 claims description 18
- 238000004140 cleaning Methods 0.000 claims description 16
- 238000007726 management method Methods 0.000 claims description 12
- 238000005457 optimization Methods 0.000 claims description 11
- 230000003993 interaction Effects 0.000 claims description 10
- 238000012795 verification Methods 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 3
- 238000004590 computer program Methods 0.000 description 14
- 238000010586 diagram Methods 0.000 description 14
- 230000015654 memory Effects 0.000 description 9
- 238000012545 processing Methods 0.000 description 9
- 238000004891 communication Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 8
- 230000008569 process Effects 0.000 description 6
- 230000003287 optical effect Effects 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013475 authorization Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
- G06F11/3476—Data logging
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2237—Vectors, bitmaps or matrices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Quality & Reliability (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Computer Hardware Design (AREA)
- Debugging And Monitoring (AREA)
Abstract
The disclosure provides a log analysis method based on a generalized linear model, comprising the following steps: the data collection module acquires a plurality of original log data; the data preprocessing module cleans and preprocesses the original log data to generate trainable data; the data storage module stores a plurality of original log data; the training data storage module stores trainable data by using a vector database; the training module trains the trainable data by using the generalized linear model to generate a target model; the generalized linear model analysis module uses the target model to conduct log real-time analysis to generate an analysis result. The disclosure also provides a log analysis system, electronic equipment and storage medium based on the generalized linear model.
Description
Technical Field
The disclosure may be used in the financial field or the log processing field, and more particularly, to a log analysis method, system, device, and medium based on a generalized linear model.
Background
Currently, most log analysis systems employ preset monitoring indexes, such as response time, error rate, etc., but have certain limitations as follows:
in the first aspect, conventional log analysis typically focuses on a single application, without going deep into the entire traffic link, and cannot fully analyze and predict problems.
In the second aspect, the analysis index is often manually set, and cannot accurately reflect the actual business problem. For example, a service request may go through multiple applications, and log monitoring of a single application may be difficult to reveal global problems.
In the third aspect, when the existing log analysis system locates abnormal conditions, manual intervention is often required, and the efficiency is low and the system is prone to error.
Disclosure of Invention
In view of the above, the present disclosure provides a log analysis method, system, apparatus, and storage medium based on a generalized linear model (generalized linearmodel, GLM). The method realizes comprehensive, accurate and real-time analysis of the log, has the capabilities of dynamic learning, early warning, optimization suggestion and the like, and improves the running efficiency and stability of the log analysis system.
According to a first aspect of the present disclosure, there is provided a log analysis method based on a GLM model, including: the data collection module acquires a plurality of original log data; the data preprocessing module cleans and preprocesses the original log data to generate trainable data; the data storage module stores a plurality of original log data; the training data storage module stores trainable data by using a vector database; the training module trains the trainable data by using the GLM to generate a target model; and the GLM analysis module uses the target model to carry out log real-time analysis to generate an analysis result.
According to an embodiment of the present disclosure, the GLM analysis module performs real-time analysis on the target model to generate an analysis result, including: the authority management module sets the authority which can be accessed by the user side; the user interaction module receives the execution requirement of a user side; and the report generation module acquires an analysis result of the GLM analysis module according to the execution requirement of the user side.
According to an embodiment of the present disclosure, the raw log data includes application raw log data, monitoring platform raw log data, and application portrait platform raw log data.
According to an embodiment of the present disclosure, the data collection module includes an application log receiving sub-module, a monitoring platform log receiving sub-module, and an application portrait platform log receiving sub-module, and the data collection module obtains a plurality of original log data, including: the application log receiving submodule acquires application original log data and stores the application original log data in the data storage module; the monitoring platform log receiving sub-module acquires original log data of the monitoring platform and stores the original log data in the data storage module; the application portrait platform log receiving submodule acquires original log data of the application portrait platform and stores the original log data in the data storage module.
According to an embodiment of the present disclosure, a data preprocessing module cleans and preprocesses raw log data to generate trainable data, including: the data cleaning sub-module cleans the original log data acquired by the data acquisition sub-module and transmits the original log data to the data conversion sub-module; the data conversion sub-module preprocesses the cleaned original log data to obtain trainable data and stores the trainable data in a vector database of the training data storage module.
According to an embodiment of the present disclosure, a training module trains trainable data using a GLM model, generating a target model, including: selecting a GLM model and carrying out initialization processing to obtain an intermediate model; and training the intermediate model according to the trainable data in the vector database by an optimization algorithm to obtain a target model.
According to an embodiment of the present disclosure, the training module trains the trainable data using the GLM model to generate a target model, further comprising: verifying the target model by using the test data to obtain a verification result; and evaluating the target model according to the verification result, and storing the target model.
According to an embodiment of the present disclosure, further comprising: the data collection module acquires a plurality of original log data according to the execution requirements received by the user interaction module.
A second aspect of the present disclosure provides a log analysis system based on a GLM model, including a data collection module, a data preprocessing module, a data storage module, a training module, and a GLM analysis module, wherein; the data collection module is used for obtaining a plurality of original log data; the data preprocessing module is used for cleaning and preprocessing the original log data to generate trainable data; the data storage module is used for storing a plurality of original log data; the training data storage module stores trainable data by using a vector database; the training module trains the trainable data by using the GLM model to generate a target model; the GLM analysis module is used for carrying out real-time analysis on the target model to generate an analysis result.
According to the embodiment of the disclosure, the data collection module adopts a distributed structure and comprises an application log receiving sub-module, a monitoring platform log receiving sub-module and an application portrait platform log receiving sub-module, wherein the application log receiving sub-module is used for receiving the application portrait platform log; the application log receiving sub-module is used for acquiring application original log data; the monitoring platform log receiving sub-module is used for acquiring original log data of the monitoring platform; the application portrait platform log receiving sub-module is used for obtaining original log data of the application portrait platform.
According to an embodiment of the disclosure, the data preprocessing module includes a data acquisition sub-module, a data cleaning sub-module, and a data conversion sub-module. Wherein; the data acquisition sub-module is used for acquiring original log data; the data cleaning submodule is used for cleaning the original log data and transmitting the original log data to the data conversion submodule; the data conversion sub-module is used for preprocessing the cleaned original log data.
A third aspect of the present disclosure provides an electronic device, comprising: one or more processors; and a storage device for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the GLM model based log analysis method described above.
A fourth aspect of the present disclosure provides a computer-readable storage medium having stored thereon executable instructions for execution by a processor of the above GLM model based log analysis method.
Drawings
The foregoing and other objects, features and advantages of the disclosure will be more apparent from the following description of embodiments of the disclosure with reference to the accompanying drawings, in which:
FIG. 1 schematically illustrates an application scenario diagram of a GLM model-based log analysis method according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates a flow chart of a GLM model-based log analysis method in accordance with an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow chart of generating trainable data in accordance with an embodiment of the present disclosure;
FIG. 4 schematically illustrates a flow chart of generating a target model according to an embodiment of the disclosure;
FIG. 5 schematically illustrates a flow chart of generating analysis results according to an embodiment of the disclosure;
FIG. 6 schematically illustrates a block diagram of a GLM model-based log analysis system, in accordance with an embodiment of the present disclosure;
fig. 7 schematically illustrates a block diagram of a GLM model-based log analysis electronic device according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).
Some of the block diagrams and/or flowchart illustrations are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, when executed by the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart. The techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). Additionally, the techniques of this disclosure may take the form of a computer-readable storage medium having stored thereon instructions that, when executed by a processor, cause the processor to perform the methods of this disclosure.
In the technical solution of the present disclosure, the related user information (including, but not limited to, user personal information, user image information, user equipment information, such as location information, etc.) and data (including, but not limited to, data for analysis, stored data, displayed data, etc.) are information and data authorized by the user or sufficiently authorized by each party, and the related data is collected, stored, used, processed, transmitted, provided, disclosed, applied, etc. and processed, all in compliance with the related laws and regulations and standards of the related country and region, necessary security measures are taken, no prejudice to the public order, and corresponding operation entries are provided for the user to select authorization or rejection.
The embodiment of the disclosure provides a log analysis method based on a GLM model, which comprises the following steps: the data collection module acquires a plurality of original log data; the data preprocessing module cleans and preprocesses the original log data to generate trainable data; the data storage module stores a plurality of original log data; the training data storage module stores trainable data by using a vector database; the training module trains the trainable data by using GLM (generalized linear model ) to generate a target model; and the GLM analysis module analyzes the target model in real time to generate an analysis result.
According to the embodiment of the disclosure, the problem positioning efficiency can be improved by the log analysis method based on the GLM large model: the log data of each application can be automatically analyzed in real time to quickly find out possible problem sources, so that the manpower resources are greatly saved, and the time required for positioning the problem is reduced. Potential problems can be found in advance: the predictive analysis can be performed, and possible problems can be found in advance, so that precautionary measures can be taken in advance, and potential risks are reduced. The service operation efficiency can be optimized: by analyzing the log data, bottlenecks and optimization points in service operation can be found, and data-driven optimization suggestions are provided, so that service operation efficiency is improved. User experience can be improved: by reducing the time of occurrence of problems in the application, the potential problems are solved in time, the service quality of enterprises can be remarkably improved, and the user experience is further improved. Data decision support may be enhanced: the real-time report and the visual instrument panel of the system can provide key business insight, and are beneficial to enhancing business decision support.
Fig. 1 schematically illustrates an application scenario diagram of a GLM model-based log analysis method according to an embodiment of the present disclosure. It should be noted that fig. 1 illustrates only an example of an application scenario in which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, but it does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments, or scenarios.
As shown in fig. 1, an application scenario 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only) may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for websites browsed by users using the terminal devices 101, 102, 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that, the log analysis method based on the GLM model provided in the embodiments of the present disclosure may be generally performed by the server 105. Accordingly, the GLM model-based log analysis system provided by the embodiments of the present disclosure may be generally provided in the server 105. The GLM model-based log analysis method provided by the embodiments of the present disclosure may also be performed by a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the GLM model-based log analysis system provided by the embodiments of the present disclosure may also be provided in a server or server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
The GLM model-based log analysis method of the disclosed embodiment will be described in detail below with reference to the scenario described in fig. 1 through fig. 2 to 5.
Fig. 2 schematically shows a flowchart of a GLM model-based log analysis method according to an embodiment of the present disclosure.
In operation S210, the data collection module acquires a plurality of raw log data.
In operation S220, the data preprocessing module cleans and preprocesses the original log data to generate trainable data.
In operation S230, the data storage module stores a plurality of original log data.
In operation S240, the training data storage module stores trainable data using the vector database.
In operation S250, the training module trains the trainable data using the GLM to generate a target model.
In operation S260, the GLM analysis module performs log real-time analysis using the target model to generate an analysis result.
Specifically, the data collection module adopts a distributed system to process high-concurrency input original log data, wherein the data collection module comprises an application log receiving sub-module, a monitoring platform log receiving sub-module and an application portrait platform log receiving sub-module.
It should be noted that, the original log data includes application original log data, monitoring platform original log data and application portrait platform original log data.
The application log receiving submodule acquires application original log data and stores the application original log data in the data storage module. The monitoring platform log receiving sub-module acquires the original log data of the monitoring platform and stores the original log data in the data storage module. The application portrait platform log receiving submodule acquires original log data of the application portrait platform and stores the original log data in the data storage module.
After the original log data is obtained, the data preprocessing module cleans and preprocesses the original data. The training data storage module stores the preprocessed trainable data by using the vector database, so that efficient data retrieval and similarity comparison are facilitated. Wherein the vector database also supports distributed storage.
Training trainable data in a vector database by using the GLM large model, and periodically updating the model to generate a target model. Wherein, can adopt distributed training strategy, improve training speed.
The GLM analysis module uses the target model to analyze the new log data in real time, predicts possible problems, hidden dangers and the possibility of optimization, and generates early warning and optimization suggestions.
By the embodiment of the disclosure, log data of each application can be automatically analyzed in real time, possible problem sources can be rapidly found, potential problems can be found in advance, taking precautionary measures in advance is facilitated, and potential risks are reduced.
Fig. 3 schematically illustrates a flow chart of generating trainable data in accordance with an embodiment of the present disclosure.
As shown in fig. 3, in the embodiment of the present disclosure, the data preprocessing module cleans and preprocesses the raw log data, and generating trainable data may further include operations S310 to S320.
In operation S310, the data cleansing sub-module cleanses the raw log data acquired by the data acquisition sub-module and transmits the same to the data conversion sub-module.
In operation S320, the data conversion sub-module preprocesses the cleaned raw log data to obtain trainable data and stores the trainable data in the vector database of the training data storage module.
Specifically, the data preprocessing module comprises a data acquisition sub-module, a data cleaning sub-module and a data conversion sub-module. After the data acquisition sub-module acquires the original log data, the data cleaning sub-module cleans the original log data acquired by the data acquisition sub-module and transmits the original log data to the data conversion sub-module. The data conversion sub-module preprocesses the cleaned original log data to obtain trainable data and stores the trainable data in a vector database of the training data storage module.
It should be noted that in the execution environment of this process, all training data supports encrypted storage and access control.
Fig. 4 schematically illustrates a flow chart of generating a target model according to an embodiment of the disclosure.
As shown in fig. 4, in an embodiment of the present disclosure, the training module trains the trainable data using the GLM model, and generating the target model may further include operations S410 to S420.
In operation S410, a GLM model is selected and an initialization process is performed to obtain an intermediate model.
In operation S420, the intermediate model performs training of an optimization algorithm according to the trainable data in the vector database to obtain a target model.
Specifically, in the process of training the trainable data, a GLM model is selected for initialization processing, and an intermediate model is obtained after the model is initialized. The training module trains the trainable data by using the GLM model to generate a target model, then verifies the target model by using the test data to obtain a verification result, evaluates the target model according to the verification result, and stores the target model.
By the embodiment of the disclosure, the deficiency and the optimization point in service operation can be found by training the log data and generating the target model for evaluation, and the data-driven optimization suggestion is provided for enterprises, so that the service operation efficiency is improved.
Fig. 5 schematically illustrates a flow chart of generating an analysis result according to an embodiment of the present disclosure.
As shown in fig. 5, in the embodiment of the present disclosure, the GLM analysis module performing real-time analysis on the target model to generate an analysis result may further include operations S510 to S530.
In operation S510, the rights management module sets rights accessible to the user terminal.
In operation S520, the user interaction module receives the execution requirement of the user terminal.
In operation S530, the report generating module obtains the analysis result of the GLM analysis module according to the execution requirement of the user side.
Specifically, by providing access rights management for user end users and roles, access rights of different users to system data and functions are controlled. After the user side obtains the authority, the user can input the execution requirement and obtain the analysis result in real time according to the interface provided by the user interaction module.
After the user interaction module receives the execution requirement input by the user terminal, the system configuration can be adjusted according to the requirement, so that the data collection module obtains a plurality of original log data according to the execution requirement received by the user interaction module, log analysis is started, and finally the report generation module obtains the analysis result of the GLM analysis module according to the execution requirement of the user terminal.
Through the embodiment of the disclosure, the user experience is improved, the corresponding analysis result is fed back according to the execution requirement of the user side, and the user experience is further improved. Second, the use of real-time reporting and visualization in this log analysis approach can provide critical business insight.
Based on the above log analysis method based on the GLM model, the present disclosure further provides a log analysis system based on the GLM model, and the device will be described in detail below with reference to fig. 7.
Fig. 6 schematically illustrates a block diagram of a GLM model based log analysis system according to an embodiment of the present disclosure.
As shown in fig. 6, the system includes a data collection module, a data preprocessing module, a data storage module, a training module, and a GLM analysis module.
Specifically, the data collection module is used for acquiring a plurality of original log data; the data preprocessing module is used for cleaning and preprocessing the original log data to generate trainable data; the data storage module is used for storing a plurality of original log data; the training data storage module stores trainable data by using a vector database; the training module trains the trainable data by using the GLM model to generate a target model; the GLM analysis module is used for carrying out real-time analysis on the target model to generate an analysis result.
The data collection module adopts a distributed structure and comprises an application log receiving sub-module, a monitoring platform log receiving sub-module and an application portrait platform log receiving sub-module. The data preprocessing module comprises a data acquisition sub-module, a data cleaning sub-module and a data conversion sub-module.
Wherein; the application log receiving sub-module is used for acquiring application original log data; the monitoring platform log receiving sub-module is used for acquiring original log data of the monitoring platform; the application portrait platform log receiving sub-module is used for obtaining original log data of the application portrait platform. The data acquisition sub-module is used for acquiring original log data; the data cleaning submodule is used for cleaning the original log data and transmitting the original log data to the data conversion submodule; the data conversion sub-module is used for preprocessing the cleaned original log data.
Continuing with fig. 6, the system further comprises: the system comprises a user interaction module, a right management module, an API interface module, a model management module and a report generation module.
Specifically, the user interaction module is used for receiving the execution requirement of the user side and outputting an analysis result; the right management module is used for setting the access right of the user; the API interface module provides an API interface to enable the log analysis system to be connected with a third party application; the model management module is used for managing models in the training module, wherein the models comprise a GLM model, an intermediate model and a target model; and the report generation module acquires an analysis result of the GLM analysis module according to the execution requirement of the user side.
Continuing with fig. 6, the system further comprises: the system comprises an automatic optimizing module, a safety management module, a load balancing module, a data backup and recovery module, a fault transfer module and a monitoring alarm module.
Specifically, the automatic optimization module is used for adjusting parameters of the log analysis system to optimize the system performance according to the analysis result of the GLM analysis module; the security management module is used for encrypting data and preventing the system from being attacked; the load balancing module is used for distributing tasks of each module in the system; the data backup and recovery module is used for backing up the data in the data storage module and the training data storage module; the fault transfer module is used for monitoring the operation of the system in real time and can be automatically switched to a standby system under the condition of system failure; the monitoring alarm module is used for monitoring system abnormality and sending out an alarm when the system is abnormal.
According to the embodiment of the disclosure, the modules of the log analysis system based on the GLM model are used for ensuring stable operation of the system in the face of various conditions, and meanwhile, the advantages of the distributed system are utilized, so that the capability of processing large-scale data and high-concurrency requests is improved. This design allows each module to focus on its particular task, thereby improving the overall efficiency and stability of the system. Through the mutual coordination of the modules, the method can realize the efficient, rapid and comprehensive analysis of the business scene full-link application log, can rapidly locate the problem, and realize the efficient log analysis and locating capability.
In an embodiment of the present disclosure, any of a plurality of modules in the GLM model-based log analysis system may be combined in one module to be implemented, or any of the modules may be split into a plurality of modules. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. According to embodiments of the present disclosure, at least one of the plurality of modules in the system may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or in hardware or firmware, such as any other reasonable manner of integrating or packaging the circuits, or in any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, at least one of the modules in the system may be implemented at least in part as a computer program module which, when executed, performs the corresponding functions.
Fig. 7 schematically illustrates a block diagram of a GLM model-based log analysis electronic device according to an embodiment of the present disclosure.
As shown in fig. 7, an electronic device 700 according to an embodiment of the present disclosure includes a processor 701 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. The processor 701 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. The processor 701 may also include on-board memory for caching purposes. The processor 701 may comprise a single processing unit or a plurality of processing units for performing different actions of the method flows according to embodiments of the disclosure.
In the RAM703, various programs and data necessary for the operation of the electronic apparatus 700 are stored. The processor 701, the ROM702, and the RAM703 are connected to each other through a bus 704. The processor 701 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM702 and/or the RAM 703. Note that the program may be stored in one or more memories other than the ROM702 and the RAM 703. The processor 701 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in one or more memories.
According to an embodiment of the present disclosure, the electronic device 700 may further include an input/output (I/O) interface 705, the input/output (I/O) interface 705 also being connected to the bus 704. The electronic device 700 may also include one or more of the following components connected to the I/O interface 705: an input section 706 including a keyboard, a mouse, and the like; an output portion 707 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 708 including a hard disk or the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. The drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read therefrom is mounted into the storage section 708 as necessary.
The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM702 and/or RAM703 and/or one or more memories other than ROM702 and RAM703 described above.
The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 701. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed over a network medium in the form of signals, downloaded and installed via the communication section 709, and/or installed from the removable medium 711. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 709, and/or installed from the removable medium 711. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 701. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems and methods according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.
The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.
Claims (13)
1. A log analysis method based on a generalized linear model, comprising:
the data collection module acquires a plurality of original log data;
the data preprocessing module cleans and preprocesses the original log data to generate trainable data;
the data storage module stores the plurality of original log data;
a training data storage module stores the trainable data using a vector database;
the training module trains the trainable data by using a generalized linear model to generate a target model;
and the generalized linear model analysis module uses the target model to conduct log real-time analysis to generate an analysis result.
2. The log analysis method based on the generalized linear model according to claim 1, wherein the generalized linear model analysis module performs log real-time analysis using the target model to generate an analysis result, further comprising:
the authority management module sets the authority which can be accessed by the user side;
the user interaction module receives the execution requirement of a user side;
and the report generation module acquires an analysis result of the generalized linear model analysis module according to the execution requirement of the user side.
3. The generalized linear model based log analysis method of claim 1, wherein the raw log data comprises application raw log data, monitoring platform raw log data, and application portrait platform raw log data.
4. A method of log analysis based on a generalized linear model according to claim 1 or 3, wherein the data collection module comprises an application log receiving sub-module, a monitoring platform log receiving sub-module and an application portrayal platform log receiving sub-module, and
the data collection module obtains a plurality of original log data, including:
the application log receiving submodule acquires application original log data and stores the application original log data in the data storage module;
the monitoring platform log receiving sub-module acquires original log data of the monitoring platform and stores the original log data in the data storage module;
and the application portrait platform log receiving submodule acquires original log data of the application portrait platform and stores the original log data in the data storage module.
5. The generalized linear model based log analysis method of claim 1, wherein the data preprocessing module cleans and preprocesses the raw log data to generate trainable data, comprising:
the data cleaning sub-module cleans the original log data acquired by the data acquisition sub-module and transmits the original log data to the data conversion sub-module;
the data conversion sub-module preprocesses the cleaned original log data to obtain trainable data and stores the trainable data in a vector database of the training data storage module.
6. The generalized linear model based log analysis method of claim 1 or 5, wherein the training module trains the trainable data using a generalized linear model to generate a target model, comprising:
selecting a generalized linear model and initializing to obtain an intermediate model;
and training an optimization algorithm by the intermediate model according to the trainable data in the vector database to obtain a target model.
7. The generalized linear model based log analysis method of claim 6, wherein the training module trains the trainable data using a generalized linear model to generate a target model, further comprising:
verifying the target model by using test data to obtain a verification result;
and evaluating the target model according to the verification result, and storing the target model.
8. The generalized linear model based log analysis method according to claim 1 or 2, further comprising:
and the data collection module acquires a plurality of original log data according to the execution requirements received by the user interaction module.
9. The log analysis system based on the generalized linear model is characterized by comprising a data collection module, a data preprocessing module, a data storage module, a training module and a generalized linear model analysis module, wherein the data collection module, the data preprocessing module, the data storage module, the training data storage module and the generalized linear model analysis module are arranged in the log analysis system;
the data collection module is used for obtaining a plurality of original log data;
the data preprocessing module is used for cleaning and preprocessing the original log data to generate trainable data;
the data storage module is used for storing the plurality of original log data;
the training data storage module stores the trainable data using a vector database;
the training module trains the trainable data by using a generalized linear model to generate a target model;
the generalized linear model analysis module is used for carrying out real-time analysis on the target model to generate an analysis result.
10. The log analysis system based on the generalized linear model according to claim 9, wherein the data collection module adopts a distributed structure, and comprises an application log receiving sub-module, a monitoring platform log receiving sub-module and an application portrait platform log receiving sub-module, wherein;
the application log receiving sub-module is used for acquiring application original log data;
the monitoring platform log receiving sub-module is used for acquiring original log data of the monitoring platform;
the application portrait platform log receiving sub-module is used for obtaining original log data of the application portrait platform.
11. The generalized linear model based log analysis system of claim 9, wherein the data preprocessing module comprises a data acquisition sub-module, a data cleaning sub-module, a data conversion sub-module, wherein;
the data acquisition sub-module is used for acquiring original log data;
the data cleaning submodule is used for cleaning original log data and transmitting the original log data to the data conversion submodule;
the data conversion sub-module is used for preprocessing the cleaned original log data.
12. An electronic device, comprising:
one or more processors;
storage means for storing one or more programs,
wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-8.
13. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311798000.6A CN117667636A (en) | 2023-12-25 | 2023-12-25 | Log analysis method, system, equipment and medium based on generalized linear model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311798000.6A CN117667636A (en) | 2023-12-25 | 2023-12-25 | Log analysis method, system, equipment and medium based on generalized linear model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117667636A true CN117667636A (en) | 2024-03-08 |
Family
ID=90086385
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311798000.6A Pending CN117667636A (en) | 2023-12-25 | 2023-12-25 | Log analysis method, system, equipment and medium based on generalized linear model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117667636A (en) |
-
2023
- 2023-12-25 CN CN202311798000.6A patent/CN117667636A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11263207B2 (en) | Performing root cause analysis for information technology incident management using cognitive computing | |
CN114358147B (en) | Training method, recognition method, device and equipment for abnormal account recognition model | |
CN113411400B (en) | Information calling method and device, electronic equipment and readable storage medium | |
CN114565158A (en) | Data prediction method and device, electronic equipment and storage medium | |
CN114710397A (en) | Method, device, electronic equipment and medium for positioning fault root cause of service link | |
CN114238993A (en) | Risk detection method, apparatus, device and medium | |
CN111582649B (en) | Risk assessment method and device based on user APP single-heat coding and electronic equipment | |
LU501006B1 (en) | Atmospheric environment quality management method, electronic equipment and storage system | |
CN115760013A (en) | Operation and maintenance model construction method and device, electronic equipment and storage medium | |
CN115048561A (en) | Recommendation information determination method and device, electronic equipment and readable storage medium | |
CN117667636A (en) | Log analysis method, system, equipment and medium based on generalized linear model | |
CN114490130A (en) | Message subscription method and device, electronic equipment and storage medium | |
CN115033574A (en) | Information generation method, information generation device, electronic device, and storage medium | |
CN114676020A (en) | Performance monitoring method and device of cache system, electronic equipment and storage medium | |
CN114301713A (en) | Risk access detection model training method, risk access detection method and risk access detection device | |
CN114443663A (en) | Data table processing method, device, equipment and medium | |
CN113052509A (en) | Model evaluation method, model evaluation apparatus, electronic device, and storage medium | |
CN115190008B (en) | Fault processing method, fault processing device, electronic equipment and storage medium | |
CN113656271B (en) | Method, device, equipment and storage medium for processing abnormal behaviors of user | |
CN117194250A (en) | Test case generation method, device, equipment, medium and program product | |
CN113392142A (en) | Method, device, equipment, medium and product for calculating hit rate of IP address library | |
CN117235043A (en) | Database migration method, device, electronic equipment and medium | |
CN116126831A (en) | Training method of stability prediction model and database stability detection method | |
CN117176576A (en) | Network resource changing method, device, equipment and storage medium | |
CN117785336A (en) | Task processing method, system, equipment and medium based on generalized linear model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |