CN112256517B - Log analysis method and device of virtualization platform based on LSTM-DSSM - Google Patents

Log analysis method and device of virtualization platform based on LSTM-DSSM Download PDF

Info

Publication number
CN112256517B
CN112256517B CN202010888954.6A CN202010888954A CN112256517B CN 112256517 B CN112256517 B CN 112256517B CN 202010888954 A CN202010888954 A CN 202010888954A CN 112256517 B CN112256517 B CN 112256517B
Authority
CN
China
Prior art keywords
log
log information
error
information
lstm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010888954.6A
Other languages
Chinese (zh)
Other versions
CN112256517A (en
Inventor
孟令鲁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202010888954.6A priority Critical patent/CN112256517B/en
Publication of CN112256517A publication Critical patent/CN112256517A/en
Application granted granted Critical
Publication of CN112256517B publication Critical patent/CN112256517B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3065Monitoring arrangements determined by the means or processing involved in reporting the monitored data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Databases & Information Systems (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a virtualization platform log analysis method and device based on LSTM-DSSM, belongs to the technical field of virtualization products, and solves the technical problems that the problem location requires circulation of multiple persons, the root cause of the problem can be found, and time and labor are consumed. Acquiring log information of all nodes; screening abnormal error log information from the acquired log information; sorting the abnormal error log information to generate log linguistic data; training the LSTM-DSSM model by using the log corpus; and when the abnormal error occurs, matching by using the trained LSTM-DSSM model to obtain the log information corresponding to the abnormal error. The invention is used for collecting the log information of the computing and management nodes together, and integrating the log information into the virtualization platform for unified management by calling the interface of the elastic search, thereby realizing real-time synchronization and unified query of log resources, solving the problems of time and labor waste in log search, and effectively improving the efficiency of the demand development and system test stages.

Description

Log analysis method and device of virtualization platform based on LSTM-DSSM
Technical Field
The invention relates to the technical field of virtualization products, in particular to a virtualization platform log analysis method and device based on LSTM-DSSM.
Background
With the development and large-scale popularization of cloud computing technology in recent years, a virtualization platform is widely applied as one of the bases of cloud computing. Various problems are inevitable in the process of developing, testing and delivering the virtualization platform to a client for use. When a problem occurs, the cause of the problem is usually analyzed by looking up a log, so as to solve the problem. Because the ICS platform comprises a management node and a computing node, and the computing node comprises resources of computing and storing networks, the analysis of the log is complex. The current problem log analysis process is as follows: checking the log of the management node, and if a problem is found, solving the problem; and if the error is thrown by the bottom layer, checking the log of the computing node, firstly checking the log of the agent, if no exception occurs, checking error reporting information of the IVA, and if the error is thrown by the storage, turning to check the log of the storage. Because the management node and the computing node are completed by different labor division cooperation, the problem positioning usually needs a plurality of people to circulate to find the root cause of the problem, which is time-consuming and labor-consuming.
Based on the problems, the invention provides a log analysis method of a virtualization platform based on LSTM-DSSM, firstly, logs of different nodes are collected together through FileBeat and are integrated to a management node to uniformly check all log information, when a task fails, an error reporting information and all log information are trained to obtain a log analysis model based on an LSTM-DSSM neural network, and error logs of the management node or a computing node during error reporting are extracted by using the model, so that research and development personnel can quickly and effectively locate problem follow-up factors.
Disclosure of Invention
The invention aims to provide a method and a device for analyzing a log of a virtualization platform based on LSTM-DSSM, so as to solve the technical problems that the positioning of a problem requires a plurality of people to circulate, the root cause of the problem can be found, and the time and the labor are consumed.
In a first aspect, the present invention provides a log analysis method applied to a management node of a virtualization platform, where the method includes:
acquiring log information of all nodes;
screening abnormal error log information from the acquired log information;
sorting the abnormal error log information to generate log linguistic data;
training an LSTM-DSSM model by using log corpora (LSTM: Long Short-Term Memory, Long Short-Term Memory network; DSSM: Deep Structured Semantic model);
and when the abnormal error occurs, matching by using the trained LSTM-DSSM model to obtain the log information corresponding to the abnormal error.
Further, the log information includes one or more of a management node log, a virtual agent log, a storage log, an agent log, and a feedback log.
Further, after the step of obtaining the log information of all the nodes, the method further includes:
filtering the state, the capacity and the monitoring log information of the timing printing according to the time period;
for the task execution progress log information, if no failed task and alarm exist in the current time period, retaining part of progress log information after the task is completed, and deleting the rest of the progress log information;
and when the current log information exceeds 80% of the log capacity, backing up the current log information.
Further, the step of screening the log information of the abnormal error from the obtained log information includes:
screening error log information from the acquired log information;
and screening error log information which does not occur due to the incapability of realizing the platform limitation from the error log information to be used as abnormal error log information.
Further, the step of sorting the abnormal error log information to generate log linguistic data includes:
analyzing the historical characteristic information, and extracting the log characteristics of the abnormal error log information;
based on the log characteristics, all abnormal error log information and the corresponding error codes are sorted to form an error reporting information base;
and sorting the task type of the abnormal error log information and a corresponding proxy interface call list to generate a log corpus.
Furthermore, the format of the log corpus is [ task type ] [ error description ] [ platform error code ] [ task ID ] [ task time ] [ task call interface list ] [ specific error log ].
Further, the step of training the LSTM-DSSM model by using the log corpus includes:
obtaining word vector data corresponding to each abnormal error log information and corresponding characteristic by using a word2Vec word vector model, training to obtain word vector data of unprocessed error logs, and taking the word vector data as input;
selecting a characteristic unit, compressing the characteristic unit, and disassembling the word token into a letter n-gram dimensionality reduction representation;
and (3) utilizing the hidden layer vector output by the LSTM unit of Query as context, respectively carrying out dot product calculation on the hidden layer vector of each time step of doc, taking the calculation result as weight, carrying out weighted summation on the vectors of each time step, and calculating a loss function of actual output and expected output by using a softmax cross entropy function.
In a second aspect, the present invention further provides a log analysis apparatus, applied to a management node of a virtualization platform, where the apparatus includes:
the acquisition module is used for acquiring the log information of all the nodes;
the screening module is used for screening the abnormal error log information from the acquired log information;
the corpus module is used for sorting the abnormal error log information to generate log corpus;
the training module is used for training the LSTM-DSSM model by using the log corpus;
and the matching module is used for matching by using the trained LSTM-DSSM model when the abnormal error occurs to obtain the log information corresponding to the abnormal error.
In a third aspect, the invention also provides a computer readable storage medium having stored thereon machine executable instructions which, when invoked and executed by a processor, cause the processor to perform the method.
According to the method for analyzing the logs of the virtualization platform based on the LSTM-DSSM, the log information of the computing and managing nodes is collected together through the FileBeat, and the log information is integrated into the virtualization platform for unified management through calling the interface of the elastic search, so that real-time synchronization and unified query of log resources are realized, the problems that log searching is time-consuming and labor-consuming are solved, log filtering, log alarming and backup processing are added on the basis, and the problem that the logs are lost easily due to too much log refreshing is solved; according to historical log information, on the basis of combining log features, an LSTM-DSSM network model is trained and generated through an LSTM-DSSM network structure, matching and searching of specific error-reporting logs are further achieved, the log searching result can be directly analyzed by research and development personnel, efficiency in the stages of demand development and system testing is effectively improved, the log analyzing method simplifies the log analyzing process, unified management of logs is achieved, the problem of log loss caused by frequent log refreshing is solved, and manpower and material resources consumed in the log analyzing process are saved.
Accordingly, the log analysis device and the computer-readable storage medium provided by the embodiment of the invention also have the technical effects.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a method for analyzing logs of a virtualization platform based on LSTM-DSSM according to an embodiment of the present invention;
fig. 2 is a diagram of an LSTM-DSSM network structure according to an embodiment of the present invention;
fig. 3 is a schematic block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
The terms "comprising" and "having," and any variations thereof, as referred to in embodiments of the present invention, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The invention provides a log analysis method, which is applied to management nodes of a virtualization platform. The method comprises the steps of taking historical log information as a corpus, constructing a word vector model after preprocessing, adding feature processing of the log information on the basis of text word vector features, inputting the word vector into an LSTM-DSSM neural network, training a log matching model, inputting error description and related log information into the trained log analysis model for a newly-appeared abnormal log, and quickly finding out the matched log information, wherein the method comprises the following steps:
acquiring log information of all nodes;
screening abnormal error log information from the acquired log information;
sorting the abnormal error log information to generate log linguistic data;
training the LSTM-DSSM model by using the log corpus;
and when the abnormal error occurs, matching by using the trained LSTM-DSSM model to obtain the log information corresponding to the abnormal error.
Further, the log information includes one or more of a management node log, a virtual agent log, a storage log, an agent log, and a feedback log.
Further, after the step of obtaining the log information of all the nodes, the method further includes:
filtering the state, the capacity and the monitoring log information of the timing printing according to the time period;
and for the task execution progress log information, if the current time period has no failed task and alarm, retaining part of progress log information after the task is completed, and deleting the rest of the progress log information.
And when the current log information exceeds 80% of the log capacity, backing up the current log information.
Further, the step of screening the log information of the abnormal error from the obtained log information includes:
screening error log information from the acquired log information;
and screening error log information which does not occur due to the incapability of realizing the platform limitation from the error log information to be used as abnormal error log information.
Further, the step of sorting the abnormal error log information to generate log linguistic data includes:
analyzing the historical characteristic information, and extracting the log characteristics of the abnormal error log information;
based on the log characteristics, all abnormal error log information and the corresponding error codes are sorted to form an error reporting information base;
and sorting the task type of the abnormal error log information and a corresponding proxy interface call list to generate a log corpus.
Furthermore, the format of the log corpus is [ task type ] [ error description ] [ platform error code ] [ task ID ] [ task time ] [ task call interface list ] [ specific error log ].
Further, the step of training the LSTM-DSSM model by using the log corpus includes:
obtaining word vector data corresponding to each abnormal error log information and corresponding features by using a word2Vec word vector model, training to obtain word vector data of unprocessed error logs, and taking the word vector data as input;
selecting a characteristic unit, compressing the characteristic unit, and disassembling the word token into a letter n-gram dimensionality reduction representation;
and (3) utilizing the hidden layer vector output by the LSTM unit of Query as context, respectively carrying out dot product calculation on the hidden layer vector of each time step of doc, taking the calculation result as weight, carrying out weighted summation on the vectors of each time step, and calculating a loss function of actual output and expected output by using a softmax cross entropy function.
The specific implementation process of the invention is as follows:
1) the method comprises the steps of installing filebias at all nodes, uniformly sending log contents of all nodes to an elastic search for storage through the filebias, collecting log information of different computing nodes and management nodes of an ICS virtualization platform to a management platform through a fileBeat and the elastic search, adding log filtering, log alarming and backup operations after the collection of the fileBeat logs is completed, realizing the uniform management of the log information and solving the problem of log loss of the virtualization platform.
2) The management node of the virtualization platform calls an interface provided by the elasticsearch pair to manage the log information of all the nodes, and when the log is inquired, the log information only needs to be checked in a unified way through the management node.
3) In the virtualization platform, when a certain log information is too much, the previous log can be covered, so that the log is lost, and in order to avoid the problem, filtering and log alarm processing are added:
filtering the state, capacity and monitoring information of the timed printing, and filtering in different time periods.
And secondly, for the task execution progress log information, if no failed task and alarm exist in the current time period, only part of progress logs are reserved after the task is completed, and the rest progress logs are deleted.
And thirdly, when the current log exceeds 80% of the log capacity, increasing log alarm processing, and backing up the current log to prevent log coverage.
4) For error information in an ICS platform, (ICS: InCloud Sphere, a virtualization platform) to classify the errors manually, and the standard for distinguishing is whether the errors are errors meeting expectations, for example, error reporting during hot addition of a disk, "IDE disk does not support hot addition", which is errors meeting expectations due to the fact that platform limitation cannot be achieved, and the errors are classified into a normal error reporting set; for example, the error information such as 'unknown exception' is error information that should not occur, and should be classified as exception information, and for errors that meet expectations, analysis processing is not required, and these are rejected.
5) And analyzing the historical characteristic information and extracting the error log characteristics of the virtualization platform.
The error reporting information of the virtualization platform comprises task starting time and task ending time, the error reporting time in the log is often corresponding to the task ending time, partial information may need to be rolled back, and the difference time is within ten seconds.
② error logs typically appear before the last task ID.
The IVA log part contains error codes, which are consistent with the error codes in the platform, and the IVA log can be searched through the error codes (IVA: InspurVirtualization Agent, virtual Agent).
And fourthly, the storage and the IVA error report correspond to the agent interface called by the manager.
6) Log features for 5):
the method comprises the following steps of firstly, arranging all error information and corresponding error codes into an error information base, wherein the error information and the error codes are in one-to-one correspondence.
And secondly, arranging the task types and the corresponding agent interface calling lists.
7) And sorting the language material of the error log, wherein the log corresponding to the error information can be a management node log, an IVA log, an agent log, a storage log and the like. Because all errors need to be thrown by the platform, the error information of the management platform and the corresponding error log are the combination of the log information of the management node and other log information. The log corpus is finally arranged into the following format:
[ task type ] [ error description ] [ platform error code ] [ task ID ] [ task time ] task invocation interface List ] [ concrete error Log ]
8) The sorted log linguistic data and all error logs are preprocessed and converted into linguistic data which can be identified by a computer, have consistent structures and have no irrelevant information. The pretreatment mainly comprises the following processes:
the Chinese character is divided into words by using a word segmentation tool, and English words can be directly segmented according to spaces.
And secondly, removing stop words, and removing the stop words according to the Chinese and English stop word list.
9) Training an LSTM-DSSM model, generating a log matching model by combining with the error log characteristic information of a virtualization platform on the basis of an LSTM-DSSM network structure, and realizing the quick matching of abnormal logs, wherein the method mainly comprises the following processes:
firstly, word2Vec word vector models are used to obtain word vector data corresponding to each error log information and corresponding features, meanwhile, unprocessed error log word vector data are obtained through training, and the word vector data are used as input;
selecting a characteristic unit, compressing the characteristic unit, and disassembling the word token into a letter n-gram dimensionality reduction representation;
and thirdly, as shown in the figure, using the hidden layer vector output by the query LSTM unit as context, respectively performing dot product calculation with the hidden layer vector of each time step of doc, using the calculation result as weight, multiplying the vector of each time step by the weight, then summing, and calculating the proximity degree of the actual output and the expected output by using a softmax cross entropy function to obtain the output.
And training all the training data through an LSTM-DSSM network structure, and finally obtaining the trained LSTM-DSSM model through a plurality of batch _ sizes.
10) When the current environment is abnormal, firstly, finding out the error code and the calling interface in 6) according to the error description, and sorting into the corpus format in 9), matching the sorted log information with all current log information through a trained LSTM-DSSM model, and outputting to obtain the required log information.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
The embodiment of the invention provides a log analysis device, which is applied to a management node of a virtualization platform, and comprises the following components:
the acquisition module is used for acquiring the log information of all the nodes;
the screening module is used for screening the abnormal error log information from the acquired log information;
the corpus module is used for sorting the abnormal error log information to generate log corpus;
the training module is used for training the LSTM-DSSM model by using the log corpus;
a matching module for matching by using the trained LSTM-DSSM model when an abnormal error occurs to obtain the log information corresponding to the abnormal error
As shown in fig. 3, an electronic device 800 according to an embodiment of the present invention includes a memory 801 and a processor 802, where the memory stores a computer program that is executable on the processor, and the processor executes the computer program to implement the steps of the method according to the foregoing embodiment.
As shown in fig. 3, the electronic device further includes: a bus 803 and a communication interface 804, the processor 802, the communication interface 804 and the memory 801 being connected by the bus 803; the processor 802 is used to execute executable modules, such as computer programs, stored in the memory 801.
The Memory 801 may include a Random Access Memory (RAM), and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 804 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used.
The bus 803 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 3, but this does not indicate only one bus or one type of bus.
The memory 801 is used for storing a program, the processor 802 executes the program after receiving an execution instruction, and the method performed by the apparatus defined by the process disclosed in any of the foregoing embodiments of the present invention may be applied to the processor 802, or implemented by the processor 802.
The processor 802 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by instructions in the form of hardware integrated logic circuits or software in the processor 802. The Processor 802 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 801, and the processor 802 reads the information in the memory 801 and completes the steps of the method in combination with the hardware thereof.
In accordance with the above method, embodiments of the present invention also provide a computer readable storage medium storing machine executable instructions, which when invoked and executed by a processor, cause the processor to perform the steps of the above method.
The apparatus provided by the embodiment of the present invention may be specific hardware on the device, or software or firmware installed on the device, etc. The device provided by the embodiment of the present invention has the same implementation principle and technical effect as the method embodiments, and for the sake of brief description, reference may be made to the corresponding contents in the method embodiments without reference to the device embodiments. It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the system, the apparatus and the unit described above may all refer to the corresponding processes in the method embodiments, and are not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
For another example, the division of the unit is only one division of logical functions, and there may be other divisions in actual implementation, and for another example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed coupling or direct coupling or communication connection between each other may be through some communication interfaces, indirect coupling or communication connection between devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: those skilled in the art can still make modifications or changes to the embodiments described in the foregoing embodiments, or make equivalent substitutions for some features, within the scope of the disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention. Are intended to be covered by the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (9)

1. A log analysis method is characterized in that the log analysis method is applied to a management node of a virtualization platform, historical log information is used as corpus, a word vector model is constructed after preprocessing, feature processing of the log information is added on the basis of text word vector features and is used as word vectors to be input into an LSTM-DSSM neural network, a log matching model is trained, error description and related log information are input into the trained log analysis model for newly-appeared abnormal logs, and matched log information is quickly found out, and the method comprises the following steps:
acquiring log information of all nodes;
screening abnormal error log information from the acquired log information;
sorting the abnormal error log information to generate log linguistic data;
training the LSTM-DSSM model by using the log corpus;
and when the abnormal error occurs, matching by using the trained LSTM-DSSM model to obtain the log information corresponding to the abnormal error.
2. The log analysis method of claim 1, wherein the log information comprises one or more of a management node log, a virtual agent log, a storage log, an agent log, and a feedback log.
3. The log analysis method according to claim 1, wherein after the step of obtaining the log information of all the nodes, the method further comprises:
filtering the state, the capacity and the monitoring log information of the timing printing according to the time period;
for the task execution progress log information, if no failed task and alarm exist in the current time period, retaining part of progress log information after the task is completed, and deleting the rest of the progress log information;
and when the current log information exceeds 80% of the log capacity, backing up the current log information.
4. The log analysis method according to claim 1, wherein the step of screening the obtained log information for abnormal error includes:
screening error log information from the acquired log information;
and screening error log information which does not occur due to the incapability of realizing the platform limitation from the error log information to be used as abnormal error log information.
5. The log analysis method according to claim 1, wherein the step of collating the abnormal error log information to generate the log corpus comprises:
analyzing the historical characteristic information, and extracting the log characteristics of the abnormal error log information;
based on the log characteristics, all abnormal error log information and the corresponding error codes are sorted to form an error reporting information base;
and sorting the task type of the abnormal error log information and a corresponding proxy interface call list to generate a log corpus.
6. The log parsing method of claim 5, wherein the log corpus is in a format of [ task type ] [ error description ] [ platform error code ] [ task ID ] [ task time ] [ task invocation interface list ] [ concrete error log ].
7. The log analysis method of claim 1, wherein the step of training the LSTM-DSSM model using the log corpus comprises:
obtaining word vector data corresponding to each abnormal error log information and corresponding characteristic by using a word2Vec word vector model, training to obtain word vector data of unprocessed error logs, and taking the word vector data as input;
selecting a characteristic unit, compressing the characteristic unit, and disassembling the word token into a letter n-gram dimensionality reduction representation;
and (3) utilizing the hidden layer vector output by the LSTM unit of Query as context, respectively carrying out dot product calculation on the hidden layer vector of each time step of doc, taking the calculation result as weight, carrying out weighted summation on the vectors of each time step, and calculating a loss function of actual output and expected output by using a softmax cross entropy function.
8. A log analysis device is characterized in that the log analysis device is applied to a management node of a virtualization platform, historical log information is used as corpus, after preprocessing, a word vector model is built, feature processing of the log information is added on the basis of text word vector features, the word vector is used as a word vector and is input into an LSTM-DSSM neural network, a log matching model is trained, error description and related log information are input into the trained log analysis model for new abnormal logs, and matched log information is quickly found out, the device comprises:
the acquisition module is used for acquiring the log information of all the nodes;
the screening module is used for screening the abnormal error log information from the acquired log information;
the corpus module is used for sorting the abnormal error log information to generate log corpus;
the training module is used for training the LSTM-DSSM model by using the log corpus;
and the matching module is used for matching by using the trained LSTM-DSSM model when the abnormal error occurs to obtain the log information corresponding to the abnormal error.
9. A computer readable storage medium having stored thereon machine executable instructions which, when invoked and executed by a processor, cause the processor to execute the method of any of claims 1 to 7.
CN202010888954.6A 2020-08-28 2020-08-28 Log analysis method and device of virtualization platform based on LSTM-DSSM Active CN112256517B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010888954.6A CN112256517B (en) 2020-08-28 2020-08-28 Log analysis method and device of virtualization platform based on LSTM-DSSM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010888954.6A CN112256517B (en) 2020-08-28 2020-08-28 Log analysis method and device of virtualization platform based on LSTM-DSSM

Publications (2)

Publication Number Publication Date
CN112256517A CN112256517A (en) 2021-01-22
CN112256517B true CN112256517B (en) 2022-07-08

Family

ID=74224277

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010888954.6A Active CN112256517B (en) 2020-08-28 2020-08-28 Log analysis method and device of virtualization platform based on LSTM-DSSM

Country Status (1)

Country Link
CN (1) CN112256517B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113032175A (en) * 2021-03-17 2021-06-25 中国工商银行股份有限公司 Abnormal program classification method and device
CN113657461A (en) * 2021-07-28 2021-11-16 北京宝兰德软件股份有限公司 Log anomaly detection method, system, device and medium based on text classification

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105786782A (en) * 2016-03-25 2016-07-20 北京搜狗科技发展有限公司 Word vector training method and device
CN105809473A (en) * 2016-02-29 2016-07-27 北京百度网讯科技有限公司 Training method, service recommending method for coupling model parameters and associated apparatus
CN107145445A (en) * 2017-05-05 2017-09-08 携程旅游信息技术(上海)有限公司 The automatic analysis method and system of the daily record that reports an error of software automated testing
US20180270261A1 (en) * 2017-03-17 2018-09-20 Target Brands, Inc. Word embeddings for anomaly classification from event logs

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105809473A (en) * 2016-02-29 2016-07-27 北京百度网讯科技有限公司 Training method, service recommending method for coupling model parameters and associated apparatus
CN105786782A (en) * 2016-03-25 2016-07-20 北京搜狗科技发展有限公司 Word vector training method and device
US20180270261A1 (en) * 2017-03-17 2018-09-20 Target Brands, Inc. Word embeddings for anomaly classification from event logs
CN107145445A (en) * 2017-05-05 2017-09-08 携程旅游信息技术(上海)有限公司 The automatic analysis method and system of the daily record that reports an error of software automated testing

Also Published As

Publication number Publication date
CN112256517A (en) 2021-01-22

Similar Documents

Publication Publication Date Title
CN111241389B (en) Sensitive word filtering method and device based on matrix, electronic equipment and storage medium
CN107870845A (en) Towards the management method and system of micro services framework applications
CN112256517B (en) Log analysis method and device of virtualization platform based on LSTM-DSSM
CN112883190A (en) Text classification method and device, electronic equipment and storage medium
CN112491611A (en) Fault location system, method, apparatus, electronic device and computer readable medium
CN112560453A (en) Voice information verification method and device, electronic equipment and medium
CN112416778A (en) Test case recommendation method and device and electronic equipment
CN112732893B (en) Text information extraction method and device, storage medium and electronic equipment
CN112433874A (en) Fault positioning method, system, electronic equipment and storage medium
CN113778864A (en) Test case generation method and device, electronic equipment and storage medium
CN107871055B (en) Data analysis method and device
CN111831708A (en) Missing data-based sample analysis method and device, electronic equipment and medium
CN112613176A (en) Slow SQL statement prediction method and system
CN112579781A (en) Text classification method and device, electronic equipment and medium
CN110399026B (en) Multi-source single-output reset method and device based on FPGA and related equipment
CN115495587A (en) Alarm analysis method and device based on knowledge graph
CN112882707B (en) Rendering method and device, storage medium and electronic equipment
CN114896418A (en) Knowledge graph construction method and device, electronic equipment and storage medium
CN115048345A (en) Abnormal log detection method and device, electronic equipment and storage medium
CN112948478A (en) Link-based code analysis method and device, electronic equipment and storage medium
CN110647537A (en) Data searching method, device and storage medium
CN111311329B (en) Tag data acquisition method, device, equipment and readable storage medium
CN116483735B (en) Method, device, storage medium and equipment for analyzing influence of code change
CN113535594B (en) Method, device, equipment and storage medium for generating service scene test case
CN111177501B (en) Label processing method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant