CN112463933A - Online extraction method and device for system log template - Google Patents

Online extraction method and device for system log template Download PDF

Info

Publication number
CN112463933A
CN112463933A CN202011476333.3A CN202011476333A CN112463933A CN 112463933 A CN112463933 A CN 112463933A CN 202011476333 A CN202011476333 A CN 202011476333A CN 112463933 A CN112463933 A CN 112463933A
Authority
CN
China
Prior art keywords
log
template
word
log template
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011476333.3A
Other languages
Chinese (zh)
Inventor
孟伟彬
刘莹
裴丹
菲德利阁·扎特·特里尼达
何林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202011476333.3A priority Critical patent/CN112463933A/en
Publication of CN112463933A publication Critical patent/CN112463933A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • G06F40/186Templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application provides an online extraction method and device of a system log template, and relates to the technical field of data processing, wherein the method comprises the following steps: acquiring a log to be processed, and matching the log to be processed in a preset log template library; under the condition that the log template is not matched, classifying each word in the log to be processed by using a trained word classifier to obtain a template word and a variable word; and replacing the variable words with the target identifiers, combining the target identifiers and the template words into texts to generate new log templates, and storing the new log templates in a log template library. Therefore, the log template can be automatically extracted, and subsequent log analysis tasks such as abnormality detection and fault prediction can be conveniently carried out.

Description

Online extraction method and device for system log template
Technical Field
The present application relates to the field of data processing technologies, and in particular, to an online extraction method and an online extraction device for a system log template.
Background
The system log plays an important role in service management. Log template extraction is the first step in performing automated log analysis. To achieve the goal of automatic template extraction, many data-driven methods have been proposed.
In the related art, there are many categories of template extraction methods. The first type is a cluster-based approach, the log template forming a natural pattern of a set of log messages; the longest common subsequence follows, e.g., the longest common subsequence algorithm is used to parse the log in the stream, as opposed to general text data, where log messages have some unique characteristics. Therefore, based on the heuristic log parsing method, the last category is frequent item mining, and the log template can be regarded as a group of constant tokens which frequently appear in the log.
However, the existing log template extraction method cannot be applied to online extraction and update of templates.
Content of application
The present application is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, a first objective of the present application is to provide an online extraction method for a system log template, which solves the problem of online extraction of mass system log templates, and can extract a log template online to support incremental learning of software and hardware systems, so as to update a template set in increments, and thus a new type of log can be matched to the updated template set.
The second purpose of the present application is to provide an online extraction apparatus for a system log template.
In order to achieve the above object, an embodiment of a first aspect of the present application provides an online extraction method of a system log template, including:
acquiring a log to be processed, and matching the log to be processed in a preset log template library;
under the condition that a log template is not matched, classifying each word in the log to be processed by using a trained word classifier to obtain a template word and a variable word;
and replacing the variable word with a target identifier, combining the target identifier and the template word into text to generate a new log template, and storing the new log template in the log template library.
The online extraction method of the system log template obtains the log to be processed, and matches the log to be processed in a preset log template library; under the condition that the log template is not matched, classifying each word in the log to be processed by using a trained word classifier to obtain a template word and a variable word; and replacing the variable words with the target identifiers, combining the target identifiers and the template words into texts to generate new log templates, and storing the new log templates in a log template library. Therefore, the log template can be automatically extracted, and subsequent log analysis tasks such as abnormality detection and fault prediction can be conveniently carried out.
In an embodiment of the present application, the online extraction method of the system log template further includes:
acquiring a plurality of historical logs and extracting a plurality of log templates from the plurality of historical logs;
and storing the plurality of log templates to construct the log template library.
In an embodiment of the present application, the online extraction method of the system log template further includes:
classifying each word of the plurality of log templates to obtain a template vocabulary sample and a variable vocabulary sample;
and inputting the template vocabulary sample and the variable vocabulary sample into a neural network as labels for training, and obtaining the word classifier.
In an embodiment of the present application, the online extraction method of the system log template further includes:
carrying out classification correct result marking on the misclassification in the new log template;
and feeding back a correct classification result to the neural network.
In an embodiment of the present application, the online extraction method of the system log template further includes:
and under the condition of matching the log template, acquiring the matched log template as the log template of the log to be processed.
In order to achieve the above object, a second embodiment of the present application provides an online extraction apparatus of a system log template, including:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring logs to be processed and matching the logs to be processed in a preset log template library;
the first classification module is used for classifying each word in the log to be processed by using a trained word classifier under the condition that the log template is not matched, and obtaining a template word and a variable word;
and the processing module is used for replacing the variable words with target identifiers, combining the target identifiers and the template words into texts to generate a new log template, and storing the new log template in the log template library.
The online extraction device of the system log template obtains the logs to be processed, and matches the logs to be processed in a preset log template library; under the condition that the log template is not matched, classifying each word in the log to be processed by using a trained word classifier to obtain a template word and a variable word; and replacing the variable words with the target identifiers, combining the target identifiers and the template words into texts to generate new log templates, and storing the new log templates in a log template library. Therefore, the log template can be automatically extracted, and subsequent log analysis tasks such as abnormality detection and fault prediction can be conveniently carried out.
In an embodiment of the present application, the apparatus for online extracting a system log template further includes:
the second acquisition module is used for acquiring a plurality of historical logs and extracting a plurality of log templates from the plurality of historical logs;
and the building module is used for storing the plurality of log templates and building the log template library.
In an embodiment of the present application, the online extracting apparatus of the system log template further includes:
the second classification module is used for classifying each word of the plurality of log templates to obtain a template vocabulary sample and a variable vocabulary sample;
and the training module is used for inputting the template vocabulary sample and the variable vocabulary sample into a neural network as labels for training to obtain the word classifier.
In an embodiment of the present application, the online extracting apparatus of the system log template further includes:
the marking module is used for marking the classified correct result of the error classification in the new log template;
and the feedback module is used for feeding back the correct classification result to the neural network.
In an embodiment of the present application, the online extracting apparatus of the system log template further includes:
and the processing module is further used for acquiring a matched log template as the log template of the log to be processed under the condition that the log template is matched.
Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a diagram illustrating an example of a log template extraction method according to an embodiment of the present application;
FIG. 2 is a diagram illustrating an example of an online extraction method for a log template according to an embodiment of the present application;
fig. 3 is a schematic flowchart of an online extraction method of a system log template according to an embodiment of the present application;
FIG. 4 is an exemplary diagram of online extraction of a system log template according to an embodiment of the present application;
FIG. 5 is an exemplary diagram of word tags in an embodiment of the present application;
FIG. 6 is a schematic structural diagram of an online extraction apparatus for a system log template according to an embodiment of the present disclosure
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.
The following describes an online extraction method and apparatus for a system log template according to an embodiment of the present application with reference to the drawings.
In particular, the log is unstructured text generated by a "printf" function in the data center software and hardware system. A log typically has a fixed structure containing time stamps, software and hardware system IDs (identities), message types and detailed message 5 fields. Wherein the timestamp field represents a specific time of log generation; the ID field of the software and hardware system represents the identification of the software and hardware system generating the log; the message type field describes the summary characteristics of the log; the detailed message field describes the specific events of the log. Generally, the syntax of the message type field and the detailed message field varies with the type, manufacturer and model of the software and hardware system, and has no uniform format, but generally consists of a fixed part and a parameter part. The fixed part is information which is predefined by developers and used for representing certain events, and the parameter part is dynamically generated according to specific information such as time sequence, interactive equipment and the like in the running process of the software and hardware system. Where the message type field is the part of interest for the log template extraction method. In the field of natural language processing, there are some methods related to document summarization, sentence compression and clustering. However, the method of natural language processing cannot accurately parse the log.
The technical problem solved by the application is to solve the problem of online extraction of the log templates of the mass system, and dynamic adjustment can be performed according to manual feedback of an operation and maintenance engineer, the purpose of template extraction is to use the log templates to represent log messages of the same type, and to represent original parameters in the logs by using x, and a schematic diagram of a log template extraction method is shown in fig. 1.
The method aims to extract the log template on line so as to support incremental learning of software and hardware systems, and update the template set in increments, so that new types of logs can be matched to the updated template set.
For the purpose that a developer will continuously perform software and hardware updates on the online service and the underlying hardware device to add new features, repair bugs, or improve performance, new log types (as shown in fig. 2) will be generated by these operations. However, these logs are not able to match any existing template and therefore new templates must be learned online and dynamically modified according to engineer feedback.
Fig. 3 is a flowchart illustrating an online extraction method of a system log template according to an embodiment of the present application.
As shown in fig. 3, the online extraction method of the system log template includes the following steps:
step 101, obtaining a log to be processed, and matching the log to be processed in a preset log template library.
And 102, under the condition that the log template is not matched, classifying each word in the log to be processed by using the trained word classifier to obtain a template word and a variable word.
And 103, replacing the variable words with the target identifiers, combining the target identifiers and the template words into texts to generate new log templates, and storing the new log templates in a log template library.
In the embodiment of the application, a plurality of historical logs are obtained, and a plurality of log templates are extracted from the plurality of historical logs; and storing the plurality of log templates to construct the log template library.
In the embodiment of the application, each word of the plurality of log templates is classified, and a template vocabulary sample and a variable vocabulary sample are obtained; and inputting the template vocabulary sample and the variable vocabulary sample into a neural network as labels for training, and obtaining the word classifier.
In the embodiment of the application, the misclassification in the new log template is subjected to classification correct result marking; and feeding back a correct classification result to the neural network.
In the embodiment of the application, under the condition that the log template is matched, the matched log template is obtained and used as the log template of the log to be processed.
Specifically, the present application converts the template extraction problem into a word classification problem, and in the offline portion, the present application first extracts templates from the history log and then classifies the template vocabulary and the variable vocabulary based on the templates. The two word classes are used as labels to train a word classifier, and then the trained two classifier is used to divide the words into template words or variable words. When real-time logs are generated, the application matches them with templates. If a real-time log cannot match any existing template, the method constructs a vector for each word in the log, marks the word as a template word or a variable word based on a trained word classifier, combines the template words to obtain a new template and adds the new template into a template set. Since the model that the present application contemplates implementing is an unsupervised model, the only data required is unstructured text logs, and no additional data markers are required.
The method solves the problem of log template extraction by using a word classification method for the first time, converts the problem of online template extraction into the problem of word classification, and designs the function of user feedback for the first time, so that the model can be updated online. A template online extraction scheme based on a word classifier is designed. And training the machine-learned text classification model by learning the word characteristics and the class conditions in the historical log offline. In the online stage, only the logs which cannot be matched with the template library need to be processed, which is a supplement of the traditional template extraction method and does not need to change the existing log processing method and architecture.
Thus, by observation of the log, one can get: the distinction between template vocabulary (e.g., "Interface") and variable vocabulary (e.g., (ae3) is readily apparent. The present application converts the template generation problem into a word classification problem for determining whether a word belongs to a template vocabulary. The labels for the words are then obtained by any existing conventional template extraction method. Next, all words are represented by vectors and the features of the template words and the variant words are learned. Finally, the application generates a new template by combining template words in the new type log. The present application solves the following three problems: 1) the amount of words in the log is very large, where many words may appear only once. Accurately representing and classifying these words is a challenging task. 2) Training the word classifier requires obtaining word labels, however those raw logs do not have labels for every word. 3) The log template extraction model cannot be updated online. Fig. 4 shows a design of the present application.
In particular, although conventional template extraction schemes cannot extract templates online, their accuracy in extracting templates offline is very high, in which case the templates can be extracted from offline data using conventional template extraction methods, and then the words of the template words and variables are labeled by comparing the template file with the log file, as shown in fig. 5.
Specifically, after obtaining the word labels, a word classifier is trained for extracting the online template. Different from the traditional machine learning method, the method adopts a transform model, does not need to manually extract features to represent each word, but directly embeds the word through deep learning words, so that the word is represented as a high-dimensional semantic vector, and then trains the high-dimensional semantic vector into a deep learning classifier of the word.
In particular, online updates are divided into two parts, respectively updates of new types of templates and updates of the classification model itself.
Specifically, for updating the template, each log is matched by using the existing template library, if log files which cannot be matched are met, each word in the new log is classified by using a trained word classifier, the template word is reserved, and the variable word is replaced by 'x'. The text that the template words are combined into is then inserted into the end of the template library as a new template. For example, in the last log in FIG. 1, the word classifier separately identifies each word in "Vlan-interface Vlan20, changed state to up," and finds that all words are template words except that "Vlan 20" is a variant word. The "vlan 20" is then replaced by ". a" and the remaining words are combined into a template message.
Specifically, for the classification model itself, the application trains a transform neural network model when used offline, and the model supports feedback information to update the training parameters of the model. According to the application scenario of the application, an operation and maintenance engineer can mark the online extracted template by means of expert knowledge of the engineer, if a word is found to be inaccurate in classification, the correct classification of the word is fed back to a transform model, so that the model can correct the judgment of the engineer when performing word recognition on the online log next time, and the whole log template extraction method is changed into an online updating model capable of accepting user feedback.
According to the online extraction method of the system log template, the logs to be processed are obtained, and the logs to be processed are matched in a preset log template library; under the condition that the log template is not matched, classifying each word in the log to be processed by using a trained word classifier to obtain a template word and a variable word; and replacing the variable words with the target identifiers, combining the target identifiers and the template words into texts to generate new log templates, and storing the new log templates in a log template library. Therefore, the log template can be automatically extracted, and subsequent log analysis tasks such as abnormality detection and fault prediction can be conveniently carried out.
In order to implement the above embodiments, the present application further provides an online extraction device for a system log template.
Fig. 6 is a schematic structural diagram of an online extraction apparatus for a system log template according to an embodiment of the present application.
As shown in fig. 6, the online extraction device of the system log template includes: a first obtaining module 610, a first classifying module 620 and a processing module 630.
The first obtaining module 610 is configured to obtain a log to be processed, and match the log to be processed in a preset log template library.
And a first classification module 620, configured to classify each word in the log to be processed by using a trained word classifier under the condition that the log template is not matched, so as to obtain a template word and a variable word.
And a processing module 63-for replacing the variable word with a target identifier, combining the target identifier and the template word into a text to generate a new log template, and storing the new log template in the log template library.
In the embodiment of the application, the second obtaining module is configured to obtain a plurality of history logs and extract a plurality of log templates from the plurality of history logs; and the building module is used for storing the plurality of log templates and building the log template library.
In the embodiment of the application, the second classification module is configured to classify each word from the plurality of log templates, and obtain a template vocabulary sample and a variable vocabulary sample; and the training module is used for inputting the template vocabulary sample and the variable vocabulary sample into a neural network as labels for training to obtain the word classifier.
In the embodiment of the present application, the marking module is configured to mark a classification correct result for the misclassification in the new log template; and the feedback module is used for feeding back the correct classification result to the neural network.
In this embodiment of the application, the processing module is further configured to, when the log template is matched, obtain a matched log template as the log template of the log to be processed.
The online extraction device of the system log template of the embodiment of the application matches the log to be processed in a preset log template library by acquiring the log to be processed; under the condition that the log template is not matched, classifying each word in the log to be processed by using a trained word classifier to obtain a template word and a variable word; and replacing the variable words with the target identifiers, combining the target identifiers and the template words into texts to generate new log templates, and storing the new log templates in a log template library. Therefore, the log template can be automatically extracted, and subsequent log analysis tasks such as abnormality detection and fault prediction can be conveniently carried out.
It should be noted that the explanation of the embodiment of the online extraction method for the system log template is also applicable to the online extraction device for the system log template of the embodiment, and details are not repeated here.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims (10)

1. The method for extracting the system log template on line is characterized by comprising the following steps of:
acquiring a log to be processed, and matching the log to be processed in a preset log template library;
under the condition that a log template is not matched, classifying each word in the log to be processed by using a trained word classifier to obtain a template word and a variable word;
and replacing the variable word with a target identifier, combining the target identifier and the template word into text to generate a new log template, and storing the new log template in the log template library.
2. The method for online extraction of a syslog template as recited in claim 1, further comprising:
acquiring a plurality of historical logs and extracting a plurality of log templates from the plurality of historical logs;
and storing the plurality of log templates to construct the log template library.
3. The method for online extraction of a syslog template as recited in claim 2, further comprising:
classifying each word of the plurality of log templates to obtain a template vocabulary sample and a variable vocabulary sample;
and inputting the template vocabulary sample and the variable vocabulary sample into a neural network as labels for training, and obtaining the word classifier.
4. The method for online extraction of a syslog template as recited in claim 3, further comprising:
carrying out classification correct result marking on the misclassification in the new log template;
and feeding back a correct classification result to the neural network.
5. The method for online extraction of a syslog template as recited in claim 1, further comprising:
and under the condition of matching the log template, acquiring the matched log template as the log template of the log to be processed.
6. An on-line extraction device for a system log template, comprising:
the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring logs to be processed and matching the logs to be processed in a preset log template library;
the first classification module is used for classifying each word in the log to be processed by using a trained word classifier under the condition that the log template is not matched, and obtaining a template word and a variable word;
and the processing module is used for replacing the variable words with target identifiers, combining the target identifiers and the template words into texts to generate a new log template, and storing the new log template in the log template library.
7. The apparatus for online extraction of a system log template of claim 6, further comprising:
the second acquisition module is used for acquiring a plurality of historical logs and extracting a plurality of log templates from the plurality of historical logs;
and the building module is used for storing the plurality of log templates and building the log template library.
8. The apparatus for online extraction of a system log template as defined in claim 7, further comprising:
the second classification module is used for classifying each word of the plurality of log templates to obtain a template vocabulary sample and a variable vocabulary sample;
and the training module is used for inputting the template vocabulary sample and the variable vocabulary sample into a neural network as labels for training to obtain the word classifier.
9. The apparatus for online extraction of a system log template as defined in claim 8, further comprising:
the marking module is used for marking the classified correct result of the error classification in the new log template;
and the feedback module is used for feeding back the correct classification result to the neural network.
10. The apparatus for online extraction of a system log template of claim 6, further comprising:
and the processing module is further used for acquiring a matched log template as the log template of the log to be processed under the condition that the log template is matched.
CN202011476333.3A 2020-12-14 2020-12-14 Online extraction method and device for system log template Pending CN112463933A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011476333.3A CN112463933A (en) 2020-12-14 2020-12-14 Online extraction method and device for system log template

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011476333.3A CN112463933A (en) 2020-12-14 2020-12-14 Online extraction method and device for system log template

Publications (1)

Publication Number Publication Date
CN112463933A true CN112463933A (en) 2021-03-09

Family

ID=74804276

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011476333.3A Pending CN112463933A (en) 2020-12-14 2020-12-14 Online extraction method and device for system log template

Country Status (1)

Country Link
CN (1) CN112463933A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113032226A (en) * 2021-05-28 2021-06-25 北京宝兰德软件股份有限公司 Method and device for detecting abnormal log, electronic equipment and storage medium
CN113239190A (en) * 2021-04-27 2021-08-10 天九共享网络科技集团有限公司 Document classification method and device, storage medium and electronic equipment
CN113590421A (en) * 2021-07-27 2021-11-02 招商银行股份有限公司 Log template extraction method, program product, and storage medium
CN114785606A (en) * 2022-04-27 2022-07-22 哈尔滨工业大学 Log anomaly detection method based on pre-training LogXLNET model, electronic device and storage medium
WO2022244106A1 (en) * 2021-05-18 2022-11-24 日本電信電話株式会社 Data conversion device, data conversion method, and data conversion program

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170185576A1 (en) * 2015-12-28 2017-06-29 International Business Machines Corporation Categorizing Log Records at Run-Time
CN110175158A (en) * 2019-05-23 2019-08-27 湖南大学 A kind of log template extraction method and system based on vectorization
CN110377576A (en) * 2019-07-24 2019-10-25 中国工商银行股份有限公司 Create method and apparatus, the log analysis method of log template
CN110659175A (en) * 2018-06-30 2020-01-07 中兴通讯股份有限公司 Log trunk extraction method, log trunk classification method, log trunk extraction equipment and log trunk storage medium
CN111708860A (en) * 2020-06-15 2020-09-25 北京优特捷信息技术有限公司 Information extraction method, device, equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170185576A1 (en) * 2015-12-28 2017-06-29 International Business Machines Corporation Categorizing Log Records at Run-Time
CN110659175A (en) * 2018-06-30 2020-01-07 中兴通讯股份有限公司 Log trunk extraction method, log trunk classification method, log trunk extraction equipment and log trunk storage medium
CN110175158A (en) * 2019-05-23 2019-08-27 湖南大学 A kind of log template extraction method and system based on vectorization
CN110377576A (en) * 2019-07-24 2019-10-25 中国工商银行股份有限公司 Create method and apparatus, the log analysis method of log template
CN111708860A (en) * 2020-06-15 2020-09-25 北京优特捷信息技术有限公司 Information extraction method, device, equipment and storage medium

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113239190A (en) * 2021-04-27 2021-08-10 天九共享网络科技集团有限公司 Document classification method and device, storage medium and electronic equipment
CN113239190B (en) * 2021-04-27 2024-02-20 天九共享网络科技集团有限公司 Document classification method, device, storage medium and electronic equipment
WO2022244106A1 (en) * 2021-05-18 2022-11-24 日本電信電話株式会社 Data conversion device, data conversion method, and data conversion program
CN113032226A (en) * 2021-05-28 2021-06-25 北京宝兰德软件股份有限公司 Method and device for detecting abnormal log, electronic equipment and storage medium
CN113590421A (en) * 2021-07-27 2021-11-02 招商银行股份有限公司 Log template extraction method, program product, and storage medium
CN113590421B (en) * 2021-07-27 2024-04-26 招商银行股份有限公司 Log template extraction method, program product and storage medium
CN114785606A (en) * 2022-04-27 2022-07-22 哈尔滨工业大学 Log anomaly detection method based on pre-training LogXLNET model, electronic device and storage medium
CN114785606B (en) * 2022-04-27 2024-02-02 哈尔滨工业大学 Log anomaly detection method based on pretrained LogXLnet model, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112463933A (en) Online extraction method and device for system log template
CN109376092B (en) Automatic analysis method for software defect reasons for defect patch codes
US11790256B2 (en) Analyzing test result failures using artificial intelligence models
CN110175158B (en) Log template extraction method and system based on vectorization
Merten et al. Software feature request detection in issue tracking systems
KR101813683B1 (en) Method for automatic correction of errors in annotated corpus using kernel Ripple-Down Rules
US11594054B2 (en) Document lineage management system
CN111611218A (en) Distributed abnormal log automatic identification method based on deep learning
CN114818643B (en) Log template extraction method and device for reserving specific service information
CN116955604A (en) Training method, detection method and device of log detection model
CN112579414A (en) Log abnormity detection method and device
CN105512195A (en) Auxiliary method for analyzing and making decisions of product FMECA report
Wong et al. Wiki-reliability: A large scale dataset for content reliability on wikipedia
CN117238276B (en) Analysis correction system based on intelligent voice data recognition
Nadim et al. Leveraging structural properties of source code graphs for just-in-time bug prediction
CN112463957B (en) Method and device for abstracting abstract of unstructured text log stream
CN112685374B (en) Log classification method and device and electronic equipment
CN116467219A (en) Test processing method and device
CN115964484A (en) Legal multi-intention identification method and device based on multi-label classification model
CN115859191A (en) Fault diagnosis method and device, computer readable storage medium and computer equipment
CN111400606B (en) Multi-label classification method based on global and local information extraction
CN114245895A (en) Method for generating consistent representation for at least two log files
CN113239684A (en) Method and device for automatically identifying abnormal log based on partial mark
CN116204645B (en) Intelligent text classification method, system, storage medium and electronic equipment
CN114462387B (en) Sentence pattern automatic discrimination method under no-label corpus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210309

RJ01 Rejection of invention patent application after publication