CN112433874A - Fault positioning method, system, electronic equipment and storage medium - Google Patents

Fault positioning method, system, electronic equipment and storage medium Download PDF

Info

Publication number
CN112433874A
CN112433874A CN202011224701.5A CN202011224701A CN112433874A CN 112433874 A CN112433874 A CN 112433874A CN 202011224701 A CN202011224701 A CN 202011224701A CN 112433874 A CN112433874 A CN 112433874A
Authority
CN
China
Prior art keywords
fault
log data
target
vocabulary
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011224701.5A
Other languages
Chinese (zh)
Inventor
孙雅伦
牟洪洋
孔涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Inspur Data Technology Co Ltd
Original Assignee
Beijing Inspur Data Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Inspur Data Technology Co Ltd filed Critical Beijing Inspur Data Technology Co Ltd
Priority to CN202011224701.5A priority Critical patent/CN112433874A/en
Publication of CN112433874A publication Critical patent/CN112433874A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/0703Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation
    • G06F11/079Root cause analysis, i.e. error or fault diagnosis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Quality & Reliability (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application discloses a fault positioning method, which comprises the following steps: acquiring to-be-detected log data of a target device, calculating the weight of vocabularies according to the occurrence frequency of each vocabulary in the to-be-detected log data, and setting N vocabularies before the ranking of the vocabulary weights as target keywords; generating N-dimensional target feature vectors according to the vocabulary weights of all the target keywords, and comparing the similarity of the target feature vectors with preset feature vectors; judging whether a preset feature vector with similarity greater than a preset value with the target feature vector exists or not; and if so, determining a fault root corresponding to the preset feature vector with the highest similarity to the target feature vector, and generating a fault positioning result of the target device according to the fault root. The method and the device can avoid errors caused by manually defining the keywords, so that the accuracy of equipment fault positioning can be improved. The application also discloses a fault positioning system, an electronic device and a storage medium, which have the beneficial effects.

Description

Fault positioning method, system, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of server hardware device management technologies, and in particular, to a fault location method, a fault location system, an electronic device, and a storage medium.
Background
With the development of machine learning and deep learning, the intelligent operation and maintenance management of the server is stepped into a channel developing at a high speed. Under the development trend of AIOps (intelligent Intelligence for IT Operations), the intelligent fault monitoring and diagnosis tracing technology takes multi-source data of a server as a drive, and attracts wide attention.
In the related technology, log data are mainly used for realizing fault diagnosis, in the fault diagnosis process, personnel with operation and maintenance experience need to maintain a data table of keywords of a fault log, and when new log data exist, faults are located by comparing the keywords. However, the subjectivity of the manually defined log keywords is too high, the number of the server operation logs and the hardware logs is gradually huge along with the technical development, and the diagnosis accuracy of the fault diagnosis mode is low, so that the fault diagnosis method is not beneficial to business expansion.
Therefore, how to improve the accuracy of device fault location is a technical problem that needs to be solved by those skilled in the art.
Disclosure of Invention
The application aims to provide a fault positioning method, a fault positioning system, electronic equipment and a storage medium, and the fault positioning accuracy of the equipment can be improved.
In order to solve the above technical problem, the present application provides a fault location method, including:
acquiring to-be-detected log data of a target device, calculating the weight of vocabularies according to the occurrence frequency of each vocabulary in the to-be-detected log data, and setting N vocabularies before the ranking of the vocabulary weights as target keywords;
generating an N-dimensional target feature vector according to the vocabulary weight of all the target keywords, and comparing the similarity of the target feature vector with a preset feature vector in a rule base; the rule base comprises a corresponding relation between a preset feature vector and a fault root factor, and the preset feature vector is a feature vector corresponding to fault log data;
judging whether a preset feature vector with the similarity degree with the target feature vector larger than a preset value exists in the rule base;
if the target device fault root exists, determining a fault root corresponding to a preset feature vector with the highest similarity with the target feature vector in the rule base, and generating a fault positioning result of the target device according to the fault root.
Optionally, calculating the weight of each vocabulary according to the occurrence frequency of each vocabulary in the to-be-detected log data includes:
performing preprocessing operation on the log data to be detected; the preprocessing operation comprises a format unifying operation, a vocabulary deleting operation and a vocabulary converting operation;
and calculating the weight of the vocabulary according to the occurrence frequency of each vocabulary in the preprocessed log data to be detected.
Optionally, calculating the weight of each vocabulary according to the occurrence frequency of each vocabulary in the to-be-detected log data includes:
and calculating the vocabulary weight of each vocabulary in the log data to be detected based on a TF-IDF algorithm.
Optionally, the process of constructing the rule base includes:
acquiring the fault log data and a fault root factor corresponding to each fault log data;
generating a corpus comprising all the fault log data, calculating the vocabulary weight of each vocabulary in the corpus in the corresponding fault log data through a TF-IDF algorithm, and generating a bag-of-words model; the word bag model is a two-dimensional table, each line in the word bag model represents each word in the corpus, each line in the word bag model represents each fault log data, and elements in the word bag model are word weights of the words in the fault log data;
taking N-bit words before the word weight ranking in each fault log data as sample words according to the word bag model, and generating N-dimensional preset feature vectors corresponding to the fault log data according to the word weights of all the sample words;
and storing the corresponding relation between the preset feature vector of the same fault log data and the fault root factor into the rule base.
Optionally, the fault location result includes a fault location, a fault category, and a fault occurrence time.
Optionally, before determining whether a preset feature vector having a similarity greater than a preset value with the target feature vector exists in the rule base, the method further includes:
and calculating cosine values of the target characteristic vector and preset characteristic vectors in the rule base, and taking the cosine values as the similarity of the target characteristic vector and the preset characteristic vectors.
Optionally, after generating the fault location result of the target device according to the fault root cause, the method further includes:
sending the fault positioning result to a man-machine interaction interface so that a user can input evaluation information conveniently;
and when the evaluation information is a diagnosis error, receiving an actual fault root factor input by a user, and updating the rule base by using the actual fault root factor and the target characteristic vector.
The present application further provides a fault location system, comprising:
the keyword determining module is used for acquiring to-be-detected log data of a target device, calculating the weight of vocabularies according to the occurrence frequency of each vocabulary in the to-be-detected log data, and setting N vocabularies before the ranking of the vocabulary weights as target keywords;
the characteristic comparison module is used for generating an N-dimensional target characteristic vector according to the vocabulary weight of all the target keywords and comparing the similarity of the target characteristic vector with a preset characteristic vector in a rule base; the rule base comprises a corresponding relation between a preset feature vector and a fault root factor, and the preset feature vector is a feature vector corresponding to fault log data;
the positioning module is used for judging whether a preset feature vector with the similarity greater than a preset value with the target feature vector exists in the rule base; if the target device fault root exists, determining a fault root corresponding to a preset feature vector with the highest similarity with the target feature vector in the rule base, and generating a fault positioning result of the target device according to the fault root.
The present application further provides a storage medium having a computer program stored thereon, which when executed, implements the steps performed by the above-described fault location method.
The application also provides an electronic device, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps executed by the fault positioning method when calling the computer program in the memory.
The application provides a fault positioning method, which comprises the following steps: acquiring to-be-detected log data of a target device, calculating the weight of vocabularies according to the occurrence frequency of each vocabulary in the to-be-detected log data, and setting N vocabularies before the ranking of the vocabulary weights as target keywords; generating an N-dimensional target feature vector according to the vocabulary weight of all the target keywords, and comparing the similarity of the target feature vector with a preset feature vector in a rule base; the rule base comprises a corresponding relation between a preset feature vector and a fault root factor, and the preset feature vector is a feature vector corresponding to fault log data; judging whether a preset feature vector with the similarity degree with the target feature vector larger than a preset value exists in the rule base; if the target device fault root exists, determining a fault root corresponding to a preset feature vector with the highest similarity with the target feature vector in the rule base, and generating a fault positioning result of the target device according to the fault root.
According to the method and the device, after the log data to be detected of the target log are obtained, the weight of the vocabulary is weighted according to the frequency of occurrence of each vocabulary in the log data to be detected, and N vocabularies before the ranking of the vocabulary weight are used as target keywords. And generating a target characteristic vector by using the target keyword to realize characteristic comparison, and generating a fault positioning result according to a fault root factor corresponding to a preset characteristic vector with the highest similarity with the target characteristic vector in the rule base. According to the method and the device, the target keywords for constructing the target characteristic vectors are determined according to the occurrence frequency of the vocabularies in the to-be-detected log data, errors caused by manually defining the keywords can be avoided, and therefore the accuracy of equipment fault positioning can be improved. The application also provides a fault positioning system, an electronic device and a storage medium, which have the beneficial effects and are not repeated herein.
Drawings
In order to more clearly illustrate the embodiments of the present application, the drawings needed for the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained by those skilled in the art without inventive effort.
Fig. 1 is a flowchart of a fault location method according to an embodiment of the present disclosure;
fig. 2 is a flowchart of a method for constructing a rule base according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a system for tracing a fault root cause based on device log mining according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a fault location system according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a flowchart of a fault location method according to an embodiment of the present disclosure.
The specific steps may include:
s101: acquiring to-be-detected log data of a target device, calculating the weight of vocabularies according to the occurrence frequency of each vocabulary in the to-be-detected log data, and setting N vocabularies before the ranking of the vocabulary weights as target keywords;
the present embodiment may be applied to a fault diagnosis device, where the fault diagnosis device may be connected to a target apparatus, and performs fault location according to-be-detected log data of the target apparatus, and the present embodiment does not limit the type and number of the target apparatus.
When the present embodiment is applied to a server, the execution subject of the present embodiment may be a fault diagnosis system of the server, and the fault diagnosis system may collect log data generated by a server hardware device in batch by using a machine data collection module. The embodiment may also implement, by using the infrastructure open source operation and maintenance tool, that the toolkit is issued to the managed machine except the server without an agent, so as to implement maintenance on the server asset. In the interactive page, a user can select which machines to collect data, can customize the time range of data collection, and can also customize a combination module, for example, only a BMC module, a BIOS module and a hard disk module can be selected, and then the data is sent to a managed machine, and the data is packaged and returned to the fault diagnosis system after the tool kit is collected.
After the log data to be detected of the target device is obtained, the vocabulary weight can be calculated according to the occurrence frequency of each vocabulary in the log data to be detected, and the vocabularies N bits before the vocabulary weight ranking are set as target keywords. Specifically, the present embodiment may calculate the vocabulary weight of each vocabulary in the log data to be detected based on the TF-IDF algorithm. The method and the device can select the keywords of the log based on a statistical feature extraction algorithm TF-IDF, and compared with the method and the device which are used for manually screening the keywords by service personnel at present, the method and the device have better sending accuracy and expansibility. The keywords of the log are used as feature items to construct log feature vectors, so that log data which is unstructured and not beneficial to computer processing and is converted into a group of N-dimensional vectors, and subsequent similarity comparison processing is facilitated.
S102: generating N-dimensional target characteristic vectors according to the vocabulary weights of all target keywords, and comparing the similarity of the target characteristic vectors with preset characteristic vectors in a rule base;
on the basis of obtaining target keywords of log data to be detected, the N target keywords can be regarded as an N-dimensional coordinate system, the vocabulary weight of the target keywords is a corresponding coordinate value in the N-dimensional coordinate system, so that a vector in an N-dimensional space of a unit data identification position is obtained, and an N-dimensional target feature vector is obtained. By the method, unstructured log data can be converted into structured feature vectors.
Before this step, there may also be an operation of constructing a rule base, where the rule base includes preset feature vectors corresponding to the fault log data, and may also include a fault root cause corresponding to each preset feature vector, that is, the rule base includes a corresponding relationship between the preset feature vectors and the fault root causes.
In this embodiment, the target feature vector may be compared with each preset feature vector in the rule base one by one, so as to obtain a similarity between the target feature vector and each preset feature vector. The greater the similarity between the target feature vector and the preset feature vector, the greater the similarity of the actual fault root cause of the target device and the fault root cause corresponding to the preset feature vector.
S103: judging whether a preset feature vector with the similarity larger than a preset value with the target feature vector exists in a rule base; if yes, go to step S104; if not, the flow ends.
If the similarity between all the preset feature vectors and the target feature vectors is not greater than the preset value, it is indicated that the target device does not have a fault root cause corresponding to any preset feature vector, and it can be determined that the target device does not have a fault, and it can also be determined that the target device has an unknown fault, and prompt information of this fault diagnosis failure can be output.
As a possible implementation manner, before determining whether there is a preset feature vector in the rule base, whose similarity to the target feature vector is greater than a preset value, cosine values of the target feature vector and the preset feature vector in the rule base are further calculated, and the cosine values are used as the similarity of the target feature vector and the preset feature vector.
S104: and determining a fault root factor corresponding to a preset feature vector with the highest similarity with the target feature vector in the rule base, and generating a fault positioning result of the target device according to the fault root factor.
The method comprises the following steps of establishing a rule base, determining a preset characteristic vector with the highest similarity to a target characteristic vector on the basis of determining that the preset characteristic vector with the similarity to the target characteristic vector larger than a preset value exists in the rule base, and generating a fault positioning result of a target device according to a fault root factor corresponding to the preset characteristic vector with the highest similarity to the target characteristic vector. Specifically, the fault location result may include a fault location, a fault category, and a fault occurrence time of the target device. The fault position and the fault category can be determined according to the fault root cause, and the fault occurrence time can be determined according to the log generation time corresponding to the log data to be detected.
As a possible implementation manner, after the fault location result of the target device is generated according to the fault root, the fault location result may be sent to a human-computer interaction interface, so that a user may input evaluation information; and when the evaluation information is a diagnosis error, receiving an actual fault root factor input by a user, and updating the rule base by using the actual fault root factor and the target characteristic vector. In the embodiment, the hardware device logs are collected, a statistical-based feature extraction scheme TF-IDF algorithm is used, the algorithm is improved by combining the characteristics of the device logs, the feature vector model of the logs is constructed by taking the form of phrases as features for extraction, abnormal logs are distinguished, fault root causes are searched according to the convention of a rule base, and the tracing and the positioning of the fault root causes of the server are realized.
In the embodiment, after the log data to be detected of the target log is acquired, the weight of the weight vocabulary is counted according to the frequency of occurrence of each vocabulary in the log data to be detected, and the vocabularies of N digits before ranking of the weight vocabulary are used as the target keywords. And generating a target characteristic vector by using the target keyword to realize characteristic comparison, and generating a fault positioning result according to a fault root factor corresponding to a preset characteristic vector with the highest similarity with the target characteristic vector in the rule base. According to the method and the device for determining the target keywords for constructing the target characteristic vectors, the target keywords are determined according to the occurrence frequency of the words in the log data to be detected, errors caused by manually defining the keywords can be avoided, and therefore the accuracy of equipment fault positioning can be improved.
As a further description of the corresponding embodiment of fig. 1, the process of calculating the vocabulary weight according to the occurrence frequency of each vocabulary in the log data to be checked in S101 may include the following steps: performing preprocessing operation on the log data to be detected; the preprocessing operation comprises a format unifying operation, a vocabulary deleting operation and a vocabulary converting operation; and calculating the weight of the vocabulary according to the occurrence frequency of each vocabulary in the preprocessed log data to be detected.
As a possible implementation manner, the present embodiment may utilize a data preprocessing module to implement preprocessing on the log data to be checked. Specifically, the pretreatment operation may include the following three steps:
step 1: the directly acquired log data to be detected usually exists in the form of compressed data packets, so that the data packets can be decompressed to obtain the log data to be detected, the log data to be detected is classified according to time, machines and equipment (different hardware equipment operates and outputs data formats, the types are different), and format unified operation is performed on the log data to be detected to obtain the log data to be detected in the same log format.
According to the embodiment, the log data is classified according to time, machines and equipment (different hardware equipment operates and outputs data formats, and the types are different), and the log formats collected by the modules are unified, so that the subsequent modules can conveniently process the log data.
Step 2: and processing the actual content of the log data to be detected, wherein the processing process can comprise the steps of extending abbreviations, deleting special characters, deleting stop words, and restoring or extracting word roots.
Wherein, the term "extend abbreviation means: the abbreviation is restored to the word content before the abbreviation, for example, the abbreviation CPU can be expanded to obtain a Central Processing Unit. Deleting special characters means: and deleting special characters in the log data to be detected, wherein the special characters can comprise punctuation marks. Deleting stop words means: deleting the vocabulary which is stopped using in the log data to be detected; the embodiment can maintain a disabled word list and delete the vocabulary in the disabled word list in the log data to be detected. The reduction or the extraction of the root word means that: extracting the root word of a deformed vocabulary (such as a past expression and a running time); for example, the words watch, watched and watching, which have the same meaning, can obtain the basic form of the vocabulary by removing the affix and reserving the root watch, thereby achieving the purpose of reducing the processing data amount.
And step 3: this embodiment can maintain an inactive thesaurus containing words in the basic log format, such as: on, Info, Index, 2019 and the like and words with high reverse word frequency IDF, the useless noise in log data is removed by using the non-functional word bank, the data volume of subsequent feature extraction is reduced (the current standard equipment logs are English, and the word segmentation is relatively simple to realize), and preparation is provided for subsequent text feature extraction and abnormal detection. The representation format of the non-functional lexicon in the embodiment is designed as follows: [ I ] ofi,Iwi],IiExpression vocabulary, IwiRepresenting the inverse word frequency. The embodiment maintains an inactive word bank, utilizes the reverse word frequency to screen out the words with high universality, and greatly reduces the subsequent data processing amount.
In the preprocessing process, the vocabulary deleting operation comprises the following steps: deleting special characters, stop words and useless words; the vocabulary conversion operation includes: expanding abbreviations, restoring or extracting roots. By the method, the data processing amount can be reduced, and the target keyword recognition rate can be improved.
Referring to fig. 2, fig. 2 is a flowchart of a rule base construction method provided in an embodiment of the present application, where this embodiment is a construction process of a rule base mentioned in an embodiment corresponding to fig. 1, and a further embodiment can be obtained by combining this embodiment with the embodiment corresponding to fig. 1, where this embodiment may include the following steps:
s201: acquiring the fault log data and a fault root factor corresponding to each fault log data;
s202: generating a corpus comprising all the fault log data, calculating the vocabulary weight of each vocabulary in the corpus in the corresponding fault log data through a TF-IDF algorithm, and generating a bag-of-words model;
the word bag model is a two-dimensional table, each line in the word bag model represents each word in the corpus, each line in the word bag model represents each fault log data, and elements in the word bag model are word weights of the words in the fault log data;
s203: taking N-bit words before the word weight ranking in each fault log data as sample words according to the word bag model, and generating N-dimensional preset feature vectors corresponding to the fault log data according to the word weights of all the sample words;
s204: and storing the corresponding relation between the preset feature vector of the same fault log data and the fault root factor into the rule base.
The rule base in this embodiment may include a corresponding relationship between a plurality of preset feature vectors and a fault root cause, and in the fault diagnosis process, the target feature vector of the log data to be detected may be compared with the preset feature vectors, and then a fault location result may be output according to the similarity comparison result.
The flow described in the above embodiment is explained below by an embodiment in practical use. Referring to fig. 3, fig. 3 is a schematic structural diagram of a system for tracing a fault root cause based on device log mining according to an embodiment of the present disclosure. The embodiment provides a system for tracing fault root cause based on equipment log mining, which may include the following functional modules: the system comprises a machine data collection module, a data preprocessing module, a characteristic module, an exception filtering and positioning module and a user interaction module.
And the machine data collection module is used for collecting the log data generated by the server hardware equipment in batch.
And the data preprocessing module is used for preprocessing the log data of the mobile phone of the machine data collection module.
The characteristic module is used for extracting characteristic words of the text data through an algorithm and converting the text data into structural data so as to facilitate computer processing. This embodiment can maintain a basic log corpus and bag-of-words model, where the corpus contains all hardware device basic log data (in this embodiment, a log is referred to as a unit data). The bag-of-words model is a two-dimensional table, the bag-of-words model is generated according to the corpus, the rows of the bag-of-words model represent all words contained in the corpus, the columns of the bag-of-words model represent unit logs of the corpus, and the single element in the bag-of-words model is the weight of a certain word to the unit data. The embodiment can calculate the weight of each vocabulary in the unit data through a TF-IDF algorithm based on statistics.
The word frequency is the ratio of the occurrence frequency of a certain vocabulary to the total number of words in the document, and the calculation formula of the word frequency is as follows:
Figure BDA0002763260750000091
wherein, TF is the word frequency, C (x) is the number of occurrence of the vocabulary x, and C is the total number of words in the document.
The reverse word frequency calculation formula is as follows:
Figure BDA0002763260750000101
wherein IDF (x) is the inverse word frequency of the vocabulary x, N represents the total number of texts in the corpus, and N (x) represents the total number of texts containing the vocabulary x in the corpus.
In this embodiment, the weight of each vocabulary in the unit data can be calculated according to the word frequency TF and the inverse file frequency idf (x), and specifically, the product of the word frequency TF and the inverse file frequency idf (x) can be used as the weight of each vocabulary.
In this embodiment, the first N bits of the vocabulary weight sorting can be selected to form the keyword of the log data, and the keyword t can be selected1,t2,t3……tNConsidering an N-dimensional coordinate system and the weights are corresponding coordinate values, so that a unit data is represented as a vector of an N-dimensional space, the feature vector data structure in this embodiment is represented as: { DN[w1,w2,……,wN]},wiIs the ith offThe weight of the key word is that i is more than or equal to 1 and less than or equal to N.
And the abnormal filtering and positioning module is used for screening normal log data from the unit log data input by classification, and reserving and classifying the abnormal log data. The anomaly filtering and positioning module can calculate cosine values through vector data generated by the characteristic modeling module to obtain similarity so as to complete classification.
Specifically, the cosine value may be calculated by the following formula:
Figure BDA0002763260750000102
wherein, simijAs a target feature vector DiAnd a predetermined feature vector DjThe similarity of (c).
The abnormity filtering and positioning module can position the input abnormal vector through the preset characteristic vector in the rule base, and the positioning result is output through the user interaction module to complete fault monitoring.
The user interaction module is used for realizing the following two functions: the method comprises the following steps that 1, a user interaction module supports a receiving expert to input common fault log information and fault root causes, simultaneously sends a request to a data preprocessing module, synchronously updates a basic log corpus and a word bag model, extracts the characteristics of the fault log information, and numbers the fault root causes into a warehouse to generate a rule base; and function 2, a user selects to collect log data of different devices on a page and detect faults, the system returns a diagnosis result after diagnosis, namely, the system is informed whether the diagnosis is correct, if the diagnosis is wrong, new fault root factor data is input, and a rule base is updated.
The characteristic detailed module in the embodiment utilizes a characteristic extraction algorithm based on statistics to process the log data of the server hardware equipment, utilizes the characteristic vectors to distinguish the log data, is applied to log processing and fault location services, and extracts the keywords through the algorithm, so that the manual workload is reduced, and meanwhile, the deviation caused by subjectivity is reduced. The embodiment provides a complete, independent and small system device, which is flexibly combined with other server management systems, the user interaction module is additionally provided with an expert entrance, a corpus can be continuously updated through the interface, the updating of a rule base is kept, and the accuracy of fault positioning is improved.
The fault root cause tracing process based on equipment log mining comprises the following steps:
stage one: preparation phase
1. And (4) uploading the basic log data and the fault error log file by service personnel, sorting the fault source and the fault category, and recording the fault source and the fault category into the system. The system preprocessing module realizes the functions of deleting redundant data and updating a corpus and an inactive word bank.
2. The characteristic customization module receives data, screens out normal log data, calculates the vocabulary weight of an error log by using a basic log corpus of the preprocessing module, extracts log text characteristics, and constructs a characteristic vector according to all characteristic items.
3. And the abnormity monitoring and positioning module receives the data, and generates a rule base by using the fault source, the fault category data and the characteristic vector which are input by the service personnel. In the embodiment, the similarity between logs can be calculated by utilizing cosine similarity, the classification of the logs is realized, and the fault diagnosis rule base is generated according to the classification.
And a second stage: diagnostic phase
1. And the user uses the user interaction module to send a collection log tool package to the managed machine.
2. The preprocessing module receives the tool kit captured logs, decompresses the logs and deletes redundant data.
3. The characteristic detailed module receives the data, calculates the weight of the vocabulary of the log data according to the bag-of-words model, selects N values before the weight and generates a characteristic vector.
4. And the abnormality detection positioning module receives the data, calculates the similarity between the input data and the vector in the rule base by utilizing a cosine similarity formula, and positions the fault category and the fault root.
5. And outputting the result to a user interaction module, and displaying the fault diagnosis positioning result in an interface.
The embodiment provides a fault root cause tracing scheme based on equipment log mining, and unstructured data such as logs are changed into structured data which is easy to process by a computer by using a feature extraction mode based on statistics. The log takes hardware equipment of the server as a unit and takes an algorithm as a support, and the data of multiple devices are processed uniformly, so that the difficulty of troubleshooting is reduced, and the efficiency of fault diagnosis and the accuracy of positioning are improved. Further, the system of the present embodiment is flexible in combination with the management server client, and has lower performance requirements for the device compared to a scheme that uses machine learning to generate the rule base.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a fault location system according to an embodiment of the present disclosure;
the system may include:
the keyword determining module 100 is configured to obtain log data to be detected of a target device, calculate a vocabulary weight according to occurrence frequency of each vocabulary in the log data to be detected, and set N vocabularies before ranking of the vocabulary weight as target keywords;
the feature comparison module 200 is configured to generate an N-dimensional target feature vector according to the vocabulary weights of all the target keywords, and compare the similarity between the target feature vector and a preset feature vector in a rule base; the rule base comprises a corresponding relation between a preset feature vector and a fault root factor, and the preset feature vector is a feature vector corresponding to fault log data;
a positioning module 300, configured to determine whether a preset feature vector with a similarity greater than a preset value to the target feature vector exists in the rule base; if the target device fault root exists, determining a fault root corresponding to a preset feature vector with the highest similarity with the target feature vector in the rule base, and generating a fault positioning result of the target device according to the fault root.
In the embodiment, after the log data to be detected of the target log is acquired, the weight of the weight vocabulary is counted according to the frequency of occurrence of each vocabulary in the log data to be detected, and the vocabularies of N digits before ranking of the weight vocabulary are used as the target keywords. And generating a target characteristic vector by using the target keyword to realize characteristic comparison, and generating a fault positioning result according to a fault root factor corresponding to a preset characteristic vector with the highest similarity with the target characteristic vector in the rule base. According to the method and the device for determining the target keywords for constructing the target characteristic vectors, the target keywords are determined according to the occurrence frequency of the words in the log data to be detected, errors caused by manually defining the keywords can be avoided, and therefore the accuracy of equipment fault positioning can be improved.
Further, the keyword determination module 100 includes:
the preprocessing unit is used for executing preprocessing operation on the log data to be detected; the preprocessing operation comprises a format unifying operation, a vocabulary deleting operation and a vocabulary converting operation;
and the weight calculation unit is used for calculating the weight of each vocabulary according to the occurrence frequency of each vocabulary in the preprocessed log data to be detected.
Further, the process of calculating the vocabulary weight by the keyword determining module 100 according to the occurrence frequency of each vocabulary in the to-be-detected log data includes: and calculating the vocabulary weight of each vocabulary in the log data to be detected based on a TF-IDF algorithm.
Further, the method also comprises the following steps:
the rule base building module is used for acquiring the fault log data and a fault root factor corresponding to each fault log data; the system is also used for generating a corpus comprising all the fault log data, calculating the vocabulary weight of each vocabulary in the corpus in the corresponding fault log data through a TF-IDF algorithm, and generating a bag-of-words model; the word bag model is a two-dimensional table, each line in the word bag model represents each word in the corpus, each line in the word bag model represents each fault log data, and elements in the word bag model are word weights of the words in the fault log data; the fault log data processing device is also used for taking N-bit vocabularies before the vocabulary weight ranking in each fault log data as sample vocabularies according to the vocabulary bag model and generating N-dimensional preset feature vectors corresponding to the fault log data according to the vocabulary weights of all the sample vocabularies; and the rule base is also used for storing the corresponding relation between the preset characteristic vector of the same fault log data and the fault root cause to the rule base.
Further, the fault location result comprises a fault location, a fault category and a fault occurrence time.
Further, the method also comprises the following steps:
and the similarity calculation module is used for calculating cosine values of the target characteristic vector and preset characteristic vectors in the rule base before judging whether the preset characteristic vectors with the similarity to the target characteristic vectors larger than a preset value exist in the rule base, and taking the cosine values as the similarity of the target characteristic vectors and the preset characteristic vectors.
Further, the method also comprises the following steps:
the interaction module is used for sending a fault positioning result of the target device to a man-machine interaction interface after the fault positioning result of the target device is generated according to the fault root cause so that a user can input evaluation information conveniently; and the evaluation information is used for receiving an actual fault root factor input by a user and updating the rule base by using the actual fault root factor and the target feature vector when the evaluation information is a diagnosis error.
Since the embodiment of the system part corresponds to the embodiment of the method part, the embodiment of the system part is described with reference to the embodiment of the method part, and is not repeated here.
The present application also provides a storage medium having a computer program stored thereon, which when executed, may implement the steps provided by the above-described embodiments. The storage medium may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The application further provides an electronic device, which may include a memory and a processor, where the memory stores a computer program, and the processor may implement the steps provided by the foregoing embodiments when calling the computer program in the memory. Of course, the electronic device may also include various network interfaces, power supplies, and the like.
The embodiments are described in a progressive manner in the specification, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description. It should be noted that, for those skilled in the art, it is possible to make several improvements and modifications to the present application without departing from the principle of the present application, and such improvements and modifications also fall within the scope of the claims of the present application.
It is further noted that, in the present specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

Claims (10)

1. A method of fault location, comprising:
acquiring to-be-detected log data of a target device, calculating the weight of vocabularies according to the occurrence frequency of each vocabulary in the to-be-detected log data, and setting N vocabularies before the ranking of the vocabulary weights as target keywords;
generating an N-dimensional target feature vector according to the vocabulary weight of all the target keywords, and comparing the similarity of the target feature vector with a preset feature vector in a rule base; the rule base comprises a corresponding relation between a preset feature vector and a fault root factor, and the preset feature vector is a feature vector corresponding to fault log data;
judging whether a preset feature vector with the similarity degree with the target feature vector larger than a preset value exists in the rule base;
if the target device fault root exists, determining a fault root corresponding to a preset feature vector with the highest similarity with the target feature vector in the rule base, and generating a fault positioning result of the target device according to the fault root.
2. The fault location method according to claim 1, wherein calculating the vocabulary weight according to the occurrence frequency of each vocabulary in the log data to be detected comprises:
performing preprocessing operation on the log data to be detected; the preprocessing operation comprises a format unifying operation, a vocabulary deleting operation and a vocabulary converting operation;
and calculating the weight of the vocabulary according to the occurrence frequency of each vocabulary in the preprocessed log data to be detected.
3. The fault location method according to claim 1, wherein calculating the vocabulary weight according to the occurrence frequency of each vocabulary in the log data to be detected comprises:
and calculating the vocabulary weight of each vocabulary in the log data to be detected based on a TF-IDF algorithm.
4. The fault location method according to claim 1, wherein the construction process of the rule base comprises:
acquiring the fault log data and a fault root factor corresponding to each fault log data;
generating a corpus comprising all the fault log data, calculating the vocabulary weight of each vocabulary in the corpus in the corresponding fault log data through a TF-IDF algorithm, and generating a bag-of-words model; the word bag model is a two-dimensional table, each line in the word bag model represents each word in the corpus, each line in the word bag model represents each fault log data, and elements in the word bag model are word weights of the words in the fault log data;
taking N-bit words before the word weight ranking in each fault log data as sample words according to the word bag model, and generating N-dimensional preset feature vectors corresponding to the fault log data according to the word weights of all the sample words;
and storing the corresponding relation between the preset feature vector of the same fault log data and the fault root factor into the rule base.
5. The fault location method according to claim 1, wherein the fault location result comprises a fault location, a fault category and a fault occurrence time.
6. The method according to claim 1, wherein before determining whether there is a preset feature vector in the rule base, the similarity between the preset feature vector and the target feature vector being greater than a preset value, the method further comprises:
and calculating cosine values of the target characteristic vector and preset characteristic vectors in the rule base, and taking the cosine values as the similarity of the target characteristic vector and the preset characteristic vectors.
7. The method according to any one of claims 1 to 6, further comprising, after generating the fault location result of the target device according to the fault root cause:
sending the fault positioning result to a man-machine interaction interface so that a user can input evaluation information conveniently;
and when the evaluation information is a diagnosis error, receiving an actual fault root factor input by a user, and updating the rule base by using the actual fault root factor and the target characteristic vector.
8. A fault location system, comprising:
the keyword determining module is used for acquiring to-be-detected log data of a target device, calculating the weight of vocabularies according to the occurrence frequency of each vocabulary in the to-be-detected log data, and setting N vocabularies before the ranking of the vocabulary weights as target keywords;
the characteristic comparison module is used for generating an N-dimensional target characteristic vector according to the vocabulary weight of all the target keywords and comparing the similarity of the target characteristic vector with a preset characteristic vector in a rule base; the rule base comprises a corresponding relation between a preset feature vector and a fault root factor, and the preset feature vector is a feature vector corresponding to fault log data;
the positioning module is used for judging whether a preset feature vector with the similarity greater than a preset value with the target feature vector exists in the rule base; if the target device fault root exists, determining a fault root corresponding to a preset feature vector with the highest similarity with the target feature vector in the rule base, and generating a fault positioning result of the target device according to the fault root.
9. An electronic device, comprising a memory in which a computer program is stored and a processor, which when called into the computer program in the memory implements the steps of the fault localization method according to any one of claims 1 to 7.
10. A storage medium having stored thereon computer-executable instructions which, when loaded and executed by a processor, carry out the steps of a fault location method as claimed in any one of claims 1 to 7.
CN202011224701.5A 2020-11-05 2020-11-05 Fault positioning method, system, electronic equipment and storage medium Pending CN112433874A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011224701.5A CN112433874A (en) 2020-11-05 2020-11-05 Fault positioning method, system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011224701.5A CN112433874A (en) 2020-11-05 2020-11-05 Fault positioning method, system, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112433874A true CN112433874A (en) 2021-03-02

Family

ID=74695537

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011224701.5A Pending CN112433874A (en) 2020-11-05 2020-11-05 Fault positioning method, system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112433874A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360311A (en) * 2021-06-04 2021-09-07 中国工商银行股份有限公司 Method, device, equipment and storage medium for extracting key data in log
CN115329774A (en) * 2022-10-14 2022-11-11 中国建筑科学研究院有限公司 Intelligent building fault diagnosis rule generation method and device based on semantic matching
CN115576735A (en) * 2022-12-06 2023-01-06 苏州浪潮智能科技有限公司 Fault positioning method and device and computer readable storage medium
CN117336852A (en) * 2023-12-01 2024-01-02 广州斯沃德科技有限公司 Distributed co-location method, device, electronic equipment and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113360311A (en) * 2021-06-04 2021-09-07 中国工商银行股份有限公司 Method, device, equipment and storage medium for extracting key data in log
CN115329774A (en) * 2022-10-14 2022-11-11 中国建筑科学研究院有限公司 Intelligent building fault diagnosis rule generation method and device based on semantic matching
CN115576735A (en) * 2022-12-06 2023-01-06 苏州浪潮智能科技有限公司 Fault positioning method and device and computer readable storage medium
CN117336852A (en) * 2023-12-01 2024-01-02 广州斯沃德科技有限公司 Distributed co-location method, device, electronic equipment and storage medium
CN117336852B (en) * 2023-12-01 2024-04-02 广州斯沃德科技有限公司 Distributed co-location method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US11562304B2 (en) Preventative diagnosis prediction and solution determination of future event using internet of things and artificial intelligence
CN112433874A (en) Fault positioning method, system, electronic equipment and storage medium
EP3717984B1 (en) Method and apparatus for providing personalized self-help experience
CN110580308B (en) Information auditing method and device, electronic equipment and storage medium
EP3916584A1 (en) Information processing method and apparatus, electronic device and storage medium
CN112445912B (en) Fault log classification method, system, device and medium
US20220019739A1 (en) Item Recall Method and System, Electronic Device and Readable Storage Medium
CN114528845A (en) Abnormal log analysis method and device and electronic equipment
CN113986864A (en) Log data processing method and device, electronic equipment and storage medium
CN112631889B (en) Portrayal method, device, equipment and readable storage medium for application system
An et al. Real-time Statistical Log Anomaly Detection with Continuous AIOps Learning.
CN111950623B (en) Data stability monitoring method, device, computer equipment and medium
CN116402630B (en) Financial risk prediction method and system based on characterization learning
CN116225848A (en) Log monitoring method, device, equipment and medium
CN115495587A (en) Alarm analysis method and device based on knowledge graph
US11822578B2 (en) Matching machine generated data entries to pattern clusters
CN117501275A (en) Method, computer program product and computer system for analyzing data consisting of a large number of individual messages
KR20230059364A (en) Public opinion poll system using language model and method thereof
CN112926297A (en) Method, apparatus, device and storage medium for processing information
CN114547231A (en) Data tracing method and system
CN114462364B (en) Method and device for inputting information
US12007829B2 (en) Extended dynamic intelligent log analysis tool
US20240143430A1 (en) Extended dynamic intelligent log analysis tool
CN116719919A (en) Text processing method and device
CN117648214A (en) Exception log processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination