CN111178537A

CN111178537A - Feature extraction model training method and device

Info

Publication number: CN111178537A
Application number: CN201911252002.9A
Authority: CN
Inventors: 覃元元
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Cloud Computing Technologies Co Ltd
Priority date: 2019-12-09
Filing date: 2019-12-09
Publication date: 2020-05-19
Anticipated expiration: 2039-12-09
Also published as: CN111178537B

Abstract

The embodiment of the application provides a feature extraction model training method and device. The method comprises the following steps: when the computing equipment trains the feature extraction model, the log text and the actual problem root cause in a sample to be trained are respectively input into the feature extraction model to obtain a first log feature vector and a first root cause feature vector, then the obtained first log feature vector is classified to obtain a predicted problem category, then the first log feature vector is matched with the first root cause feature vector to obtain a matching result, and finally parameters of the feature extraction model are adjusted according to the predicted problem category, the actual problem category and the matching result. In the scheme, the computing equipment can utilize the problem root cause and the problem category information of the historical samples to train the feature extraction model, so that the method can improve the accuracy of the feature extraction model.

Description

Feature extraction model training method and device

Technical Field

The application relates to the technical field of machine learning, in particular to a feature extraction model training method and device.

Background

In many processes such as daily software design, software development, software testing, etc., a computing device may continuously generate log text. When the computing equipment runs the system, if the running is abnormal, the computing equipment can generate abnormal information and store the abnormal information in the log. The exception information may also be referred to as log text.

Since the exception information logged by a computing device is typically only a surface phenomenon in current software and/or hardware, and not the root cause of the exception. Therefore, in the process of running software by the computing equipment, especially in the scene of testing large-scale software, if the computing equipment can automatically finish the root cause delimitation positioning of the abnormity according to the log text in the log, and provide a solution, the efficiency of analyzing and solving the abnormity by the staff can be greatly improved.

The existing log abnormity automatic positioning method is generally realized by a mathematical model. Wherein maintaining the mathematical model comprises the steps of:

1. a large number of historical data samples are accumulated, wherein each historical data sample can be represented by such a quadruple (log text, problem category, problem root, solution).

2. And training a feature extraction model based on the historical data samples. The feature extraction model can be realized by machine learning algorithms such as neural networks or artificially constructed text features.

3. And (5) performing problem root factor prediction by using a feature extraction model. The computing device extracts log feature vectors in the log text to be predicted by using a feature extraction model, classifies the log feature vectors based on the log feature vectors to obtain classification labels of problem categories, and determines problem root causes and solutions based on the classification labels.

4. Manual feedback and positioning model iteration. After the feature extraction model is predicted online, the computing device can continuously optimize the iterative positioning module according to the subsequent data samples fed back manually so as to improve the accuracy of the obtained problem categories.

In general, because actual operation conditions are complex and changeable, the problem root of the abnormal operation of the computing device can be infinite, and the number of problem categories summarized by workers is limited, therefore, in the existing feature extraction model, the problem categories are generally used as classification labels for training. However, in the model training process and the prediction process using the model, only the problem category information in the historical data sample can be used, but the problem root cause information cannot be effectively used, so that the problem root cause information in the historical data sample is not sufficiently used by the feature extraction model, the feature extraction model is poor in accuracy, and finally the problem root cause accuracy obtained by calculating the log feature vector predicted by using the feature extraction model is poor.

Disclosure of Invention

The application provides a feature extraction model training method and device, which can realize training of a feature extraction model by using problem root factors and problem category information of historical data samples so as to improve accuracy of the feature extraction model.

In a first aspect, an embodiment of the present application provides a feature extraction model training method, where the method includes:

the method comprises the steps that a computing device obtains a sample to be trained, wherein the sample to be trained comprises a log text, an actual problem category and an actual problem root; the computing equipment inputs the log text into the feature extraction model to obtain a first log feature vector; inputting the root cause of the actual problem into the feature extraction module to obtain a first root cause feature vector; the computing equipment inputs the first log feature vector into a classifier model to obtain a prediction problem category; the computing equipment matches the first log feature vector with the first root cause feature vector to obtain a matching result; finally, the computing device adjusts parameters of the feature extraction model according to the predicted problem category, the actual problem category and the matching result.

In the method, the computing equipment can utilize the problem root cause and the problem category information of the historical sample to realize the training of the feature extraction model, so the method can improve the accuracy of the feature extraction model, and further can improve the accuracy of the problem root cause obtained by computing according to the log feature vector predicted by using the feature extraction model.

In one possible design, the computing device may match the first log feature vector and the first root feature vector to obtain a matching result by:

the computing equipment inputs the first log feature vector into a first feature reasoning model to obtain a second root cause feature vector matched with the first log feature vector; the first feature reasoning model is used for reasoning to obtain root factor feature vectors matched with the log feature vectors; the computing device calculating a similarity between the first root cause eigenvector and the second root cause eigenvector; and then the computing equipment determines the matching result according to the similarity. Optionally, in this design, the matching result may be represented in various forms. For example, the matching result may be a similarity between the first root cause feature vector and the second root cause feature vector. As another example, the match result may be a match or a mismatch. As another example, the matching result may be a matching rank.

Through the above design, the computing device can accurately obtain the matching result between the first log feature vector and the first root cause feature vector.

the computing equipment inputs the first root cause feature vector into a second feature reasoning model to obtain a second log feature vector matched with the first root cause feature vector; the second feature reasoning model is used for reasoning to obtain a log feature vector matched with the root cause feature vector; the computing device calculating a similarity between the first log feature vector and the second log feature vector; and finally, the computing equipment determines the matching result according to the similarity. Optionally, in this design, the matching result may be represented in various forms. For example, the matching result may be a similarity between the first log feature vector and the second log feature vector. As another example, the match result may be a match or a mismatch. As another example, the matching result may be a matching rank.

In one possible design, the computing device may match the first log feature vector and the first root cause feature vector to obtain a matching result by:

the computing equipment inputs the first log feature vector and the first root cause feature vector into a fully-connected neural network model to obtain a correlation coefficient between the first log feature vector and the first root cause feature vector; the correlation coefficient is used for representing the degree of correlation between the first log feature vector and the first root cause feature vector; and the computing equipment determines the matching result according to the correlation coefficient. Optionally, in this design, the matching result may be represented in various forms. For example, the matching result may be a correlation coefficient between the first log feature vector and the first root cause feature vector. As another example, the match result may be a match or a mismatch. As another example, the matching result may be a matching rank.

In one possible design, the adjusting, by the computing device, the parameter of the feature extraction model according to the predicted problem category and the actual problem category and the matching result includes:

the processor may, for example, perform a weighted summation of the first error and the second error to obtain a composite error (which may also be referred to as a joint loss) and adjust the parameters of the feature extraction model based on the composite error, e.g., the composite error satisfies the formula L α L1+ β L2.

Since the first error and the second error both indicate the error condition or accuracy of the feature extraction vector model, the parameters of the feature extraction model are adjusted according to the first error and the second error, so that the accuracy of the adjusted feature extraction model can be improved.

In one possible design, the computing device calculates a first error based on the predicted problem category and the actual problem category, and adjusts parameters of the classifier model based on the first error. The computing device may train the feature extraction model and the classifier model simultaneously, and since the predicted problem category is obtained by the classifier model, the difference between the predicted problem category and the actual problem category may reflect the error condition or accuracy of the classifier model. In summary, in this design, the computing device may adjust parameters of the classifier model by predicting a problem category and the actual problem category to improve the accuracy of the classifier model.

In one possible design, when the computing device uses any one of the first feature inference model, the second feature inference model or the fully-connected neural network model when the first log feature vector and the first root feature vector are matched, the computing device may train the mathematical model. Since the matching result of the first log feature vector and the first root feature vector is obtained from the calculation result of the mathematical model, the degree of difference between the matching result and the ideal matching result can also represent the error condition or accuracy of the mathematical model. In summary, in the present design, the computing device may adjust parameters of the mathematical model used through the matching result obtained through calculation, so as to improve the accuracy of the mathematical model. In one embodiment, the processor calculates a second error based on the match result and the ideal match result, and adjusts the parameters of the mathematical model based on the second error.

In a second aspect, an embodiment of the present application provides a problem root cause prediction method, where a feature extraction model designed in an embodiment of the present application may be obtained by training using the scheme in the first aspect. The method comprises the following steps:

the computing equipment inputs the log text to be predicted into the feature extraction model to obtain a log feature vector to be predicted; then, the computing equipment acquires a plurality of stored historical feature vector samples, wherein each historical feature vector sample comprises a log feature vector sample and a root factor feature vector sample; the computing equipment computes a correlation coefficient between the log feature vector to be predicted and each historical feature vector sample according to the log feature vector sample and the root cause feature vector sample contained in the log feature vector to be predicted and each historical feature vector sample; the correlation coefficient between the log feature vector to be predicted and each historical feature vector sample is used for representing the degree of correlation between the log feature vector to be predicted and each historical feature vector sample; the computing equipment determines a target historical feature vector sample with the highest correlation coefficient between the target historical feature vector sample and the log feature vector to be predicted in the plurality of historical feature vector samples; and the computing equipment determines a target problem root corresponding to the target historical characteristic vector sample as a problem root prediction result of the log text to be predicted.

Since the feature extraction model is obtained by the method provided by the embodiment shown in fig. 3, the accuracy of the log feature vector to be predicted obtained by the feature extraction model is high. Furthermore, in the problem root cause prediction method, the computing device computes the correlation coefficient according to the log feature vector to be predicted and the log feature vector sample and the root cause feature vector sample in each historical feature vector sample, so that the method can also improve the accuracy of the correlation coefficient between the log feature vector to be predicted and the historical feature vector sample, and finally can improve the accuracy of the problem root cause prediction result of the log text to be predicted. In conclusion, in the method, the computing device can effectively utilize the problem root cause and the problem category information in the historical sample data in the problem root cause prediction process of the log text to be predicted, so that the accuracy of the predicted problem root cause prediction result is improved.

In one possible design, the log feature vector samples and root feature vector samples included in each of the plurality of historical feature vector samples are obtained by the feature extraction model. In this way, the accuracy of the correlation coefficient of the log feature vector to be trained and each historical feature vector sample calculated subsequently can be ensured.

In one possible design, the calculating device calculates a correlation coefficient between the log feature vector to be predicted and each historical feature vector sample according to the log feature vector to be predicted and the log feature vector and the root feature vector contained in each historical feature vector sample, and includes:

the computing equipment determines log feature vector samples and root cause feature vector samples contained in an ith historical feature vector sample, wherein the ith historical feature vector sample is any one of the plurality of historical feature vector samples;

the computing equipment computes a first correlation coefficient according to the log feature vector to be predicted and the log feature vector sample, wherein the first correlation coefficient is the similarity between the log feature vector to be predicted and the log feature vector sample;

the computing device calculates a second correlation coefficient according to the log feature vector to be predicted and the root cause feature vector sample, wherein the second correlation coefficient is a correlation coefficient between the log feature vector to be predicted and the root cause feature vector sample, and the correlation coefficient between the log feature vector to be predicted and the root cause feature vector sample is used for representing the degree of correlation between the log feature vector to be predicted and the root cause feature vector sample;

the computing device takes the weighted sum of the first correlation coefficient and the second correlation coefficient as the correlation coefficient of the log feature vector to be predicted and the ith historical feature vector sample.

Through the design, the computing device can accurately compute the correlation coefficient between the log feature vector to be predicted and each historical feature vector sample.

In one possible design, the computing device may calculate the second correlation coefficient according to the to-be-predicted log feature vector and the root-cause feature vector sample by any one of the following manners:

the first method is as follows: the computing equipment inputs the characteristic vector of the log to be predicted into a first characteristic reasoning model to obtain a reasoning root factor characteristic vector matched with the characteristic vector of the log to be predicted; calculating the similarity between the reasoning root factor characteristic vector and the root factor characteristic vector sample, and taking the similarity as the second correlation coefficient; the first feature reasoning model is used for reasoning and obtaining root factor feature vectors matched with the log feature vectors.

The second method comprises the following steps: the computing equipment inputs the root cause feature vector sample into a second feature reasoning model to obtain a reasoning log feature vector matched with the root cause feature vector sample; calculating the similarity between the inference log feature vector and the to-be-predicted log feature vector, and taking the similarity as the second correlation coefficient; and the second feature reasoning model is used for reasoning to obtain the log feature vector matched with the root factor feature vector.

The third method comprises the following steps: and the computing equipment inputs the characteristic vector of the log to be predicted and the root factor characteristic vector sample into a full-connection neural network model to obtain the second correlation coefficient.

Through the design, the computing device can accurately obtain the correlation coefficient between the log feature vector to be predicted and the root cause feature vector sample through any one of the above manners.

In one possible design, the computing device may further determine a target problem category corresponding to the target historical feature vector sample as a problem category prediction result of the log text to be predicted; or determining a target solution corresponding to the target historical feature vector sample as a solution prediction result of the log text to be predicted.

Through the design, the computing equipment can also obtain the problem category prediction result and the solution prediction result, so that the prediction result is richer.

In a third aspect, an embodiment of the present application provides a computing device, including means for performing the steps in the first or second aspect above.

In a fourth aspect, embodiments of the present application provide a computing device comprising at least one processing element and at least one memory element, wherein the at least one memory element is configured to store programs and data, and the at least one processing element is configured to perform the method provided in the first aspect or the second aspect of the present application.

In a fifth aspect, the present application further provides a computer program, which when run on a computer, causes the computer to perform the method provided in any one of the above aspects.

In a sixth aspect, the present application further provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a computer, the computer is caused to execute the method provided in any one of the above aspects.

In a seventh aspect, an embodiment of the present application further provides a chip, where the chip is configured to read a computer program stored in a memory, and execute the method provided in any one of the above aspects.

In an eighth aspect, an embodiment of the present application further provides a chip system, where the chip system includes a processor, and is used to support a computer device to implement the method provided in any one of the above aspects. In one possible design, the system-on-chip further includes a memory for storing programs and data necessary for the computer device. The chip system may be formed by a chip, and may also include a chip and other discrete devices.

Drawings

FIG. 1A is a schematic diagram of a conventional feature extraction model training process;

FIG. 1B is a diagram illustrating a conventional problem root prediction process based on a feature extraction model;

FIG. 2 is a block diagram of a computing device according to an embodiment of the present application;

fig. 3 is a flowchart of a feature extraction model training method according to an embodiment of the present disclosure;

fig. 4A is a schematic flowchart of an example of training a feature extraction model according to an embodiment of the present disclosure;

fig. 4B is a schematic diagram of a calculation flow of a matching module according to an embodiment of the present disclosure;

fig. 4C is a schematic diagram of a calculation flow of another matching module according to an embodiment of the present disclosure;

fig. 4D is a schematic diagram illustrating a calculation flow of another matching module according to an embodiment of the present disclosure;

FIG. 5 is a flowchart of a problem root cause prediction method according to an embodiment of the present application;

FIG. 6A is a flowchart illustrating an example of problem root cause prediction according to an embodiment of the present disclosure;

fig. 6B is a schematic diagram of a calculation flow of a correlation coefficient calculation module according to an embodiment of the present application;

fig. 6C is a schematic diagram of a calculation flow of a second correlation coefficient calculation module according to an embodiment of the present application;

fig. 6D is a schematic calculation flow diagram of another second correlation coefficient calculation module according to the embodiment of the present application;

fig. 6E is a schematic diagram of a calculation flow of another second correlation coefficient calculation module according to an embodiment of the present application;

FIG. 7 is a block diagram of a computing device according to an embodiment of the present application;

FIG. 8 is a block diagram of another computing device provided in embodiments of the present application;

fig. 9 is a block diagram of another computing device provided in an embodiment of the present application.

Detailed Description

The application provides a feature extraction model training method and device, which can realize training of a feature extraction model by using problem root factors and problem category information of historical samples so as to improve accuracy of the feature extraction model. The method and the device are based on the same technical conception, and because the principles of solving the problems of the method and the device are similar, the implementation of the device and the method can be mutually referred, and repeated parts are not repeated.

In the scheme provided by the embodiment of the application, when a feature extraction model is trained, a computing device inputs a log text and an actual problem root cause in a sample to be trained into the feature extraction model respectively to obtain a first log feature vector and a first root cause feature vector, classifies the obtained first log feature vector to obtain a predicted problem category, matches the first log feature vector with the first root cause feature vector to obtain a matching result, and finally adjusts parameters of the feature extraction model according to the predicted problem category, the actual problem category and the matching result. In the scheme, the computing equipment can utilize the problem root cause and the problem category information of the historical sample to train the feature extraction model, so that the method can improve the accuracy of the feature extraction model, and further can improve the accuracy of the problem root cause obtained by computing according to the log feature vector predicted by using the feature extraction model.

Hereinafter, some terms in the present application are explained to facilitate understanding by those skilled in the art.

1) The computing device is a device with a computing function and a software running function. Illustratively, the computing device may be a computer, a server, or other entity device. And if the operation of the computing equipment is abnormal in the software operation process, the abnormal log text is stored in the log.

2) And the characteristic extraction model is used for extracting log characteristic vectors from the log texts and extracting root cause characteristic vectors from the problem root cause texts. Optionally, the feature extraction model may be implemented by a machine learning algorithm such as a neural network or a manually constructed text feature.

3) And the first feature reasoning model is used for reasoning to obtain the root factor feature vector matched with the log feature vector. Optionally, the first feature inference model may be implemented by a machine learning algorithm such as a neural network.

4) And the second characteristic reasoning model is used for reasoning to obtain the log characteristic vector matched with the root factor characteristic vector. Optionally, the second feature inference model may be implemented by a machine learning algorithm such as a neural network.

5) And the full-connection neural network model is used for calculating a correlation coefficient between the log characteristic vector and the root factor characteristic vector and is realized through a full-connection neural network algorithm.

6) Plural means two or more.

7) At least one, means one or more.

8) "and/or" describe the association relationship of the associated objects, indicating that there may be three relationships, e.g., a and/or B, which may indicate: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship.

In addition, it is to be understood that the terms first, second, etc. in the description of the present application are used for distinguishing between the descriptions and not necessarily for describing a sequential or chronological order.

The training process of the conventional feature extraction model and the problem root prediction process are described in detail below with reference to fig. 1A and 1B.

Referring to fig. 1A, the training process of the feature extraction model includes: the computing equipment inputs the log text in the sample to be trained into the feature extraction model to obtain a log feature vector to be trained; the computing equipment inputs the characteristic vector of the log to be trained into a classifier model so as to obtain a problem category; then, calculating the problem category output by the classifier model and the actual problem category in the training sample by the calculating equipment through a loss function to obtain an error; finally, the computing device adjusts the feature extraction model according to the computed error. Optionally, when the classifier model is not trained, the computing device may also adjust the classifier model according to the calculated error.

Illustratively, a plurality of historical log feature vector samples are stored in the computing device, and each historical log feature corresponding sample corresponds to one historical data sample, wherein each historical data sample comprises log text, actual problem categories, problem root causes, solutions and the like. The classifier model can be realized by a method for calculating vector similarity in the process of the problem type corresponding to the log feature vector. The specific process is as follows: the classifier model reads a plurality of historical log feature vector samples, then calculates the similarity between the feature vector of the log to be trained and each historical log feature vector sample, then selects the historical log feature vector sample with the highest similarity with the feature vector of the log to be trained, and finally outputs the problem category contained in the historical data sample corresponding to the historical log feature vector sample.

Referring to fig. 1B, the problem root cause prediction process includes: the method comprises the steps that a computing device obtains a log text to be predicted, and the log text to be predicted is input into a feature extraction model to obtain a log feature vector to be predicted; then, the computing device inputs the log feature vector to be predicted into a classifier model, so that a predicted problem category, a problem root and a solution are obtained. In the same way, the classifier model can still calculate the similarity between the feature vector of the log to be predicted and the feature vector sample of the historical log, determine the feature vector sample of the historical log most similar to the feature vector of the log to be predicted, and finally output the problem category, the problem root cause and the solution included in the historical data sample corresponding to the feature vector sample of the historical log.

It is known that a large number of problem roots may belong to the same problem category. As can be seen from the above description, the problem category information in the historical data sample is only utilized in the model training process and the prediction process using the model, but the problem root cause information cannot be effectively used, so that the problem root cause information in the historical data sample is not fully utilized by the feature extraction model, the feature extraction model is poor in accuracy, and finally the problem root cause accuracy obtained by calculating the log feature vector predicted by using the feature extraction model is poor.

The embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Fig. 2 shows a block diagram of a computing device capable of performing the methods provided herein. Referring to fig. 2, the computing device includes: a processor 210, a memory 220, a communication module 230, an input unit 240, a display unit 250, and the like. Those skilled in the art will appreciate that the configuration of the computing device shown in FIG. 2 does not constitute a limitation of the computing device, as the computing device provided herein may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.

The various components of the computing device are described in detail below with reference to FIG. 2:

the communication module 230 may be connected to other devices through a wireless connection or a physical connection, so as to implement data transmission and reception of the computer apparatus. Optionally, the communication module 230 may include any one or a combination of a communication interface, a Radio Frequency (RF) circuit, a wireless fidelity (WiFi) module, a bluetooth module, and the like, which is not limited in this embodiment of the present application.

The memory 220 may be used to store program instructions and data. The processor 210 executes various functional applications of the computing device and data processing by executing program instructions stored in the memory 220. Among the program instructions are program instructions that can cause the processor 210 to execute a feature extraction model training method and a problem root prediction method provided in the following embodiments of the present application.

Alternatively, the memory 220 may mainly include a program storage area and a data storage area. The storage program area can store an operating system, various application programs, program instructions and the like; the storage data area may store a plurality of data models including a feature extraction model and a classifier model. In addition, the memory 220 may include a high-speed random access memory, and may further include a non-volatile memory, such as a magnetic disk storage device, a flash memory device, a universal flash memory (UFS), and the like.

The input unit 240 may be used to receive information such as data or operation instructions input by a user. Optionally, the input unit 240 may include input devices such as a touch panel, function keys, a physical keyboard, a mouse, a camera, and a monitor.

The display unit 250 may implement human-computer interaction for displaying contents such as information input by a user, information provided to the user, and the like through a user interface. The display unit 250 may include a display panel 251. Alternatively, the display panel 251 may be configured in the form of a Liquid Crystal Display (LCD), an organic light-emitting diode (OLED), or the like.

Further, when the input unit 240 includes a touch panel, the touch panel may cover the display panel 251, and when the touch panel detects a touch event on or near the touch panel, the touch panel transmits the touch event to the processor 210 to determine the type of the touch event so as to perform a corresponding operation.

The processor 210 is the control center of the computing device, and is connected to the above components by various interfaces and lines. The processor 210 may implement the feature extraction model training method and the problem root cause prediction method provided by the embodiment of the present application by executing the program instructions stored in the memory 220 and calling the data stored in the memory 220 to complete various functions of software operation, calculation, and the like of the computing device.

Optionally, the processor 210 may include one or more processing units. The processing unit can process the data of the input data model and output a calculation result. When the processor 210 of the computing device implements the feature extraction model training method and the problem root prediction method, the processing unit reads sample data and a mathematical model from the storage data area of the memory 220, and then computes the sample data based on the mathematical model, thereby implementing the method.

In addition, a memory with a buffering function may be further disposed in the processor 210 for storing instructions and data. In some embodiments, the memory in the processor 210 is a cache memory. The memory may hold instructions or data that have just been used or recycled by processor 210. If the processor 210 needs to reuse the instruction or data, it can be called directly from the internal memory. In this way, the processor 210 can be prevented from repeatedly accessing the memory 220, and the waiting time of the processor 210 can be reduced, thereby improving the working efficiency of the processor 210.

It is to be understood that the configuration of the computing device shown in fig. 2 is not to be construed as a limitation of the computing device, and that computing devices capable of performing the methods provided by embodiments of the present application may include more or fewer components than those shown. For example, the computing device may further include a camera, a sensor, an audio collector, and other components, which are not described herein again.

In order to solve the problem that the accuracy of a trained feature extraction model is poor due to the fact that problem root information cannot be effectively used in the traditional feature extraction model training process shown in fig. 1A, the embodiment of the application provides a feature extraction model training method. The method may be applied to a computing device as shown in fig. 2, executed by a processor in the computing device. The feature extraction model training method provided by the embodiment of the present application is described in detail below with reference to fig. 3.

S301: a processor in the computing device obtains a sample to be trained from a memory, wherein the sample to be trained comprises log texts, actual problem categories and actual problem root causes.

The memory stores a plurality of historical data samples, and each historical data sample comprises information such as a log text, an actual problem category and an actual problem root. In the process of training the feature extraction model, the computing device may select any one of the plurality of historical data samples as a sample to be trained.

S302: the processor inputs the log text into the feature extraction model to obtain a first log feature vector; and inputting the root cause of the actual problem into the feature extraction module to obtain a first root cause feature vector.

In the embodiment of the application, because both the log text and the actual problem root are text information, the feature extraction model can extract features of both the log text and the actual problem root.

S303: and the processor inputs the first log feature vector into a classifier model to obtain a prediction problem category.

The classifier model may be a conventional classifier model shown in fig. 1A.

In the process of S303, the classifier model may obtain the predicted problem category by:

the processor reads a plurality of historical log feature vector samples corresponding to the plurality of historical data samples from the memory and inputs the historical log feature vector samples into the classifier model;

the classifier model calculates the similarity between the first log feature vector and each historical log feature vector sample, then selects a target historical log feature vector sample with the highest similarity with the first log feature vector, determines a target historical data sample corresponding to the target historical log feature vector sample, and finally outputs the problem category contained in the target historical data sample. And the actual problem category contained in the target historical data sample output by the classifier model is a predicted problem category obtained by aiming at the first log feature vector prediction.

S304: and the processor matches the first log feature vector with the first root cause feature vector to obtain a matching result.

Optionally, the processor may obtain the matching result through, but is not limited to, the following embodiments.

The first embodiment:

the processor inputs the first log feature vector into a first feature reasoning model to obtain a second root cause feature vector matched with the first log feature vector; the first feature reasoning model is used for reasoning to obtain root factor feature vectors matched with the log feature vectors; the processor calculates the similarity between the first root cause eigenvector and the second root cause eigenvector; and the processor determines the matching result according to the obtained similarity.

Alternatively, the matching result may be represented in various forms.

For example, the matching result may be a similarity between the first root cause eigenvector and the second root cause eigenvector, that is, the processor may directly use the calculated similarity between the first root cause eigenvector and the second root cause eigenvector as the matching result.

As another example, the match result may be a match or a mismatch. After the processor calculates and obtains the similarity between the first root cause feature vector and the second root cause feature vector, comparing the obtained value of the similarity with a first similarity threshold, and when the obtained similarity is greater than or equal to the similarity threshold, determining that the matching result is matching; and when the obtained similarity is smaller than the similarity threshold value, determining that the matching result is not matched.

As another example, the matching result may be a matching rank. The computing device may determine a plurality of matching levels, where each matching level corresponds to a similarity value range. After the processor calculates and obtains the similarity between the first root cause feature vector and the second root cause feature vector, the processor determines which matching level the obtained similarity value belongs to, thereby determining that the matching result is the matching level.

The second embodiment:

the processor inputs the first root cause feature vector into a second feature reasoning model to obtain a second log feature vector matched with the first root cause feature vector; the second feature reasoning model is used for reasoning to obtain a log feature vector matched with the root cause feature vector; the processor calculating a similarity between the first log feature vector and the second log feature vector; and the processor determines the matching result according to the obtained similarity.

Similar to the matching result in the first embodiment, in the second embodiment, the matching result may also be represented in three forms, such as, but not limited to, the obtained similarity, matching or mismatching, or matching grade, and specifically refer to the above description, which is not repeated herein.

Third embodiment:

the processor inputs the first log feature vector and the first root cause feature vector into a fully-connected neural network model to obtain a correlation coefficient between the first log feature vector and the first root cause feature vector; the correlation coefficient is used for representing the degree of correlation between the first log feature vector and the first root cause feature vector; and the processor determines the matching result according to the correlation coefficient.

Similar to the matching result in the first embodiment, in the third embodiment, the matching result can also be expressed by, but is not limited to, the following form:

for example, the matching result may be the obtained correlation coefficient.

As another example, the match result may be a match or a mismatch. After the processor calculates the correlation coefficient, comparing the obtained value of the correlation coefficient with a correlation coefficient threshold value, and when the obtained correlation coefficient is greater than or equal to the correlation coefficient threshold value, determining the matching result as matching; and when the obtained correlation coefficient is smaller than the correlation coefficient threshold value, determining that the matching result is not matched.

As another example, the matching result may be a matching rank. The computing device may determine a plurality of matching levels, where each matching level corresponds to a correlation coefficient value range. After the correlation coefficient is calculated by the processor, the processor determines the correlation coefficient value range of which matching grade the value of the obtained correlation coefficient belongs to, so as to determine that the matching result is the matching grade.

It should be noted that the first feature inference model, the second feature inference model, or the fully-connected neural network model may be implemented by a neural network or an equal machine learning algorithm.

S305: and the processor adjusts the parameters of the feature extraction model according to the predicted problem category, the actual problem category and the matching result.

In one embodiment, the processor may perform S305 by the following three steps:

a 1: the processor calculates a first error based on the predicted problem category and the actual problem category.

For example, the processor may input the predicted problem category and the actual problem category into a preset first loss function to obtain the first error.

Since in the ideal case where the feature extraction vector model is accurate, the predicted problem class should be the same as the actual problem class. If the two are different, the feature extraction vector model is not accurate, and the difference degree between the two can indicate the error condition or accuracy of the feature extraction vector model.

a 2: the processor calculates a second error based on the matching result.

Since the log text in the training sample and the actual problem root cause are actually a perfect match. Therefore, in the case of an ideal state where the feature extraction vector model is accurate, the first log feature vector and the first root feature vector should also be perfectly matched, i.e., the perfect matching result of the first log feature vector and the first root feature vector indicates perfect matching. For the convenience of calculation, in the embodiment of the present application, the ideal matching result may be expressed in the same way as the matching result obtained by calculation.

For example, when the expression form of the matching result is similarity or a correlation coefficient, the value of the ideal matching result should be 1; for another example, when the representation form of the ideal matching result is a match or a mismatch, the ideal matching result is a match; in the case where the expression of the matching result is a matching level, for example, the ideal matching result is the highest level of the matching level. Therefore, if the calculated matching result is different from the ideal matching result, it indicates that the feature extraction vector model is inaccurate, and the degree of difference between the matching result and the ideal matching result may indicate the error condition or accuracy of the feature extraction vector model.

For example, the processor may input a preset second loss function according to the calculated matching result and the ideal matching result to obtain the second error.

a 3: and the processor adjusts the parameters of the feature extraction model according to the first error and the second error.

As can be seen from the above description, the first error and the second error both indicate the error condition or accuracy of the feature extraction vector model, and therefore, the accuracy of the adjusted feature extraction model can be improved by adjusting the parameters of the feature extraction model according to the first error and the second error.

for example, the composite error satisfies the formula L α L1+ β L2, where L is the composite error, L1 is the first error, L2 is the second error, α is a weight of the first error, β is a weight of the second error, and α and β are real numbers greater than 0.

In addition, in this embodiment of the application, the computing device may train the feature extraction model and the classifier model at the same time, and since the predicted problem category is obtained through the classifier model, the difference between the predicted problem category and the actual problem category may represent the error condition or accuracy of the classifier model. In summary, in the embodiment of the present application, the computing device may adjust the parameters of the classifier model by predicting the problem category and the actual problem category. In one embodiment, the processor calculates a first error based on the predicted problem category and the actual problem category, and adjusts parameters of the classifier model based on the first error. For example, the processor may input the predicted problem category and the actual problem category into a first loss function to obtain the first error.

Furthermore, in the embodiment of the present application, when the computing device performs S304 to calculate the matching result of the first log feature vector and the first root feature vector, in the case of using any one of the first feature inference model, the second feature inference model, or the fully-connected neural network model, the computing device may further train the mathematical model. Since the matching result of the first log feature vector and the first root feature vector is obtained from the calculation result of the mathematical model, the degree of difference between the matching result and the ideal matching result can also represent the error condition or accuracy of the mathematical model. In summary, in the embodiment of the present application, the computing device may adjust parameters of the used mathematical model according to the matching result obtained through calculation. In one embodiment, the processor calculates a second error based on the match result and the ideal match result, and adjusts the parameters of the mathematical model based on the second error. For example, the processor may input the calculated matching result and the ideal matching result into a preset second loss function to obtain the second error.

The embodiment of the application provides a feature extraction model training method. In the method, when a computing device trains a feature extraction model, a log text and an actual problem root cause in a sample to be trained are respectively input into the feature extraction model to obtain a first log feature vector and a first root cause feature vector, then the obtained first log feature vector is classified to obtain a predicted problem category, then the first log feature vector is matched with the first root cause feature vector to obtain a matching result, and finally parameters of the feature extraction model are adjusted according to the predicted problem category, the actual problem category and the matching result. In the scheme, the computing equipment can utilize the problem root cause and the problem category information of the historical sample to train the feature extraction model, so that the method can improve the accuracy of the feature extraction model, and further can improve the accuracy of the problem root cause obtained by computing according to the log feature vector predicted by using the feature extraction model.

Based on the embodiment of the feature extraction model training method shown in fig. 3, the embodiment of the present application further provides a feature extraction model training example. The feature extraction model training process in this example is described below with reference to fig. 4A.

After the computing equipment obtains the sample to be trained, the log text and the example problem root cause in the sample to be trained are respectively input into the feature extraction model, and a first log feature vector and a first root cause feature vector are obtained.

The computing device then inputs the first log feature vector into a classifier model and outputs a prediction problem category. For example, the conventional similarity calculation method of the classifier model calculates the similarity between a first log feature vector and each historical log feature vector sample, thereby determining the historical log feature vector sample with the highest similarity to the first log feature vector, and outputting the actual problem category contained in the historical data sample corresponding to the historical log feature vector sample as the predicted problem category.

And the computing equipment matches the first log feature vector with the first root cause feature vector through the matching module to obtain a matching result.

And substituting the actual problem category in the sample to be trained and the predicted problem category obtained by the calculation of the classifier model into the first loss function by the calculation equipment to obtain a first error L1.

And the computing equipment substitutes the matching result and the ideal matching result into the second loss function to obtain a second error L2.

the calculation device calculates a combined error L, which may, for example, conform to the formula L α L1+ β L2, where α and β are preset weights, based on the first error L1 and the second error L2.

And finally, the computing equipment adjusts the parameters of the feature extraction model according to the comprehensive error L.

In addition, optionally, the computing device may further adjust parameters in the adjusted classifier model according to the first error L1, and may adjust parameters of the mathematical model in the matching module according to the second error L2.

In one embodiment, the matching module may be implemented by a first feature inference model, where the first feature inference model is configured to infer a root cause feature vector matching a log feature vector. Referring to FIG. 4B, the computing device inputs the first log feature vector V1 into a first feature inference model to obtain a second cause feature vector V1'; then calculating the similarity between the second root cause eigenvector V1' and the first root cause eigenvector V2; and finally, the computing equipment obtains the matching result through the matching analysis module, or directly uses the obtained similarity as the matching result. For example, as shown in fig. 4B, the first feature inference model may multiply the first log feature vector V2 with at least one matrix (Q1 … … Qn) to obtain the second root cause feature vector V1'.

In another embodiment, the matching module may be implemented by a second feature inference model, where the second feature inference model is used to infer log feature vectors that match root cause feature vectors. Referring to fig. 4C, the computing device inputs the first root cause feature vector into a second feature inference model, obtaining a second log feature vector V2'; then calculating the similarity between the first log feature vector V1 and the second log feature vector V2'; and finally, the computing equipment obtains the matching result through the matching analysis module, or directly uses the obtained similarity as the matching result. For example, as shown in fig. 4C, the second feature inference model may multiply the first root cause feature vector V2 with at least one matrix (P1 … … Pm) to obtain the second log feature vector V2'.

In both of the above embodiments, the matching analysis module may convert the similarity into a representation of the matching result. For example, when the representation form of the matching result is matching or not matching, the matching analysis module determines the relationship between the similarity and a set similarity threshold, and when the similarity is greater than or equal to the similarity threshold, the output matching result is matching; and when the similarity is smaller than the similarity threshold value, outputting a matching result of mismatch. For another example, when the expression form of the matching result is the matching level, the matching analysis module determines which matching level the obtained value of the similarity belongs to, and outputs the matching result as the matching level.

In yet another embodiment, the matching module may be implemented by a fully connected neural network model. Referring to fig. 4D, the computing device inputs the first log eigenvector V1 and the second factor eigenvector V2 into the fully-connected neural network model to obtain a correlation coefficient therebetween; and finally, the computing equipment obtains the matching result through the matching analysis module, or directly uses the obtained correlation coefficient as the matching result. For example, the fully-connected neural network model may be configured to concatenate the first log feature vector V1 and the first root cause feature vector V2 into a vector V3 through a vector concatenation function, and then calculate the vector V3 through a correlation coefficient calculation function to obtain the correlation coefficient.

In the above embodiment, the matching analysis module may convert the correlation coefficient into a representation of the matching result. For example, when the representation form of the matching result is matching or not matching, the matching analysis module determines the relationship between the correlation coefficient and a set correlation coefficient threshold, and when the correlation coefficient is greater than or equal to the correlation coefficient threshold, the output matching result is matching; and when the correlation coefficient is smaller than the correlation coefficient threshold value, the output matching result is not matched. For another example, when the expression form of the matching result is the matching level, the matching analysis module determines which matching level the obtained value of the correlation coefficient belongs to, and outputs the matching result as the matching level.

In order to solve the problem that the accuracy of the problem root cause obtained by calculating the log feature vector obtained by using the feature extraction model for prediction is poor, the embodiment of the application further provides a problem root cause prediction method. The feature extraction model used in this method is trained using the method shown in fig. 3. The method may be applied to a computing device as shown in fig. 2, executed by a processor in the computing device. The problem root cause prediction method provided by the embodiment of the present application is described in detail below with reference to fig. 5.

S501: and a processor in the computing equipment inputs the log text to be predicted into the feature extraction model to obtain the log feature vector to be predicted.

S502: the processor obtains a plurality of stored historical feature vector samples from the memory, wherein each historical feature vector sample comprises a log feature vector sample and a root cause feature vector sample.

In order to ensure the accuracy of the subsequently calculated correlation coefficient between the log feature vector to be trained and each historical feature vector sample, optionally, the log feature vector sample and the root cause feature vector sample contained in each historical sample in the plurality of historical samples are obtained by the feature extraction model.

S503: and the processor calculates a correlation coefficient between the log feature vector to be predicted and each historical feature vector sample according to the log feature vector sample and the root factor feature vector sample contained in the log feature vector to be predicted and each historical feature vector sample. And the correlation coefficient between the log feature vector to be predicted and each historical feature vector sample is used for representing the degree of correlation between the log feature vector to be predicted and each historical feature vector sample.

In one embodiment, the processor may determine a correlation coefficient between the log feature vector to be predicted and an i-th historical feature vector sample. Wherein the ith historical feature vector sample is any one of the plurality of historical feature vector samples.

b 1: the processor determines log feature vector samples and root feature vector samples contained in the ith historical feature vector sample.

b 2: and the processor calculates a first correlation coefficient according to the log feature vector to be predicted and the log feature vector sample, wherein the first correlation coefficient is the similarity between the log feature vector to be predicted and the log feature vector sample.

b 3: and the processor calculates a second correlation coefficient according to the log feature vector to be predicted and the root cause feature vector sample, wherein the second correlation coefficient is a correlation coefficient between the log feature vector to be predicted and the root cause feature vector sample, and the correlation coefficient between the log feature vector to be predicted and the root cause feature vector sample is used for representing the degree of correlation between the log feature vector to be predicted and the root cause feature vector sample.

b 4: the processor takes the weighted sum of the first correlation coefficient and the second correlation coefficient as the correlation coefficient of the log feature vector to be predicted and the ith historical feature vector sample.

Optionally, in the above embodiment, the processor may execute step b3 by any one of the following manners:

the first method is as follows: the processor inputs the characteristic vector of the log to be predicted into a first characteristic reasoning model to obtain a reasoning root factor characteristic vector matched with the characteristic vector of the log to be predicted; calculating the similarity between the reasoning root factor characteristic vector and the root factor characteristic vector sample, and taking the similarity as the second correlation coefficient; the first feature reasoning model is used for reasoning and obtaining root factor feature vectors matched with the log feature vectors.

The second method comprises the following steps: the processor inputs the root cause feature vector sample into a second feature reasoning model to obtain a reasoning log feature vector matched with the root cause feature vector sample; calculating the similarity between the inference log feature vector and the to-be-predicted log feature vector, and taking the similarity as the second correlation coefficient; and the second feature reasoning model is used for reasoning to obtain the log feature vector matched with the root factor feature vector.

The third method comprises the following steps: and the processor inputs the characteristic vector of the log to be predicted and the root factor characteristic vector sample into a full-connection neural network model to obtain the second correlation coefficient.

In order to improve the accuracy of the calculation results and reduce the memory space in the computing device for storing the mathematical model, the mathematical model used by the processor in executing b3 may be the same as the mathematical model used in S304 in the embodiment shown in fig. 3.

S504: the processor determines a target historical feature vector sample with the highest correlation coefficient with the log feature vector to be predicted in the plurality of historical feature vector samples.

S505: and the processor determines a target problem root corresponding to the target historical characteristic vector sample as a problem root prediction result of the log text to be predicted.

In addition, the embodiment of the present application may further implement a problem category and/or a solution for predicting a log text to be predicted, that is, after the processor determines a target historical feature vector sample with a highest correlation coefficient between the target historical feature vector sample and the log feature vector to be predicted after S504, determining a target problem category corresponding to the target historical feature vector sample as a problem category prediction result of the log text to be predicted; and/or determining a target solution corresponding to the target historical feature vector sample as a solution prediction result of the log text to be predicted.

In this embodiment, in determining the target history data sample corresponding to the target history feature vector sample, the processor may use an actual problem root (i.e., the target problem root corresponding to the target history feature vector sample), an actual problem category (i.e., the target problem category corresponding to the target history feature vector sample), or an actual solution (i.e., the target solution corresponding to the target history feature vector sample) contained in the target history data sample as a problem root prediction result, a problem category prediction result, or a solution prediction result of the log text to be predicted, respectively.

The embodiment of the application provides a problem root cause prediction method. In the method, the computing device extracts a log feature vector to be predicted corresponding to a log text to be predicted by using a feature extraction model, and selects a target historical feature vector sample with the highest correlation coefficient with the log feature vector to be predicted from a plurality of historical feature vector samples. And finally, taking the target problem root corresponding to the target historical characteristic vector sample as a problem root prediction result of the log text to be predicted. Since the feature extraction model is obtained by the method provided by the embodiment shown in fig. 3, the accuracy of the log feature vector to be predicted obtained by the feature extraction model is high. Furthermore, in the problem root cause prediction method, the computing device computes the correlation coefficient according to the log feature vector to be predicted and the log feature vector sample and the root cause feature vector sample in each historical feature vector sample, so that the method can also improve the accuracy of the correlation coefficient between the log feature vector to be predicted and the historical feature vector sample, and finally can improve the accuracy of the problem root cause prediction result of the log text to be predicted. In conclusion, in the method, the computing device can effectively utilize the problem root cause and the problem category information in the historical sample data in the problem root cause prediction process of the log text to be predicted, so that the accuracy of the predicted problem root cause prediction result is improved.

Based on the embodiment of the problem root cause prediction method shown in fig. 5, the embodiment of the present application further provides a problem root cause prediction example. The problem root prediction process in this example is described below with reference to fig. 6A.

After obtaining the log text to be predicted, the computing equipment inputs the log text to be predicted into a feature extraction module to obtain the log feature vector to be predicted.

The computing device reads a plurality of historical feature vector samples from a memory and calculates the correlation coefficient of the log feature vector to be predicted and each historical feature vector sample through a correlation coefficient calculating module. Each historical feature vector sample comprises a log feature vector sample and a root cause feature vector sample, and the log feature vector sample and the root cause feature vector sample are obtained by inputting an actual log text and an actual problem root cause contained in a historical data sample corresponding to the historical feature vector sample into the feature extraction model respectively.

The computing device determines a target historical feature vector sample with the highest correlation coefficient with the to-be-predicted log feature vector from the plurality of historical feature vector samples.

And the computing equipment determines a target historical data sample corresponding to the target historical feature vector sample, and then takes an actual problem root contained in the target historical data sample as a problem root prediction result. Optionally, the computing device may further use an actual problem category included in the target historical data sample as a problem category prediction result, and/or the computing device may further use an actual solution included in the target historical data sample as a solution prediction result.

In one embodiment, the correlation coefficient calculation module may calculate the correlation coefficient between the to-be-predicted log feature vector and the ith historical feature vector sample by a method as shown in fig. 6B, including the following steps:

the calculating device calculates the similarity between the log feature vector to be predicted and the log feature vector sample through a first correlation coefficient calculating module, and uses the obtained similarity as a first correlation coefficient P1.

The computing device computes a correlation coefficient between the log feature vector to be predicted and the root factor feature vector sample through a second correlation coefficient computing module, and takes the obtained correlation coefficient as a second correlation coefficient P2.

The computing equipment weights and computes the first correlation coefficient P1 and the second correlation coefficient P2 through a weighting and computing module, and finally obtains the correlation coefficient between the log feature vector to be predicted and the ith historical feature vector sample.

In an embodiment, the second correlation coefficient calculation module may be implemented by a first feature inference model. Referring to fig. 6C, the computing device inputs the to-be-predicted log feature vector S1 into a first feature inference model to obtain an inference root cause feature vector S1 ', and then calculates the similarity between the inference root cause feature vector S1' and a root cause feature vector sample S2 through a similarity calculation module. Finally, the computing device may use the obtained similarity as the second correlation number P2.

In another embodiment, the second correlation coefficient calculation module may be implemented by a second feature inference model. Referring to fig. 6D, the computing device inputs the root cause feature vector sample S2 into a second feature inference model to obtain an inference log feature vector S2 ', and then calculates the similarity between the to-be-predicted log feature vector S1 and the inference log feature vector S2' through a similarity calculation module. Finally, the computing device may use the obtained similarity as the second correlation number P2.

In yet another embodiment, the second correlation coefficient calculation module may be implemented by a fully connected neural network model. Referring to fig. 6E, the computing device inputs the to-be-predicted log feature vector S1 and the root cause feature vector sample S2 into the fully-connected neural network model, and obtains a correlation coefficient therebetween. Finally, the computing device may use the obtained correlation coefficient as the second correlation coefficient P2. For example, the fully-connected neural network model may splice the to-be-predicted log feature vector S1 and the root cause feature vector sample S2 into a vector S3 through a vector splicing function, and then calculate the vector S3 through a correlation coefficient calculation function to obtain the correlation coefficient.

Based on the same technical concept, the embodiment of the present application further provides a computing device, where the computing device is configured to implement the feature extraction model training method shown in fig. 3. Referring to fig. 7, the computing device includes: an acquisition unit 701, a calculation unit 702, and an adjustment unit 703. The functions of the various elements of the computing device when implementing the method illustrated in fig. 3 are described below. The computing device may further include a storage unit, where the storage unit stores contents such as a mathematical model and sample data required to implement the above method.

An obtaining unit 701, configured to obtain a sample to be trained, where the sample to be trained includes a log text, an actual problem category, and an actual problem root;

a calculating unit 702, configured to input the log text into the feature extraction model to obtain a first log feature vector; inputting the root cause of the actual problem into the feature extraction module to obtain a first root cause feature vector; inputting the first log feature vector into a classifier model to obtain a prediction problem category; matching the first log feature vector with the first root cause feature vector to obtain a matching result;

an adjusting unit 703 is configured to adjust parameters of the feature extraction model according to the predicted problem category, the actual problem category, and the matching result.

In one possible implementation, when the first log feature vector and the first root cause feature vector are matched to obtain a matching result, the calculating unit 702 is specifically configured to:

inputting the first log feature vector into a first feature reasoning model to obtain a second root cause feature vector matched with the first log feature vector; the first feature reasoning model is used for reasoning to obtain root factor feature vectors matched with the log feature vectors;

calculating the similarity between the first root cause eigenvector and the second root cause eigenvector;

and determining the matching result according to the similarity.

inputting the first root cause feature vector into a second feature reasoning model to obtain a second log feature vector matched with the first root cause feature vector; the second feature reasoning model is used for reasoning to obtain a log feature vector matched with the root cause feature vector;

calculating a similarity between the first log feature vector and the second log feature vector;

and determining the matching result according to the similarity.

inputting the first log feature vector and the first root cause feature vector into a fully-connected neural network model to obtain a correlation coefficient between the first log feature vector and the first root cause feature vector; the correlation coefficient is used for representing the degree of correlation between the first log feature vector and the first root cause feature vector;

and determining the matching result according to the correlation coefficient.

In a possible implementation manner, when adjusting the parameters of the feature extraction model according to the predicted problem category, the actual problem category, and the matching result, the adjusting unit 703 is specifically configured to:

calculating a first error according to the predicted problem category and the actual problem category;

calculating a second error according to the matching result;

and adjusting parameters of the feature extraction model according to the first error and the second error.

Based on the same technical concept, the embodiment of the present application further provides a computing device, where the computing device is configured to implement the problem root cause prediction method shown in fig. 5. Referring to fig. 8, the computing device 800 includes: a prediction unit 801, an acquisition unit 802, a calculation unit 803, and a determination unit 804. The functions of the various elements of the computing device 800 when implementing the method of fig. 5 are described below. The computing device may further include a storage unit, where the storage unit stores contents such as a mathematical model and sample data required to implement the above method.

The prediction unit 801 is used for inputting the log text to be predicted into the feature extraction model to obtain the log feature vector to be predicted;

an obtaining unit 802, configured to obtain a plurality of stored historical feature vector samples, where each historical feature vector sample includes a log feature vector sample and a root cause feature vector sample;

a calculating unit 803, configured to calculate a correlation coefficient between the to-be-predicted log feature vector and each historical feature vector sample according to the to-be-predicted log feature vector and the log feature vector sample and the root feature vector sample included in each historical feature vector sample; the correlation coefficient between the log feature vector to be predicted and each historical feature vector sample is used for representing the degree of correlation between the log feature vector to be predicted and each historical feature vector sample; determining a target historical feature vector sample with the highest correlation coefficient between the target historical feature vector sample and the log feature vector to be predicted in the plurality of historical feature vector samples;

a determining unit 804, configured to determine a target problem root corresponding to the target historical feature vector sample as a problem root prediction result of the to-be-predicted log text.

In one possible implementation, the log feature vector samples and root feature vector samples included in each of the plurality of historical feature vector samples are obtained by the feature extraction model.

In a possible implementation manner, the calculating unit 803, when calculating a correlation coefficient between the log feature vector to be predicted and each historical feature vector sample according to the log feature vector to be predicted and the log feature vector and the root feature vector contained in each historical feature vector sample, is specifically configured to:

determining a log feature vector sample and a root cause feature vector sample contained in an ith historical feature vector sample, wherein the ith historical feature vector sample is any one of the plurality of historical feature vector samples;

calculating a first correlation coefficient according to the log feature vector to be predicted and the log feature vector sample, wherein the first correlation coefficient is the similarity between the log feature vector to be predicted and the log feature vector sample;

calculating a second correlation coefficient according to the log feature vector to be predicted and the root cause feature vector sample, wherein the second correlation coefficient is a correlation coefficient between the log feature vector to be predicted and the root cause feature vector sample, and the correlation coefficient between the log feature vector to be predicted and the root cause feature vector sample is used for representing the degree of correlation between the log feature vector to be predicted and the root cause feature vector sample;

and taking the weighted sum of the first correlation coefficient and the second correlation coefficient as the correlation coefficient of the log feature vector to be predicted and the ith historical feature vector sample.

In a possible implementation manner, when the calculating unit 803 calculates the second correlation coefficient according to the to-be-predicted log feature vector and the root-cause feature vector sample, it is specifically configured to:

inputting the characteristic vector of the log to be predicted into a first characteristic reasoning model to obtain a reasoning root factor characteristic vector matched with the characteristic vector of the log to be predicted; calculating the similarity between the reasoning root factor characteristic vector and the root factor characteristic vector sample, and taking the similarity as the second correlation coefficient; the first feature reasoning model is used for reasoning to obtain root factor feature vectors matched with the log feature vectors; or

Inputting the root cause feature vector sample into a second feature reasoning model to obtain a reasoning log feature vector matched with the root cause feature vector sample; calculating the similarity between the inference log feature vector and the to-be-predicted log feature vector, and taking the similarity as the second correlation coefficient; the second feature reasoning model is used for reasoning to obtain a log feature vector matched with the root cause feature vector; or

And inputting the log feature vector to be predicted and the root cause feature vector sample into a fully-connected neural network model to obtain the second correlation coefficient.

In a possible implementation, the determining unit 804 is further configured to:

determining a target problem category corresponding to the target historical feature vector sample as a problem category prediction result of the log text to be predicted; or

And determining a target solution corresponding to the target historical feature vector sample as a solution prediction result of the log text to be predicted.

It should be noted that, in the above embodiments of the present application, the division of the module is schematic, and is only a logical function division, and in actual implementation, there may be another division manner, and in addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or may exist alone physically, or two or more units are integrated in one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed by the prior art, or all or part of the technical solution may be embodied in a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Based on the same technical concept, the embodiment of the present application further provides a computing device, which is capable of implementing the method shown in fig. 3 or fig. 5. Referring to fig. 9, the computing device 900 includes: a processor 901 and a memory 902. Wherein the processor 901 and the memory 902 are connected to each other.

Optionally, the computing device 900 further includes a communication module 903. The communication module 903 is used for communicating with other devices. The communication module 903 may be any one or a combination of a communication interface, an RF circuit, a WiFi module, a bluetooth module, and the like, which is not limited in this application.

Optionally, the processor 901 and other components are connected to each other through a bus 904. The bus 904 may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 9, but this does not indicate only one bus or one type of bus.

The processor 901 is configured to implement the method shown in fig. 3 or fig. 5, which may specifically refer to the description in the above embodiments, and is not described herein again.

The memory 902 is used for storing program instructions, data, and the like. In particular, the program instructions may include program code comprising computer operational instructions, data including various mathematical models, sample data, etc. required to implement the above methods. The memory 902 may include a Random Access Memory (RAM) and may also include a non-volatile memory (non-volatile memory), such as at least one disk memory. The processor 901 executes the program instructions stored in the memory 902 and uses the data stored in the memory 902 to implement the above functions, thereby implementing the methods provided by the above embodiments.

Based on the above embodiments, the present application further provides a computer program, which when running on a computer, causes the computer to execute the method provided by the embodiment shown in fig. 3 or fig. 5.

Based on the above embodiments, the present application also provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a computer, the computer program causes the computer to execute the method provided by the embodiment shown in fig. 3 or fig. 5.

Based on the above embodiments, the embodiments of the present application further provide a chip, where the chip is used to read a computer program stored in a memory, and implement the method provided by the embodiment shown in fig. 3 or fig. 5.

Based on the above embodiments, the present application provides a chip system, which includes a processor, and is used for supporting a computer device to implement the functions related to the computing device in the embodiments shown in fig. 3 or fig. 5. In one possible design, the system-on-chip further includes a memory for storing programs and data necessary for the computer device. The chip system may be constituted by a chip, or may include a chip and other discrete devices.

In summary, the present application provides a method and apparatus for training a feature extraction model. In the scheme, when a computing device trains a feature extraction model, a log text and an actual problem root cause in a sample to be trained are respectively input into the feature extraction model to obtain a first log feature vector and a first root cause feature vector, then the obtained first log feature vector is classified to obtain a predicted problem category, then the first log feature vector is matched with the first root cause feature vector to obtain a matching result, and finally parameters of the feature extraction model are adjusted according to the predicted problem category, the actual problem category and the matching result. In the scheme, the computing equipment can utilize the problem root cause and the problem category information of the historical sample to train the feature extraction model, so that the method can improve the accuracy of the feature extraction model, and further can improve the accuracy of the problem root cause obtained by computing according to the log feature vector predicted by using the feature extraction model.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A method for training a feature extraction model, the method comprising:

acquiring a sample to be trained, wherein the sample to be trained comprises a log text, an actual problem category and an actual problem root cause;

inputting the log text into the feature extraction model to obtain a first log feature vector; inputting the root cause of the actual problem into the feature extraction module to obtain a first root cause feature vector;

inputting the first log feature vector into a classifier model to obtain a prediction problem category;

matching the first log feature vector with the first root cause feature vector to obtain a matching result;

and adjusting parameters of the feature extraction model according to the predicted problem category, the actual problem category and the matching result.

2. The method of claim 1, wherein matching the first log feature vector and the first root cause feature vector to obtain a matching result comprises:

and determining the matching result according to the similarity.

3. The method of claim 1, wherein matching the first log feature vector and the first root cause feature vector to obtain a matching result comprises:

and determining the matching result according to the similarity.

4. The method of claim 1, wherein matching the first log feature vector and the first root cause feature vector to obtain a matching result comprises:

and determining the matching result according to the correlation coefficient.

5. The method of any one of claims 1-4, wherein adjusting parameters of the feature extraction model based on the predicted problem category and the actual problem category, and the matching results, comprises:

calculating a second error according to the matching result;

6. A method for problem root cause prediction, the method comprising:

inputting the log text to be predicted into a feature extraction model to obtain a log feature vector to be predicted;

obtaining a plurality of stored historical feature vector samples, wherein each historical feature vector sample comprises a log feature vector sample and a root cause feature vector sample;

calculating a correlation coefficient between the log feature vector to be predicted and each historical feature vector sample according to the log feature vector sample and the root cause feature vector sample contained in the log feature vector to be predicted and each historical feature vector sample; the correlation coefficient between the log feature vector to be predicted and each historical feature vector sample is used for representing the degree of correlation between the log feature vector to be predicted and each historical feature vector sample;

determining a target historical feature vector sample with the highest correlation coefficient between the target historical feature vector sample and the log feature vector to be predicted in the plurality of historical feature vector samples;

and determining a target problem root corresponding to the target historical feature vector sample as a problem root prediction result of the log text to be predicted.

7. The method of claim 6, wherein the log feature vector samples and root feature vector samples contained in each of the plurality of historical feature vector samples are derived by the feature extraction model.

8. The method as claimed in claim 6 or 7, wherein calculating the correlation coefficient between the log feature vector to be predicted and each historical feature vector sample according to the log feature vector to be predicted and the root feature vector contained in each historical feature vector sample comprises:

9. The method of claim 8, wherein calculating the second correlation coefficient according to the log feature vector to be predicted and the root-cause feature vector sample comprises:

10. The method of any one of claims 6-9, further comprising:

11. A computing device, comprising:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring a sample to be trained, and the sample to be trained comprises a log text, an actual problem category and an actual problem root;

the calculation unit is used for inputting the log text into the feature extraction model to obtain a first log feature vector; inputting the root cause of the actual problem into the feature extraction module to obtain a first root cause feature vector; inputting the first log feature vector into a classifier model to obtain a prediction problem category; matching the first log feature vector with the first root cause feature vector to obtain a matching result;

and the adjusting unit is used for adjusting the parameters of the feature extraction model according to the predicted problem category, the actual problem category and the matching result.

12. The computing device of claim 11, wherein the computing unit, when matching the first log feature vector and the first root cause feature vector to obtain a matching result, is specifically configured to:

and determining the matching result according to the similarity.

13. The computing device of claim 11, wherein the computing unit, when matching the first log feature vector and the first root cause feature vector to obtain a matching result, is specifically configured to:

and determining the matching result according to the similarity.

14. The computing device of claim 11, wherein the computing unit, when matching the first log feature vector and the first root cause feature vector to obtain a matching result, is specifically configured to:

and determining the matching result according to the correlation coefficient.

15. The computing device according to any one of claims 11 to 14, wherein the adjusting unit, when adjusting the parameters of the feature extraction model according to the predicted problem category and the actual problem category and the matching result, is specifically configured to:

calculating a second error according to the matching result;

16. A computing device, comprising:

the prediction unit is used for inputting the log text to be predicted into the feature extraction model to obtain a log feature vector to be predicted;

the device comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring a plurality of stored historical feature vector samples, and each historical feature vector sample comprises a log feature vector sample and a root factor feature vector sample;

the calculation unit is used for calculating a correlation coefficient between the log feature vector to be predicted and each historical feature vector sample according to the log feature vector sample to be predicted and the log feature vector sample and the root feature vector sample contained in each historical feature vector sample; the correlation coefficient between the log feature vector to be predicted and each historical feature vector sample is used for representing the degree of correlation between the log feature vector to be predicted and each historical feature vector sample; determining a target historical feature vector sample with the highest correlation coefficient between the target historical feature vector sample and the log feature vector to be predicted in the plurality of historical feature vector samples;

and the determining unit is used for determining a target problem root corresponding to the target historical feature vector sample as a problem root prediction result of the log text to be predicted.

17. The computing device of claim 16, wherein the log feature vector samples and root feature vector samples contained in each of the plurality of historical feature vector samples are derived by the feature extraction model.

18. The computing device according to claim 16 or 17, wherein the computing unit, when computing the correlation coefficient between the log feature vector to be predicted and each historical feature vector sample from the log feature vector to be predicted and the root feature vector contained in each historical feature vector sample, is specifically configured to:

19. The computing device according to claim 18, wherein the computing unit, when computing the second correlation coefficient based on the to-be-predicted log feature vector and the root feature vector sample, is specifically configured to:

20. The computing device of any of claims 16-19, wherein the determination unit is further to:

21. A computing device, comprising:

a memory for storing program instructions and data;

a processor for invoking the program instructions and data stored in the memory for performing the method of any of claims 1-10.

22. A computer program, which, when run on a computer, causes the computer to perform the method of any one of claims 1-10.

23. A computer-readable storage medium, in which a computer program is stored which, when executed by a computer, causes the computer to carry out the method according to any one of claims 1 to 10.

24. A chip for reading a computer program stored in a memory for performing the method according to any one of claims 1 to 10.